JP2024037556A

JP2024037556A - Information processing device and program

Info

Publication number: JP2024037556A
Application number: JP2022142481A
Authority: JP
Inventors: 達哉森; Tatsuya Mori
Original assignee: Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2022-09-07
Filing date: 2022-09-07
Publication date: 2024-03-19

Abstract

To acquire transmittance of a transparent region of an image even if no special device or environment is used.SOLUTION: An information processing device includes a processor that defines an object region 53a as a region of an object including a transparent region from an image, acquires information indicating a first region 54a as a region other than the object region 53a in the image, a second region 54b as the transparent region and a third region 54c as a region other than the second region 54b in the object region 53a, from the image and the object region 53a, and determines transmittance of the second region 54b on the basis of the image and the information.SELECTED DRAWING: Figure 8

Description

本発明は、情報処理装置及びプログラムに関するものである。 The present invention relates to an information processing device and a program.

ｗｅｂ広告やポップ広告の作成など画像加工の分野において、画像中の物体を別の背景に合成する作業が行われているが、透明領域を含む物体の画像を別の背景に貼り付けたときに、透明領域に元の背景が写りこんでしまう。高精度に合成するために必要な透明領域の透過度を算出する先行特許として、例えば、特許文献１には、透明物体を、黒白パターンを有するボードを背景としてデジタルカメラによって撮影し、得られた画像データに対し、その画素毎に背景の属性を判定し、該属性毎の画素の色データに基づいて、透明物体の透明度データ及び色データを算出する技術が開示されている。また、特許文献２には、測定装置が撮像位置取得部と物理量算出部とを備え、撮像位置取得部は、背景との間に空間または空間および空間内に存在する透明物体が介在した状態での撮像画像について、背景の撮像位置に対応する撮像画像中の位置を取得し、物理量算出部は、背景の撮像位置に対応する撮像画像中の位置および背景の撮像位置に基づいて透明物体の物理量を算出する技術が開示されている。 In the field of image processing, such as creating web advertisements and pop advertisements, objects in an image are composited onto another background, but when pasting an image of an object that includes transparent areas onto another background, , the original background is reflected in the transparent area. For example, Patent Document 1 discloses a prior patent that calculates the transparency of a transparent area necessary for high-precision compositing. A technique is disclosed in which background attributes are determined for each pixel of image data, and transparency data and color data of a transparent object are calculated based on color data of pixels for each attribute. Further, in Patent Document 2, a measuring device includes an imaging position acquisition unit and a physical quantity calculation unit, and the imaging position acquisition unit is configured to detect a state in which a space or a space and a transparent object existing in the space are interposed between the imaging position acquisition unit and the background. The physical quantity calculation unit calculates the physical quantity of the transparent object based on the position in the captured image corresponding to the background imaging position and the background imaging position for the captured image. A technique for calculating is disclosed.

特開２００１－１４３０８５号公報Japanese Patent Application Publication No. 2001-143085 特開２０１８－１５９６７１号公報Japanese Patent Application Publication No. 2018-159671

ここで、画像の透明領域の透過度を特殊な装置や環境を用いて取得する場合、透過度を取得するユーザの負担軽減を図ることは難しい。
本発明の目的は、特殊な装置や環境を用いなくても画像の透明領域の透過度を取得することにある。 Here, when obtaining the transparency of a transparent region of an image using a special device or environment, it is difficult to reduce the burden on the user who obtains the transparency.
An object of the present invention is to obtain the transparency of a transparent area of an image without using any special equipment or environment.

請求項１に記載の発明は、プロセッサを備え、前記プロセッサは、画像から透明領域を含む対象物の領域である対象物領域を定め、前記画像及び前記対象物領域から、当該画像における当該対象物領域以外の領域である第１領域と、当該対象物領域のうち前記透明領域である第２領域及び当該第２領域以外の領域である第３領域と、を示す情報を取得し、
前記画像及び前記情報を基に前記第２領域の透過度を定める、情報処理装置である。
請求項２に記載の発明は、前記取得する情報は、前記第１領域、前記第２領域及び前記第３領域の各々を示す３つの値からなる画像の情報である、ことを特徴とする請求項１に記載の情報処理装置である。
請求項３に記載の発明は、前記第２領域及び前記第３領域の各々を示す値は、前記第１領域の色情報と当該第２領域及び当該第３領域の色情報との差により定まる、ことを特徴とする請求項２に記載の情報処理装置である。
請求項４に記載の発明は、前記対象物領域は、機械学習モデルを用いて定められる、ことを特徴とする請求項１乃至３のいずれか１項に記載の情報処理装置である。
請求項５に記載の発明は、前記対象物領域は、前記画像に含まれる複数の対象物のいずれか一つである、ことを特徴とする請求項１乃至３のいずれか１項に記載の情報処理装置である。
請求項６に記載の発明は、画面に表示した前記第２領域についてユーザによる修正指示を受け付ける、ことを特徴とする請求項１に記載の情報処理装置である。
請求項７に記載の発明は、前記修正指示は、前記第２領域を前記第３領域に対して変更する両者の境界変更を行うものである、ことを特徴とする請求項６に記載の情報処理装置である。
請求項８に記載の発明は、前記境界変更は、前記第２領域と前記第３領域のいずれか一方を選択し他方において指定された部分を当該一方に変更するものである、ことを特徴とする請求項７に記載の情報処理装置である。
請求項９に記載の発明は、前記修正指示は、前記透過度の変更を行うものである、ことを特徴とする請求項６に記載の情報処理装置である。
請求項１０に記載の発明は、コンピュータに、画像から透明領域を含む対象物の領域である対象物領域を定める機能と、前記画像及び前記対象物領域から、当該画像における当該対象物領域以外の領域である第１領域と、当該対象物領域のうち前記透明領域である第２領域及び当該第２領域以外の領域である第３領域と、を示す情報を取得する機能と、前記画像及び前記情報を基に前記第２領域の透過度を定める機能と、を実現させるプログラムである。 The invention according to claim 1 includes a processor, and the processor determines an object area that is an area of the object including a transparent area from an image, and determines the object area in the image from the image and the object area. obtaining information indicating a first area that is an area other than the area, a second area that is the transparent area of the target object area, and a third area that is an area other than the second area;
The information processing apparatus determines the transparency of the second area based on the image and the information.
The invention according to claim 2 is characterized in that the acquired information is image information consisting of three values indicating each of the first area, the second area, and the third area. The information processing device according to item 1.
In the invention according to claim 3, the values indicating each of the second area and the third area are determined by the difference between the color information of the first area and the color information of the second area and the third area. 3. The information processing apparatus according to claim 2, characterized in that: .
The invention according to claim 4 is the information processing apparatus according to any one of claims 1 to 3, wherein the target object area is determined using a machine learning model.
The invention according to claim 5 is characterized in that the object area is any one of a plurality of objects included in the image. It is an information processing device.
The invention according to claim 6 is the information processing apparatus according to claim 1, characterized in that the information processing apparatus receives a correction instruction from a user regarding the second area displayed on the screen.
The invention according to claim 7 is characterized in that the modification instruction is to change the boundary between the second area and the third area. It is a processing device.
The invention according to claim 8 is characterized in that the boundary change selects either the second area or the third area and changes a specified portion in the other area to the one. 8. The information processing apparatus according to claim 7.
The invention according to claim 9 is the information processing apparatus according to claim 6, wherein the modification instruction is to change the degree of transparency.
The invention according to claim 10 provides a computer with a function of determining an object area that is an area of an object including a transparent area from an image, and a function of determining an object area other than the object area in the image from the image and the object area. a first area that is a region, a second area that is the transparent area of the target object area, and a third area that is an area other than the second area; This program implements a function of determining the degree of transparency of the second area based on information.

請求項１の発明によれば、特殊な装置や環境を用いなくても画像の透明領域の透過度を取得することができる。
請求項２の発明によれば、取得する情報が第１領域、第２領域及び第３領域の各々を示す３つの値からなる画像の情報でない場合に比べ、情報処理の負担軽減を図ることができる。
請求項３の発明によれば、第２領域及び第３領域の各々を示す値が第１領域の色情報と第２領域及び第３領域の色情報との差により定まる構成を採用しない場合に比べ、情報処理の負担軽減を図ることができる。
請求項４の発明によれば、対象物領域が機械学習モデルを用いて定められる構成を採用しない場合に比べ、情報処理の負担軽減を図ることができる。
請求項５の発明によれば、対象物領域が画像に含まれる複数の対象物のいずれか一つである構成を採用しない場合に比べ、透過度の精度向上を図ることができる。
請求項６の発明によれば、画面に表示した第２領域についてユーザによる修正指示を受け付ける構成を採用しない場合に比べ、使用感を向上させることができる。
請求項７の発明によれば、修正指示が第２領域を第３領域に対して変更する両者の境界変更を行うものである構成を採用しない場合に比べ、使用感を向上させることができる。
請求項８の発明によれば、境界変更が第２領域と第３領域のいずれか一方を選択し他方において指定された部分を一方に変更するものである構成を採用しない場合に比べ、操作性の向上を図ることができる。
請求項９の発明によれば、修正指示が透過度の変更を行うものである構成を採用しない場合に比べ、使用感を向上させることができる。
請求項１０の発明によれば、特殊な装置や環境を用いなくても画像の透明領域の透過度を取得することができる。 According to the first aspect of the invention, it is possible to obtain the transparency of a transparent area of an image without using any special equipment or environment.
According to the invention of claim 2, it is possible to reduce the burden of information processing compared to the case where the information to be acquired is not information of an image consisting of three values indicating each of the first area, the second area, and the third area. can.
According to the invention of claim 3, when the configuration in which the values indicating each of the second area and the third area are determined by the difference between the color information of the first area and the color information of the second area and the third area is not adopted. In comparison, it is possible to reduce the burden of information processing.
According to the fourth aspect of the invention, it is possible to reduce the burden of information processing compared to a case where a configuration in which the target object area is determined using a machine learning model is not adopted.
According to the fifth aspect of the invention, it is possible to improve the accuracy of transparency compared to a case where the object area is any one of a plurality of objects included in the image.
According to the invention of claim 6, the usability can be improved compared to the case where a configuration is not adopted in which a modification instruction from a user is accepted for the second area displayed on the screen.
According to the invention of claim 7, the usability can be improved compared to the case where a configuration in which the modification instruction changes the boundary between the second area and the third area is not adopted.
According to the invention of claim 8, the operability is improved compared to the case where a configuration in which the boundary change selects one of the second area and the third area and changes a designated portion in the other area to one is not adopted. It is possible to improve the
According to the invention of claim 9, the usability can be improved compared to the case where a configuration in which the modification instruction changes the transparency is not adopted.
According to the tenth aspect of the invention, it is possible to obtain the transparency of a transparent area of an image without using any special equipment or environment.

本実施の形態が適用される画像処理システムの全体構成例を示した図である。1 is a diagram showing an example of the overall configuration of an image processing system to which this embodiment is applied. 本実施の形態における画像処理装置のハードウェア構成例を示した図である。1 is a diagram illustrating an example of a hardware configuration of an image processing apparatus according to the present embodiment. 本実施の形態における携帯端末のハードウェア構成例を示した図である。FIG. 2 is a diagram showing an example of the hardware configuration of a mobile terminal according to the present embodiment. 第１の実施の形態に係る画像処理装置の機能構成例を表すブロック図である。1 is a block diagram illustrating an example of a functional configuration of an image processing apparatus according to a first embodiment; FIG. 対象物領域推定部の処理を説明する図である。It is a figure explaining the processing of a target object area estimation part. トライマップ生成部の処理を説明する図である。FIG. 3 is a diagram illustrating processing of a try map generation unit. トライマップ生成部の処理を説明する図であり、（ａ）は、マスク画像及び画像の画素を説明する図であり、（ｂ）は画素値の差の計算結果を示す表である。FIG. 3 is a diagram illustrating processing of a try map generation unit, in which (a) is a diagram illustrating a mask image and pixels of the image, and (b) is a table illustrating calculation results of differences in pixel values. トライマップ生成部の処理を説明する図である。FIG. 3 is a diagram illustrating processing of a try map generation unit. 画像と新しい背景とを合成した場合の画像を説明する図であり、（ａ）は第１の実施の形態に係る画像処理装置の処理により合成した場合の合成画像を示し、（ｂ）は、従来の処理により合成した場合の比較例としての合成画像を示す。FIG. 3 is a diagram illustrating an image when an image and a new background are combined, where (a) shows a combined image when combined by processing of the image processing device according to the first embodiment, and (b), A composite image as a comparative example when composited using conventional processing is shown. 第２の実施の形態に係る画像処理装置の機能構成例を表すブロック図である。FIG. 2 is a block diagram illustrating an example of the functional configuration of an image processing device according to a second embodiment. ユーザ補正部を説明する図であり、（ａ）、（ｂ）及び（ｃ）は、透過度を変更する操作を説明する図である。It is a figure explaining a user correction part, (a), (b), and (c) are figures explaining the operation which changes transparency. 第３の実施の形態に係る画像処理装置の機能構成例を表すブロック図である。FIG. 7 is a block diagram illustrating an example of a functional configuration of an image processing device according to a third embodiment. ユーザ補正部の処理を説明する図であり、（ａ）はユーザ補正部の処理前を示し、（ｂ）は処理後を示す。It is a figure explaining the processing of a user correction part, (a) shows before processing of a user correction part, (b) shows after processing. 第４の実施の形態に係る画像処理装置の処理を説明する図である。FIG. 7 is a diagram illustrating processing of an image processing apparatus according to a fourth embodiment.

以下、添付図面を参照して、本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

［本実施の形態の概要］
本実施の形態は、画像から透明領域を含む対象物の領域である対象物領域を定め、画像及び対象物領域から、画像における対象物領域以外の領域である第１領域と、対象物領域のうち透明領域である第２領域及び第２領域以外の領域である第３領域と、を示す情報を取得し、画像及び情報を基に第２領域の透過度を定める、情報処理装置を提供する。 [Overview of this embodiment]
In this embodiment, an object area that is an area of the object including a transparent area is determined from an image, and a first area that is an area other than the object area in the image and a first area of the object area are determined from the image and the object area. Provided is an information processing device that obtains information indicating a second area that is a transparent area and a third area that is an area other than the second area, and determines the degree of transparency of the second area based on the image and information. .

ここで、透明領域とは、画像において対象物の背景が光を通す透明な部材を介して視認可能な領域をいう。かかる透明な部材は、平たんな面を有する場合のほか、一部又は全部に曲面ないし折れ曲がった面を有する場合や、かかる面が規則的又は不規則的であってもよい。また、背景は第１領域に対応するものであり、透明領域は第２領域に対応するものである。
透過度とは、第２領域を通る光の度合いをいう。
情報処理装置としては、画像処理を行う画像処理装置１０（図１参照）を例にとって説明するが、携帯端末２０（同図参照）であってもよい。 Here, the transparent area refers to an area in an image where the background of the object is visible through a transparent member that transmits light. In addition to having a flat surface, such a transparent member may have a partially or entirely curved or bent surface, or such a surface may be regular or irregular. Further, the background corresponds to the first area, and the transparent area corresponds to the second area.
Transmittance refers to the degree of light passing through the second region.
The information processing apparatus will be described using an image processing apparatus 10 (see FIG. 1) that performs image processing as an example, but it may also be a mobile terminal 20 (see FIG. 1).

［画像処理システムの全体構成］
図１は、本実施の形態が適用される画像処理システム１の全体構成例を示した図である。図示するように、この画像処理システム１は、画像処理装置１０及び携帯端末２０を含む。画像処理装置１０は、通信回線３０に接続されている。携帯端末２０は、無線通信によりアクセスポイント４０を介して通信回線３０と無線接続可能になっている。なお、図では、画像処理装置１０及び携帯端末２０は、１つずつしか示していないが、複数存在してもよい。また、通信回線３０は、例えばＬＡＮ（Local Area Network）やインターネットとすればよい。 [Overall configuration of image processing system]
FIG. 1 is a diagram showing an example of the overall configuration of an image processing system 1 to which this embodiment is applied. As illustrated, this image processing system 1 includes an image processing device 10 and a mobile terminal 20. The image processing device 10 is connected to a communication line 30. The mobile terminal 20 can be wirelessly connected to the communication line 30 via the access point 40 by wireless communication. Note that in the figure, only one image processing device 10 and one mobile terminal 20 are shown, but a plurality of them may exist. Furthermore, the communication line 30 may be, for example, a LAN (Local Area Network) or the Internet.

画像処理装置１０は、携帯端末２０からの画像を処理する装置であり、画像を基に対象物領域を推定し、画像及び推定結果を用いて、対象物領域外、対象物領域内透明領域、対象物内非透明領域を表す画像（以下、トライマップ画像という）を生成し、画像及びトライマップ画像を基に透過度を推定する。なお、画像処理装置１０は、例えばパーソナルコンピュータ（ＰＣ）によって実現するとよい。 The image processing device 10 is a device that processes images from the mobile terminal 20, and estimates a target object area based on the image, and uses the image and the estimation result to determine transparent areas outside the target area, transparent areas inside the target object area, An image (hereinafter referred to as a tri-map image) representing a non-transparent region within the object is generated, and the degree of transparency is estimated based on the image and the tri-map image. Note that the image processing device 10 may be realized by, for example, a personal computer (PC).

携帯端末２０は、対象物を撮像し、画像処理装置１０に送信する端末装置である。携帯端末２０には、カメラアプリケーションがインストールされているとよい。このカメラアプリケーションは、例えば、携帯端末２０の操作者の操作により、対象物を撮像し、撮像画像を画像処理装置１０へ送信する。なお、携帯端末２０は、例えば、スマートフォンによって実現するとよい。 The mobile terminal 20 is a terminal device that captures an image of a target object and transmits the image to the image processing device 10. A camera application is preferably installed on the mobile terminal 20. This camera application captures an image of an object and transmits the captured image to the image processing device 10, for example, by an operation of an operator of the mobile terminal 20. Note that the mobile terminal 20 may be realized by, for example, a smartphone.

［画像処理装置のハードウェア構成］
図２は、本実施の形態における画像処理装置１０のハードウェア構成例を示した図である。図示するように、画像処理装置１０は、プロセッサ１０ａと、ＲＡＭ（Random Access Memory）１０ｂと、ＨＤＤ（Hard Disk Drive）１０ｃと、通信インターフェース（以下、「通信Ｉ／Ｆ」と表記する）１０ｄと、表示デバイス１０ｅと、入力デバイス１０ｆとを備える。 [Hardware configuration of image processing device]
FIG. 2 is a diagram showing an example of the hardware configuration of the image processing device 10 in this embodiment. As shown in the figure, the image processing device 10 includes a processor 10a, a RAM (Random Access Memory) 10b, an HDD (Hard Disk Drive) 10c, and a communication interface (hereinafter referred to as "communication I/F") 10d. , a display device 10e, and an input device 10f.

プロセッサ１０ａは、ＯＳ（Operating System）やアプリケーション等の各種ソフトウェアを実行し、後述する各機能を実現する。 The processor 10a executes various software such as an OS (Operating System) and applications to realize various functions described below.

ＲＡＭ１０ｂは、プロセッサ１０ａの作業用メモリ等として用いられるメモリである。ＨＤＤ１０ｃは、各種ソフトウェアに対する入力データや各種ソフトウェアからの出力データ等を記憶する例えば磁気ディスク装置である。 The RAM 10b is a memory used as a working memory of the processor 10a. The HDD 10c is, for example, a magnetic disk device that stores input data for various software, output data from various software, and the like.

通信Ｉ／Ｆ１０ｄは、通信回線３０を介して携帯端末２０との間で各種情報の送受信を行う。
表示デバイス１０ｅは、各種情報を表示する例えばディスプレイである。入力デバイス１０ｆは、ユーザが情報を入力するために用いる例えばキーボードやマウスである。 The communication I/F 10d sends and receives various information to and from the mobile terminal 20 via the communication line 30.
The display device 10e is, for example, a display that displays various information. The input device 10f is, for example, a keyboard or mouse used by the user to input information.

［携帯端末のハードウェア構成］
図３は、本実施の形態における携帯端末２０のハードウェア構成例を示した図である。図示するように、携帯端末２０は、プロセッサ２１と、ＲＡＭ２２と、ＲＯＭ２３と、入出力デバイス２４と、音声入力機構２５と、音声出力機構２６と、撮像機構２７と、無線回路２８と、アンテナ２９とを備える。 [Hardware configuration of mobile terminal]
FIG. 3 is a diagram showing an example of the hardware configuration of the mobile terminal 20 in this embodiment. As illustrated, the mobile terminal 20 includes a processor 21, a RAM 22, a ROM 23, an input/output device 24, an audio input mechanism 25, an audio output mechanism 26, an imaging mechanism 27, a wireless circuit 28, and an antenna 29. Equipped with.

プロセッサ２１は、ＲＯＭ２３等に記憶された各種プログラムをＲＡＭ２２にロードして実行することにより、携帯端末２０の各機能を実現する。 The processor 21 implements each function of the mobile terminal 20 by loading various programs stored in the ROM 23 and the like into the RAM 22 and executing them.

ＲＡＭ２２は、プロセッサ２１の作業用メモリ等として用いられるメモリである。ＲＯＭ２３は、プロセッサ２１が実行する各種プログラム等を記憶するメモリである。 The RAM 22 is a memory used as a working memory of the processor 21 and the like. The ROM 23 is a memory that stores various programs executed by the processor 21.

入出力デバイス２４は、各種情報の表示やユーザからの操作入力の受付を行うデバイスであり、例えばタッチパネルである。音声入力機構２５は、外部から音声を入力するデバイスであり、例えばマイクロフォンである。音声出力機構２６は、音声を外部に出力するデバイスであり、例えばスピーカである。 The input/output device 24 is a device that displays various information and receives operation input from the user, and is, for example, a touch panel. The audio input mechanism 25 is a device that inputs audio from the outside, and is, for example, a microphone. The audio output mechanism 26 is a device that outputs audio to the outside, and is, for example, a speaker.

撮像機構２７は、被写体を撮像するデバイスであり、例えばカメラである。 The imaging mechanism 27 is a device that images a subject, and is, for example, a camera.

無線回路２８は、無線通信を制御する回路である。アンテナ２９は、無線回路２８が出力した無線通信の信号を送信したり、無線通信の信号を受信して無線回路２８に出力したりするデバイスである。無線通信は、５Ｇ（5th Generation）、ＬＴＥ（Long Term Evolution）、Ｗｉ－Ｆｉ（登録商標）、ｂｌｕｅｔｏｏｔｈ（登録商標）、ＮＦＣ（Near Field Communication）等の無線通信規格に準拠する通信を含む。 The wireless circuit 28 is a circuit that controls wireless communication. The antenna 29 is a device that transmits a wireless communication signal output from the wireless circuit 28 or receives a wireless communication signal and outputs it to the wireless circuit 28. Wireless communication includes communication based on wireless communication standards such as 5G (5th Generation), LTE (Long Term Evolution), Wi-Fi (registered trademark), Bluetooth (registered trademark), and NFC (Near Field Communication).

本実施の形態に係る画像処理装置１０では、従来のように特殊な装置や環境を用いることなく、１つの画像から透過度を推定し、これにより、高精度な合成の自動化を図ることができる。以下、種々の実施の形態について説明する。 The image processing device 10 according to the present embodiment can estimate the degree of transparency from one image without using special equipment or environments as in the past, and thereby achieve highly accurate automation of composition. . Various embodiments will be described below.

［第１の実施の形態］
図４は、第１の実施の形態に係る画像処理装置１０の機能構成例を表すブロック図である。図示するように、第１の実施の形態に係る画像処理装置１０は、対象物領域推定部１１と、トライマップ生成部１２と、透過度推定部１３とを備えている。 [First embodiment]
FIG. 4 is a block diagram showing an example of the functional configuration of the image processing device 10 according to the first embodiment. As illustrated, the image processing device 10 according to the first embodiment includes an object area estimation section 11, a tri-map generation section 12, and a transparency estimation section 13.

対象物領域推定部１１は、画像から、透明領域を含む対象物の領域である対象物領域を推定する。これにより、対象物領域推定部１１は、対象物領域を表す対象物領域マスク画像を算出する。 The object area estimation unit 11 estimates an object area, which is an area of the object including a transparent area, from the image. Thereby, the object area estimation unit 11 calculates an object area mask image representing the object area.

トライマップ生成部１２は、画像及び対象物領域推定部１１による対象物領域から、対象物領域内を透明領域と非透明領域に区分する。これにより、トライマップ生成部１２は、対象物領域外と、対象物領域内の非透明領域及び透明領域という３つの領域の位置を表すトライマップ画像を生成する。 The trymap generation unit 12 divides the object area into a transparent area and a non-transparent area based on the image and the object area determined by the object area estimation unit 11. Thereby, the tri-map generation unit 12 generates a tri-map image representing the positions of three areas: the outside of the object area, the non-transparent area and the transparent area within the object area.

透過度推定部１３は、画像とトライマップ生成部１２によるトライマップ画像とを基に透過度を推定する。これにより、透過度推定部１３は、透過度画像を用意する。 The transparency estimation unit 13 estimates the transparency based on the image and the trimap image generated by the trimap generation unit 12. Thereby, the transparency estimating unit 13 prepares a transparency image.

まず、対象物領域推定部１１について図５を用いて説明する。
図５は、対象物領域推定部１１の処理を説明する図である。
対象物領域推定部１１は、境界で色差が小さい透明領域においても精度良く推測することを目的として、図５に示すように、画像５１から機械学習モデル５２を利用した領域推定を行い、対象物領域５３ａを表すマスク画像５３を算出する。 First, the object area estimation unit 11 will be explained using FIG. 5.
FIG. 5 is a diagram illustrating the processing of the target object area estimation unit 11.
The object area estimating unit 11 performs area estimation using a machine learning model 52 from an image 51, as shown in FIG. A mask image 53 representing the area 53a is calculated.

画像５１は、携帯端末２０（図１参照）で撮影され送信されたものである。また、画像５１は、図５に示すように、手提げ用に四角形状に抜いた四角形状穴Ｇ２を有する透明な袋Ｇ１に１個のりんごＧ３が入っている画像である。画像５１における背景には、複数の縦線が現れている。透明な袋Ｇ１は例えばビニル等であるが、ペットボトル等の場合もある。
背景として現れている複数の縦線は、透明な袋Ｇ１から透けて見えている。すなわち、複数の縦線は、透明な袋Ｇ１の外縁Ｇ４の外側の部分及び袋Ｇ１の四角形状穴Ｇ２の部分に直接現れている他、透明な袋Ｇ１の外縁Ｇ４の内側でも、四角形状穴Ｇ２及びりんごＧ３以外の部分で視認可能である。 The image 51 is photographed and transmitted by the mobile terminal 20 (see FIG. 1). Further, as shown in FIG. 5, the image 51 is an image of an apple G3 contained in a transparent bag G1 having a square hole G2 cut out for carrying as a handbag. A plurality of vertical lines appear in the background of the image 51. The transparent bag G1 is made of vinyl, for example, but may also be made of a plastic bottle.
The plurality of vertical lines appearing as the background are visible through the transparent bag G1. That is, the plurality of vertical lines appear directly on the outside of the outer edge G4 of the transparent bag G1 and on the square hole G2 of the bag G1, and also on the inside of the outer edge G4 of the transparent bag G1. It is visible in parts other than G2 and apple G3.

透明な袋Ｇ１に入っているりんごＧ３は対象物の一例であり、対象物領域５３ａは、かかる対象物の領域の一例である。 The apple G3 contained in the transparent bag G1 is an example of the target object, and the target object area 53a is an example of the area of the target object.

対象物領域推定部１１で利用する機械学習モデル５２は、透明領域を含む対象物領域５３ａを推定するように学習されたものである。さらに説明すると、機械学習モデル５２は任意であり、エンコーダー・デコーダー構造を持つＵ－ＮｅｔやＤｅｅｐＬａｂＶ３＋などの深層学習モデルを活用してもよい。また、精度向上を目的とした二段階モデルの利用やモデル構造、損失関数を工夫するなどをしてもよい。 The machine learning model 52 used by the object area estimating unit 11 is trained to estimate an object area 53a including a transparent area. To explain further, the machine learning model 52 is arbitrary, and a deep learning model such as U-Net or DeepLab V3+ having an encoder-decoder structure may be utilized. Furthermore, a two-stage model may be used or the model structure and loss function may be devised for the purpose of improving accuracy.

マスク画像５３は、機械学習モデル５２により、対象物領域５３ａと対象物領域５３ａ以外の領域５３ｂとの間の境界が精度よく推測されている。なお、図５の領域５３ｂは、斜線で表している。 In the mask image 53, the boundary between the target object region 53a and a region 53b other than the target object region 53a is estimated with high accuracy by the machine learning model 52. Note that the region 53b in FIG. 5 is indicated by diagonal lines.

次に、トライマップ生成部１２について図６、図７及び図８を用いて説明する。
図６～図８は、トライマップ生成部１２の処理を説明する図である。
図６には、左側にマスク画像５３、右側にトライマップ画像５４を示している。マスク画像５３には、対象物領域５３ａ及び領域５３ｂがあり、トライマップ画像５４には、対象物領域外５４ａ、透明領域５４ｂ及び非透明領域５４ｃがある。 Next, the try map generation unit 12 will be explained using FIGS. 6, 7, and 8.
6 to 8 are diagrams illustrating the processing of the try map generation unit 12.
FIG. 6 shows a mask image 53 on the left and a trimap image 54 on the right. The mask image 53 has an object area 53a and an area 53b, and the trimap image 54 has an outside area 54a, a transparent area 54b, and a non-transparent area 54c.

トライマップ生成部１２は、対象物領域推定部１１で推定したマスク画像５３の対象物領域５３ａを透明領域５４ｂと非透明領域５４ｃに区分する。そして、トライマップ生成部１２は、対象物領域外５４ａ、透明領域５４ｂ及び非透明領域５４ｃを表すトライマップ画像５４を生成する。
トライマップ画像５４の対象物領域外５４ａは、マスク画像５３の領域５３ｂとすることができる。 The tri-map generation section 12 divides the object region 53a of the mask image 53 estimated by the object region estimation section 11 into a transparent region 54b and a non-transparent region 54c. Then, the trimap generation unit 12 generates a trimap image 54 representing the outside of the object area 54a, the transparent area 54b, and the non-transparent area 54c.
An area 54a outside the target object area of the try map image 54 can be an area 53b of the mask image 53.

透明領域５４ｂと非透明領域５４ｃの区分方法は任意であるが、例えば対象物領域外５４ａの画像の色情報を活用することで算出することが考えられる。すなわち、トライマップ生成部１２は、対象物領域外５４ａの色情報と対象物領域内の色情報の差を基に透明領域５４ｂと非透明領域５４ｃに区分する。
対象物領域外５４ａは第１領域の一例であり、透明領域５４ｂは第２領域の一例であり、非透明領域５４ｃは第３領域の一例である。トライマップ画像５４は、第１領域と第２領域と第３領域とを示す情報の一例である。 Although the method of dividing the transparent area 54b and the non-transparent area 54c is arbitrary, it may be calculated by utilizing color information of the image outside the object area 54a, for example. That is, the trimap generation unit 12 divides the transparent area 54b and the non-transparent area 54c based on the difference between the color information outside the object area 54a and the color information inside the object area.
The outside object area 54a is an example of a first area, the transparent area 54b is an example of a second area, and the non-transparent area 54c is an example of a third area. The try map image 54 is an example of information indicating the first area, the second area, and the third area.

ここで、図７の（ａ）は、マスク画像５３及び画像５１の画素を説明する図であり、（ｂ）は画素値の差の計算結果を示す表である。
図７（ａ）に示すように、マスク画像５３の対象物領域５３ａに対応する画像５１の領域５１ａ内の画素値の集合をＩinとし、領域５３ｂに対応する画像５１の領域５１ｂ内の画素値の集合をＩoutとする。また、画像５１の領域５１ａ内のある画素の画素値をｍとし、画像５１の領域５１ｂ内のある画素の画素値をｎと表すことにする。 Here, (a) of FIG. 7 is a diagram explaining the pixels of the mask image 53 and the image 51, and (b) is a table showing the calculation result of the difference in pixel values.
As shown in FIG. 7A, the set of pixel values in the area 51a of the image 51 corresponding to the object area 53a of the mask image 53 is Iin, and the pixel values in the area 51b of the image 51 corresponding to the area 53b are Let the set of Iout be Iout. Further, the pixel value of a certain pixel in the area 51a of the image 51 is expressed as m, and the pixel value of a certain pixel in the area 51b of the image 51 is expressed as n.

そして、図７（ｂ）に示すように、画像５１の区分したい区分対象５１ａの対象５１１の画像における画素値（ＲＧＢ）に対して、対象物領域外５１ｂのすべての画素の画素値との差を計算し、その最小値を取得する。図７（ｂ）に示す例では、最小値は９である。 Then, as shown in FIG. 7B, the difference between the pixel values (RGB) in the image of the target 511 of the target 51a to be classified in the image 51 and the pixel values of all pixels outside the target object area 51b. Calculate and get its minimum value. In the example shown in FIG. 7(b), the minimum value is 9.

説明が重複するが、Ｉinは画像における対象物領域内の画素値の集合を表し、Ｉoutは画像における対象物領域外の画素値の集合を表す。ｍはＩinの要素を表し、ｎはＩoutの要素を表す。
上記式において、||・||1はＬ１ノルムを表しているが、ＲＧＢ３値をある一つの値に変換する方法は任意であり、Ｌ２ノルムでもよく、各成分の平均値や２乗和、２乗平均などを計算しても良い。 Although the explanation will be repeated, Iin represents a set of pixel values within the object region in the image, and Iout represents a set of pixel values outside the object region in the image. m represents an element of Iin, and n represents an element of Iout.
In the above formula, ||・||1 represents the L1 norm, but the method of converting the RGB 3 values into a certain value is arbitrary, and the L2 norm may be used, such as the average value of each component, the sum of squares, You may calculate the square mean or the like.

また、色差計算はＲＧＢ空間に限定されず、Ｌａｂ空間やＨＳＶ空間に変換してから算出しても良く、計算する色空間は任意である。 Further, the color difference calculation is not limited to the RGB space, but may be performed after converting to the Lab space or the HSV space, and the color space to be calculated is arbitrary.

差の最小値が閾値以下であるならば、背景の画素に近いことから対象画素を透明領域５４ｂとし、閾値より大きければ非透明領域５４ｃとする。閾値は事前に設定しておいてもよいし、対象物領域内の平均画素値と対象物領域外の平均画素値の差を閾値とするなど、自動で設定するなどしてもよい。 If the minimum value of the difference is less than or equal to the threshold value, the target pixel is determined to be a transparent region 54b because it is close to a background pixel, and if it is larger than the threshold value, it is determined to be a non-transparent region 54c. The threshold value may be set in advance, or may be set automatically, such as by using the difference between the average pixel value within the object area and the average pixel value outside the object area as the threshold value.

図８に示すように、トライマップ生成部１２は、計算による区分結果（図７（ｂ）参照）を基に、トライマップ画像を作成する。例えば、Ｄｉｆｆが閾値以下だと０．５、閾値よりも大きいと１とする。より詳細には、対象物領域外５４ａを０、透明領域５４ｂを０．５、非透明領域５４ｃを１という３つの値からなる画像を作成することでトライマップ画像を作成する。図８の例ではトライマップ画像の各領域を示す３つの値が０、０．５、１．０であるが、これに限られない。
図８に括弧書きで示す０、０．５、１．０は、３つの値の一例である。 As shown in FIG. 8, the trimap generation unit 12 creates a trimap image based on the calculated classification results (see FIG. 7(b)). For example, if Diff is less than or equal to the threshold value, it is set to 0.5, and when it is larger than the threshold value, it is set to 1. More specifically, the trimap image is created by creating an image consisting of three values: 0 for the outside of the object area 54a, 0.5 for the transparent area 54b, and 1 for the non-transparent area 54c. In the example of FIG. 8, the three values indicating each region of the trimap image are 0, 0.5, and 1.0, but the value is not limited to this.
0, 0.5, and 1.0 shown in parentheses in FIG. 8 are examples of three values.

また、トライマップ画像の作成方法は、上記の色情報による方法以外にも、画像と対象物領域を入力として受け取り、対象物領域内を領域分割する深層学習モデルを活用するなどの方法も考えられる。 In addition to the color information method described above, there are other ways to create a tri-map image, such as using a deep learning model that receives the image and object area as input and divides the object area into regions. .

次に、透過度推定部１３について説明する。
透過度推定部１３は、画像とトライマップ画像を基に透明領域における透過度を算出する。ここで、透過度とは、以下式の合成方法で活用されるαマスクである。 Next, the transparency estimation section 13 will be explained.
The transparency estimating unit 13 calculates the transparency in the transparent area based on the image and the trimap image. Here, the transparency is an α mask used in the synthesis method of the following formula.

ここで、Ｉは画像、Ｆは前景成分、Ｂは背景成分をそれぞれ表し、αは前景成分の透過度ないし不透過度を表すものとする。 Here, I represents the image, F represents the foreground component, B represents the background component, and α represents the degree of transparency or opacity of the foreground component.

もしくは、透過度とは、ＥｎｖｉｒｏｎｍｅｎｔＭａｔｔｉｎｇで活用されるρである。 Alternatively, the transparency is ρ used in Environment Matching.

ここで、ｍは、透明領域を表すマスク画像、Ｂは背景画像、Ｍは屈折を表す画像、ρは透過度を表す。上式において、透明領域における屈折の影響を考慮しない場合はＭをＢとすればよい。以上のように、透過度の定義は合成方法に準じるものとし、合成方法に関しても上記２つに限らない。 Here, m is a mask image representing a transparent area, B is a background image, M is an image representing refraction, and ρ represents transparency. In the above equation, M may be set to B if the influence of refraction in the transparent region is not considered. As described above, the definition of transparency is based on the synthesis method, and the synthesis method is not limited to the above two.

透過度の算出方法は、画像とトライマップ画像、透過度画像のセットを大量に用意し、深層学習にその関係性を学習させることで、テスト画像に対する透過度を推測させる。もしくは、制約条件を設定し、最適化により算出する方法でもよい。 The method for calculating transparency is to prepare a large set of images, trimap images, and transparency images, and have deep learning learn the relationships between them to estimate the transparency of the test image. Alternatively, a method of setting constraint conditions and calculating by optimization may be used.

このように、第１の実施の形態では、特殊な装置や環境を用いなくても、ＲＧＢ画像から透過度を算出することを可能にしている。このため、画像と新しい背景とを合成する場合に、自然な合成画像を作成できる。以下、説明する。 In this way, the first embodiment makes it possible to calculate transparency from an RGB image without using any special equipment or environment. Therefore, when composing an image and a new background, a natural-looking composite image can be created. This will be explained below.

図９は、画像５１と新しい背景５５とを合成した場合の画像を説明する図であり、（ａ）は第１の実施の形態に係る画像処理装置１０の処理により合成した場合の合成画像５６を示し、（ｂ）は、従来の処理により合成した場合の比較例としての合成画像１５６を示す。
図９（ａ）に示すように、画像５１は、複数の縦線が現れた背景であり、透明な袋Ｇ１から複数の縦線が透けて見える。その一方で、新しい背景５５は、複数の縦線がない。上述したように、第１の実施の形態では、１枚のＲＧＢ画像から透過度を算出していることから、合成画像５６において、透明な袋Ｇ１が新しい背景５５に置き換わっている。しかも、第１の実施の形態では、透過度の算出を特殊な装置や環境を用いずに行うことが可能である。 FIG. 9 is a diagram illustrating an image when the image 51 and the new background 55 are combined, and (a) is a combined image 56 when combined by the processing of the image processing device 10 according to the first embodiment. , and (b) shows a composite image 156 as a comparative example when composite is performed using conventional processing.
As shown in FIG. 9A, the image 51 is a background in which a plurality of vertical lines appear, and the plurality of vertical lines can be seen through the transparent bag G1. On the other hand, the new background 55 is free of vertical lines. As described above, in the first embodiment, since the transparency is calculated from one RGB image, the transparent bag G1 is replaced with the new background 55 in the composite image 56. Moreover, in the first embodiment, it is possible to calculate the transmittance without using any special equipment or environment.

これに対し、図９（ｂ）に示す合成画像１５６では、領域を切り出してそのまま貼り付けるだけでは、透明な袋Ｇ１に元の背景である複数の縦線が残る。このため、第１の実施の形態ではなく従来の処理による場合には、自然に合成することは難しい。 On the other hand, in the composite image 156 shown in FIG. 9(b), if the area is simply cut out and pasted as is, a plurality of vertical lines, which are the original background, remain on the transparent bag G1. Therefore, if conventional processing is used instead of the first embodiment, it is difficult to naturally synthesize the images.

このように、第１の実施の形態では、１つの画像から透過度を推定することで、高精度な合成の自動化を図ることが可能である。
さらに説明すると、高精度な合成の自動化をより推進するために、携帯端末２０で画像５１を撮影した後に新しい背景を撮影することで、背景を指定する手間を省略することができ、簡易な操作で高精度な合成を実現することが可能になる。
また、画面に表示した画像５１における対象物の大きさや位置をユーザのタッチ操作やピンチ操作により変更可能にし、ユーザの意向を反映した合成画像５６を作成できるようにしてもよい。例えば、新しい背景５５のサイズが画像５１と異なる場合でも、合成画像５６を自然な感じにすることが可能になる。 In this way, in the first embodiment, by estimating the degree of transparency from one image, it is possible to automate highly accurate composition.
To further explain, in order to further promote automation of high-precision composition, by photographing a new background after photographing the image 51 with the mobile terminal 20, the trouble of specifying the background can be omitted, and the operation can be simplified. This makes it possible to achieve highly accurate synthesis.
Further, the size and position of the object in the image 51 displayed on the screen may be changed by a touch operation or a pinch operation by the user, so that a composite image 56 that reflects the user's intention may be created. For example, even if the new background 55 has a different size from the image 51, it is possible to make the composite image 56 look natural.

［第２の実施の形態］
図１０は、第２の実施の形態に係る画像処理装置１０の機能構成例を表すブロック図である。図示するように、第２の実施の形態に係る画像処理装置１０は、第１の実施の形態の場合と同じく、対象物領域推定部１１と、トライマップ生成部１２と、透過度推定部１３とを備える他、第１の実施の形態では備えていない画像合成部１４及びユーザ補正部１５を備えている。
このため、第２の実施の形態は、透過度を自動で決定していたためにユーザによる微調整ができない第１の実施の形態とは異なり、合成結果に対してユーザの指示により、透過度の変更を行うことができる。 [Second embodiment]
FIG. 10 is a block diagram showing an example of the functional configuration of the image processing device 10 according to the second embodiment. As shown in the figure, the image processing device 10 according to the second embodiment includes an object area estimation section 11, a tri-map generation section 12, and a transparency estimation section 13, as in the case of the first embodiment. In addition to the above, the image synthesis section 14 and the user correction section 15, which are not provided in the first embodiment, are also provided.
Therefore, unlike the first embodiment, in which the transparency is determined automatically and the user cannot make fine adjustments, the second embodiment allows the transparency to be adjusted based on the user's instructions for the synthesis result. Changes can be made.

画像合成部１４は、画像５１と透過度画像を基に新しい背景である合成先画像に合成する処理（例えば図９参照）を行う。また、画像合成部１４は、画像５１から切り出した領域と合成先画像との合成に関し、透明な領域に対しては透過度に応じた処理を行う。このようにして、画像合成部１４は、合成画像を作成する。 The image compositing unit 14 performs a process of compositing the image 51 and the transparency image into a compositing destination image that is a new background (for example, see FIG. 9). In addition, the image synthesis unit 14 performs processing on transparent areas according to the degree of transparency regarding the synthesis of the area cut out from the image 51 and the synthesis destination image. In this way, the image composition unit 14 creates a composite image.

ユーザ補正部１５は、ユーザによる修正指示に従って合成画像の透明領域における透過度を変更し、変更後の透過度に応じた補正後合成画像を作成する。 The user correction unit 15 changes the transparency in the transparent area of the composite image according to a modification instruction from the user, and creates a corrected composite image according to the changed transparency.

図１１は、ユーザ補正部１５を説明する図であり、（ａ）、（ｂ）及び（ｃ）は、透過度を変更する操作を説明する図である。
図１１（ａ）～（ｃ）に示すように、ユーザ補正部１５による補正は、画面に合成画像５６が表示された状態で、透過度調整部６１のスライダー６２の操作により行われる。合成画像５６には、透明領域５４ｂ及び非透明領域５４ｃが表示されている。 FIG. 11 is a diagram illustrating the user correction unit 15, and (a), (b), and (c) are diagrams illustrating an operation for changing transparency.
As shown in FIGS. 11A to 11C, the correction by the user correction unit 15 is performed by operating the slider 62 of the transparency adjustment unit 61 while the composite image 56 is displayed on the screen. The composite image 56 displays a transparent area 54b and a non-transparent area 54c.

ユーザによりスライダー６２のスライド操作が行われると、ユーザ補正部１５は、透明領域５４ｂについて透過度の修正指示を受け付ける。そして、ユーザ補正部１５は、受け付けた修正指示に従って、合成画像５６における透明領域５４ｂの透過度をリアルタイムで変更する。なお、非透明領域５４ｃは、スライド操作による修正が行われない。 When the user performs a sliding operation on the slider 62, the user correction unit 15 receives an instruction to correct the transparency of the transparent area 54b. Then, the user correction unit 15 changes the transparency of the transparent area 54b in the composite image 56 in real time according to the received correction instruction. Note that the non-transparent area 54c is not modified by the slide operation.

透過度調整部６１においてスライダー６２が左側に進むに従って透明領域５４ｂの透過度が下がり、右側に進むに従って透過度は上がる。例えば図１１（ａ）では、透明領域５４ｂと背景が区別可能であるが、同図（ｂ）では、透明領域５４ｂと背景との輪郭が視認可能であるものの両者の区別が（ａ）に比べて難しくなる。さらに、同図（ｃ）では、完全に透過する状態であり、透明領域５４ｂの輪郭が視認困難である。
ユーザによる確定操作がなされると、ユーザ補正部１５は、補正後合成画像（図１０参照）を作成する。 In the transparency adjustment section 61, as the slider 62 moves to the left, the transparency of the transparent area 54b decreases, and as the slider 62 moves to the right, the transparency increases. For example, in FIG. 11(a), the transparent area 54b and the background can be distinguished, but in FIG. It becomes difficult. Furthermore, in the same figure (c), it is in a completely transparent state, and the outline of the transparent area 54b is difficult to see.
When the user performs a confirmation operation, the user correction unit 15 creates a corrected composite image (see FIG. 10).

透過度の変更は、スライダー６２のスライドという簡易な操作で行うものであり、透過度調整を容易に行うことができる。
なお、透過度の変更は、例えば透過度がαマスクであれば、０と１でないα値を全体的に弱める、もしくは強めることで変更できるが、変更方法は任意である。
透過度調整部６１のスライダー６２の操作は、ユーザによる修正指示の一例であり、透過度の変更を行うものの一例である。 The transparency can be changed by a simple operation of sliding the slider 62, and the transparency can be easily adjusted.
Note that the transparency can be changed, for example, if the transparency is an α mask, by weakening or strengthening α values that are not 0 or 1 as a whole, but the changing method is arbitrary.
The operation of the slider 62 of the transparency adjustment section 61 is an example of a correction instruction by the user, and is an example of changing the transparency.

［第３の実施の形態］
図１２は、第３の実施の形態に係る画像処理装置１０の機能構成例を表すブロック図である。図示するように、第３の実施の形態に係る画像処理装置１０は、対象物領域推定部１１と、トライマップ生成部１２と、透過度推定部１３と、ユーザ補正部１６とを備えている。
このため、第３の実施の形態は、透明領域及び非透明領域を自動で決定していたためにユーザによる微調整ができない第１の実施の形態とは異なり、ユーザの指示により、領域の修正を行うことができる。
なお、第３の実施の形態に係る画像処理装置１０では、第２の実施の形態が備える画像合成部１４を備えていないが、備えてもよい。 [Third embodiment]
FIG. 12 is a block diagram showing an example of the functional configuration of the image processing device 10 according to the third embodiment. As illustrated, the image processing device 10 according to the third embodiment includes an object area estimation section 11, a tri-map generation section 12, a transparency estimation section 13, and a user correction section 16. .
Therefore, unlike the first embodiment, in which the transparent and non-transparent regions are automatically determined and the user cannot make fine adjustments, the third embodiment allows the user to modify the regions according to the user's instructions. It can be carried out.
Note that although the image processing device 10 according to the third embodiment does not include the image synthesis unit 14 included in the second embodiment, it may be provided.

ユーザ補正部１６は、透明領域と非透明領域との境界についてユーザによる修正指示を受け付け、受け付けた修正指示に従って境界を変更し、変更後の境界に応じた補正後トライマップ画像を作成する。
その後、透過度推定部１３は、画像と補正後トライマップ画像を基にして透過度画像を作成する。 The user correction unit 16 receives correction instructions from the user regarding the boundary between the transparent area and the non-transparent area, changes the boundary according to the received correction instruction, and creates a corrected trimap image according to the changed boundary.
Thereafter, the transparency estimation unit 13 creates a transparency image based on the image and the corrected trimap image.

図１３は、ユーザ補正部１６の処理を説明する図であり、（ａ）はユーザ補正部１６の処理前を示し、（ｂ）は処理後を示す。（ａ）、（ｂ）の各上段は画像に透明領域と非透明領域の境界を上書きした上書き画像５６であり、各下段はトライマップ画像５４である。
図１３（ａ）に示すように、上書き画像５６が画面に表示された場合、ユーザが透明領域５４ｂ及び非透明領域５４ｃを確認したところ、りんごＧ３の画像と非透明領域５４ｃとが一致していない。すなわち、非透明領域５４ｃが透明領域５４ｂ側にはみ出た領域５７が存在する。かかるはみ出た領域５７は、同図（ａ）の下段に示すトライマップ画像５４にも存在し、トライマップ生成部１２の処理に起因するものである。 FIG. 13 is a diagram illustrating the processing of the user correction unit 16, in which (a) shows the state before the processing by the user correction unit 16, and (b) shows the state after the processing. The upper rows of each of (a) and (b) are overwritten images 56 obtained by overwriting the boundary between the transparent area and the non-transparent area on the image, and the lower rows of each are the tri-map images 54.
As shown in FIG. 13(a), when the overwritten image 56 is displayed on the screen, the user checks the transparent area 54b and the non-transparent area 54c and finds that the image of apple G3 and the non-transparent area 54c match. do not have. That is, there is a region 57 in which the non-transparent region 54c protrudes toward the transparent region 54b. This protruding area 57 also exists in the trimap image 54 shown in the lower part of FIG.

そこで、ユーザは、図１３（ａ）の手順Ｐ１のように、上書き画像５６の透明領域５４ｂをマウスでタッチ操作した後、手順Ｐ２のように、はみ出た領域５７に向かって動かすドラッグ操作を行う。かかるユーザ操作により、ユーザ補正部１６は、はみ出た領域５７が小さくなるようにトライマップ画像５４を補正する。 Therefore, the user performs a touch operation on the transparent area 54b of the overwritten image 56 with the mouse as in step P1 of FIG. 13A, and then performs a drag operation to move it toward the protruding area 57 as in step P2. . Through such a user operation, the user correction unit 16 corrects the tri-map image 54 so that the protruding area 57 becomes smaller.

はみ出た領域５７を小さくするユーザ操作が完了すると、図１３（ｂ）に示す上書き画像５６では、はみ出た領域５７がなくなり、非透明領域５４ｃがりんごＧ３の画像と一致している。これは、ユーザ補正部１６が、同図（ｂ）の下段に示すトライマップ画像５４に対しユーザ操作によるはみ出た領域５７の削除補正を行ったものである。 When the user operation to reduce the protruding area 57 is completed, the protruding area 57 disappears in the overwritten image 56 shown in FIG. 13(b), and the non-transparent area 54c matches the image of the apple G3. In this example, the user correction unit 16 performs correction to delete the protruding area 57 caused by the user's operation from the trimap image 54 shown in the lower part of FIG. 2(b).

なお、図１３では、はみ出た領域５７に対する補正指示を、ドラッグ操作前に透明領域５４ｂをタッチ操作する場合で説明したが、これに限られず、非透明領域５４ｃをタッチ操作する場合でもよい。
また、はみ出た領域５７をタッチ操作した後に、りんごＧ３の輪郭をなぞるようにマウス操作を行うことで補正指示するようにしてもよい。 In addition, in FIG. 13, the correction instruction for the protruding area 57 is explained in the case of performing a touch operation on the transparent area 54b before the drag operation, but the present invention is not limited to this, and may be performed when a touch operation is performed on the non-transparent area 54c.
Further, after touching the protruding area 57, the correction instruction may be given by performing a mouse operation to trace the outline of the apple G3.

このように、ユーザ補正部１６では、ユーザがタッチ操作やマウス操作でなぞることで、その領域を透明領域５４ｂないし非透明領域５４ｃに変更あるいは追加することができる。 In this manner, the user correction unit 16 allows the user to trace the area using a touch operation or a mouse operation to change or add the area to the transparent area 54b or non-transparent area 54c.

画面でのユーザ操作は、ユーザによる修正指示の一例であり、透明領域５４ｂを非透明領域５４ｃに対して変更する操作であって透明領域５４ｂと非透明領域５４ｃとの境界を変更する操作としての境界変更を行うものの一例である。
また、透明領域５４ｂまたは非透明領域５４ｃをマウスでタッチ操作する修正指示は、透明領域５４ｂと非透明領域５４ｃのいずれか一方を選択することの一例であり、ドラッグ操作する修正指示は、他方において指定された部分を一方に変更するものの一例である。 The user operation on the screen is an example of a correction instruction by the user, and is an operation to change the transparent area 54b relative to the non-transparent area 54c, and is an operation to change the boundary between the transparent area 54b and the non-transparent area 54c. This is an example of a boundary change.
Further, a correction instruction by touching the transparent area 54b or the non-transparent area 54c with a mouse is an example of selecting either one of the transparent area 54b or the non-transparent area 54c, and a correction instruction by dragging the mouse is an example of selecting one of the transparent area 54b or the non-transparent area 54c. This is an example of changing a specified part to one side.

［第４の実施の形態］
図１４は、第４の実施の形態に係る画像処理装置１０の処理を説明する図である。
同図に示す第４の実施の形態は、複数の対象物を含む画像５８が画面に表示される場合に、必要な部分を切り取り、切り取った画像５１について上述の処理を行う場合である。
より詳細には、画面に表示される左側の画像５８には、透明な袋Ｇ１に入ったりんごＧ３と透明な袋Ｇ１に入ったレモンＧ５がある。 [Fourth embodiment]
FIG. 14 is a diagram illustrating processing of the image processing device 10 according to the fourth embodiment.
The fourth embodiment shown in the figure is a case where, when an image 58 including a plurality of objects is displayed on the screen, necessary portions are cut out and the above-described processing is performed on the cut out image 51.
More specifically, the image 58 on the left side displayed on the screen includes an apple G3 in a transparent bag G1 and a lemon G5 in a transparent bag G1.

ユーザが透明な袋Ｇ１に入ったりんごＧ３を利用しようとする場合、中央の画像５８に示すように、トリミングする長方形の右下位置をポイント５８ａで指定し、画面においてトリミング枠５８ｂの範囲を定める。
このようにしてトリミング枠５８ｂ外の領域が取り除かれ、トリミング枠５８ｂで囲まれた領域が、画像５１として切り出される。このため、透明な袋Ｇ１に入った対象物が複数ある場合にも対応することができ、また、透明領域５４ｂ（例えば図６参照）の透過度の精度向上を図ることができる。
画像５８に含まれる透明な袋Ｇ１に入ったりんごＧ３や透明な袋Ｇ１に入ったレモンＧ５は、画像に含まれる複数の対象物の一例である。なお、対象物領域５３ａないし透明領域５４ｂ及び非透明領域５４ｃ（例えば図６参照）は、対象物の領域の一例である。 When the user wants to use the apple G3 in the transparent bag G1, as shown in the central image 58, the user specifies the lower right position of the rectangle to be trimmed with the point 58a, and defines the range of the trimming frame 58b on the screen. .
In this way, the area outside the trimming frame 58b is removed, and the area surrounded by the trimming frame 58b is cut out as the image 51. Therefore, it is possible to cope with the case where there are a plurality of objects in the transparent bag G1, and it is also possible to improve the accuracy of the transparency of the transparent region 54b (see, for example, FIG. 6).
The apple G3 in the transparent bag G1 and the lemon G5 in the transparent bag G1 included in the image 58 are examples of a plurality of objects included in the image. Note that the object area 53a to the transparent area 54b and the non-transparent area 54c (see FIG. 6, for example) are examples of object areas.

上記各実施形態において、プロセッサとは広義的なプロセッサを指し、汎用的なプロセッサ（例えばCPU：Central Processing Unit、等）や、専用のプロセッサ（例えばGPU： Graphics Processing Unit、ASIC： Application Specific Integrated Circuit、FPGA： Field Programmable Gate Array、プログラマブル論理デバイス、等）を含むものである。
また上記各実施形態におけるプロセッサの動作は、１つのプロセッサによって成すのみでなく、物理的に離れた位置に存在する複数のプロセッサが協働して成すものであってもよい。また、プロセッサの各動作の順序は上記各実施形態において記載した順序のみに限定されるものではなく、適宜変更してもよい。 In each of the above embodiments, the processor refers to a processor in a broad sense, and includes a general-purpose processor (e.g., CPU: Central Processing Unit, etc.), a dedicated processor (e.g., GPU: Graphics Processing Unit, ASIC: Application Specific Integrated Circuit, etc.) FPGA: Field Programmable Gate Array, programmable logic device, etc.)
Further, the operation of the processor in each of the above embodiments may be performed not only by one processor, but also by a plurality of processors located at physically separate locations. Further, the order of each operation of the processor is not limited to the order described in each of the above embodiments, and may be changed as appropriate.

＜付記＞
(((1)))
プロセッサを備え、前記プロセッサは、画像から透明領域を含む対象物の領域である対象物領域を定め、前記画像及び前記対象物領域から、当該画像における当該対象物領域以外の領域である第１領域と、当該対象物領域のうち前記透明領域である第２領域及び当該第２領域以外の領域である第３領域と、を示す情報を取得し、
前記画像及び前記情報を基に前記第２領域の透過度を定める、
情報処理装置。
(((2)))
前記取得する情報は、前記第１領域、前記第２領域及び前記第３領域の各々を示す３つの値からなる画像の情報である、ことを特徴とする(((1)))に記載の情報処理装置。
(((3)))
前記第２領域及び前記第３領域の各々を示す値は、前記第１領域の色情報と当該第２領域及び当該第３領域の色情報との差により定まる、ことを特徴とする(((2)))に記載の情報処理装置。
(((4)))
前記対象物領域は、機械学習モデルを用いて定められる、ことを特徴とする(((1)))乃至(((3)))のいずれか１項に記載の情報処理装置。
(((5)))
前記対象物領域は、前記画像に含まれる複数の対象物のいずれか一つである、ことを特徴とする(((1)))乃至(((4)))のいずれか１項に記載の情報処理装置。
(((6)))
画面に表示した前記第２領域についてユーザによる修正指示を受け付ける、ことを特徴とする(((1)))に記載の情報処理装置。
(((7)))
前記修正指示は、前記第２領域を前記第３領域に対して変更する両者の境界変更を行うものである、ことを特徴とする(((6)))に記載の情報処理装置。
(((8)))
前記境界変更は、前記第２領域と前記第３領域のいずれか一方を選択し他方において指定された部分を当該一方に変更するものである、ことを特徴とする(((7)))に記載の情報処理装置。
(((9)))
前記修正指示は、前記透過度の変更を行うものである、ことを特徴とする(((6)))に記載の情報処理装置。
(((10)))
コンピュータに、
画像から透明領域を含む対象物の領域である対象物領域を定める機能と、
前記画像及び前記対象物領域から、当該画像における当該対象物領域以外の領域である第１領域と、当該対象物領域のうち前記透明領域である第２領域及び当該第２領域以外の領域である第３領域と、を示す情報を取得する機能と、
前記画像及び前記情報を基に前記第２領域の透過度を定める機能と、
を実現させるプログラム。 <Additional notes>
(((1)))
a processor, the processor determines an object area that is an area of the object including a transparent area from an image, and determines from the image and the object area a first area that is an area other than the object area in the image; and a second region that is the transparent region of the target object region and a third region that is a region other than the second region,
determining the transparency of the second area based on the image and the information;
Information processing device.
(((2)))
The information to be acquired is image information consisting of three values indicating each of the first area, the second area, and the third area (((1))). Information processing device.
(((3)))
The value indicating each of the second area and the third area is determined by the difference between the color information of the first area and the color information of the second area and the third area. The information processing device described in 2))).
(((Four)))
The information processing device according to any one of ((1)) to ((3)), wherein the target object area is determined using a machine learning model.
(((Five)))
According to any one of (((1))) to (((4))), the object area is any one of a plurality of objects included in the image. information processing equipment.
(((6)))
The information processing apparatus according to ((1)), further comprising the step of accepting a correction instruction from a user regarding the second area displayed on the screen.
(((7)))
The information processing device according to ((6))), wherein the modification instruction is to change the boundary between the second region and the third region.
(((8)))
In ((7))), the boundary change is characterized in that one of the second area and the third area is selected and a portion specified in the other area is changed to the selected one. The information processing device described.
(((9)))
The information processing device according to ((6))), wherein the modification instruction is to change the transparency.
(((Ten)))
to the computer,
A function to determine an object area, which is an area of the object including a transparent area, from an image;
From the image and the object area, a first area which is an area other than the object area in the image, a second area which is the transparent area in the object area, and an area other than the second area. a third area;
a function of determining the transparency of the second area based on the image and the information;
A program that makes this possible.

(((1)))の発明によれば、特殊な装置や環境を用いなくても画像の透明領域の透過度を取得することができる。
(((2)))の発明によれば、取得する情報が第１領域、第２領域及び第３領域の各々を示す３つの値からなる画像の情報でない場合に比べ、情報処理の負担軽減を図ることができる。
(((3)))の発明によれば、第２領域及び第３領域の各々を示す値が第１領域の色情報と第２領域及び第３領域の色情報との差により定まる構成を採用しない場合に比べ、情報処理の負担軽減を図ることができる。
(((4)))の発明によれば、対象物領域が機械学習モデルを用いて定められる構成を採用しない場合に比べ、情報処理の負担軽減を図ることができる。
(((5)))の発明によれば、対象物領域が画像に含まれる複数の対象物のいずれか一つである構成を採用しない場合に比べ、透過度の精度向上を図ることができる。
(((6)))の発明によれば、画面に表示した第２領域についてユーザによる修正指示を受け付ける構成を採用しない場合に比べ、使用感を向上させることができる。
(((7)))の発明によれば、修正指示が第２領域を第３領域に対して変更する両者の境界変更を行うものである構成を採用しない場合に比べ、使用感を向上させることができる。
(((8)))の発明によれば、境界変更が第２領域と第３領域のいずれか一方を選択し他方において指定された部分を一方に変更するものである構成を採用しない場合に比べ、操作性の向上を図ることができる。
(((9)))の発明によれば、修正指示が透過度の変更を行うものである構成を採用しない場合に比べ、使用感を向上させることができる。
(((10)))の発明によれば、特殊な装置や環境を用いなくても画像の透明領域の透過度を取得することができる。 According to the invention (((1))), it is possible to obtain the transparency of a transparent area of an image without using any special equipment or environment.
According to the invention (((2))), the burden of information processing is reduced compared to the case where the information to be acquired is not image information consisting of three values indicating each of the first area, the second area, and the third area. can be achieved.
According to the invention (((3))), the value indicating each of the second area and the third area is determined by the difference between the color information of the first area and the color information of the second area and the third area. Compared to the case where this method is not adopted, the burden of information processing can be reduced.
According to the invention (((4))), it is possible to reduce the burden of information processing compared to the case where a configuration in which the object region is determined using a machine learning model is not adopted.
According to the invention (((5))), it is possible to improve the accuracy of transparency compared to the case where the object area is one of a plurality of objects included in the image. .
According to the invention (((6))), the usability can be improved compared to the case where a configuration is not adopted that accepts correction instructions from the user for the second area displayed on the screen.
According to the invention (((7))), the usability is improved compared to the case where a configuration in which the modification instruction changes the boundary between the second area and the third area is not adopted. be able to.
According to the invention (((8))), when the boundary change does not adopt a configuration in which one of the second area and the third area is selected and a specified portion in the other is changed to one of the areas, Compared to this, it is possible to improve the operability.
According to the invention (((9))), the usability can be improved compared to the case where a configuration in which the modification instruction changes the transparency is not adopted.
According to the invention (((10))), it is possible to obtain the transparency of a transparent area of an image without using any special equipment or environment.

１０…画像処理装置、１０ａ…プロセッサ、５１…画像、５３ａ…対象物領域、５４…トライマップ画像、５４ａ…対象物領域外、５４ｂ…透明領域、５４ｃ…非透明領域、Ｇ１…透明な袋、Ｇ３…りんご DESCRIPTION OF SYMBOLS 10... Image processing device, 10a... Processor, 51... Image, 53a... Object area, 54... Trimap image, 54a... Outside object area, 54b... Transparent area, 54c... Non-transparent area, G1... Transparent bag, G3…Apple

Claims

Equipped with a processor,
The processor includes:
Determine an object area that is an area of the object including a transparent area from the image,
From the image and the object area, a first area which is an area other than the object area in the image, a second area which is the transparent area in the object area, and an area other than the second area. obtain information indicating the third area;
determining the transparency of the second area based on the image and the information;
Information processing device.

The information processing apparatus according to claim 1, wherein the information to be acquired is image information consisting of three values indicating each of the first area, the second area, and the third area.

2. A value indicating each of the second area and the third area is determined by a difference between color information of the first area and color information of the second area and the third area. The information processing device described in .

The information processing apparatus according to any one of claims 1 to 3, wherein the target object area is determined using a machine learning model.

The information processing apparatus according to any one of claims 1 to 3, wherein the object area is any one of a plurality of objects included in the image.

The information processing apparatus according to claim 1, wherein the information processing apparatus receives a correction instruction from a user regarding the second area displayed on the screen.

7. The information processing apparatus according to claim 6, wherein the modification instruction is to change a boundary between the second area and the third area.

8. The information processing according to claim 7, wherein the boundary change selects one of the second area and the third area and changes a specified portion in the other area to the selected one. Device.

7. The information processing apparatus according to claim 6, wherein the modification instruction is to change the degree of transparency.

to the computer,
A function to determine an object area, which is an area of the object including a transparent area, from an image;
From the image and the object area, a first area which is an area other than the object area in the image, a second area which is the transparent area in the object area, and an area other than the second area. a third area;
a function of determining the transparency of the second area based on the image and the information;
A program that makes this possible.