JP7224931B2

JP7224931B2 - LEARNING MODEL GENERATOR, PROGRAM, AND METHOD OF MANUFACTURING TERMINAL DEVICE

Info

Publication number: JP7224931B2
Application number: JP2019009987A
Authority: JP
Inventors: 繁塩澤
Original assignee: Recruit Holdings Co Ltd
Current assignee: Recruit Holdings Co Ltd
Priority date: 2019-01-24
Filing date: 2019-01-24
Publication date: 2023-02-20
Anticipated expiration: 2039-01-24
Also published as: JP2020119283A

Description

本発明は、学習モデル生成装置、プログラム、及び端末装置を製造する方法に関する。 The present invention relates to a learning model generation device, a program, and a method of manufacturing a terminal device.

下記特許文献１には、携帯装置を使用して商品を購入する方法が開示されている。この方法では、ユーザが、携帯装置のカメラを利用して商品のバーコードをスキャンし、バーコードに含まれる商品の識別情報をサーバに送信する。続いて、サーバが、商品の識別情報に基づいて商品名称や価格等の商品情報を取得し、取得した商品情報を仮想ショッピングカートに収容する。その後、仮想ショッピングカートに収容された商品情報に基づいて精算処理が行われる。 Japanese Unexamined Patent Application Publication No. 2002-200001 discloses a method of purchasing a product using a mobile device. In this method, the user scans the barcode of the product using the camera of the mobile device and transmits the identification information of the product contained in the barcode to the server. Subsequently, the server acquires product information such as the product name and price based on the product identification information, and stores the acquired product information in the virtual shopping cart. After that, settlement processing is performed based on the product information stored in the virtual shopping cart.

特表２０１３－５４１１０７号公報Japanese Patent Publication No. 2013-541107

特許文献１の方法では、商品名称や価格を携帯装置で確認する場合、購入を前提として、商品ごとにバーコードをスキャンさせる必要がある。したがって、陳列棚に並べられている商品や購入が未確定の商品の商品名称や価格を確認する場合には、バーコードをスキャンさせて商品名称や価格を確認し、その後、購入をキャンセルする必要がある。 In the method of Patent Document 1, when confirming product names and prices with a portable device, it is necessary to scan a barcode for each product on the premise of purchase. Therefore, when confirming the product name and price of products on display shelves or products whose purchase has not been confirmed, it is necessary to scan the barcode, confirm the product name and price, and then cancel the purchase. There is

そこで、本発明は、商品名称や価格を手軽に確認できるようにする学習モデル生成装置、プログラム、及び端末装置を製造する方法を提供することを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to provide a learning model generation device, a program, and a method of manufacturing a terminal device that enable easy confirmation of product names and prices.

本発明の一態様に係る学習モデル生成装置は、クロマキー合成時に取り除かれる特定色の背景の前に置かれた商品を複数のアングルで撮影した画像を受信する受信部と、前記受信部により受信されたそれぞれの前記画像に対応付けて登録される商品情報として、少なくとも前記画像に対応する商品の商品名称及び価格の入力を受け付ける入力受付部と、前記受信部により受信されたそれぞれの前記画像から前記特定色の背景を取り除く背景除去部と、前記商品が店舗で販売されるときに前記商品の背景となり得る複数の背景画像に対し、前記背景除去部により前記特定色の背景が取り除かれた前記画像を、それぞれ合成することで、合成画像を生成する画像合成部と、前記画像合成部により生成された前記合成画像と、対応する前記商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習し、入力された一つ以上の商品の画像に対応する商品名称及び価格を出力する、学習モデルを生成する学習部と、を備える。 A learning model generation device according to an aspect of the present invention includes a receiving unit that receives images of a product placed in front of a background of a specific color to be removed during chromakey synthesis, photographed from a plurality of angles; Further, as product information registered in association with each of the images, an input reception unit that receives input of at least the product name and price of the product corresponding to the image; a background removal unit that removes a background of a specific color; and the image from which the background of the specific color is removed by the background removal unit for a plurality of background images that may serve as a background of the product when the product is sold in a store. an image synthesizing unit that generates a synthesized image by synthesizing each of the above, the synthetic image generated by the image synthesizing unit, and teacher data including a combination of the product name and price included in the corresponding product information and a learning unit that generates a learning model that learns using and outputs product names and prices corresponding to one or more input product images.

上記態様において、前記画像合成部により生成されたそれぞれの前記合成画像に対応する属性情報を調整し、前記属性情報が異なる複数の前記合成画像を生成する合成画像増殖部をさらに備え、前記学習部は、前記画像合成部及び前記合成画像増殖部により生成された前記合成画像と、対応する前記商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習することとしてもよい。 In the above aspect, the learning unit further comprises a synthetic image growing unit that adjusts attribute information corresponding to each of the synthetic images generated by the image synthesizing unit and generates a plurality of the synthetic images having different attribute information. may be learned using teacher data including a combination of the synthetic image generated by the image synthesizing unit and the synthetic image growing unit and the product name and price included in the corresponding product information.

上記態様において、前記属性情報は、少なくとも、コントラスト、サイズ、回転角度及びノイズのいずれかを含むこととしてもよい。 In the above aspect, the attribute information may include at least one of contrast, size, rotation angle, and noise.

上記態様において、前記背景画像は、少なくとも、買い物かごの内側を背景とする画像を含むこととしてもよい。 In the above aspect, the background image may include at least an image with the inside of the shopping basket as a background.

本発明の他の態様に係るプログラムは、コンピュータに、撮影部により撮影される一つ以上の商品をディスプレイに表示する機能、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する機能、学習モデルから出力される商品名称及び価格を、前記商品に対応させてディスプレイに表示する機能、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額をディスプレイに表示する機能、を実現させる。 A program according to another aspect of the present invention provides a computer with a function of displaying one or more products photographed by an imaging unit on a display, and inputting an image of one or more products displayed on the display into a learning model. Function, function to display the product name and price output from the learning model on the display in association with the product, and display the total amount corresponding to one or more products displayed on the display according to the user's total instruction. Realize the function to display.

本発明の他の態様に係る端末装置を製造する方法は、端末装置に、撮影部により撮影される一つ以上の商品をディスプレイに表示する処理、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する処理、学習モデルから出力される商品名称及び価格を、前記商品に対応させてディスプレイに表示する処理、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額をディスプレイに表示する処理、を実行させるアプリケーションプログラムをインストールさせるために配信することにより、端末装置を製造する。 A method of manufacturing a terminal device according to another aspect of the present invention includes a process of displaying one or more products photographed by a photographing unit on a display of the terminal device, and an image of the one or more products displayed on the display. to the learning model, processing to display the product name and price output from the learning model on the display in correspondence with the product, and correspondence to one or more products displayed on the display according to the user's aggregation instruction. A terminal device is manufactured by distributing an application program for installing an application program for executing a process of displaying a total amount to be paid on a display.

本発明によれば、商品名称や価格を手軽に確認できるようにする学習モデル生成装置、プログラム、及び端末装置を製造する方法を提供することができる。 According to the present invention, it is possible to provide a learning model generation device, a program, and a method of manufacturing a terminal device that enable easy confirmation of product names and prices.

実施形態に係る学習モデル生成装置の構成を例示する図である。1 is a diagram illustrating the configuration of a learning model generation device according to an embodiment; FIG. 実施形態に係る端末装置の構成を例示する図である。It is a figure which illustrates the structure of the terminal device which concerns on embodiment. （Ａ）乃び（Ｂ）、並びに（Ｃ）乃び（Ｄ）は、グリーンの背景の前に置かれた商品をアングルを変えて撮影した画像の一例を示す模式図である。(A) to (B) and (C) to (D) are schematic diagrams showing an example of images of products placed in front of a green background photographed from different angles. （Ａ）乃至（Ｄ）は、図３（Ａ）乃至（Ｄ）の画像からグリーンの背景を取り除いた画像を例示する模式図である。3A to 3D are schematic diagrams illustrating images obtained by removing the green background from the images of FIGS. 3A to 3D; FIG. （Ａ）は店舗にある商品陳列棚の画像を例示し、（Ｂ）は（Ａ）の画像上に図４（Ａ）の画像を合成して作成した合成画像を例示する模式図である。4A is a schematic diagram illustrating an image of a product display shelf in a store, and FIG. 4B is a schematic diagram illustrating a composite image created by synthesizing the image of FIG. 4A on the image of FIG. 4A. FIG. 端末装置のタッチパネルに表示される画面の一例を示す模式図である。It is a schematic diagram which shows an example of the screen displayed on the touch panel of a terminal device. 端末装置のタッチパネルに表示される画面の一例を示す模式図である。It is a schematic diagram which shows an example of the screen displayed on the touch panel of a terminal device. 実施形態に係る学習モデル生成装置の動作手順の一例を説明するフローチャートである。4 is a flow chart explaining an example of the operation procedure of the learning model generation device according to the embodiment; 実施形態に係る端末装置の動作手順の一例を説明するフローチャートである。6 is a flowchart illustrating an example of an operation procedure of a terminal device according to an embodiment;

添付図面を参照して、本発明の好適な実施形態について説明する。なお、各図において、同一の符号を付したものは同一又は同様の構成を有する。 Preferred embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that, in each figure, the same reference numerals have the same or similar configurations.

図１を参照し、実施形態に係る学習モデル生成装置の構成について説明する。学習モデル生成装置１は、後述する端末装置のカメラで撮影された商品の画像を入力とし、その画像に対応する商品名称及び価格を出力とする学習モデルを生成するサーバ装置である。 The configuration of the learning model generation device according to the embodiment will be described with reference to FIG. The learning model generation device 1 is a server device that receives an image of a product photographed by a camera of a terminal device, which will be described later, as an input, and generates a learning model that outputs the product name and price corresponding to the image.

学習モデル生成装置１は、物理的な構成として、例えば、ＣＰＵ（プロセッサ）及びメモリを含む制御装置１０、通信装置２０、記憶装置３０、入力装置４０並びに出力装置（例えば、ディスプレイ、スピーカ）５０等を備えて構成される。カメラ９は、有線通信又は無線通信を用いて学習モデル生成装置１に接続することができる。ＣＰＵがメモリや記憶装置３０に格納された所定のプログラムを実行することにより、以下の各機能が発現する。 The learning model generation device 1 has a physical configuration including, for example, a control device 10 including a CPU (processor) and memory, a communication device 20, a storage device 30, an input device 40, an output device (eg, display, speaker) 50, and the like. configured with The camera 9 can be connected to the learning model generation device 1 using wired communication or wireless communication. When the CPU executes a predetermined program stored in the memory or the storage device 30, each of the following functions is realized.

学習モデル生成装置１は、機能的な構成として、例えば、受信部１１、入力受付部１２、背景除去部１３、画像合成部１４、合成画像増殖部１５及び学習部１６を有する。各機能について以下に説明する。 The learning model generating device 1 has, for example, a receiving unit 11, an input receiving unit 12, a background removing unit 13, an image synthesizing unit 14, a synthetic image growing unit 15, and a learning unit 16 as functional configurations. Each function is described below.

受信部１１は、クロマキー合成時に取り除かれる特定色の背景の前に置かれた商品を、複数のアングルで撮影した画像をカメラ９から受信する。クロマキー合成は、映像（画像）を合成する技法の一つであり、特定の色が表示されている領域に他の映像（画像）をはめ込む技法である。特定の色として、グリーンやブルーが一般に使用される。本実施形態では、特定の色としてグリーンを用いる場合について、例示的に説明する。撮影するアングルは、異なるアングルの数が多くなるほど学習モデルの学習効果を高めることができる。他方、撮影するアングルの数が多くなるほど撮影の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、撮影するアングルの数を適宜定めることが望ましい。 The receiving unit 11 receives from the camera 9 images of a product placed in front of a background of a specific color to be removed during chromakey synthesis, which are photographed from a plurality of angles. Chroma key synthesis is one of techniques for synthesizing video (images), and is a technique for fitting another video (image) into an area where a specific color is displayed. Green and blue are commonly used as particular colors. In this embodiment, a case where green is used as the specific color will be exemplified. The learning effect of the learning model can be enhanced as the number of different shooting angles increases. On the other hand, as the number of angles to be photographed increases, the trouble of photographing and the learning time increase. Therefore, it is desirable to appropriately determine the number of shooting angles in consideration of the learning effect and effort.

図３に、異なるアングルで商品を撮影した画像について例示する。図３（Ａ）は、グリーンの背景（グリーンバック）Ｂの前に立てて置いたＡコーヒーＭａをほぼ正面から撮影した画像である。図３（Ｂ）は、図３（Ａ）のＡコーヒーＭａを横に寝かせた状態でほぼ正面から撮影した画像である。図３（Ｃ）は、グリーンの背景Ｂの前に立てて置いたＢコーヒーＭｂをほぼ正面から撮影した画像である。図３（Ｄ）は、図３（Ｃ）のＢコーヒーＭｂを真上に近い所から撮影した画像である。 FIG. 3 exemplifies images of products photographed at different angles. FIG. 3(A) is an image of A coffee Ma placed upright in front of a green background (green background) B, photographed almost from the front. FIG. 3B is an image of the A coffee Ma of FIG. FIG. 3(C) is an image of the B coffee Mb placed upright in front of the green background B and photographed almost from the front. FIG. 3(D) is an image of the B coffee Mb in FIG. 3(C) photographed from near directly above.

図１に示す入力受付部１２は、受信部１１により受信されたそれぞれの画像に対応付けて登録される商品情報の入力を受け付ける。商品情報は、画像に対応する商品に関する情報であり、例えば、商品名称、価格、商品の産地、商品の賞味期限、商品の評価等を含む。本実施形態では、商品情報として、少なくとも、商品名称及び価格を含むこととする。商品情報の入力は、例えば、管理者が入力装置４０を操作して入力することができる。入力された商品情報を画像に対応付けて登録することで、後述する学習モデルを生成する際に、商品情報を入力する労力を削減することができる。 The input reception unit 12 shown in FIG. 1 receives input of product information registered in association with each image received by the reception unit 11 . The product information is information about the product corresponding to the image, and includes, for example, the product name, price, production area of the product, expiration date of the product, evaluation of the product, and the like. In this embodiment, the product information includes at least the product name and price. The product information can be input by, for example, the administrator operating the input device 40 . By registering the input product information in association with the image, it is possible to reduce labor for inputting the product information when generating a learning model to be described later.

背景除去部１３は、受信部１１により受信されたそれぞれの画像から特定色の背景を取り除く。図４に、特定色の背景が取り除かれた画像について例示する。図４（Ａ）、（Ｂ）は、図３（Ａ）、（Ｂ）の各画像からグリーンの背景Ｂを取り除いたＡコーヒーＭａの画像である。図４（Ｃ）、（Ｄ）は、図３（Ｃ）、（Ｄ）の各画像からグリーンの背景Ｂを取り除いたＢコーヒーＭｂの画像である。 The background remover 13 removes the background of a specific color from each image received by the receiver 11 . FIG. 4 illustrates an image from which a background of a specific color has been removed. FIGS. 4A and 4B are images of A coffee Ma obtained by removing the green background B from the images of FIGS. 3A and 3B. FIGS. 4C and 4D are images of B coffee Mb obtained by removing the green background B from the images of FIGS. 3C and 3D.

図１に示す画像合成部１４は、背景除去部１３により特定色の背景が取り除かれた画像を、複数の背景画像にそれぞれ合成することで、合成画像を生成する。合成に用いる背景画像は、商品が店舗で販売されるときに商品の背景となり得る画像を用いる。例えば、商品を入れる買い物かごの内側を背景とする画像や、商品が陳列される棚を背景とする画像等を用いることが好ましい。背景画像は、背景画像の数が多くなるほど学習モデルの学習効果を高めることができる。他方、背景画像の数が多くなるほど合成の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、背景画像の数を適宜定めることが望ましい。 The image synthesis unit 14 shown in FIG. 1 generates a synthesized image by synthesizing the image from which the background of the specific color has been removed by the background removal unit 13 with each of the plurality of background images. As the background image used for synthesis, an image that can serve as the background of the product when the product is sold at the store is used. For example, it is preferable to use an image with the background of the inside of the shopping basket into which the products are put, or an image with the background of the shelves on which the products are displayed. Background images can enhance the learning effect of the learning model as the number of background images increases. On the other hand, as the number of background images increases, the time required for synthesis and learning time increases. Therefore, it is desirable to appropriately determine the number of background images in consideration of the learning effect and effort.

図５に、背景画像及び合成画像の一例を示す。図５（Ａ）は、飲料品が陳列されている棚を写した背景画像である。図５（Ｂ）は、図５（Ａ）の背景画像上に、図４（Ａ）のＡコーヒーＭａの画像を合成することで作成された合成画像である。 FIG. 5 shows an example of the background image and the composite image. FIG. 5A is a background image showing a shelf on which beverages are displayed. FIG. 5(B) is a synthesized image created by synthesizing the image of A coffee Ma in FIG. 4(A) on the background image in FIG. 5(A).

図１に示す合成画像増殖部１５は、画像合成部１４により生成されたそれぞれの合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像を生成する。属性情報は、例えば、合成画像のコントラスト、合成画像のサイズ、元の合成画像から画像全体を回転させた角度及び合成画像に含まれるノイズを含む。属性情報を変更することで合成画像の数を増やすことができるため、異なる合成画像を生成する処理の高速化を実現することが可能となる。 The synthetic image multiplication unit 15 shown in FIG. 1 adjusts the attribute information corresponding to each of the synthetic images generated by the image synthesizing unit 14, and generates a plurality of synthetic images with different attribute information. The attribute information includes, for example, the contrast of the synthesized image, the size of the synthesized image, the angle at which the entire image is rotated from the original synthesized image, and the noise contained in the synthesized image. Since the number of synthesized images can be increased by changing the attribute information, it is possible to speed up the process of generating different synthesized images.

増殖させる合成画像は、増殖させる数が多くなるほど学習モデルの学習効果を高めることができる。他方、増殖させる数が多くなるほど調整の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、増殖させる合成画像の数を適宜定めることが望ましい。 The more synthetic images are grown, the more the learning effect of the learning model can be enhanced. On the other hand, as the number to be multiplied increases, the adjustment and learning time will increase. Therefore, it is desirable to appropriately determine the number of synthesized images to be multiplied in consideration of the learning effect and effort.

学習部１６は、画像合成部１４及び合成画像増殖部１５により生成された合成画像と、その合成画像に対応する商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを学習することで、学習モデルを生成する。学習部１６は、学習した教師データに基づいて、学習モデルに入力された一つ以上の商品の画像に対応する商品名称及び価格を出力する。 The learning unit 16 learns teacher data including a combination of the synthetic image generated by the image synthesizing unit 14 and the synthetic image multiplication unit 15 and the product name and price included in the product information corresponding to the synthetic image. , to generate a learning model. Based on the learned teacher data, the learning unit 16 outputs product names and prices corresponding to one or more product images input to the learning model.

学習部１６の機能は、例えば、ＹＯＬＯ（You Only Look Once）、ＳＳＤ（Single Shot MultiBox Detector）、Ｒ－ＣＮＮ（Regions with CNN features）等の物体検出用のディープラーニングモデルを利用して実現することができる。 The function of the learning unit 16 is realized using a deep learning model for object detection such as YOLO (You Only Look Once), SSD (Single Shot MultiBox Detector), R-CNN (Regions with CNN features), etc. can be done.

図２を参照し、実施形態に係る端末装置の構成について説明する。端末装置６は、物理的な構成として、例えば、ＣＰＵ（プロセッサ）及びメモリを含む制御装置６１、記憶装置６２、入力装置及びディスプレイを含むタッチパネル６３、通信装置６４並びにカメラ（撮影装置）６５等を備えて構成される。 The configuration of the terminal device according to the embodiment will be described with reference to FIG. The terminal device 6 has a physical configuration including, for example, a control device 61 including a CPU (processor) and memory, a storage device 62, a touch panel 63 including an input device and a display, a communication device 64, a camera (photographing device) 65, and the like. configured with.

端末装置６には、学習部１６が生成した学習モデルを搭載したアプリケーションプログラムがインストールされている。ＣＰＵがメモリ又は記憶装置６２に格納されたアプリケーションプログラムを実行することにより、以下の各機能が発現する。 An application program loaded with a learning model generated by the learning unit 16 is installed in the terminal device 6 . When the CPU executes the application program stored in the memory or the storage device 62, each of the following functions is realized.

端末装置６は、機能的な構成として、例えば、商品画像表示機能、入力機能、商品情報表示機能、及び合計金額表示機能を有する。商品画像表示機能は、カメラ６５で撮影されている一つ以上の商品をディスプレイに表示する機能である。入力機能は、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する機能である。 The terminal device 6 has, for example, a product image display function, an input function, a product information display function, and a total amount display function as functional configurations. The product image display function is a function of displaying one or more products photographed by the camera 65 on the display. The input function is a function of inputting one or more product images displayed on the display to the learning model.

商品情報表示機能は、学習モデルから出力される商品名称及び価格を、その商品に対応させてディスプレイに表示する機能である。図６に、商品に対応させて商品名称及び価格を表示する画面の一例を示す。端末装置６のディスプレイ６３には、カメラ６５により撮影されている買い物かごの内側が表示されている。買い物かごの内側には、ＡコーヒーＭａ、ＢコーヒーＭｂ、たまねぎＭｃ、りんごＭｄ、バナナＭｅが収納されている。 The product information display function is a function of displaying the product name and price output from the learning model on the display in association with the product. FIG. 6 shows an example of a screen displaying product names and prices corresponding to products. The display 63 of the terminal device 6 displays the inside of the shopping basket photographed by the camera 65 . Inside the shopping basket, A coffee Ma, B coffee Mb, onion Mc, apple Md, and banana Me are stored.

図６には、ＡコーヒーＭａに対応させて、商品名称“Ａコーヒー”及び価格“￥６７”が表示され、ＢコーヒーＭｂに対応させて、商品名称“Ｂコーヒー”及び価格“￥９５”が表示されている。同様に、たまねぎＭｃに対応させて、商品名称“たまねぎ”及び価格“￥１００”が表示され、りんごＭｄに対応させて、商品名称“りんご”及び価格“￥１０９”が表示され、バナナＭｅに対応させて、商品名称“バナナ”及び価格“￥２０１”が表示されている。画面下部にあるボタンＴをクリックすると、集計指示が送出され、後述する図７に示す画面に遷移する。 In FIG. 6, the product name "A coffee" and the price "¥67" are displayed corresponding to the A coffee Ma, and the product name "B coffee" and the price "¥95" are displayed corresponding to the B coffee Mb. is displayed. Similarly, the product name “onion” and the price “¥100” are displayed corresponding to the onion Mc, the product name “apple” and the price “¥109” are displayed corresponding to the apple Md, and the banana Me is displayed. Correspondingly, the product name "banana" and the price "¥201" are displayed. When the button T at the bottom of the screen is clicked, a tally instruction is sent, and the screen transitions to the screen shown in FIG. 7, which will be described later.

合計金額表示機能は、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額等を表示する機能である。図７に、各商品に対応する合計金額を表示する画面の一例を示す。端末装置６のディスプレイ６３には、カメラ６５により撮影されている各商品の明細情報及び最安値情報と、合計及び最安値合計とが表示されている。 The total price display function is a function of displaying the total price and the like corresponding to one or more products displayed on the display in accordance with a user's tally instruction. FIG. 7 shows an example of a screen displaying the total price corresponding to each product. The display 63 of the terminal device 6 displays detailed information and lowest price information of each product photographed by the camera 65, as well as the total and the total lowest price.

明細情報は、カメラ６５により撮影されている各商品の商品名称及び価格であり、最安値情報は、その商品を販売している周辺店舗での最安値及び最安値の店舗名称である。合計は、カメラ６５により撮影されている各商品の価格の合計値であり、最安値合計は、各商品の最安値の合計値である。図７には、合計として、“５７２円”が表示され、最安値合計として、“４６８円”が表示されている。 The itemized information is the product name and price of each product photographed by the camera 65, and the lowest price information is the lowest price and the name of the store selling the product in the surrounding stores. The total is the sum of the prices of the products photographed by the camera 65, and the total lowest price is the sum of the lowest prices of the products. In FIG. 7, "572 yen" is displayed as the total, and "468 yen" is displayed as the total lowest price.

このように、端末装置６のカメラ６５を用いて一つ以上の商品をディスプレイに表示すると、各商品の合計金額や周辺店舗を含む店舗での各商品の最安値等が表示されるため、ユーザの買い物を効率よく支援することが可能となる。 In this way, when one or more products are displayed on the display using the camera 65 of the terminal device 6, the total price of each product and the lowest price of each product at stores including nearby stores are displayed. shopping can be efficiently supported.

次に、図８を参照して、実施形態に係る学習モデル生成装置１の動作の一例について説明する。 Next, an example of the operation of the learning model generation device 1 according to the embodiment will be described with reference to FIG.

最初に、学習モデル生成装置１の受信部１１は、グリーンの背景の前に置かれた商品を、複数のアングルで撮影した画像をカメラ９から受信する（ステップＳ１０１）。 First, the receiving unit 11 of the learning model generation device 1 receives from the camera 9 images of a product placed in front of a green background, which are captured from multiple angles (step S101).

続いて、入力受付部１２は、上記ステップＳ１０１で受信した各画像に対応付けて登録される商品情報として、商品名称及び価格の入力を受け付ける（ステップＳ１０２）。 Subsequently, the input receiving unit 12 receives an input of a product name and price as product information registered in association with each image received in step S101 (step S102).

続いて、背景除去部１３は、上記ステップＳ１０１で受信した各画像からグリーンの背景を取り除く（ステップＳ１０３）。 Subsequently, the background removal unit 13 removes the green background from each image received in step S101 (step S103).

続いて、画像合成部１４は、上記ステップＳ１０３でグリーンの背景が取り除かれた画像を、複数の背景画像にそれぞれ合成することで、合成画像を生成する（ステップＳ１０４）。 Subsequently, the image synthesis unit 14 generates a synthesized image by synthesizing the image from which the green background has been removed in step S103 with each of the plurality of background images (step S104).

続いて、合成画像増殖部１５は、上記ステップＳ１０４で生成された各合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像を生成する（ステップＳ１０５）。 Subsequently, the composite image growing unit 15 adjusts the attribute information corresponding to each composite image generated in step S104, and generates a plurality of composite images having different attribute information (step S105).

続いて、学習部１６は、上記ステップＳ１０４及びステップＳ１０５で生成された合成画像と、その合成画像に対応する商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習モデルを生成する（ステップＳ１０６）。そして本動作を終了する。 Subsequently, the learning unit 16 generates a learning model using teacher data including combinations of the synthetic images generated in steps S104 and S105 and the product names and prices included in the product information corresponding to the synthetic images. Generate (step S106). Then, this operation ends.

次に、図９を参照して、実施形態に係る端末装置６の動作の一例について説明する。 Next, an example of the operation of the terminal device 6 according to the embodiment will be described with reference to FIG.

最初に、端末装置６は、カメラ６５で撮影されている一つ以上の商品をディスプレイに表示する（ステップＳ２０１）。 First, the terminal device 6 displays one or more products photographed by the camera 65 on the display (step S201).

続いて、端末装置６は、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する（ステップＳ２０２）。 Subsequently, the terminal device 6 inputs one or more product images displayed on the display to the learning model (step S202).

続いて、端末装置６は、学習モデルから出力される商品名称及び価格を、その商品に対応させてディスプレイに表示する（ステップＳ２０３）。 Subsequently, the terminal device 6 displays the product name and price output from the learning model on the display in association with the product (step S203).

続いて、端末装置６は、ユーザによる集計指示（ボタンＴをクリック）が発行されたかどうかを判定する（ステップＳ２０４）。この判定がＮＯである場合（ステップＳ２０４；ＮＯ）には、集計指示が発行されるまで待機する。ここで、集計指示の発行を待機している際に、カメラ６５で撮影される商品が変更された場合には、上記ステップＳ２０１に処理を移行し、変更された商品について、前述したステップＳ２０１からステップＳ２０３までの処理を実行する。 Subsequently, the terminal device 6 determines whether or not the user has issued a tally instruction (by clicking the button T) (step S204). If this determination is NO (step S204; NO), it waits until a totaling instruction is issued. Here, when the product photographed by the camera 65 is changed while waiting for the issuance of the tallying instruction, the process proceeds to step S201, and the changed product is processed from step S201 described above. The processing up to step S203 is executed.

一方、上記ステップＳ２０４で集計指示が発行されたと判定された場合（ステップＳ２０４；ＹＥＳ）に、端末装置６は、ディスプレイに表示された各商品に対応する合計金額等をディスプレイに表示する（ステップＳ２０５）。そして本動作を終了する。 On the other hand, if it is determined in step S204 that the tallying instruction has been issued (step S204; YES), the terminal device 6 displays the total amount corresponding to each product displayed on the display (step S205). ). Then, this operation ends.

前述したように、実施形態における学習モデル生成装置１によれば、グリーンの背景の前に置かれた商品を複数のアングルで撮影した画像を受信し、その各画像に対応付けて登録される商品名称及び価格の入力を受け付けるとともに、商品が店舗で販売されるときに商品の背景となり得る複数の背景画像に対し、各画像からグリーンの背景が取り除かれた画像をそれぞれ合成して合成画像を生成し、その合成画像と、対応する商品名称及び価格との組み合わせを含む教師データを用いて学習モデルを生成することができる。そして、この学習モデルに、一つ以上の商品の画像を入力して、商品の画像に対応する商品名称及び価格を出力することが可能となる。 As described above, according to the learning model generation device 1 of the embodiment, images of products placed in front of a green background photographed from a plurality of angles are received, and products registered in association with each of the images are received. It accepts input of names and prices, and generates composite images by synthesizing images with green backgrounds removed from multiple background images that can serve as backgrounds for products when they are sold in stores. Then, a learning model can be generated using teacher data including combinations of the synthesized images and corresponding product names and prices. Then, one or more product images are input to this learning model, and product names and prices corresponding to the product images can be output.

それゆえ、実施形態における学習モデル生成装置１によれば、商品名称や価格を手軽に確認させることが可能となる。 Therefore, according to the learning model generation device 1 of the embodiment, it is possible to easily confirm the product name and price.

また、学習モデル生成装置１によれば、生成した各合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像をさらに生成することで、教師データに用いる合成画像を増やす処理を高速化することができる。 In addition, according to the learning model generation device 1, the attribute information corresponding to each generated synthesized image is adjusted, and a plurality of synthesized images with different attribute information are further generated, thereby increasing the number of synthesized images used as teacher data. It can be faster.

[変形例]
なお、本発明は、前述した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲内において、他の様々な形で実施することができる。したがって、上記実施形態はあらゆる点で単なる例示にすぎず、限定的に解釈されるものではない。例えば、前述した各処理ステップは処理内容に矛盾を生じない範囲で任意に順番を変更し、又は並列に実行することができる。 [Variation]
It should be noted that the present invention is not limited to the embodiments described above, and can be implemented in various other forms without departing from the gist of the present invention. Therefore, the above-described embodiment is merely an example in all respects, and should not be construed as limiting. For example, the processing steps described above can be arbitrarily changed in order or executed in parallel as long as the content of processing is not inconsistent.

また、学習モデル生成装置１の構成要素は、前述した実施形態における構成要素に限定されることなく、必要に応じて任意の構成要素を適宜省略することや追加することができる。例えば、学習モデル生成装置１の機能的な構成のうち、合成画像増殖部１５を省略することとしてもよい。 Also, the components of the learning model generation device 1 are not limited to the components in the above-described embodiments, and arbitrary components can be appropriately omitted or added as necessary. For example, the synthetic image growing unit 15 may be omitted from the functional configuration of the learning model generation device 1 .

１…学習モデル生成装置、６…端末装置、９…カメラ、１０…制御装置、１１…受信部、１２…入力受付部、１３…背景除去部、１４…画像合成部、１５…合成画像増殖部、１６…学習部、２０…通信装置、３０…記憶装置、４０…入力装置、６１…制御装置、６２…記憶装置、６３…タッチパネル（入力装置及びディスプレイ）、６４…通信装置、６５…カメラ。 REFERENCE SIGNS LIST 1 learning model generating device 6 terminal device 9 camera 10 control device 11 receiving unit 12 input receiving unit 13 background removing unit 14 image synthesizing unit 15 synthesized image growing unit , 16... Learning unit, 20... Communication device, 30... Storage device, 40... Input device, 61... Control device, 62... Storage device, 63... Touch panel (input device and display), 64... Communication device, 65... Camera.

Claims

a receiving unit for receiving images of a product placed in front of a background of a specific color to be removed during chromakey synthesis, taken from a plurality of angles;
an input reception unit that receives input of at least the product name and price of the product corresponding to the image as product information registered in association with each of the images received by the reception unit;
a background remover that removes the background of the specific color from each of the images received by the receiver;
A composite image is generated by synthesizing the images from which the background of the specific color has been removed by the background removing unit with respect to a plurality of background images that can serve as the background of the product when the product is sold in a store. an image synthesizing unit to generate;
Learn using teacher data including combinations of the synthetic image generated by the image synthesizing unit and the product name and price included in the corresponding product information, and correspond to one or more input product images. a learning unit that generates a learning model that outputs the product name and price to be used;
A learning model generation device comprising:

further comprising a synthetic image growing unit that adjusts attribute information corresponding to each of the synthetic images generated by the image synthesizing unit and generates a plurality of the synthetic images with different attribute information;
The learning unit learns using teacher data including a combination of the synthetic image generated by the image synthesizing unit and the synthetic image growing unit and the product name and price included in the corresponding product information.
The learning model generation device according to claim 1.

The attribute information includes at least one of contrast, size, rotation angle and noise,
3. The learning model generation device according to claim 2.

The background image includes at least an image with the inside of the shopping basket as a background,
The learning model generation device according to any one of claims 1 to 3.

to the computer,
A function to display one or more products photographed by the photographing unit on the display;
The ability to input images of one or more products displayed on the display into the learning model;
A function to display the product name and price output from the learning model on the display in association with the product;
A function of displaying on the display the total amount corresponding to one or more products displayed on the display according to the user's totaling instruction;
A program for realizing
According to the learning model, the product is sold at a store by removing the background of the specific color from the images of the product placed in front of the background of the specific color to be removed during chromakey synthesis from a plurality of angles. A teacher containing a combination of a synthetic image generated by synthesizing each of a plurality of background images that can become the background of the product when the product is displayed, and the product name and price included in the product information corresponding to the image. It is learned using data and generated to output product names and prices corresponding to one or more input images of products.

to the terminal device,
a process of displaying one or more products photographed by the photographing unit on the display;
A process of inputting images of one or more products displayed on the display into the learning model,
A process of displaying the product name and price output from the learning model on the display in association with the product;
A process of displaying on the display the total amount corresponding to one or more products displayed on the display according to the totaling instruction by the user;
A method of manufacturing a terminal device by distributing for installation an application program for executing
According to the learning model, the product is sold at a store by removing the background of the specific color from the images of the product placed in front of the background of the specific color to be removed during chromakey synthesis from a plurality of angles. A teacher containing a combination of a synthetic image generated by synthesizing each of a plurality of background images that can become the background of the product when the product is displayed, and the product name and price included in the product information corresponding to the image. It is learned using data and generated to output product names and prices corresponding to one or more input images of products.