JP2020119283A

JP2020119283A - Learning model production system, program, and method for manufacturing terminal device

Info

Publication number: JP2020119283A
Application number: JP2019009987A
Authority: JP
Inventors: 繁塩澤; Shigeru Shiozawa
Original assignee: Recruit Co Ltd
Current assignee: Recruit Co Ltd
Priority date: 2019-01-24
Filing date: 2019-01-24
Publication date: 2020-08-06
Anticipated expiration: 2039-01-24
Also published as: JP7224931B2

Abstract

To provide a learning model production system making it possible to readily check an article name and price.SOLUTION: A learning model production system includes a receiving unit 11 that receives images produced by shooting an article, which is placed in front of a background of a specific color, at plural angles, an input accepting unit 12 that accepts as article information, which is registered in association with the received images, input of an article name and price of the article consistent with the images, a background removing unit 13 that removes the background of the specific color from each of the received images, an image synthesis unit 14 that synthesizes the images, from which the background of the specific color is removed, with plural background images, which may be backgrounds of the article when the article is put on the market at a store, so as to produce synthetic images, and a learning unit 16 that produces a learning model which learns using the produced synthetic images and teacher data, which contains a set of the article name and price contained in the associated article information, and outputs the article name and price associated with one or more inputted article images.SELECTED DRAWING: Figure 1

Description

本発明は、学習モデル生成装置、プログラム、及び端末装置を製造する方法に関する。 The present invention relates to a learning model generation device, a program, and a method for manufacturing a terminal device.

下記特許文献１には、携帯装置を使用して商品を購入する方法が開示されている。この方法では、ユーザが、携帯装置のカメラを利用して商品のバーコードをスキャンし、バーコードに含まれる商品の識別情報をサーバに送信する。続いて、サーバが、商品の識別情報に基づいて商品名称や価格等の商品情報を取得し、取得した商品情報を仮想ショッピングカートに収容する。その後、仮想ショッピングカートに収容された商品情報に基づいて精算処理が行われる。 Patent Document 1 below discloses a method for purchasing a product using a mobile device. In this method, the user scans the barcode of the product using the camera of the mobile device, and transmits the identification information of the product included in the barcode to the server. Then, the server acquires the product information such as the product name and the price based on the product identification information, and stores the acquired product information in the virtual shopping cart. After that, the settlement process is performed based on the product information stored in the virtual shopping cart.

特表２０１３−５４１１０７号公報Japanese Patent Publication No. 2013-541107

特許文献１の方法では、商品名称や価格を携帯装置で確認する場合、購入を前提として、商品ごとにバーコードをスキャンさせる必要がある。したがって、陳列棚に並べられている商品や購入が未確定の商品の商品名称や価格を確認する場合には、バーコードをスキャンさせて商品名称や価格を確認し、その後、購入をキャンセルする必要がある。 According to the method of Patent Document 1, in order to confirm the product name and the price with the mobile device, it is necessary to scan the barcode for each product on the assumption that the product is purchased. Therefore, when checking the product name or price of a product displayed on the display shelf or a product for which purchase has not been confirmed, it is necessary to scan the barcode to check the product name or price, and then cancel the purchase. There is.

そこで、本発明は、商品名称や価格を手軽に確認できるようにする学習モデル生成装置、プログラム、及び端末装置を製造する方法を提供することを目的とする。 Therefore, it is an object of the present invention to provide a learning model generation device, a program, and a method for manufacturing a terminal device that enable easy confirmation of a product name and a price.

本発明の一態様に係る学習モデル生成装置は、クロマキー合成時に取り除かれる特定色の背景の前に置かれた商品を複数のアングルで撮影した画像を受信する受信部と、前記受信部により受信されたそれぞれの前記画像に対応付けて登録される商品情報として、少なくとも前記画像に対応する商品の商品名称及び価格の入力を受け付ける入力受付部と、前記受信部により受信されたそれぞれの前記画像から前記特定色の背景を取り除く背景除去部と、前記商品が店舗で販売されるときに前記商品の背景となり得る複数の背景画像に対し、前記背景除去部により前記特定色の背景が取り除かれた前記画像を、それぞれ合成することで、合成画像を生成する画像合成部と、前記画像合成部により生成された前記合成画像と、対応する前記商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習し、入力された一つ以上の商品の画像に対応する商品名称及び価格を出力する、学習モデルを生成する学習部と、を備える。 A learning model generation device according to an aspect of the present invention includes a receiving unit that receives an image of a product placed in front of a background of a specific color that is removed during chroma key composition at a plurality of angles, and a receiving unit that receives the image. As the product information registered in association with each of the images, at least an input receiving unit that receives an input of the product name and price of the product corresponding to the image, and the image from each of the images received by the receiving unit. A background removal unit that removes a background of a specific color, and a plurality of background images that can be the background of the product when the product is sold at a store, the image from which the background of the specific color has been removed by the background removal unit. , Teacher data including a combination of an image synthesizing unit that generates a synthetic image by synthesizing the above, the synthetic image generated by the image synthesizing unit, and a product name and a price included in the corresponding product information. And a learning unit that outputs a product name and a price corresponding to the input image of one or more products, and generates a learning model.

上記態様において、前記画像合成部により生成されたそれぞれの前記合成画像に対応する属性情報を調整し、前記属性情報が異なる複数の前記合成画像を生成する合成画像増殖部をさらに備え、前記学習部は、前記画像合成部及び前記合成画像増殖部により生成された前記合成画像と、対応する前記商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習することとしてもよい。 In the above aspect, the learning unit further includes a synthetic image multiplying unit that adjusts attribute information corresponding to each of the synthetic images generated by the image synthesizing unit and generates a plurality of the synthetic images having different attribute information. May be learned using teacher data including a combination of the combined image generated by the image combining unit and the combined image multiplying unit and the product name and price included in the corresponding product information.

上記態様において、前記属性情報は、少なくとも、コントラスト、サイズ、回転角度及びノイズのいずれかを含むこととしてもよい。 In the above aspect, the attribute information may include at least one of contrast, size, rotation angle, and noise.

上記態様において、前記背景画像は、少なくとも、買い物かごの内側を背景とする画像を含むこととしてもよい。 In the above aspect, the background image may include at least an image with the inside of the shopping cart as a background.

本発明の他の態様に係るプログラムは、コンピュータに、撮影部により撮影される一つ以上の商品をディスプレイに表示する機能、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する機能、学習モデルから出力される商品名称及び価格を、前記商品に対応させてディスプレイに表示する機能、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額をディスプレイに表示する機能、を実現させる。 A program according to another aspect of the present invention inputs to a computer a function of displaying one or more products photographed by a photographing unit on a display, and images of one or more products displayed on the display to a learning model. Function, function to display the product name and price output from the learning model on the display corresponding to the product, total amount corresponding to one or more products displayed on the display according to the totaling instruction by the user Realize the function to display.

本発明の他の態様に係る端末装置を製造する方法は、端末装置に、撮影部により撮影される一つ以上の商品をディスプレイに表示する処理、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する処理、学習モデルから出力される商品名称及び価格を、前記商品に対応させてディスプレイに表示する処理、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額をディスプレイに表示する処理、を実行させるアプリケーションプログラムをインストールさせるために配信することにより、端末装置を製造する。 A method of manufacturing a terminal device according to another aspect of the present invention includes a process of displaying, on a terminal device, one or more products photographed by a photographing unit, and an image of one or more products displayed on the display. Input to the learning model, processing to display the product name and price output from the learning model on the display corresponding to the product, corresponding to one or more products displayed on the display according to the total instruction by the user The terminal device is manufactured by distributing the application program for executing the process of displaying the total amount of money to be displayed on the display.

本発明によれば、商品名称や価格を手軽に確認できるようにする学習モデル生成装置、プログラム、及び端末装置を製造する方法を提供することができる。 According to the present invention, it is possible to provide a learning model generation device, a program, and a method for manufacturing a terminal device that enable easy confirmation of product names and prices.

実施形態に係る学習モデル生成装置の構成を例示する図である。It is a figure which illustrates the structure of the learning model production|generation apparatus which concerns on embodiment. 実施形態に係る端末装置の構成を例示する図である。It is a figure which illustrates the structure of the terminal device which concerns on embodiment. （Ａ）乃び（Ｂ）、並びに（Ｃ）乃び（Ｄ）は、グリーンの背景の前に置かれた商品をアングルを変えて撮影した画像の一例を示す模式図である。(A) Nobi (B) and (C) Nobi (D) are schematic diagrams showing an example of an image of a product placed in front of a green background, taken at different angles. （Ａ）乃至（Ｄ）は、図３（Ａ）乃至（Ｄ）の画像からグリーンの背景を取り除いた画像を例示する模式図である。3A to 3D are schematic views illustrating images in which the green background is removed from the images in FIGS. 3A to 3D. （Ａ）は店舗にある商品陳列棚の画像を例示し、（Ｂ）は（Ａ）の画像上に図４（Ａ）の画像を合成して作成した合成画像を例示する模式図である。4A is a schematic diagram illustrating an image of a product display shelf in a store, and FIG. 4B is a schematic diagram illustrating a combined image created by combining the image of FIG. 4A on the image of FIG. 端末装置のタッチパネルに表示される画面の一例を示す模式図である。It is a schematic diagram which shows an example of the screen displayed on the touch panel of a terminal device. 端末装置のタッチパネルに表示される画面の一例を示す模式図である。It is a schematic diagram which shows an example of the screen displayed on the touch panel of a terminal device. 実施形態に係る学習モデル生成装置の動作手順の一例を説明するフローチャートである。It is a flow chart explaining an example of an operation procedure of a learning model generation device concerning an embodiment. 実施形態に係る端末装置の動作手順の一例を説明するフローチャートである。It is a flow chart explaining an example of an operation procedure of a terminal unit concerning an embodiment.

添付図面を参照して、本発明の好適な実施形態について説明する。なお、各図において、同一の符号を付したものは同一又は同様の構成を有する。 Preferred embodiments of the present invention will be described with reference to the accompanying drawings. In addition, in each of the drawings, those denoted by the same reference numerals have the same or similar configurations.

図１を参照し、実施形態に係る学習モデル生成装置の構成について説明する。学習モデル生成装置１は、後述する端末装置のカメラで撮影された商品の画像を入力とし、その画像に対応する商品名称及び価格を出力とする学習モデルを生成するサーバ装置である。 The configuration of the learning model generation device according to the embodiment will be described with reference to FIG. 1. The learning model generation device 1 is a server device that generates a learning model in which an image of a product photographed by a camera of a terminal device described below is input and a product name and price corresponding to the image are output.

学習モデル生成装置１は、物理的な構成として、例えば、ＣＰＵ（プロセッサ）及びメモリを含む制御装置１０、通信装置２０、記憶装置３０、入力装置４０並びに出力装置（例えば、ディスプレイ、スピーカ）５０等を備えて構成される。カメラ９は、有線通信又は無線通信を用いて学習モデル生成装置１に接続することができる。ＣＰＵがメモリや記憶装置３０に格納された所定のプログラムを実行することにより、以下の各機能が発現する。 The physical structure of the learning model generation device 1 is, for example, a control device 10 including a CPU (processor) and a memory, a communication device 20, a storage device 30, an input device 40, and an output device (for example, a display and a speaker) 50. It is configured with. The camera 9 can be connected to the learning model generation device 1 using wired communication or wireless communication. The following functions are realized by the CPU executing a predetermined program stored in the memory or the storage device 30.

学習モデル生成装置１は、機能的な構成として、例えば、受信部１１、入力受付部１２、背景除去部１３、画像合成部１４、合成画像増殖部１５及び学習部１６を有する。各機能について以下に説明する。 The learning model generation device 1 has, for example, a reception unit 11, an input reception unit 12, a background removal unit 13, an image synthesis unit 14, a synthetic image propagation unit 15, and a learning unit 16 as a functional configuration. Each function will be described below.

受信部１１は、クロマキー合成時に取り除かれる特定色の背景の前に置かれた商品を、複数のアングルで撮影した画像をカメラ９から受信する。クロマキー合成は、映像（画像）を合成する技法の一つであり、特定の色が表示されている領域に他の映像（画像）をはめ込む技法である。特定の色として、グリーンやブルーが一般に使用される。本実施形態では、特定の色としてグリーンを用いる場合について、例示的に説明する。撮影するアングルは、異なるアングルの数が多くなるほど学習モデルの学習効果を高めることができる。他方、撮影するアングルの数が多くなるほど撮影の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、撮影するアングルの数を適宜定めることが望ましい。 The receiving unit 11 receives, from the camera 9, an image obtained by photographing a product placed in front of a background of a specific color that is removed during chroma key composition at a plurality of angles. Chroma key composition is one of the techniques for combining images (images), and is a technique for fitting other images (images) in the area where a specific color is displayed. Green or blue is generally used as the specific color. In the present embodiment, a case where green is used as a specific color will be described as an example. As for the angles to be photographed, the learning effect of the learning model can be enhanced as the number of different angles increases. On the other hand, as the number of angles to be photographed increases, the time and effort for photographing and learning time increase. Therefore, it is desirable to appropriately determine the number of shooting angles in consideration of the learning effect and labor.

図３に、異なるアングルで商品を撮影した画像について例示する。図３（Ａ）は、グリーンの背景（グリーンバック）Ｂの前に立てて置いたＡコーヒーＭａをほぼ正面から撮影した画像である。図３（Ｂ）は、図３（Ａ）のＡコーヒーＭａを横に寝かせた状態でほぼ正面から撮影した画像である。図３（Ｃ）は、グリーンの背景Ｂの前に立てて置いたＢコーヒーＭｂをほぼ正面から撮影した画像である。図３（Ｄ）は、図３（Ｃ）のＢコーヒーＭｂを真上に近い所から撮影した画像である。 FIG. 3 exemplifies images obtained by photographing a product at different angles. FIG. 3A is an image obtained by photographing the A coffee Ma standing upright in front of a green background (green background) B from almost the front. FIG. 3B is an image photographed from almost the front in a state where the A coffee Ma of FIG. 3A is laid sideways. FIG. 3C is an image obtained by photographing the B coffee Mb standing upright in front of the green background B substantially from the front. FIG. 3D is an image obtained by photographing the B coffee Mb of FIG. 3C from a position immediately above.

図１に示す入力受付部１２は、受信部１１により受信されたそれぞれの画像に対応付けて登録される商品情報の入力を受け付ける。商品情報は、画像に対応する商品に関する情報であり、例えば、商品名称、価格、商品の産地、商品の賞味期限、商品の評価等を含む。本実施形態では、商品情報として、少なくとも、商品名称及び価格を含むこととする。商品情報の入力は、例えば、管理者が入力装置４０を操作して入力することができる。入力された商品情報を画像に対応付けて登録することで、後述する学習モデルを生成する際に、商品情報を入力する労力を削減することができる。 The input receiving unit 12 illustrated in FIG. 1 receives an input of product information registered in association with each image received by the receiving unit 11. The product information is information about a product corresponding to the image, and includes, for example, a product name, a price, a production place of the product, a shelf life of the product, and an evaluation of the product. In the present embodiment, the product information includes at least a product name and a price. The administrator can operate the input device 40 to input the product information, for example. By registering the input product information in association with the image, it is possible to reduce the labor of inputting the product information when generating a learning model to be described later.

背景除去部１３は、受信部１１により受信されたそれぞれの画像から特定色の背景を取り除く。図４に、特定色の背景が取り除かれた画像について例示する。図４（Ａ）、（Ｂ）は、図３（Ａ）、（Ｂ）の各画像からグリーンの背景Ｂを取り除いたＡコーヒーＭａの画像である。図４（Ｃ）、（Ｄ）は、図３（Ｃ）、（Ｄ）の各画像からグリーンの背景Ｂを取り除いたＢコーヒーＭｂの画像である。 The background removing unit 13 removes the background of the specific color from each image received by the receiving unit 11. FIG. 4 illustrates an image from which the background of a specific color has been removed. 4A and 4B are images of A coffee Ma obtained by removing the green background B from the images of FIGS. 3A and 3B. 4C and 4D are images of B coffee Mb in which the green background B is removed from the images of FIGS. 3C and 3D.

図１に示す画像合成部１４は、背景除去部１３により特定色の背景が取り除かれた画像を、複数の背景画像にそれぞれ合成することで、合成画像を生成する。合成に用いる背景画像は、商品が店舗で販売されるときに商品の背景となり得る画像を用いる。例えば、商品を入れる買い物かごの内側を背景とする画像や、商品が陳列される棚を背景とする画像等を用いることが好ましい。背景画像は、背景画像の数が多くなるほど学習モデルの学習効果を高めることができる。他方、背景画像の数が多くなるほど合成の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、背景画像の数を適宜定めることが望ましい。 The image combining unit 14 illustrated in FIG. 1 generates a combined image by combining the images from which the background of the specific color has been removed by the background removing unit 13 with the plurality of background images. As the background image used for composition, an image that can be the background of the product when the product is sold at a store is used. For example, it is preferable to use an image with the background of the inside of the shopping basket in which the product is placed, an image with the shelf on which the product is displayed as the background, and the like. As for the background image, the learning effect of the learning model can be enhanced as the number of background images increases. On the other hand, as the number of background images increases, the time and effort required for composition and learning time increase. Therefore, it is desirable to appropriately determine the number of background images in consideration of the learning effect and labor.

図５に、背景画像及び合成画像の一例を示す。図５（Ａ）は、飲料品が陳列されている棚を写した背景画像である。図５（Ｂ）は、図５（Ａ）の背景画像上に、図４（Ａ）のＡコーヒーＭａの画像を合成することで作成された合成画像である。 FIG. 5 shows an example of the background image and the composite image. FIG. 5A is a background image showing a shelf on which beverages are displayed. FIG. 5B is a composite image created by combining the image of A coffee Ma of FIG. 4A on the background image of FIG. 5A.

図１に示す合成画像増殖部１５は、画像合成部１４により生成されたそれぞれの合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像を生成する。属性情報は、例えば、合成画像のコントラスト、合成画像のサイズ、元の合成画像から画像全体を回転させた角度及び合成画像に含まれるノイズを含む。属性情報を変更することで合成画像の数を増やすことができるため、異なる合成画像を生成する処理の高速化を実現することが可能となる。 The composite image multiplying unit 15 shown in FIG. 1 adjusts the attribute information corresponding to each composite image generated by the image combining unit 14, and generates a plurality of composite images having different attribute information. The attribute information includes, for example, the contrast of the composite image, the size of the composite image, the angle obtained by rotating the entire image from the original composite image, and the noise included in the composite image. Since the number of composite images can be increased by changing the attribute information, it is possible to speed up the process of generating different composite images.

増殖させる合成画像は、増殖させる数が多くなるほど学習モデルの学習効果を高めることができる。他方、増殖させる数が多くなるほど調整の手間や学習時間が増大することになる。したがって、学習効果と労力とを勘案し、増殖させる合成画像の数を適宜定めることが望ましい。 As for the synthetic image to be propagated, the learning effect of the learning model can be enhanced as the number of propagated images increases. On the other hand, the larger the number of cells to be propagated, the more time and effort for adjustment and learning time increase. Therefore, it is desirable to appropriately determine the number of composite images to be propagated in consideration of the learning effect and labor.

学習部１６は、画像合成部１４及び合成画像増殖部１５により生成された合成画像と、その合成画像に対応する商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを学習することで、学習モデルを生成する。学習部１６は、学習した教師データに基づいて、学習モデルに入力された一つ以上の商品の画像に対応する商品名称及び価格を出力する。 The learning unit 16 learns teacher data including a combination of the combined image generated by the image combining unit 14 and the combined image multiplying unit 15 and the product name and price included in the product information corresponding to the combined image. , Generate a learning model. The learning unit 16 outputs the product name and the price corresponding to the image of one or more products input to the learning model based on the learned teacher data.

学習部１６の機能は、例えば、ＹＯＬＯ（You Only Look Once）、ＳＳＤ（Single Shot MultiBox Detector）、Ｒ−ＣＮＮ（Regions with CNN features）等の物体検出用のディープラーニングモデルを利用して実現することができる。 The function of the learning unit 16 is realized by using a deep learning model for object detection such as YOLO (You Only Look Once), SSD (Single Shot MultiBox Detector), and R-CNN (Regions with CNN features). You can

図２を参照し、実施形態に係る端末装置の構成について説明する。端末装置６は、物理的な構成として、例えば、ＣＰＵ（プロセッサ）及びメモリを含む制御装置６１、記憶装置６２、入力装置及びディスプレイを含むタッチパネル６３、通信装置６４並びにカメラ（撮影装置）６５等を備えて構成される。 The configuration of the terminal device according to the embodiment will be described with reference to FIG. 2. The physical configuration of the terminal device 6 includes, for example, a control device 61 including a CPU (processor) and a memory, a storage device 62, a touch panel 63 including an input device and a display, a communication device 64, a camera (imaging device) 65, and the like. It is equipped with.

端末装置６には、学習部１６が生成した学習モデルを搭載したアプリケーションプログラムがインストールされている。ＣＰＵがメモリ又は記憶装置６２に格納されたアプリケーションプログラムを実行することにより、以下の各機能が発現する。 An application program having the learning model generated by the learning unit 16 is installed in the terminal device 6. When the CPU executes the application program stored in the memory or the storage device 62, the following functions are realized.

端末装置６は、機能的な構成として、例えば、商品画像表示機能、入力機能、商品情報表示機能、及び合計金額表示機能を有する。商品画像表示機能は、カメラ６５で撮影されている一つ以上の商品をディスプレイに表示する機能である。入力機能は、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する機能である。 The terminal device 6 has, for example, a product image display function, an input function, a product information display function, and a total amount display function as a functional configuration. The product image display function is a function of displaying one or more products captured by the camera 65 on the display. The input function is a function of inputting one or more product images displayed on the display to the learning model.

商品情報表示機能は、学習モデルから出力される商品名称及び価格を、その商品に対応させてディスプレイに表示する機能である。図６に、商品に対応させて商品名称及び価格を表示する画面の一例を示す。端末装置６のディスプレイ６３には、カメラ６５により撮影されている買い物かごの内側が表示されている。買い物かごの内側には、ＡコーヒーＭａ、ＢコーヒーＭｂ、たまねぎＭｃ、りんごＭｄ、バナナＭｅが収納されている。 The product information display function is a function of displaying the product name and price output from the learning model on the display in association with the product. FIG. 6 shows an example of a screen for displaying the product name and the price corresponding to the product. The display 63 of the terminal device 6 displays the inside of the shopping cart photographed by the camera 65. Inside the shopping basket, A coffee Ma, B coffee Mb, onion Mc, apple Md, and banana Me are stored.

図６には、ＡコーヒーＭａに対応させて、商品名称“Ａコーヒー”及び価格“￥６７”が表示され、ＢコーヒーＭｂに対応させて、商品名称“Ｂコーヒー”及び価格“￥９５”が表示されている。同様に、たまねぎＭｃに対応させて、商品名称“たまねぎ”及び価格“￥１００”が表示され、りんごＭｄに対応させて、商品名称“りんご”及び価格“￥１０９”が表示され、バナナＭｅに対応させて、商品名称“バナナ”及び価格“￥２０１”が表示されている。画面下部にあるボタンＴをクリックすると、集計指示が送出され、後述する図７に示す画面に遷移する。 In FIG. 6, the product name “A coffee” and the price “¥67” are displayed corresponding to the A coffee Ma, and the product name “B coffee” and the price “¥95” are displayed corresponding to the B coffee Mb. It is displayed. Similarly, the product name “onion” and the price “¥100” are displayed corresponding to the onion Mc, the product name “Apple” and the price “¥109” are displayed corresponding to the apple Md, and the banana Me is displayed. Correspondingly, the product name “banana” and the price “¥201” are displayed. When the button T at the bottom of the screen is clicked, a totaling instruction is sent out, and the screen changes to the screen shown in FIG. 7 described later.

合計金額表示機能は、ユーザによる集計指示に従って、ディスプレイに表示された一つ以上の商品に対応する合計金額等を表示する機能である。図７に、各商品に対応する合計金額を表示する画面の一例を示す。端末装置６のディスプレイ６３には、カメラ６５により撮影されている各商品の明細情報及び最安値情報と、合計及び最安値合計とが表示されている。 The total amount display function is a function of displaying the total amount of money corresponding to one or more products displayed on the display according to a totaling instruction from the user. FIG. 7 shows an example of a screen displaying the total amount of money corresponding to each product. On the display 63 of the terminal device 6, detailed information and lowest price information of each product photographed by the camera 65, and total and lowest price total are displayed.

明細情報は、カメラ６５により撮影されている各商品の商品名称及び価格であり、最安値情報は、その商品を販売している周辺店舗での最安値及び最安値の店舗名称である。合計は、カメラ６５により撮影されている各商品の価格の合計値であり、最安値合計は、各商品の最安値の合計値である。図７には、合計として、“５７２円”が表示され、最安値合計として、“４６８円”が表示されている。 The detailed information is the product name and the price of each product photographed by the camera 65, and the lowest price information is the lowest price and the lowest store name in the peripheral stores selling the product. The total is the total value of the prices of the respective products photographed by the camera 65, and the lowest price total is the total value of the lowest prices of the respective products. In FIG. 7, "572 yen" is displayed as the total and "468 yen" is displayed as the lowest price total.

このように、端末装置６のカメラ６５を用いて一つ以上の商品をディスプレイに表示すると、各商品の合計金額や周辺店舗を含む店舗での各商品の最安値等が表示されるため、ユーザの買い物を効率よく支援することが可能となる。 In this way, when one or more products are displayed on the display by using the camera 65 of the terminal device 6, the total price of each product and the lowest price of each product in the stores including the surrounding stores are displayed. It will be possible to efficiently support such shopping.

次に、図８を参照して、実施形態に係る学習モデル生成装置１の動作の一例について説明する。 Next, an example of the operation of the learning model generation device 1 according to the embodiment will be described with reference to FIG.

最初に、学習モデル生成装置１の受信部１１は、グリーンの背景の前に置かれた商品を、複数のアングルで撮影した画像をカメラ９から受信する（ステップＳ１０１）。 First, the reception unit 11 of the learning model generation device 1 receives, from the camera 9, images obtained by photographing a product placed in front of a green background at a plurality of angles (step S101).

続いて、入力受付部１２は、上記ステップＳ１０１で受信した各画像に対応付けて登録される商品情報として、商品名称及び価格の入力を受け付ける（ステップＳ１０２）。 Subsequently, the input receiving unit 12 receives the input of the product name and the price as the product information registered in association with each image received in step S101 (step S102).

続いて、背景除去部１３は、上記ステップＳ１０１で受信した各画像からグリーンの背景を取り除く（ステップＳ１０３）。 Subsequently, the background removing unit 13 removes the green background from each image received in step S101 (step S103).

続いて、画像合成部１４は、上記ステップＳ１０３でグリーンの背景が取り除かれた画像を、複数の背景画像にそれぞれ合成することで、合成画像を生成する（ステップＳ１０４）。 Next, the image composition unit 14 creates a composite image by combining the images from which the green background has been removed in step S103 described above with the plurality of background images, respectively (step S104).

続いて、合成画像増殖部１５は、上記ステップＳ１０４で生成された各合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像を生成する（ステップＳ１０５）。 Subsequently, the composite image multiplying unit 15 adjusts the attribute information corresponding to each composite image generated in step S104, and generates a plurality of composite images having different attribute information (step S105).

続いて、学習部１６は、上記ステップＳ１０４及びステップＳ１０５で生成された合成画像と、その合成画像に対応する商品情報に含まれる商品名称及び価格との組み合わせを含む教師データを用いて学習モデルを生成する（ステップＳ１０６）。そして本動作を終了する。 Subsequently, the learning unit 16 uses the teacher data including the combination of the composite image generated in steps S104 and S105 and the product name and price included in the product information corresponding to the composite image to create a learning model. It is generated (step S106). Then, this operation ends.

次に、図９を参照して、実施形態に係る端末装置６の動作の一例について説明する。 Next, an example of the operation of the terminal device 6 according to the embodiment will be described with reference to FIG. 9.

最初に、端末装置６は、カメラ６５で撮影されている一つ以上の商品をディスプレイに表示する（ステップＳ２０１）。 First, the terminal device 6 displays on the display one or more products captured by the camera 65 (step S201).

続いて、端末装置６は、ディスプレイに表示される一つ以上の商品の画像を学習モデルに入力する（ステップＳ２０２）。 Subsequently, the terminal device 6 inputs the images of one or more products displayed on the display to the learning model (step S202).

続いて、端末装置６は、学習モデルから出力される商品名称及び価格を、その商品に対応させてディスプレイに表示する（ステップＳ２０３）。 Subsequently, the terminal device 6 displays the product name and price output from the learning model on the display in association with the product (step S203).

続いて、端末装置６は、ユーザによる集計指示（ボタンＴをクリック）が発行されたかどうかを判定する（ステップＳ２０４）。この判定がＮＯである場合（ステップＳ２０４；ＮＯ）には、集計指示が発行されるまで待機する。ここで、集計指示の発行を待機している際に、カメラ６５で撮影される商品が変更された場合には、上記ステップＳ２０１に処理を移行し、変更された商品について、前述したステップＳ２０１からステップＳ２０３までの処理を実行する。 Subsequently, the terminal device 6 determines whether or not a totaling instruction (clicking the button T) is issued by the user (step S204). If this determination is NO (step S204; NO), the process waits until the totaling instruction is issued. Here, when the product photographed by the camera 65 is changed while waiting to issue the counting instruction, the process proceeds to step S201, and the changed product is changed from step S201 described above. The processing up to step S203 is executed.

一方、上記ステップＳ２０４で集計指示が発行されたと判定された場合（ステップＳ２０４；ＹＥＳ）に、端末装置６は、ディスプレイに表示された各商品に対応する合計金額等をディスプレイに表示する（ステップＳ２０５）。そして本動作を終了する。 On the other hand, when it is determined in step S204 that the totaling instruction is issued (step S204; YES), the terminal device 6 displays the total amount of money corresponding to each product displayed on the display on the display (step S205). ). Then, this operation ends.

前述したように、実施形態における学習モデル生成装置１によれば、グリーンの背景の前に置かれた商品を複数のアングルで撮影した画像を受信し、その各画像に対応付けて登録される商品名称及び価格の入力を受け付けるとともに、商品が店舗で販売されるときに商品の背景となり得る複数の背景画像に対し、各画像からグリーンの背景が取り除かれた画像をそれぞれ合成して合成画像を生成し、その合成画像と、対応する商品名称及び価格との組み合わせを含む教師データを用いて学習モデルを生成することができる。そして、この学習モデルに、一つ以上の商品の画像を入力して、商品の画像に対応する商品名称及び価格を出力することが可能となる。 As described above, according to the learning model generation device 1 in the embodiment, an image obtained by shooting a product placed in front of a green background at a plurality of angles is received, and the product registered in association with each image. Generates a composite image by accepting the input of name and price, and combining each background image with the green background removed from multiple background images that can be the background of the product when it is sold in the store. Then, the learning model can be generated using the teacher data including the combination of the synthesized image and the corresponding product name and price. Then, it becomes possible to input one or more product images to the learning model and output the product name and price corresponding to the product images.

それゆえ、実施形態における学習モデル生成装置１によれば、商品名称や価格を手軽に確認させることが可能となる。 Therefore, according to the learning model generation device 1 in the embodiment, it is possible to easily confirm the product name and the price.

また、学習モデル生成装置１によれば、生成した各合成画像に対応する属性情報を調整し、属性情報が異なる複数の合成画像をさらに生成することで、教師データに用いる合成画像を増やす処理を高速化することができる。 Further, according to the learning model generation device 1, by adjusting the attribute information corresponding to each generated composite image and further generating a plurality of composite images having different attribute information, a process of increasing the composite images used for the teacher data is performed. It can speed up.

[変形例]
なお、本発明は、前述した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲内において、他の様々な形で実施することができる。したがって、上記実施形態はあらゆる点で単なる例示にすぎず、限定的に解釈されるものではない。例えば、前述した各処理ステップは処理内容に矛盾を生じない範囲で任意に順番を変更し、又は並列に実行することができる。 [Modification]
It should be noted that the present invention is not limited to the above-described embodiment, and can be implemented in various other forms without departing from the scope of the present invention. Therefore, the above embodiments are merely examples in all respects, and should not be construed as limiting. For example, the processing steps described above can be arbitrarily changed in order or executed in parallel within a range that does not cause a contradiction in the processing content.

また、学習モデル生成装置１の構成要素は、前述した実施形態における構成要素に限定されることなく、必要に応じて任意の構成要素を適宜省略することや追加することができる。例えば、学習モデル生成装置１の機能的な構成のうち、合成画像増殖部１５を省略することとしてもよい。 Further, the constituent elements of the learning model generation device 1 are not limited to the constituent elements in the above-described embodiment, and any constituent element can be appropriately omitted or added as necessary. For example, in the functional configuration of the learning model generation device 1, the synthetic image propagation unit 15 may be omitted.

１…学習モデル生成装置、６…端末装置、９…カメラ、１０…制御装置、１１…受信部、１２…入力受付部、１３…背景除去部、１４…画像合成部、１５…合成画像増殖部、１６…学習部、２０…通信装置、３０…記憶装置、４０…入力装置、６１…制御装置、６２…記憶装置、６３…タッチパネル（入力装置及びディスプレイ）、６４…通信装置、６５…カメラ。 DESCRIPTION OF SYMBOLS 1... Learning model production|generation apparatus, 6... Terminal device, 9... Camera, 10... Control device, 11... Receiving part, 12... Input receiving part, 13... Background removal part, 14... Image composition part, 15... Synthetic image multiplication part , 16... Learning unit, 20... Communication device, 30... Storage device, 40... Input device, 61... Control device, 62... Storage device, 63... Touch panel (input device and display), 64... Communication device, 65... Camera.

Claims

A receiving unit that receives an image of a product placed in front of a background of a specific color that is removed during chroma key composition at multiple angles,
As the product information registered in association with each of the images received by the receiving unit, an input receiving unit that receives at least the product name and the price of the product corresponding to the image,
A background removing unit that removes the background of the specific color from each of the images received by the receiving unit,
For a plurality of background images that can be the background of the product when the product is sold in a store, by combining the images from which the background of the specific color has been removed by the background removal unit, respectively, a composite image is obtained. An image synthesis unit to generate,
Corresponds to images of one or more products that have been learned by using teacher data including a combination of the composite image generated by the image composition unit and the product name and price included in the corresponding product information. A learning unit that outputs a product name and a price to generate a learning model,
A learning model generation device including.

Adjusting the attribute information corresponding to each of the composite images generated by the image combining unit, further comprising a composite image multiplying unit for generating a plurality of the composite image different attribute information,
The learning unit performs learning using teacher data including a combination of the composite image generated by the image combining unit and the composite image multiplying unit, and a product name and a price included in the corresponding product information,
The learning model generation device according to claim 1.

The attribute information includes at least one of contrast, size, rotation angle, and noise,
The learning model generation device according to claim 2.

The background image includes at least an image with the inside of the shopping basket as a background,
The learning model generation device according to any one of claims 1 to 3.

On the computer,
A function to display one or more products photographed by the photographing unit on the display,
A function to input the image of one or more products displayed on the display to the learning model,
A function of displaying the product name and price output from the learning model on the display in association with the product,
A function to display the total amount of money corresponding to one or more products displayed on the display according to the totaling instruction by the user,
A program for realizing.

In the terminal device,
A process of displaying on the display one or more products photographed by the photographing unit,
Inputting one or more product images displayed on the display to the learning model,
A process of displaying the product name and price output from the learning model on the display in association with the product,
A process of displaying the total amount of money corresponding to one or more products displayed on the display according to the totaling instruction by the user,
A method of manufacturing a terminal device by delivering an application program for executing the method for installing the terminal program.