JP7155295B2

JP7155295B2 - Image recognition device, image recognition program, and image recognition method

Info

Publication number: JP7155295B2
Application number: JP2020565674A
Authority: JP
Inventors: 一成岩永; 海斗笹尾
Original assignee: Hitachi Kokusai Electric Inc
Current assignee: Hitachi Kokusai Electric Inc
Priority date: 2019-01-08
Filing date: 2019-12-20
Publication date: 2022-10-18
Anticipated expiration: 2039-12-20
Also published as: JPWO2020145085A1; WO2020145085A1

Description

本発明は、画像認識装置、画像認識プログラム、および画像認識方法に関する。 The present invention relates to an image recognition device, an image recognition program, and an image recognition method.

飲食を顧客に提供する業種において、顧客に対してサービスをタイムリーに提供することは、顧客満足度の向上と、効率的な事業運営といった観点から重要である。 In the business of providing food and drink to customers, it is important to provide services to customers in a timely manner from the viewpoint of improving customer satisfaction and efficient business management.

そのため、従業員は店内の各テーブルを巡回しながら目視確認を繰り返し、各テーブルにおける食事の進行状況を常に把握する必要があった。 Therefore, it is necessary for the employees to keep track of the progress of the meal at each table by repeating visual confirmation while going around each table in the store.

一方、顧客が操作する呼び出しボタンや注文用タブレットを各テーブルに配置して、サービス指示のタイミングを顧客の判断に任せてしまう方法も知られている。 On the other hand, there is also known a method in which a customer-operated call button or ordering tablet is placed on each table, and the timing of service instructions is left to the customer's discretion.

また、特許文献１には、「食事に用いる容器と容器に収容された料理の画像を撮影し、画像から料理の残量を面積測定して食事が終了したか否かを判断し、食事終了を示す情報を出力する」旨の技術が開示される。 In addition, Patent Document 1 discloses that “a container used for a meal and an image of the food contained in the container are photographed, the area of the remaining food is measured from the image, and it is determined whether or not the meal is finished, and the meal is finished. A technique to the effect that "information indicating is output" is disclosed.

特開２０１５－１３８４５２号公報JP 2015-138452 A

従業員には、接客以外にも、食材発注・下ごしらえ・調理・後片付け・レジ清算・店内清掃などの仕事がある。そのため、従業員が、各テーブルの食事の進行状況をきめ細かく目視確認して、サービス指示のタイミングを常時判断する方法は、従業員としての負担が大きくなるという問題があった。 In addition to serving customers, employees also have jobs such as ordering ingredients, preparing food, cooking, cleaning up afterward, clearing cash registers, and cleaning the store. For this reason, there is a problem that a method in which an employee carefully visually confirms the progress of meals at each table and constantly judges the timing of service instructions increases the burden on the employee.

一方、呼び出しボタンや注文用タブレットを使用する方法は、顧客からの要求に合わせて従業員が受動的に対応するため、顧客満足度をそれ以上に高めることは難しいという問題があった。 On the other hand, the method of using a call button or ordering tablet has the problem that it is difficult to further improve customer satisfaction because employees respond passively according to customer requests.

また、特許文献１の技術は、料理の残量を面積測定する方法の開示しかなく、それ以外の多様な食事の進行状況についての柔軟な判定は考慮されていない。 In addition, the technique of Patent Document 1 only discloses a method for measuring the area of the remaining amount of food, and does not consider flexible determination of the progress of various meals.

そこで、本発明は、顧客満足度の高いサービスを提供するための技術を提供することを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to provide a technique for providing services with high customer satisfaction.

上記課題を解決するために、代表的な本発明の画像認識装置の一つは、飲食スペースを撮像した映像を取得する映像取得部と、飲食スペースの映像群について飲食に関する状況（以下「食事状況」という）を機械学習させた学習モデルを用いて、映像取得部が取得した映像について食事状況を画像認識する画像認識部と、画像認識された食事状況に基づいて顧客の飲食に関する進捗状態を判定する状態判定部とを備える。 In order to solve the above problems, one of the typical image recognition devices of the present invention includes an image acquisition unit that acquires an image of an eating and drinking space, and a situation related to eating and drinking for a group of images of the eating and drinking space (hereinafter referred to as "eating situation"). ”) is machine-learned, the image recognition unit recognizes the eating situation in the image acquired by the image acquisition unit, and the progress status of the customer's eating and drinking is determined based on the image-recognized eating situation. and a state determination unit.

本発明の画像認識技術により、顧客満足度の高いサービスを提供することが可能となる。
上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。The image recognition technology of the present invention makes it possible to provide services with high customer satisfaction.
Problems, configurations, and effects other than those described above will be clarified by the following description of the embodiments.

業務支援システムおよび画像認識装置の構成を示す図である。It is a figure which shows the structure of a business support system and an image recognition apparatus. 画像認識部および状態判定部の構成を示す図である。4 is a diagram showing the configuration of an image recognition section and a state determination section; FIG. 学習データの一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of learning data; 画像認識装置の動作を説明する流れ図である。It is a flowchart explaining operation|movement of an image recognition apparatus. 撮像装置から取得する飲食スペースの映像を示す図である。FIG. 4 is a diagram showing an image of a dining space acquired from an imaging device; 検知領域を示す図である。It is a figure which shows a detection area. ブロック分割を行った検知領域を示す図である。It is a figure which shows the detection area which performed block division. 顧客単位の検知領域を示す図である。FIG. 10 is a diagram showing a detection area for each customer; 学習モデルによる推論結果を、検知領域ごとに示した図である。FIG. 10 is a diagram showing inference results by a learning model for each detection area; ３Ｄ畳込みニューラルネットワークを説明する図である。It is a figure explaining a 3D convolutional neural network. ＹＯＬＯやＳＳＤなどによる物体検出の結果を示す図である。It is a figure which shows the result of the object detection by YOLO, SSD, etc. FIG. 学習モデルから出力されるクラス分類を示す図である。It is a figure which shows the class classification output from a learning model. 状態遷移モデルを示す図である。It is a figure which shows a state transition model. 状態遷移モデルの遷移条件の例を示す図である。FIG. 10 is a diagram showing an example of transition conditions of a state transition model; 進捗状態に対応するサービス指示の一例を示す図である。It is a figure which shows an example of the service instruction|indication corresponding to a progress state. サービス指示の表示例を示す図である。It is a figure which shows the example of a display of a service instruction|indication.

本発明に関わる実施例を、図面を参照して説明する。
＜実施例の構成＞
図１は、業務支援システム１および画像認識装置１００の全体構成を示すブロック図である。
この画像認識装置１００は、ハードウェアをＣＰＵ（Central Processing Unit）やメモリユニットや通信ユニットなどを備えたコンピュータ（情報処理システムなども含む）により構成してもよい。Embodiments related to the present invention will be described with reference to the drawings.
<Configuration of Example>
FIG. 1 is a block diagram showing the overall configuration of a business support system 1 and an image recognition device 100. As shown in FIG.
The image recognition apparatus 100 may be configured by a computer (including an information processing system, etc.) having a CPU (Central Processing Unit), a memory unit, a communication unit, and the like.

このハードウェアにおいて、画像認識プログラムが実行されることにより、画像認識装置１００の後述する各種機能が実現する。 By executing an image recognition program in this hardware, various functions of the image recognition apparatus 100, which will be described later, are realized.

なお、ハードウェアの一部または全部については、ＤＳＰ（Digital Signal Processor ）、ＦＰＧＡ（Field-Programmable Gate Array）、ＧＰＵ（Graphics Processing Unit ）などで代替してもよい。また、ハードウェアの一部または全部をネットワーク上のサーバに集中または分散配置してクラウド化し、複数の人がネットワークを介して共同利用してもよい。 Part or all of the hardware may be replaced by a DSP (Digital Signal Processor), FPGA (Field-Programmable Gate Array), GPU (Graphics Processing Unit), or the like. Also, part or all of the hardware may be centrally or distributedly arranged in a server on the network to form a cloud, and shared use by multiple people via the network.

図１に示すように、業務支援システム１は、画像認識装置１００、撮像装置１０１、表示／指示端末１０２、および表示出力装置１０３を備える。さらに、画像認識装置１００は、映像取得部１１０、画像認識部１２０、状態判定部１３０、データ通信部１４０、記録制御部１５０、表示制御部１７０、および記録装置１６０を備える。 As shown in FIG. 1 , the business support system 1 includes an image recognition device 100 , an imaging device 101 , a display/instruction terminal 102 and a display output device 103 . Furthermore, the image recognition device 100 includes a video acquisition unit 110 , an image recognition unit 120 , a state determination unit 130 , a data communication unit 140 , a recording control unit 150 , a display control unit 170 and a recording device 160 .

撮像装置１０１は、１台以上のカメラによって構成され、飲食スペースを撮影するように天井や壁や卓上などに配置される。例えば、撮像装置１０１には、ＣＣＴＶカメラ、卓上に設置したロボットに内蔵されたカメラ、魚眼レンズカメラ、ウェブカメラなどが採用可能である。 The imaging device 101 is composed of one or more cameras, and is arranged on a ceiling, a wall, a tabletop, or the like so as to photograph the dining space. For example, the imaging device 101 can be a CCTV camera, a camera built into a robot installed on a desk, a fisheye lens camera, a web camera, or the like.

映像取得部１１０は、無線や有線ケーブルやネットワーク網を介して、撮像装置１０１から飲食スペースの映像を１次元配列もしくは２次元配列の画像データとして取得する。また、映像取得部１１０は、外部の映像データベースや記録装置１６０からも飲食スペースの映像群を取得する。 The video acquisition unit 110 acquires the video of the dining space from the imaging device 101 as image data in a one-dimensional array or a two-dimensional array via wireless, wired cables, or a network. The video acquisition unit 110 also acquires video groups of the eating and drinking space from an external video database and the recording device 160 .

画像認識部１２０は、映像取得部１１０が取得する映像について、食事状況や、メニュー状況や、人物の有無の状況を画像認識する。 The image recognition unit 120 performs image recognition on the image acquired by the image acquisition unit 110 regarding the eating situation, the menu situation, and the presence or absence of a person.

なお、本開示において、「食事状況」とは、飲食の進行状況を示すものであり、例えば「食事未提供」、「食事提供済み」、「食事終了」などを指す。 In the present disclosure, "meal status" indicates the progress of eating and drinking, and indicates, for example, "meal not provided", "meal provided", "meal completed", and the like.

また、本開示において、「メニュー状況」とは、顧客が閲覧するメニューに関する状況を示すものであり、例えば「メニュー無し」、「メニュー有り（閉状態）」、「メニュー有り（開状態）」などを指す。 In addition, in the present disclosure, "menu status" indicates the status of the menu viewed by the customer, such as "no menu", "menu present (closed state)", "menu present (open state)", etc. point to

さらに、「メニュー有り（閉状態）」には、メニューブックが閉じて中ページを閲覧できない状況の他に、メニューが衝立などに収納されてメニューを閲覧できない状況などを含めてもよい。「メニュー有り（開状態）」には、メニューブックが開いた状態で中ページが閲覧可能な状況の他に、メニューを衝立などから取り出してメニューが閲覧可能になった状況や、顧客がメニューを手にしている状況などを含めてもよい。 Further, "with menu (closed state)" may include not only the situation where the menu book is closed and the inside pages cannot be browsed, but also the situation where the menu cannot be browsed because the menu is housed in a partition or the like. In "with menu (open state)", in addition to the situation where the middle page can be viewed with the menu book open, the situation where the menu can be viewed after the menu is taken out from the screen, etc., and the situation where the customer opens the menu. You can also include the situation you are in.

また、本開示において、「人物の有無の状況」とは、人物の有無に関する状況を示すものであり、例えば「在席」、「空席」などを指す。
さらに「在席」には、椅子に着席した状況の他に、立食式の飲食スペースにおいて立食可能なスペースに立っている状況を含めてもよい。In addition, in the present disclosure, the “state of the presence or absence of a person” indicates the state of the presence or absence of a person, and indicates, for example, “presence” or “empty seat”.
Furthermore, "at a seat" may include not only the state of sitting on a chair, but also the state of standing in a stand-up dining space.

状態判定部１３０は、画像認識の結果に基づいて、顧客の飲食に関する進捗状態を判定し、進捗状態に対応するサービス指示のタイミングを報知する。 The status determination unit 130 determines the progress status of the customer's eating and drinking based on the result of image recognition, and notifies the timing of the service instruction corresponding to the progress status.

データ通信部１４０は、状態判定部１３０、画像認識部１２０、表示／指示端末１０２、およびデータセンタ（不図示）などの間で無線や有線ケーブルやネットワーク網を介してデータ通信を行う。 The data communication unit 140 performs data communication among the state determination unit 130, the image recognition unit 120, the display/instruction terminal 102, a data center (not shown), and the like via wireless, wired cable, or network.

表示／指示端末１０２は、ＰＯＳ端末（販売時点情報管理端末）などの情報処理装置であって、状態判定部１３０が報知するサービス指示のタイミングを、データ通信部１４０を介して受信してイベントや画像や振動や音声などによって顧客担当の従業員に通知する。 The display/instruction terminal 102 is an information processing device such as a POS terminal (point-of-sale information management terminal), and receives the timing of the service instruction notified by the state determination unit 130 via the data communication unit 140 to receive an event or an instruction. The employee in charge of the customer is notified by image, vibration, voice, etc.

また、従業員が表示／指示端末１０２に入力した顧客情報（注文データや精算データなど）は、データ通信部１４０を介して状態判定部１３０に伝達される。さらに、従業員が表示／指示端末１０２に入力した情報（通知確認や通知停止の操作や、サービス指示を実施したか否かの情報など）もデータ通信部１４０を介して状態判定部１３０に伝達される。 Customer information (order data, payment data, etc.) input by the employee to the display/instruction terminal 102 is transmitted to the state determination section 130 via the data communication section 140 . Furthermore, the information input by the employee to the display/instruction terminal 102 (such as notification confirmation or notification stop operation, information as to whether or not a service instruction has been implemented, etc.) is also transmitted to the state determination unit 130 via the data communication unit 140. be done.

記録制御部１５０は、状態判定部１３０の判定に基づいて、記録装置１６０に対する記録制御（記録のオンオフ、フレームレートの制御、圧縮率の制御、記録間隔の制御など）を行う。 The recording control unit 150 performs recording control (on/off of recording, frame rate control, compression ratio control, recording interval control, etc.) for the recording device 160 based on the determination by the state determination unit 130 .

記録装置１６０は、記録制御部１５０による記録制御に従って、映像取得部１１０が取得した映像を記録する。さらに、記録装置１６０は、画像認識装置１００による画像認識や状態判定の結果も記録する。 The recording device 160 records the image acquired by the image acquisition unit 110 according to the recording control by the recording control unit 150 . Furthermore, the recording device 160 also records the results of image recognition and state determination by the image recognition device 100 .

表示制御部１７０は、映像取得部１１０が取得した映像と共に、画像認識装置１００による画像認識や状態判定やサービス指示の結果を表示出力装置１０３に表示する。 The display control unit 170 displays the results of image recognition, state determination, and service instruction by the image recognition device 100 on the display output device 103 together with the image acquired by the image acquisition unit 110 .

図２は、画像認識部１２０および状態判定部１３０の構成を示すブロック図である。
同図において、画像認識部１２０は、パラメータ設定部１２１、前処理部１２２、特徴抽出部１２３、認識部１２４、および学習モデル１２５を備える。状態判定部１３０は、状態遷移部１３１、および指示出力部１３２を備える。FIG. 2 is a block diagram showing the configuration of the image recognition section 120 and the state determination section 130. As shown in FIG.
In the figure, the image recognition unit 120 includes a parameter setting unit 121 , a preprocessing unit 122 , a feature extraction unit 123 , a recognition unit 124 and a learning model 125 . State determination section 130 includes state transition section 131 and instruction output section 132 .

パラメータ設定部１２１は、画像処理に必要なパラメータを設定、変更、読み込み、保存する機能を持つ。 The parameter setting unit 121 has a function of setting, changing, reading, and saving parameters necessary for image processing.

前処理部１２２は、映像取得部１１０の取得した映像に対して、パラメータ設定部１２１のパラメータに従って検知領域のトリミング処理やマスキング処理を行う。さらに、前処理部１２２は、映像に対してノイズやフリッカなどを低減するため、平滑化フィルタ、輪郭強調フィルタ、濃度変換などの前処理を施してもよい。さらに、画像認識の用途に応じてＲＧＢカラーやモノクロなどの色変換処理を行ってもよい。さらには、処理負荷を低減するために、画像データに縮小処理を施してもよい。 The preprocessing unit 122 performs trimming processing and masking processing of the detection region on the image acquired by the image acquiring unit 110 according to the parameters of the parameter setting unit 121 . Furthermore, the preprocessing unit 122 may perform preprocessing such as a smoothing filter, an edge enhancement filter, and density conversion in order to reduce noise, flicker, and the like on the video. Furthermore, color conversion processing such as RGB color or monochrome may be performed depending on the purpose of image recognition. Furthermore, in order to reduce the processing load, the image data may be reduced.

特徴抽出部１２３は、前処理を施した検知領域の映像に対して、後段の画像認識の精度を向上させるため、画像認識に関わる画像の特徴を強調または抽出する画像処理を行う。 The feature extraction unit 123 performs image processing for enhancing or extracting image features related to image recognition in order to improve the accuracy of subsequent image recognition on the preprocessed video of the detection region.

認識部１２４は、特徴抽出部１２３により特徴抽出を行った検知領域の映像について、学習モデル１２５を用いた画像認識を実施する。 The recognition unit 124 performs image recognition using the learning model 125 on the image of the detection region for which the feature extraction has been performed by the feature extraction unit 123 .

学習モデル１２５は、畳込みニューラルネットワークなどにより構成される。この学習モデル１２５の学習データは、事前に検知領域の映像群を収集し、人物有無のクラス分類、食事状況のクラス分類、およびメニュー状況のクラス分類を教師値として与えることにより作成される。 The learning model 125 is configured by a convolutional neural network or the like. The learning data of this learning model 125 is created by collecting a group of images of the detection area in advance, and giving class classifications of presence/absence of people, eating situations, and menu situations as teacher values.

例えば、次のクラス分類が教師値となる。
＜食事状況＞
・食事未提供
・食事提供済み
・食事終了
＜メニュー状況＞
・メニュー有り（閉状態）
・メニュー有り（開状態）
・メニュー無し
＜人物有無の状況＞
・在席
・空席
図３は、このような教師値を付与した学習データの一例を示す説明図である。For example, the following class classification is the teacher value.
<Meal status>
・Meal not served ・Meal served ・Meal finished <menu status>
・With menu (closed state)
・With menu (open state)
・No menu
-Seats present -Seats vacant FIG. 3 is an explanatory diagram showing an example of learning data to which such teacher values are assigned.

同図のＡ群は、空席、食事未提供、メニュー無しの教師値を付与した映像群である。 Group A in the figure is a video group to which teacher values such as vacant seats, no meals provided, and no menus are given.

Ｂ群は、在席、食事未提供、メニュー有り（開状態）の教師値を付与した映像群である。 Group B is a video group to which teacher values such as presence, meal not provided, and menu available (open state) are assigned.

Ｃ群は、在席、食事提供済み、メニュー無しの教師値を付与した映像群である。 Group C is a video group to which a teacher value of being seated, having a meal served, and having no menu is assigned.

Ｄ群は、在席、食事終了、メニュー無しの教師値を付与した映像群である。 Group D is a video group to which teacher values such as being seated, having finished eating, and having no menu were assigned.

このような学習データの映像群を学習モデル１２５の入力層に与えつつ、学習モデル１２５の出力層において教師値との誤差を最小化するように機械学習（例えば逆誤差伝搬法）が行われる。 Machine learning (for example, back propagation method) is performed so as to minimize the error from the teacher value in the output layer of the learning model 125 while giving such a video group of learning data to the input layer of the learning model 125 .

このような機械学習により、学習モデル１２５の入力層に検知領域の映像を与えると、学習モデル１２５の出力層に「人物有無のクラス分類」、「食事状況のクラス分類」、および「メニュー状況のクラス分類」などの推論結果が出力されるようになる。 By such machine learning, when the image of the detection area is given to the input layer of the learning model 125, the output layer of the learning model 125 is "class classification of person presence", "class classification of eating situation", and "class classification of menu situation". Inference results such as "class classification" will be output.

状態遷移部１３１は、入店から退店までの顧客の飲食に関する状態の遷移をモデル化した状態遷移モデルを有する。状態遷移部１３１は、この状態遷移モデルに対して、認識部１２４の画像認識の結果（「人物有無のクラス分類」、「食事状況のクラス分類」、および「メニュー状況のクラス分類」などの推論結果）を遷移条件として当てはめて、顧客の状態を状態遷移させることで、現時点の顧客の飲食に関する進捗状態を判定する。 The state transition unit 131 has a state transition model that models the state transition regarding customer eating and drinking from entering the store to leaving the store. The state transition unit 131 applies the result of image recognition by the recognition unit 124 (class classification of presence/absence of person, class classification of eating situation, class classification of menu situation, etc.) to the state transition model. result) is applied as a transition condition, and the state of the customer is changed to determine the current state of progress regarding eating and drinking of the customer.

また、状態遷移部１３１は、顧客の状態遷移について時間経過に基づく予測を行い、次の進捗状態に遷移させる。 In addition, the state transition unit 131 predicts the customer's state transition based on the passage of time, and transitions to the next progress state.

さらに、状態遷移部１３１は、表示／指示端末１０２に操作入力された顧客の注文、精算などの顧客情報をデータ通信部１４０を介して取得する。状態遷移部１３１は、この顧客情報に基づいて顧客の進捗状態を優先的に決定する。 Furthermore, the state transition unit 131 acquires customer information such as the customer's order and payment that are input to the display/instruction terminal 102 via the data communication unit 140 . The state transition unit 131 preferentially determines the progress state of the customer based on this customer information.

このような状態遷移を起こした情報群は、記録制御部１５０を介して記録装置１６０内の履歴１６０ａにバックアップされる。 A group of information causing such a state transition is backed up in the history 160 a in the recording device 160 via the recording control section 150 .

指示出力部１３２は、顧客の進捗状態に対応して、サービス指示をデータ通信部１４０を介して表示／指示端末１０２に報知する。 The instruction output unit 132 notifies the display/instruction terminal 102 of the service instruction via the data communication unit 140 according to the customer's progress.

＜実施例の動作＞
図４は、画像認識装置１００の動作を説明する流れ図である。
以下、図４に示すステップ番号に沿って、実施例の動作を説明する。<Operation of Example>
FIG. 4 is a flowchart for explaining the operation of the image recognition device 100. As shown in FIG.
The operation of the embodiment will be described below along the step numbers shown in FIG.

ステップＳ１０１：映像取得部１１０は、撮像装置１０１から飲食スペースの映像を取得する。図５は、取得される飲食スペースの映像３０１を示す図である。映像３０１には、テーブル単位（テーブルと座席のセット）の複数の検知領域３０２～３０５が含まれる。 Step S<b>101 : The image acquisition unit 110 acquires an image of the dining space from the imaging device 101 . FIG. 5 is a diagram showing an acquired image 301 of the dining space. An image 301 includes a plurality of detection areas 302 to 305 for each table (table and seat set).

ステップＳ１０２：パラメータ設定部１２１は、この映像３０１を歪曲補正した後、テーブル単位の検知領域３０２～３０５をトリミングするために区画を設定する。このような区画設定は、ユーザがテーブルとテーブルとの間に境界線を指定して設定してもよい。また、パラメータ設定部１２１が映像３０１からテーブルの色や形状を認識することによって、テーブル単位の区画を自動設定してもよい。 Step S102: After correcting the distortion of the image 301, the parameter setting unit 121 sets divisions for trimming the detection areas 302 to 305 in table units. Such division setting may be set by the user specifying a boundary line between tables. Alternatively, the parameter setting unit 121 may recognize the color and shape of the table from the image 301 to automatically set the partitions for each table.

また、パラメータ設定部１２１は、各テーブルの照明などの撮影条件の違いに基づいて、テーブル単位の検知領域３０２～３０５それぞれに対する画質補正（輝度補正や色補正やガンマ補正など）を設定してもよい。 In addition, the parameter setting unit 121 may set image quality correction (luminance correction, color correction, gamma correction, etc.) for each of the detection areas 302 to 305 for each table based on differences in shooting conditions such as illumination of each table. good.

さらに、パラメータ設定部１２１は、テーブル上の飾りなど、食事やメニューと関係しない固定領域を除くためのマスク設定を行ってもよい。 Furthermore, the parameter setting unit 121 may perform mask setting for excluding fixed areas unrelated to meals and menus, such as decorations on the table.

前処理部１２２は、パラメータ設定部１２１による上述した一連の設定に基づいて、映像３０１からテーブル単位の検知領域３０２～３０５をトリミングする。図６は、トリミングされた検知領域３０２を示す図である。 The preprocessing unit 122 trims the detection areas 302 to 305 in table units from the image 301 based on the above-described series of settings by the parameter setting unit 121 . FIG. 6 is a diagram showing a cropped sensing region 302. As shown in FIG.

なお、説明を簡単にするため、ここから以降は検知領域３０２の処理のみを説明するが、残りの検知領域３０３～３０５についても同様の処理が並行または逐次に行われる。 To simplify the explanation, only the processing of the detection area 302 will be described from here on, but the remaining detection areas 303 to 305 are also subjected to similar processing in parallel or sequentially.

ステップＳ１０３：前処理部１２２は、テーブル単位の検知領域３０２をさらに分割する。
図７は、テーブル単位の検知領域３０２をｍ列×ｎ行のブロック単位の検知領域５０１に分ける様子を示す図である。このように処理単位を細分化することにより、後段の特徴検出や画像認識の処理単位を小さくし、かつ並列化するなどの高速化も可能になる。
図８は、顧客単位の検知領域６０１～６０３を決定する様子を示す図である。前処理部１２２は、顧客の有無を、背景差分法や顔検出や学習モデルなどの画像処理により検出する。前処理部１２２は、検出された顧客の領域とその顧客に提供される飲食物やメニューの領域をカバーするように検知領域６０１～６０３を決定する。このように顧客の行動領域ごとに検知領域６０１～６０３を分けたことにより、顧客別の画像認識が可能になる。Step S103: The preprocessing unit 122 further divides the detection area 302 for each table.
FIG. 7 is a diagram showing how the table-based detection area 302 is divided into block-based detection areas 501 of m columns×n rows. By subdividing the processing unit in this way, it is possible to reduce the processing unit of the feature detection and image recognition in the latter stage and increase the speed by parallelization.
FIG. 8 is a diagram showing how detection areas 601 to 603 are determined for each customer. The preprocessing unit 122 detects the presence or absence of a customer by image processing such as a background subtraction method, face detection, and learning model. The preprocessing unit 122 determines the detection areas 601 to 603 so as to cover the area of the detected customer and the area of food and drink and menus provided to the customer. By dividing the detection areas 601 to 603 for each customer's activity area in this way, image recognition for each customer becomes possible.

さらに、図８に示すような複数人が自由に座る長椅子ではなく、一人ずつ座る椅子席の場合、顧客ごとの検知領域６０１～６０３の代わりに、椅子席ごとに検知領域を分けてもよい。 Furthermore, in the case of a chair where one person sits instead of a chaise longue where a plurality of people can freely sit as shown in FIG.

上述した検知領域の種類は、後段の特徴抽出部１２３、認識部１２４の設計や処理負荷や飲食スペース特有の事情に応じて使い分けられる。 The types of detection regions described above are selectively used according to the design and processing load of the feature extraction unit 123 and the recognition unit 124 in the latter stage and circumstances specific to the eating and drinking space.

ステップＳ１０４：特徴抽出部１２３は、検知領域に対して、例えば、動き特徴として、背景差分法などによって過去画像との変化量、オプティカルフローに基づく動きベクトル、画像特徴として、画像に含まれるエッジ情報、色情報、輝度情報などを抽出する。抽出された特徴は検知領域の映像データとして認識部１２４へ出力される。 Step S104: The feature extracting unit 123 extracts, for example, the amount of change from the previous image by the background subtraction method or the like, the motion vector based on the optical flow, and the edge information contained in the image as the image feature, for the detection area, as the motion feature. , color information, luminance information, and so on. The extracted features are output to the recognition unit 124 as video data of the detection area.

なお、特徴抽出部１２３は、ハフ変換で検出した円や直線の数など、人物や皿などの形状に着目したルールベースと特徴量を検知領域に付属させてもよい。また、特徴抽出部１２３は、人物の手足のスケルトン認識（例えばOpenPoseなど）によって、手足などの人体パーツの特徴情報を抽出して検知領域に付属させてもよい。 Note that the feature extraction unit 123 may attach a feature amount and a rule base focusing on the shape of a person or a plate, such as the number of circles or straight lines detected by the Hough transform, to the detection region. Further, the feature extraction unit 123 may extract feature information of human body parts such as hands and feet by skeleton recognition (for example, OpenPose) of a person's hands and feet, and attach the feature information to the detection area.

認識部１２４は、特徴抽出部１２３から入力された検知領域の多次元配列データを、学習モデル１２５の入力層に入力する。 The recognition unit 124 inputs the multidimensional array data of the detection region input from the feature extraction unit 123 to the input layer of the learning model 125 .

学習モデル１２５の内部では、この検知領域の多次元配列データに対して、畳込み層やプーリング層や全結合層による配列演算および活性化関数による非線形演算が行われる。検知領域に付属する画像以外の特徴情報については、全結合層に入力してもよい。 Inside the learning model 125, the multi-dimensional array data of the detection region is subjected to array operations by convolution layers, pooling layers, and fully connected layers, and non-linear operations by activation functions. Feature information other than the image attached to the detection region may be input to the fully connected layer.

この配列演算では、上述した機械学習で作成された各層の係数値やバイアス値などが使用されることにより、学習モデル１２５の出力層には、上述した「人物有無のクラス分類」、「食事状況のクラス分類」、および「メニュー状況のクラス分類」などの推論結果が出力されるようになる。 In this array operation, the coefficient values and bias values of each layer created by the machine learning described above are used, so that the output layer of the learning model 125 includes the above-described “classification of presence/absence of people” and “meal situation”. Inference results such as "class classification of menu" and "class classification of menu situation" will be output.

図９は、学習モデル１２５による推論結果を、検知領域６０１、６０２、６０３ごとに示した図である。 FIG. 9 is a diagram showing inference results by the learning model 125 for each detection area 601, 602, 603. FIG.

なお、食事状況の正しい推定を行うためには、フォークなど器具を使う動作なども有用な情報となることから、学習モデル１２５は、過去の時系列の複数フレームを同時に畳込む３Ｄ畳込みニューラルネットワーク（図１０参照）などの方式を用いて、動きを含めた推論処理を行ってもよい。この場合、学習モデル１２５は、「顧客による食事中の動き」や「顧客がメニューを読む動き」や「食事中にスマホを見る動き」などの顧客の動きを検出することが可能になる。 In addition, in order to correctly estimate the eating situation, since the actions of using tools such as forks are also useful information, the learning model 125 is a 3D convolutional neural network that simultaneously convolves multiple frames in the past (See FIG. 10) or the like may be used to perform inference processing including motion. In this case, the learning model 125 can detect the customer's movements such as "the customer's movement during the meal", "the customer's movement to read the menu", and "the customer's movement to look at the smartphone during the meal".

また、学習モデル１２５は、検知領域６０１、６０２、６０３のように顧客ごとの検知領域に対してだけではなく、図７のような一律に分割されたブロック状の検知領域５０１に対しても適用することができる。この場合、例えば、１人しか映っていない状態であっても、複数のブロックが「在席」を示すようになる。そこで、同じ「在席」を示す隣接ブロックを統合することにより、１人分の映像範囲を確定できる。また、１人分の映像範囲に含まれるブロック群についてクラス分類の結果の多数決をとることにより、１人分の映像範囲についてクラス分類を行うことが可能になる。 In addition, the learning model 125 is applied not only to the detection regions for each customer, such as the detection regions 601, 602, and 603, but also to the uniformly divided block-shaped detection region 501 as shown in FIG. can do. In this case, for example, even if only one person is shown, a plurality of blocks will indicate "presence". Therefore, by integrating adjacent blocks indicating the same "presence", the video range for one person can be determined. Further, by determining the majority of the results of class classification for the block groups included in the video range for one person, it is possible to classify the video range for one person.

また、各テーブルの状態を個人ごとに分ける必要がない場合、つまり、テーブル全体の状態を管理したい場合は、図６に示した検知領域３０２のように机単位の区画全体を学習モデル１２５に入力とすることで推論処理を行うことも可能である。 In addition, if it is not necessary to divide the state of each table for each individual, that is, if you want to manage the state of the entire table, input the entire division of each desk into the learning model 125 like the detection area 302 shown in FIG. It is also possible to perform inference processing by doing so.

さらに、顧客ニーズとして、水やドリンクなどのコップの出し忘れを防ぎたいというニーズがある場合、例えば、You Only Look Once（ＹＯＬＯ）やSingle Shot Multibox Det ector（ＳＳＤ）に代表されるような物体検出処理を学習モデル１２５に適用して、”ドリンク”や”食事”、”人物”など机上周辺に存在しうるものをクラスとして定義、学習させることによって、図１１に示すようにオブジェクトの位置、有無を認識するようにしてもよい。 Furthermore, if there is a customer need to prevent forgetting to take out the cup of water or drink, for example, object detection such as You Only Look Once (YOLO) and Single Shot Multibox Detector (SSD) By applying the processing to the learning model 125 and defining classes that can exist around the desk, such as "drinks", "meal", and "persons", and making them learn, the position and presence/absence of objects as shown in FIG. may be recognized.

図１２は、学習モデル１２５により実施されるクラス分類を示す図である。
同図において、クラス分類は、次のクラスからなる。
＜食事状況＞
・食事未提供
・食事提供済み
・食事終了
＜メニュー状況＞
・メニュー有り（閉状態）
・メニュー有り（開状態）
・メニュー無し
＜人物有無の状況＞
・在席
・空席FIG. 12 is a diagram showing the classification performed by the learning model 125. As shown in FIG.
In the figure, the classification consists of the following classes.
<Meal status>
・Meal not served ・Meal served ・Meal finished <menu status>
・With menu (closed state)
・With menu (open state)
・No menu
・Available ・Vacant

ステップＳ１０５：図１３は、状態遷移部１３１が使用する状態遷移モデルを示す図である。同図において、状態遷移モデルは、次の（１）～（６）の進捗状態を有する。（１）入店（２）メニュー選択（３）待ち（４）食事中（５）食後（６）退店 Step S105: FIG. 13 is a diagram showing a state transition model used by the state transition section 131. FIG. In the figure, the state transition model has the following progress states (1) to (6). (1) Enter (2) Menu selection (3) Wait (4) During meal (5) After meal (6) Exit

これらの進捗状態は、（１）～（６）の昇順に状態遷移（図１３に示す実線矢印）が起こる。この昇順の状態遷移は、顧客が入店してから退店するまでの標準的な状態遷移である。その他に、状態遷移モデルには、図１３に示す点線矢印のような例外的な状態遷移が存在する。 These progress states undergo state transitions (solid line arrows shown in FIG. 13) in ascending order of (1) to (6). This ascending state transition is a standard state transition from when a customer enters the store to when he/she leaves the store. In addition, the state transition model has exceptional state transitions such as the dotted line arrows shown in FIG.

図１４は、この状態遷移モデルの遷移条件の一例を示す図である。
状態遷移部１３１は、状態遷移モデルに対して、認識部１２４の画像認識の結果（「人物有無のクラス分類」、「食事状況のクラス分類」、および「メニュー状況のクラス分類」などの推論結果）を組み合わせて遷移条件に該当すると、顧客の進捗状態を状態遷移させる。この状態遷移により、現時点における顧客の飲食に関する進捗状態が決定する。FIG. 14 is a diagram showing an example of transition conditions of this state transition model.
The state transition unit 131 applies the image recognition results of the recognition unit 124 to the state transition model (inference results such as “person presence/absence class classification”, “eating situation class classification”, and “menu situation class classification”). ) are combined to meet the transition condition, the progress state of the customer is changed. This state transition determines the current state of progress regarding the customer's eating and drinking.

例えば、前回が「退店」の進捗状態にあった検知領域において、人物有無のクラス分類が空席から在席に変化すると、進捗状態は「入店」に初期設定される。 For example, when the class classification of the presence of a person changes from vacant to seated in a detection region that was previously in the progress state of "exit", the progress state is initially set to "enter".

さらに、「入店」の進捗状態において、在席状態でメニューが開かれるという遷移条件を満足すると、進捗状態を「入店」から「メニュー選択」へ状態遷移させる。 Furthermore, in the progress state of "entering", when the transition condition that the menu is opened in the presence state is satisfied, the progress state is changed from "entering" to "menu selection".

また、「メニュー選択」の進捗状態において、開かれていたメニューが閉じるという遷移条件を満足すると、進捗状態を「メニュー選択」から「待ち」へ状態遷移させる。 In addition, in the progress state of "menu selection", when the transition condition that the opened menu is closed is satisfied, the progress state is changed from "menu selection" to "waiting".

さらに、「待ち」の進捗状態において、食事状況が食事提供済みになるという遷移条件を満足すると、進捗状態を「待ち」から「食事中」へ状態遷移させる。 Furthermore, in the progress state of "waiting", when the transition condition that the meal status becomes "meal provided" is satisfied, the progress state is changed from "waiting" to "eating".

また、「食事中」の進捗状態において、食事状況が食事終了になるという遷移条件を満足すると、進捗状態を「食事中」から「食後」へ状態遷移させる。 In the progress state of "during meal", when the transition condition that the meal status is the end of meal is satisfied, the progress state is changed from "during meal" to "after meal".

さらに、「食後」の進捗状態において、メニューが開かれるという遷移条件を満足すると、進捗状態を「食後」から「メニュー選択」へ状態遷移させる。 Furthermore, when the transition condition that the menu is opened is satisfied in the progress state of "after meal", the progress state is changed from "after meal" to "menu selection".

また、「食後」の進捗状態において、在席から空席へ変化するという遷移条件を満足すると、進捗状態を「食後」から「退店」へ状態遷移させる。 In addition, in the progress state of "after meal", when the transition condition that the seat is changed from seated to vacant is satisfied, the progress state is changed from "after meal" to "exited".

ステップＳ１０６：状態遷移部１３１は、状態遷移後の時間経過を計測する。この時間経過に基づいて、次の状態遷移の可能性を予測する。この予測により、画像認識の結果に変化がない場合でも、状態遷移部１３１は進捗状態を次に進めることができる。 Step S106: The state transition unit 131 measures the elapsed time after the state transition. Based on this passage of time, the possibility of the next state transition is predicted. With this prediction, the state transition unit 131 can advance the progress state to the next even if the result of image recognition does not change.

ステップＳ１０７：状態遷移部１３１は、注文などの顧客情報を収集する表示／指示端末１０２などの情報処理装置から顧客情報をデータ通信部１４０を介して取得すると、状態遷移モデルの進捗状態を優先的（強制的）に変更する。 Step S107: When the state transition unit 131 acquires customer information from an information processing device such as the display/instruction terminal 102 that collects customer information such as orders through the data communication unit 140, the state transition unit 131 preferentially updates the progress state of the state transition model. (forced).

例えば、図１４に示すように、「入店」または「メニュー選択」の進捗状態において、「注文完了」の顧客情報を取得すると、進捗状態は「待ち」へ強制的に変更される。 For example, as shown in FIG. 14, when the customer information of "order completed" is obtained in the progress state of "entering" or "menu selection", the progress state is forcibly changed to "waiting".

ステップＳ１０８：指示出力部１３２は、検知領域ごとに進捗状態に対応するサービス指示（サービス内容：サービス提供のタイミング）を決定する。
図１５は、進捗状態に対応するサービス指示の一例を示す図である。Step S108: The instruction output unit 132 determines a service instruction (service content: timing of service provision) corresponding to the progress state for each detection area.
FIG. 15 is a diagram showing an example of a service instruction corresponding to progress.

例えば、「入店」の進捗状況に対応して、サービス指示（飲料水コップの提供：入店になってから）が決定する。 For example, a service instruction (provide drinking water cup: after entering the store) is determined in accordance with the progress of "entering the store".

さらに、「メニュー選択」の進捗状態に対応して、サービス指示（注文の確認：メニュー選択になってから所定時間経過後）が決定する。 Further, a service instruction (confirmation of order: after a predetermined time has elapsed since menu selection) is determined in accordance with the progress of "menu selection".

また、「待ち」の進捗状態に対応して、サービス指示（食事の提供：待ちになってから調理完了後）が決定する。 In addition, service instructions (meal provision: after completion of cooking after waiting) are determined in accordance with the progress state of "waiting".

さらに、「食後」の進捗状態に対応して、サービス指示（食器の回収：食後になってから）が決定する。 Further, the service instruction (collection of tableware: after eating) is determined in accordance with the progress of "after eating".

また、「退店」の進捗状態に対応して、サービス指示（片付け指示：退店になってから）が決定する。 Also, a service instruction (tidying up instruction: after leaving the store) is determined in accordance with the progress of "leaving the store".

さらに、「食後」の進捗状態に対応して、サービス指示（食後デザートの提供：食後かつ食後デザートの注文がある場合）が決定する。 Further, the service instruction (delivery of dessert after meal: when there is an order for dessert after meal and after meal) is determined in accordance with the progress of "after meal".

ステップＳ１０９：顧客単位やテーブル単位にサービス指示が生じるため、複数のサービス指示が短期間に集中するケースも生じる。そこで、指示出力部１３２は、顧客やテーブルの間で発生するサービス指示に対して優先レベルを設定する。 Step S109: Since service instructions are generated for each customer or table, there are cases where multiple service instructions are concentrated in a short period of time. Therefore, the instruction output unit 132 sets a priority level for service instructions generated between customers and tables.

一般に、顧客のために行うサービス（「注文の確認」や「食事の提供」や「食後デザートの提供」など）は、飲食店のために行うサービス（「食器の回収」など）よりも優先レベルが高くなる。 In general, services for customers (such as "order confirmation," "delivery of meals," and "delivery of desserts") are prioritized over services for restaurants (such as "collecting tableware"). becomes higher.

ステップＳ１１０：指示出力部１３２は、優先レベルに応じて、サービス指示のタイミングを入れ替えることにより、従業員別の接客スケジュールを予定する。 Step S110: The instruction output unit 132 plans a customer service schedule for each employee by changing the timing of service instructions according to the priority level.

ステップＳ１１１：指示出力部１３２は、従業員別の接客スケジュールに所定時間以上の空き時間があるか否かを判定する。空き時間がある場合、指示出力部１３２はステップＳ１１２に動作を移行する。それ以外の場合、指示出力部１３２はステップＳ１１３に動作を移行する。 Step S111: The instruction output unit 132 determines whether or not the customer service schedule for each employee has a vacant time equal to or longer than a predetermined time. If there is vacant time, the instruction output unit 132 moves the operation to step S112. Otherwise, the instruction output unit 132 moves the operation to step S113.

ステップＳ１１２：指示出力部１３２は、空き時間に応じて、接客以外の仕事（食材発注・下ごしらえ・調理・後片付け・レジ清算・店内清掃など）や休憩のタイミングをデータ通信部１４０を介して、従業員の表示／指示端末１０２に報知する。 Step S112: The instruction output unit 132 informs the employee via the data communication unit 140 of work other than customer service (ordering of ingredients, preparation, cooking, cleanup, checkout, cleaning of the store, etc.) and the timing of breaks, according to the vacant time. The display/instruction terminal 102 of the member is notified.

ステップＳ１１３：指示出力部１３２は、予定された接客スケジュールのタイミングでサービス指示と、サービスすべきテーブルまたは顧客の位置情報とをデータ通信部１４０を介して、従業員の表示／指示端末１０２に報知する。 Step S113: The instruction output unit 132 notifies the employee's display/instruction terminal 102 via the data communication unit 140 of the service instruction and the location information of the table to be serviced or the customer at the scheduled timing of the customer service schedule. do.

なお、サービス指示を、図１６に示すようなマップ情報７００にして、バックヤードの表示出力装置１０３に表示してもよい。例えば、「退店」の進捗状態が発生すると、マップ上のレジ箇所に「レジ待ちあり」が表示される。このような表示出力装置１０３を従業員が視認しやすい場所に置くことで、全体のオペレーションを把握できる他、指示端末を持たない従業員も、各自が処理すべき仕事を視認しやすいようになる。 The service instruction may be displayed on the display output device 103 in the backyard as map information 700 as shown in FIG. For example, when the progress status of "Exit" occurs, "Waiting for checkout" is displayed at the checkout location on the map. By placing such a display output device 103 in a place where employees can easily see it, the entire operation can be grasped, and even employees who do not have an instruction terminal can easily see the work to be processed by each employee. .

ステップＳ１１４：認識部１２４は、食事状況の画像認識と併せて、コップの中の飲料水の残量についての補充物状況を画像認識する。状態遷移部１３１は、補充物状況に応じてコップの中の飲料水が不足状態か否かを判定する。 Step S114: The recognition unit 124 performs image recognition of the replenishment status regarding the remaining amount of drinking water in the cup, together with the image recognition of the meal status. The state transition unit 131 determines whether the drinking water in the cup is in a shortage state according to the replenishment state.

指示出力部１３２は、不足状態の判定に対して、飲料水の補充を行うサービス指示と、サービスすべきテーブルまたは顧客の位置情報とをデータ通信部１４０を介して、従業員の表示／指示端末１０２に報知する。 The instruction output unit 132 sends a service instruction to replenish the drinking water and the location information of the table to be serviced or the customer to the employee's display/instruction terminal via the data communication unit 140 in response to the determination of the shortage state. 102.

なお、補充物は飲料水に限らず、ドリンク、お茶、調味料、付け合わせ、ご飯、パン、卓上ケース内の箸やフォークやナイフなどのお代わりまたは補充を行うものでもよい。 Note that the replenishment is not limited to drinking water, and may be drinks, tea, seasonings, garnishes, rice, bread, chopsticks, forks, knives, etc. in the desktop case.

上述した一連の動作（ステップＳ１０１～１１４）を完了すると、画像認識装置１００は動作をステップＳ１０１に戻すことで、従業員の接客支援が繰り返し継続的に行われる。
＜実施例の効果＞After completing the above-described series of operations (steps S101 to 114), the image recognition apparatus 100 returns the operation to step S101, so that the employee's customer service support is repeatedly performed continuously.
<Effect of Example>

（１）実施例では、機械学習した学習モデルを用いて、飲食スペースの映像から顧客の食事状況を画像認識する。したがって、特許文献１の飲食残量の面積測定のような固定的な画像認識とは異なり、顧客の食事状況を柔軟に画像認識することが可能になる。 (1) In the embodiment, a machine-learning model is used to image-recognize the customer's eating situation from the video of the eating and drinking space. Therefore, unlike the fixed image recognition such as area measurement of remaining amount of food and drink in Patent Document 1, it is possible to flexibly recognize the customer's meal status by image recognition.

（２）特許文献１の飲食残量の面積測定では、顧客に「注文」の意思があるか否かは判定のしようがない。それに対して、実施例では、食事状況の他に、メニュー状況について画像認識を行う。その結果、メニューが開かれた状況を検知して顧客が「注文」の意思があるなどを総合的に判定し、顧客の注文意思を的確に捉えることが可能になる。したがって、実施例の採用により、顧客が従業員に注文の声をかける前に、従業員が顧客に注文を伺うことが可能になる。そのため、顧客満足度を高めることが可能になる。 (2) There is no way to determine whether or not the customer has the intention of "ordering" in the area measurement of the remaining amount of food and drink in Patent Document 1. On the other hand, in the embodiment, image recognition is performed on the menu status in addition to the meal status. As a result, it is possible to comprehensively determine whether the customer has an intention to place an order by detecting when the menu is open, and to accurately grasp the customer's intention to place an order. Therefore, by employing the embodiment, it is possible for the employee to ask the customer for the order before the customer calls the employee for the order. Therefore, it becomes possible to improve customer satisfaction.

（３）また、食事状況が変化するのは、「食事提供」の直後であるのに対して、メニュー状況が変化するのはそれよりも前の「入店」後からである。そうしてみると、食事状況にメニュー状況を加えることにより、より広範囲の期間についてサービス指示を報知することが可能になる。 (3) In addition, the meal situation changes immediately after "meal provision", whereas the menu situation changes after "entering the restaurant". In this way, by adding the menu status to the meal status, it becomes possible to notify service instructions over a wider range of time periods.

（４）また、実施例では、顧客の飲食の進捗状態を状態遷移モデルで表す。そのため、現在の進捗状態と、画像認識の結果（遷移条件）とに基づいて、進捗状態を状態遷移させる。例えば、「食後」の進捗状態において、画像認識によりメニューが開かれたことを検知することで、顧客に「追加注文」の意思があるなどのより詳細な注文意思を的確に捉えることが可能になる。一般に「最初の注文」に比べて「追加注文」は必ず発生するわけではないため、その注文意思を的確に捉えるには、従業員側に接客についての長い経験が必要になる。しかし、実施例の採用により、経験の浅い従業員であっても顧客に追加注文を適切なタイミングで伺うことが可能になる。そのため、顧客満足度を一段と高めることが可能になる。 (4) In addition, in the embodiment, the progress of the customer's eating and drinking is represented by a state transition model. Therefore, the state of progress is changed based on the current state of progress and the result of image recognition (transition condition). For example, by using image recognition to detect that a menu has been opened in the "after meal" progress, it is possible to accurately capture more detailed order intentions, such as the customer's intention to place an "additional order." Become. In general, "additional orders" are not always generated compared to "first orders", so in order to accurately grasp the intention of the order, the employees need to have a long experience in customer service. However, by adopting the embodiment, even an inexperienced employee can ask the customer for an additional order at an appropriate time. Therefore, it is possible to further improve customer satisfaction.

（５）さらに実施例では、時間経過による予測によって状態遷移モデルを状態遷移させる。そのため、何らかの原因により画像認識の結果が得られない場合にも、顧客の進捗状態が進み、従業員は注文や配膳などのサービス指示を受けることができる。したがって、実施例においてサービス指示が停止してしまって顧客に迷惑がかかるといった事態を防ぐことができる。 (5) Furthermore, in the embodiment, the state transition model is changed by prediction based on the passage of time. Therefore, even if the image recognition result cannot be obtained for some reason, the customer's progress progresses and the employee can receive service instructions such as ordering and serving. Therefore, in the embodiment, it is possible to prevent the customer from being inconvenienced by the service instruction being stopped.

（６）また、実施例では、ＰＯＳ端末などから収集した顧客情報に基づいて状態遷移モデルを優先的（強制的）に状態遷移させる。そのため、何らかの原因により画像認識の結果が得られない場合にも、顧客の進捗状態が進み、従業員はサービス指示を受けることができる。したがって、実施例においてサービス指示が停止してしまって顧客に迷惑がかかるといった事態を防ぐことができる。 (6) In the embodiment, the state transition model preferentially (forcibly) transitions based on the customer information collected from the POS terminal or the like. Therefore, even if the image recognition result cannot be obtained for some reason, the customer's progress progresses and the employee can receive service instructions. Therefore, in the embodiment, it is possible to prevent the customer from being inconvenienced by the service instruction being stopped.

（７）さらに実施例では、顧客の有無に応じて画像認識を行う検知領域を決定する。そのため、顧客のいない領域について無駄に画像認識を行うことがなくなり、効率的な処理が可能になる。 (7) Furthermore, in the embodiment, the detection area for image recognition is determined according to the presence or absence of a customer. Therefore, image recognition is not performed wastefully for regions where there are no customers, and efficient processing becomes possible.

（８）また実施例では、複数の顧客や顧客グループについて、サービス指示の優先レベルを勘案した接客スケジュールを作成する。そのため、多数のサービス指示が集中して発生して混乱するなどの状況を緩和することが可能になる。従来このような混乱を避けるためには、従業員側に接客についての深く長い経験が必要になる。しかし、実施例の採用により、経験の浅い従業員であっても優先レベルの高いサービス指示から順に実施すればよくなる。したがって、顧客満足度を一段と高めることが可能になる。 (8) In addition, in the embodiment, a customer service schedule is created in consideration of the priority levels of service instructions for a plurality of customers or customer groups. Therefore, it is possible to alleviate the situation in which a large number of service instructions are generated intensively and cause confusion. Conventionally, in order to avoid such confusion, it is necessary for employees to have deep and long experience in customer service. However, by adopting the embodiment, even an inexperienced employee can perform service orders in order from the highest priority level. Therefore, it is possible to further improve customer satisfaction.

（９）さらに実施例では、接客スケジュールに基づいて空き時間を予測する。したがって、従業員に対してサービス指示だけではなく、空き時間を指示することが可能になる。したがって、実施例の採用により、従業員の行動に無為な空き時間が生まれるといったことがなくなり、空き時間を意識して効率的に活用することが可能になる。 (9) Furthermore, in the embodiment, free time is predicted based on the customer service schedule. Therefore, it is possible to instruct the employee not only about the service but also about the free time. Therefore, by adopting the embodiment, it is possible to prevent idle time from being generated in the behavior of employees, and to make efficient use of idle time while being conscious of it.

（１０）また、実施例では、お茶や飲料水などの補充物状況についても画像認識を行い、補充指示を適時に報知することができる。したがって、実施例の採用により、従業員が補充物の不足を何度も目視確認する必要がなくなり、従業員の手間を減らすことが可能になる。 (10) In addition, in the embodiment, it is possible to perform image recognition on the replenishment status of tea, drinking water, etc., and to notify the replenishment instruction in a timely manner. Therefore, the adoption of the embodiment eliminates the need for the employee to visually confirm the lack of replenishment many times, and it is possible to reduce the labor of the employee.

（１１）以上述べたように、実施例では、店舗の従業員や経営者は、顧客の食事状況などに基づいて、次にやるべき仕事について、必要なタイミングでサービス指示を受けることが可能になる。 (11) As described above, in the embodiment, the employees and managers of the store can receive service instructions at the necessary timing regarding the next work to be done based on the customer's meal situation. Become.

（１２）さらに、実施例では、従業員は、必要なタイミングでサービス指示を受けるため、顧客の状態を目視確認するなど意識を払う必要が少なく、レジや調理／片付け業務や清掃など、その他業務に集中することが可能となり、業務効率の向上につながる。 (12) Furthermore, in the embodiment, since the employee receives service instructions at the necessary timing, there is little need to pay attention such as visually confirming the customer's condition, and other tasks such as cash register, cooking/cleaning, cleaning, etc. It will be possible to concentrate on the work, leading to improvement of work efficiency.

（１３）この業務効率の向上の結果、実施例では、食事提供までの待ち時間が短縮される。したがって、効率的かつ顧客満足度の高い速やかな業務遂行が期待できる。 (13) As a result of this improvement in work efficiency, the waiting time until the meal is served is shortened in the embodiment. Therefore, efficient and speedy business execution with high customer satisfaction can be expected.

＜実施例の補足＞
なお、実施例では、ニューラルネットワークの学習モデル１２５について説明した。しかしながら、本発明はこれに限定されない。本発明は、クラス分類が可能な学習モデルであればよく、決定木学習などの学習モデルを使用することもできる。<Supplement to Examples>
In addition, in the embodiment, the learning model 125 of the neural network has been described. However, the invention is not so limited. The present invention may be any learning model capable of classifying, and can also use a learning model such as decision tree learning.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。 In addition, the present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations.

また、ある構成の一部を他の構成に置き換えることが可能である。 Also, it is possible to replace part of a configuration with another configuration.

さらに、実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 Furthermore, it is possible to add, delete, or replace a part of the configuration of the embodiment with another configuration.

１…業務支援システム、１００…画像認識装置、１０１…撮像装置、１０２…表示／指示端末、１０３…表示出力装置、１１０…映像取得部、１２０…画像認識部、１２１…パラメータ設定部、１２２…前処理部、１２３…特徴抽出部、１２４…認識部、１２５…学習モデル、１３０…状態判定部、１３１…状態遷移部、１３２…指示出力部、１４０…データ通信部、１５０…記録制御部、１６０…記録装置、１６０ａ…履歴、１７０…表示制御部 REFERENCE SIGNS LIST 1 business support system 100 image recognition device 101 imaging device 102 display/instruction terminal 103 display output device 110 video acquisition unit 120 image recognition unit 121 parameter setting unit 122 Preprocessing unit 123 Feature extraction unit 124 Recognition unit 125 Learning model 130 State determination unit 131 State transition unit 132 Instruction output unit 140 Data communication unit 150 Recording control unit 160... Recording device, 160a... History, 170... Display control unit

Claims

an image acquisition unit that acquires an image of the eating and drinking space;
Using a learning model obtained by machine-learning the classification of the presence/absence of people, the classification of eating situations, and the classification of menu situations for a group of images of the eating and drinking space, the classification of the presence/absence of people for the images acquired by the image acquisition unit. , an image recognition unit for image recognition of the class classification of the meal situation and the class classification of the menu situation;
a state determination unit that determines a progress state regarding the customer's eating and drinking by a state transition combining the image-recognized person presence/absence class classification, the meal situation class classification, and the menu situation class classification;
An image recognition device comprising:

The image recognition device according to claim 1,
The state determination unit
Having a state transition model that models the state transition of the progress state, and determining the progress state by applying the progress state to the state transition model using the recognition result of the image recognition unit as a transition condition to cause the state transition. An image recognition device characterized by:

In the image recognition device according to claim 2,
The state determination unit
An image recognition device that predicts the state transition regarding the customer's eating and drinking based on the passage of time.

In the image recognition device according to any one of claims 1 to 3,
The state determination unit
An image recognition apparatus that acquires the customer information from an information processing apparatus that collects customer information regarding the customer's order or settlement , and determines the progress state by prioritizing determination based on the customer information.

In the image recognition device according to any one of claims 1 to 4,
The image recognition device, wherein the image recognition unit detects the presence or absence of the customer in the image, and determines a detection area for image recognition in the image according to the presence or absence of the customer.

In the image recognition device according to any one of claims 1 to 5,
The state determination unit
An image recognition apparatus, characterized in that a service instruction corresponding to the progress state is notified.

In the image recognition device according to any one of claims 1 to 6,
The state determination unit manages the priority level of the service instruction corresponding to the progress state for each customer or for each customer group, and notifies the service instruction corresponding to the progress state according to the priority level. An image recognition device characterized by:

In the image recognition device according to any one of claims 1 to 7,
The image recognition device, wherein the state determination unit determines a vacant time related to customer service based on the progress state, and notifies timing of work other than customer service or break according to the vacant time.

In the image recognition device according to any one of claims 1 to 8,
The image recognition unit
image recognition of the situation regarding replenishment to be replaced or replenished in the eating and drinking space (hereinafter referred to as "replenishment status");
The state determination unit
An image recognition apparatus, comprising: judging the shortage of the replenishment based on the status of the replenishment, and notifying a service instruction corresponding to the shortage of the replenishment.

An image recognition program that causes a computer to function as the image recognition unit and the state determination unit according to any one of claims 1 to 9.

an image acquisition step of acquiring an image of the eating and drinking space;
Using a learning model obtained by machine-learning classification of presence/absence of a person, classification of eating situation, and classification of menu situation for a group of images of an eating and drinking space, classifying the presence/absence of a person for the video acquired in the video acquisition step. , an image recognition step of image-recognizing the class classification of the meal situation and the class classification of the menu situation;
a state determination step of determining a progress state of the customer's eating and drinking by a state transition combining the image-recognized person presence/absence class classification, the meal status class classification, and the menu status class classification;
An image recognition method comprising:

In the image recognition method according to claim 11,
The state determination step includes:
Having a state transition model that models the state transition of the progress state,
The image recognition method, wherein the progress state is determined by applying the progress state to the state transition model using a recognition result of the image recognition step as a transition condition to cause a state transition.