JP2023074751A

JP2023074751A - Information processing system, information processing device, information processing method, and program

Info

Publication number: JP2023074751A
Application number: JP2021187864A
Authority: JP
Inventors: 賢治進藤; Kenji Shindo
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-11-18
Filing date: 2021-11-18
Publication date: 2023-05-30

Abstract

To provide an information processing system for suppressing the occurrence of false recognition and reducing processing costs, a device, a method, and a program.SOLUTION: In a recognition device 12, an image acquisition unit 400 receives images captured by an in-store imaging device or a cash register imaging device and imaging data including imaging times of the images, and sequentially transmits the received imaging data to a face detection unit 401. The face detection unit performs face detection processing on the images received by the image acquisition unit, and extracts an area including the face in the images as a boundary frame of the face. In the face detection processing, the face detection unit 401 acquires information indicating feature points of facial organs and the likelihood of the facial organs, along with information on coordinates, width, and height of the boundary frame of the face. A feature quantity extraction unit 402 includes a high-precision processing unit 408 configured to extract a high-precision feature quantity being a second feature quantity through second extraction processing, and a medium-precision processing unit 409 configured to extract a medium-precision feature quantity being a first feature quantity through first extraction processing.SELECTED DRAWING: Figure 7

Description

本発明は、人物認識を使用した人物照合等に使用可能な情報処理技術に関する。 The present invention relates to an information processing technology that can be used for person verification using person recognition.

特許文献１には、顔認識を用いた決済システムが開示されている。特許文献１では、店内カメラを用いて撮像した来店者の顔画像と、登録されている全ての会員情報の顔画像とを照合して、来店中の会員リストを作成する。そして、レジにおける決済時に、来店中の会員リストの顔画像とレジに来た客の顔画像とを用いた照合を行って会員を特定する。また特許文献２には、画像データから顔検出処理や特徴量抽出処理を行う手法として、ディープニューラルネットワークと呼ばれる多階層のニューラルネットワークを用いることが開示されている。 Patent Literature 1 discloses a payment system using face recognition. In Patent Literature 1, a list of members currently visiting the store is created by collating face images of customers captured using an in-store camera with face images of all registered member information. Then, at the time of payment at the cash register, matching is performed using the face image of the list of members who are visiting the store and the face image of the customer who has come to the cash register to identify the member. Japanese Patent Laid-Open No. 2002-200002 discloses using a multi-layered neural network called a deep neural network as a technique for performing face detection processing and feature amount extraction processing from image data.

特開２０１８－１０１４２０号公報JP 2018-101420 A 特開２０２０－０３０４８０号公報Japanese Patent Application Laid-Open No. 2020-030480

一般的に店内カメラは高い位置に配されていることが多く、その店内カメラで撮像された画像は店内を俯瞰的に撮像した画像（俯瞰画像）となる。しかしながら俯瞰画像に写る人物の顔画像は斜め上側から見える画像となるため、人物を正面から撮像した顔画像と比較して照合精度が低下し、誤認識が生ずる可能性がある。ここで、人物を誤認識した状態で特許文献１に開示されるような会員リストが作成されると、レジでの決済時にレジ前に来た客の顔画像を用いて再度照合を行おうとしても、会員リストから一致する会員を見つけられずに再照合に失敗することがある。また店舗の規模によっては店内カメラやレジ前のカメラの設置台数が多くなることがあり、それらカメラの台数が多くなればなるほど、顔画像を用いた顔認識のための処理コストが増大する。 Generally, in-store cameras are often arranged at high positions, and the image captured by the in-store camera is a bird's-eye view image of the store (bird's-eye view image). However, since the face image of the person in the bird's-eye view image is an image that can be seen obliquely from above, there is a possibility that the matching accuracy will be lower than that of the face image of the person captured from the front, resulting in erroneous recognition. Here, when a member list as disclosed in Patent Document 1 is created in a state where a person is erroneously recognized, an attempt is made to re-verify using the face images of customers who come to the cash register at the time of payment at the cash register. may fail to rematch without finding a matching member from the member list. In addition, depending on the size of the store, the number of in-store cameras and cameras in front of the cash register may increase. As the number of cameras increases, the processing cost for face recognition using face images increases.

そこで本発明は、誤認識の発生を抑制し、処理コストの低減をも可能にすることを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to suppress the occurrence of erroneous recognition and to reduce the processing cost.

本発明は、第１の情報処理装置と第２の情報処理装置とを含む情報処理システムであって、前記第１の情報処理装置は、複数の人物の顔画像から、第１の抽出処理によって抽出された第１の特徴情報と、前記第１の抽出処理よりも処理量が多い第２の抽出処理によって抽出された第２の特徴情報と、を含む人物情報を取得し、人物を撮影した第１の顔画像から前記第１の抽出処理によって抽出された第３の特徴情報と、前記人物情報に含まれる前記第１の特徴情報とを比較して類似度を取得する第１の照合処理を行い、前記第１の照合処理による類似度を含めた前記人物情報を、前記第２の情報処理装置に送信し、前記第２の情報処理装置は、前記第１の情報処理装置から前記類似度を含む人物情報を取得し、前記人物情報から、前記類似度に応じた前記第１の特徴情報および前記第２の特徴情報を含むリストを作成し、人物を撮影した第２の顔画像から前記第２の抽出処理によって抽出した第４の特徴情報と、前記リストの前記類似度が高い順の前記第２の特徴情報とを比較して、前記第２の顔画像の人物を、前記人物情報の中から特定する第２の照合処理を行う、ことを特徴とする。 The present invention is an information processing system including a first information processing device and a second information processing device, wherein the first information processing device extracts face images of a plurality of persons by a first extraction process. Acquiring person information including the extracted first feature information and second feature information extracted by a second extraction process having a larger amount of processing than the first extraction process, and photographing the person a first matching process of comparing the third feature information extracted from the first face image by the first extraction process and the first feature information included in the person information to obtain a degree of similarity; and transmits the person information including the degree of similarity obtained by the first matching process to the second information processing device, and the second information processing device receives the similarity from the first information processing device. a list including the first feature information and the second feature information according to the similarity is created from the person information; and a second facial image of the person is created The person in the second face image is identified by comparing the fourth feature information extracted by the second extraction process with the second feature information in the list in descending order of similarity. It is characterized by performing a second collation process to specify from the information.

本発明によれば、誤認識の発生を抑制でき、処理コストの低減も可能となる。 According to the present invention, the occurrence of erroneous recognition can be suppressed, and the processing cost can be reduced.

情報処理システムの概略構成図である。1 is a schematic configuration diagram of an information processing system; FIG. 店舗内の各配置の一例を示すイメージ図である。It is an image figure which shows an example of each arrangement|positioning in a store. 管理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a management apparatus. 認識装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a recognition apparatus. 管理装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of a management apparatus. 会員情報リストの一例を示す図である。It is a figure which shows an example of a member information list. 認識装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of a recognition apparatus. 第１の実施形態に係る来店者照合リストの一例を示す図である。It is a figure which shows an example of the visitor collation list|wrist which concerns on 1st Embodiment. 顔器官検出の説明に用いる図である。FIG. 10 is a diagram used for explaining facial organ detection; 会員登録処理のフローチャートである。It is a flow chart of member registration processing. 第１の実施形態の第１の顔照合時のフローチャートである。7 is a flow chart at the time of first face matching according to the first embodiment; 第２の顔照合時のフローチャートである。10 is a flow chart for second face collation. 第２の実施形態の第１の顔照合時のフローチャートである。FIG. 11 is a flow chart at the time of first face matching according to the second embodiment; FIG. 第３の実施形態に係る来店者照合リストの一例を示す図である。FIG. 13 is a diagram showing an example of a visitor verification list according to the third embodiment; FIG. 第４の実施形態に係る情報処理装置の機能構成を示す図である。It is a figure which shows the functional structure of the information processing apparatus which concerns on 4th Embodiment. 第４の実施形態に係る情報処理のフローチャートである。14 is a flowchart of information processing according to the fourth embodiment; 第４の実施形態に係る情報処理のシーケンス図である。FIG. 14 is a sequence diagram of information processing according to the fourth embodiment;

以下、本発明に係る実施形態を、図面を参照しながら説明する。以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。実施形態の構成は、本発明が適用される装置の仕様や各種条件（使用条件、使用環境等）によって適宜修正又は変更され得る。また、後述する各実施形態の一部を適宜組み合わせて構成してもよい。以下の各実施形態において、同一の構成には同じ参照符号を付して説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments according to the present invention will be described with reference to the drawings. The following embodiments do not limit the present invention, and not all combinations of features described in the embodiments are essential to the solution of the present invention. The configuration of the embodiment can be appropriately modified or changed according to the specifications of the device to which the present invention is applied and various conditions (use conditions, use environment, etc.). Also, a part of each embodiment described later may be appropriately combined. In each of the following embodiments, the same configurations are given the same reference numerals.

本実施形態では、顔認識を用いて決済を行うための情報処理装置を含む情報処理システムを例に挙げて説明する。顔認識を用いて決済を行う情報処理システムでは、予め会員の顔画像と決済情報とを紐づけて人物情報（以後、会員情報とする）として登録しておき、店舗等のレジに設置されたカメラで撮像した客の顔画像と会員情報に登録済みの顔画像とを照合する。そして、照合が成功した場合に、その照合できた顔画像と紐づく決済情報に基づく決済処理が実行される。ここで、登録されている会員情報が多くなると、レジでの照合の処理時間がかかり、レジでの決済時間が長くなってしまうが、前述した特許文献１の技術を用いれば決済時間の短縮が可能になると考えられる。しかしながら、一般的な店内カメラで得られる俯瞰画像では照合の精度が低下して誤認識が生ずることがあり、誤認識した人物を基に特許文献１に開示された会員リストが作成されると、前述したようにレジでの決済時の再照合に失敗することがある。また店舗の規模によって店内カメラやレジ前のカメラの設置台数が多くなると、顔画像を用いた顔認識のための処理コストは増大する。
そこで、本実施形態の情報処理システムでは、顔認識を用いた決済処理を行うための構成として、以下に説明する構成を有し、後述する処理を行うことによって、決済時間の短縮、誤認識の発生抑制、および計算処理コストの低減を可能とする。 In this embodiment, an information processing system including an information processing device for performing settlement using face recognition will be described as an example. In an information processing system that uses face recognition to make payments, a member's face image and payment information are linked in advance and registered as personal information (hereinafter referred to as member information), and is installed at the cash register of a store, etc. A customer's face image captured by a camera is collated with a face image registered in member information. Then, when the collation is successful, settlement processing is executed based on the settlement information associated with the face image that has been collated. If there is a large amount of registered member information, it takes a long time to process the checkout at the cash register, and the settlement time at the cash register becomes long. It is considered possible. However, with a bird's-eye view image obtained by a general in-store camera, the accuracy of matching may decrease and misrecognition may occur. As mentioned above, re-matching at the time of payment at the cash register may fail. In addition, as the number of in-store cameras and cameras installed in front of cash registers increases depending on the size of the store, the processing cost for face recognition using face images increases.
Therefore, the information processing system of the present embodiment has the configuration described below as a configuration for performing settlement processing using face recognition. It is possible to suppress the generation and reduce the calculation processing cost.

＜第１の実施形態＞
図１は、第１の本実施形態に係る情報処理システムの一構成例を示した図である。
本実施形態の情報処理システムにおいて、第１の情報処理装置としての管理装置１０と、第２の情報処理装置の１つである認識装置１２ａとは、ネットワーク１１を介して接続されている。同様に管理装置１０と、第２の情報処理装置の１つである認識装置１２ｂとは、ネットワーク１１を介して接続されている。認識装置１２ａと店内撮像装置１４ａとレジ撮像装置１５ａとはネットワーク１３ａを介して接続され、同様に、認識装置１２ｂと店内撮像装置１４ｂとレジ撮像装置１５ｂとはネットワーク１３ｂを介して接続されている。なお、図１では、認識装置１２ａと１２ｂ、店内撮像装置１４ａと１４ｂ、レジ撮像装置１５ａと１５ｂ、およびネットワーク１３ａと１３ｂのように、それぞれ二つの例を挙げているが、それぞれ一つであってもよいし、三つ以上であってもよい。ネットワーク１１とネットワーク１３ａおよび１３ｂは、例えばＥｔｈｅｒｎｅｔ（登録商標）等の通信規格に準拠する複数のルータ、スイッチ、ケーブル等から実現される。またネットワーク１１とネットワーク１３ａおよび１３ｂは、インターネットや、無線ＬＡＮ（ＷＩＲＥＬＥＳＳＬＡＮ）、ＷＡＮ（ＷＩＤＥＡＲＥＡＮＥＴＷＯＲＫ）等により実現されてもよい。以下、認識装置１２ａと１２ｂ、店内撮像装置１４ａと１４ｂ、レジ撮像装置１５ａと１５ｂ、ネットワーク１３ａと１３ｂを、それぞれ区別せずに説明する場合には認識装置１２、店内撮像装置１４、レジ撮像装置１５、ネットワーク１３とのみ表記する。 <First embodiment>
FIG. 1 is a diagram showing a configuration example of an information processing system according to the first embodiment.
In the information processing system of this embodiment, the management device 10 as the first information processing device and the recognition device 12a as one of the second information processing devices are connected via the network 11 . Similarly, the management device 10 and the recognition device 12b, which is one of the second information processing devices, are connected via the network 11. FIG. The recognition device 12a, the in-store imaging device 14a, and the cash register imaging device 15a are connected via a network 13a, and similarly, the recognition device 12b, the in-store imaging device 14b, and the cash register imaging device 15b are connected via a network 13b. . In FIG. 1, two examples are given, such as the recognition devices 12a and 12b, the in-store imaging devices 14a and 14b, the register imaging devices 15a and 15b, and the networks 13a and 13b. or three or more. The network 11 and the networks 13a and 13b are realized by a plurality of routers, switches, cables, etc. complying with communication standards such as Ethernet (registered trademark). Also, the network 11 and the networks 13a and 13b may be realized by the Internet, wireless LAN (WIRELESS LAN), WAN (WIDE AREA NETWORK), or the like. Hereinafter, when the recognition devices 12a and 12b, the in-store imaging devices 14a and 14b, the cashier imaging devices 15a and 15b, and the networks 13a and 13b are not distinguished from each other, the recognition device 12, the in-store imaging device 14, and the cash register imaging device will be referred to as the recognition device 12a and 12b. 15 and network 13 only.

図２は、本実施形態の情報処理システムが適用される店舗のイメージの一例を示した図である。
図２に示すように、店舗２０には、店内撮像装置２０２、レジ撮像装置２０４、レジ２０１が配されているとする。店内撮像装置２０２は図１の店内撮像装置１４、レジ撮像装置２０４は図１のレジ撮像装置１５に相当する。店内撮像装置２０２は店内を俯瞰的に撮影できる位置に配置され、レジ撮像装置２０４はレジ２１０の前に配置されている。つまり店内撮像装置２０２は来店して店内に存在する人物（客等）を俯瞰的に撮影し、レジ撮像装置２０４はレジ２１０の前に来た人物（決済しようとしている客）を正面から撮影する。なお図２では、店内撮像装置２０２とレジ撮像装置２０４とが一台のみ描かれているが、店内撮像装置２０２とレジ撮像装置２０４はそれぞれ複数台配置されていてもよい。また店内撮像装置１４は、店内に設置されている場合だけでなく、店舗入口近辺などの店外に設置されていてもよい。レジ２０１も一つに限らず複数であってもよい。レジ２０１が複数設けられている場合、レジ撮像装置２０４はレジごとに配される。 FIG. 2 is a diagram showing an example of an image of a store to which the information processing system of this embodiment is applied.
As shown in FIG. 2, the store 20 is provided with an in-store imaging device 202, a cash register imaging device 204, and a cash register 201. FIG. An in-store imaging device 202 corresponds to the in-store imaging device 14 in FIG. 1, and a cash register imaging device 204 corresponds to the cash register imaging device 15 in FIG. The in-store imaging device 202 is arranged at a position where the inside of the store can be photographed from above, and the cash register imaging device 204 is arranged in front of the cash register 210 . In other words, the in-store imaging device 202 captures a bird's-eye view of a person (customer, etc.) who has come to the store and is present in the store, and the cash register imaging device 204 captures a person (customer about to make a payment) who has come to the cash register 210 from the front. . Although only one in-store imaging device 202 and one cash register imaging device 204 are illustrated in FIG. Further, the in-store imaging device 14 may be installed outside the store, such as near the store entrance, as well as inside the store. The cash register 201 is not limited to one, and may be plural. When a plurality of cash registers 201 are provided, the cash register imaging device 204 is arranged for each cash register.

店内撮像装置２０２は店舗２０内の人物（来店者２０３とする）を撮像し、レジ撮像装置２０４はレジ２０１前に来て決済しようとしている人物２０５の顔を撮像する。これら店内撮像装置２０２にて撮像された来店者２０３の画像は、図１のネットワーク１３を介して認識装置１２に送られ、さらにネットワーク１１を介して管理装置１０に送られる。またレジ撮像装置２０４にて撮像された人物２０５（決済しようとしている人物）の顔画像は、図１のネットワーク１３を介して認識装置１２に送られる。なお、他の例として、レジ撮像装置１５は、独立した装置ではなく、スマートフォンやタブレット等に認識装置１２と共に設けられていてもよい。 The in-store imaging device 202 images a person (a visitor 203) in the store 20, and the cash register imaging device 204 images the face of a person 205 who is about to make a payment in front of the cash register 201. The image of the visitor 203 captured by the in-store imaging device 202 is sent to the recognition device 12 via the network 13 of FIG. A face image of a person 205 (person who is about to make a payment) captured by the register imaging device 204 is sent to the recognition device 12 via the network 13 in FIG. As another example, the register imaging device 15 may be provided together with the recognition device 12 in a smart phone, a tablet, or the like instead of being an independent device.

図１に説明を戻す。
管理装置１０は、全ての会員について、クレジットカード情報などの決済情報、氏名、年齢などの個人情報、顔画像、および顔認識に用いる顔情報などを保存し、管理する装置である。本実施形態では、ある会員の決済情報、氏名、年齢などの個人情報、顔画像、および顔認識に用いる顔情報などを含めて会員情報とする。管理装置１０は、全ての会員の会員情報を会員情報リストとしてデータベース等に格納して管理しているとする。会員情報リストの詳細は後述する。 Returning to FIG.
The management device 10 is a device that stores and manages payment information such as credit card information, personal information such as names and ages, face images, and face information used for face recognition for all members. In the present embodiment, member information includes payment information of a certain member, personal information such as name and age, face image, face information used for face recognition, and the like. It is assumed that the management device 10 stores and manages member information of all members as a member information list in a database or the like. Details of the member information list will be described later.

ここで、本実施形態において、顔情報とは、顔画像から所定の抽出処理によって抽出される顔の特徴を表す特徴情報であり、本実施形態では特徴量と呼ぶ。また本実施形態において、画像から特徴量（顔情報）を抽出する処理は、ディープニューラルネットワークと呼ばれる多階層のニューラルネットワークがディープラーニングを用いて学習されたニューラルネットワークを用いた処理を例に挙げる。また本実施形態の場合、画像から特徴量を抽出する処理には、第１の抽出処理と第２の抽出処理とがある。これら第１の抽出処理と第２の抽出処理にそれぞれ用いられるニューラルネットワークは回路規模が異なっており、第２の抽出処理では、第１の抽出処理よりも回路規模が大きいニューラルネットワークが用いられる。一般に、ニューラルネットワークの回路規模の大小は、当該ニューラルネットワークによる処理量の大小つまり処理コストの大小と対応した関係を有しており、回路規模が小さい方では、回路規模が大きい方よりも処理量が少なくなる。したがって本実施形態において、第２の抽出処理は、第１の抽出処理よりも処理量が多く、処理コストが大きい。また、回路規模が大きいニューラルネットワークにて取得される特徴量は、一般に、それより回路規模が小さいニューラルネットワークにて取得される特徴量よりも高い精度の特徴量であることが多い。このため本実施形態では回路規模が大きいニューラルネットワークを用いる第２の抽出処理にて取得される第２の特徴量を、高精度特徴量（または高精度顔情報）と呼ぶ。また本実施形態では、第２の抽出処理に用いられるものよりも相対的に回路規模が小さいニューラルネットワークを用いる第１の抽出処理にて取得される第１の特徴量を、中精度特徴量（または中精度顔情報）と呼ぶ。 Here, in the present embodiment, face information is feature information representing features of a face extracted from a face image by predetermined extraction processing, and is called a feature amount in the present embodiment. In the present embodiment, the process of extracting a feature amount (face information) from an image is exemplified by a process using a multi-layered neural network called a deep neural network trained using deep learning. Further, in the case of this embodiment, the processing for extracting the feature amount from the image includes the first extraction processing and the second extraction processing. The neural networks used for the first extraction process and the second extraction process have different circuit scales, and the second extraction process uses a neural network with a larger circuit scale than the first extraction process. In general, the circuit scale of a neural network has a corresponding relationship with the amount of processing by the neural network, that is, the processing cost. becomes less. Therefore, in the present embodiment, the second extraction process has a larger processing amount and higher processing cost than the first extraction process. Also, the feature quantity acquired by a neural network with a large circuit scale is generally a feature quantity with higher precision than the feature quantity acquired by a neural network with a smaller circuit scale. For this reason, in the present embodiment, the second feature quantity obtained by the second extraction process using a neural network with a large circuit scale is called a high-precision feature quantity (or high-precision face information). Further, in the present embodiment, the first feature amount obtained in the first extraction process using a neural network having a relatively smaller circuit scale than that used in the second extraction process is used as a medium-precision feature amount ( or medium-precision facial information).

店内撮像装置１４とレジ撮像装置１５は、それぞれ画像を撮像する装置（カメラ）であり、例えばネットワークカメラ、モジュールカメラなど、その形態は特に規定しない。店内撮像装置１４とレジ撮像装置１５は、撮像した画像データおよび撮像時刻等を含む撮像データを、ネットワーク１３を介して認識装置１２に送信する。なお以下の説明では、記載を簡略にするため、画像データを単に画像とのみ記す。 The in-store imaging device 14 and the cashier imaging device 15 are devices (cameras) that respectively capture images, and may be network cameras, module cameras, or the like, and their forms are not particularly defined. The in-store imaging device 14 and the cash register imaging device 15 transmit imaged image data and imaging data including imaging time and the like to the recognition device 12 via the network 13 . In the following description, image data is simply referred to as an image for the sake of simplification.

認識装置１２は、ネットワーク１３を介して接続されている店内撮像装置１４やレジ撮像装置１５で撮像された画像を取得し、その画像から人物の顔画像を検出し、さらにその検出した顔画像から特徴量を抽出する。詳細は後述するが、第１の実施形態の認識装置１２において、店内撮像装置１４の画像に写っている人物（来店者２０３）の顔画像から特徴量を抽出する際には、第１の抽出処理による第１の特徴量、つまり中精度特徴量（中精度顔情報）を取得する。そして、その抽出した中精度特徴量を、ネットワーク１１を介して管理装置１０へ送信する。 The recognition device 12 acquires an image captured by an in-store imaging device 14 or a cash register imaging device 15 connected via a network 13, detects a face image of a person from the image, and further detects a face image from the detected face image. Extract features. Although the details will be described later, in the recognition device 12 of the first embodiment, when extracting the feature amount from the face image of the person (visitor 203) appearing in the image of the in-store imaging device 14, the first extraction A first feature amount obtained by processing, that is, a medium-precision feature amount (medium-precision face information) is obtained. Then, the extracted medium-precision feature amount is transmitted to the management device 10 via the network 11 .

管理装置１０は、店内撮像装置１４の画像から認識装置１２が抽出した中精度特徴量を取得した場合には、その取得した中精度特徴量と、登録会員について事前に登録されている中精度顔情報（中精度特徴量）とを用いた顔照合処理を行う。第１の実施形態では、店内撮像装置１４の画像から抽出された中精度特徴量と、登録会員として登録されている中精度顔情報（中精度特徴量）とを用いて、管理装置１０で行われる顔照合処理を、第１の顔照合と表記する。そして管理装置１０は、その第１の顔照合の結果と会員情報とを、認識装置１２に送る。詳細は後述するが、本実施形態において、第１の顔照合の結果を示す情報には、顔の特徴量が類似する度合を示す類似度スコアが用いられる。同じく詳細は後述するが、第１の顔照合の結果と共に認識装置１２に送られる会員情報は、管理装置１０が管理している会員情報リストの中から、第１の顔照合の結果に応じた会員情報となされる。これら第１の顔照合の結果と会員情報は、認識装置１２において、後述する来店者照合リストに保持される情報となされる。 When the management device 10 acquires the medium-precision feature quantity extracted by the recognition device 12 from the image captured by the in-store imaging device 14, the management device 10 combines the acquired medium-precision feature quantity and the medium-precision face registered in advance for the registered member. Face matching processing is performed using information (medium-precision feature amount). In the first embodiment, the management device 10 uses medium-precision feature amounts extracted from images captured by the in-store imaging device 14 and medium-precision face information (medium-precision feature amounts) registered as registered members. This face matching process is referred to as first face matching. The management device 10 then sends the result of the first face matching and the member information to the recognition device 12 . Although the details will be described later, in the present embodiment, a similarity score indicating the degree of similarity of facial feature amounts is used as the information indicating the result of the first face matching. Although details will also be described later, the member information sent to the recognition device 12 together with the result of the first face verification is selected from the member information list managed by the management device 10 according to the result of the first face verification. Made with membership information. The result of the first face collation and the member information are used as information held in a visitor collation list, which will be described later, in the recognition device 12 .

認識装置１２は、管理装置１０による第１の顔照合の結果と会員情報とを含む来店者照合リストを作成および保持して管理する。詳細は後述するが、来店者照合リストは、第１の顔照合の結果として得られた類似度スコアと、その第１の顔照合の結果に応じた会員情報のうち、会員の識別情報、決済情報、および顔認識に用いる顔情報を少なくとも含むリストとなされる。 The recognition device 12 creates, holds, and manages a visitor verification list including the result of the first face verification by the management device 10 and the member information. Although the details will be described later, the store visitor matching list includes the similarity score obtained as a result of the first face matching and member information corresponding to the result of the first face matching. information, and a list containing at least face information used for face recognition.

その後、認識装置１２は、ネットワーク１３を介してレジ撮像装置１５からレジ前の人物（図２の人物２０５）を撮像した画像を取得すると、その人物の画像から顔画像を検出し、さらにその顔画像から特徴量を抽出する。詳細は後述するが、本実施形態の認識装置１２において、レジ撮像装置１５の画像に写っている人物（決済しようとしている人物２０５）の顔画像から特徴量を抽出する際には、第２の抽出処理による第２の特徴量、つまり高精度特徴量（高精度顔情報）を取得する。そして認識装置１２は、レジ前の人物の顔画像から得た高精度特徴量と、来店者照合リストの会員情報に含まれる高精度顔情報（高精度特徴量）とを用いて顔認識を行う。本実施形態では、レジ前の人物の顔画像から抽出した高精度特徴量と、来店者照合リストの会員情報に含まれる高精度顔情報（高精度特徴量）とを用いた顔照合処理を、第２の顔照合と表記する。そして、認識装置１２は、第２の顔照合の結果を基に、レジ前の人物が、来店者照合リストの会員情報のなかの何れかの会員であるかを特定する。 After that, when the recognition device 12 acquires an image of the person in front of the cash register (the person 205 in FIG. 2) from the cash register imaging device 15 via the network 13, the recognition device 12 detects a face image from the image of the person, and further detects the face image. Extract features from images. Although the details will be described later, in the recognition device 12 of the present embodiment, when extracting the feature amount from the face image of the person (person 205 who is about to make a payment) in the image of the cash register imaging device 15, the second A second feature amount obtained by extraction processing, that is, a high-precision feature amount (high-precision face information) is obtained. Then, the recognition device 12 performs face recognition using the high-precision feature amount obtained from the face image of the person before the checkout and the high-precision face information (high-precision feature amount) included in the member information of the visitor verification list. . In this embodiment, face matching processing using high-precision feature amounts extracted from the face image of a person before the checkout and high-precision face information (high-precision feature amounts) included in the member information of the visitor verification list is performed by: This is referred to as second face collation. Based on the result of the second face verification, the recognition device 12 identifies whether the person before the checkout is a member of the member information of the visitor verification list.

なお本実施形態において、管理装置１０は、データセンターに１台設置されているとするが、複数台が設置されていてもよいし、店舗やクラウド上に置かれていてもよい。また前述の例では、認識装置１２は例えば各店舗に１つずつ置かれ、それぞれの店舗ごとの店内撮像装置１４およびレジ撮像装置１５と接続されているが、認識装置１２はデータセンターやクラウド上などに置かれていてもよい。さらに他の例として、認識装置１２に係る各機能を実現するためのプログラムがレジ端末にインストールされていてもよく、この場合、認識装置１２はレジ端末に含まれる。また本実施形態では、管理装置１０と認識装置１２が別構成となっている例を挙げているが、管理装置１０と認識装置１２とは一体の装置であってもよい。この場合の管理装置１０は、認識装置１２に係る第２の顔照合を含む各処理も実行する。 Note that in the present embodiment, one management device 10 is installed in the data center, but a plurality of management devices may be installed, or may be placed in a store or on the cloud. In the above example, the recognition device 12 is installed in each store, for example, and is connected to the in-store imaging device 14 and the cash register imaging device 15 of each store. and so on. As still another example, a program for realizing each function related to the recognition device 12 may be installed in the cash register terminal, and in this case, the recognition device 12 is included in the cash register terminal. Further, in this embodiment, an example in which the management device 10 and the recognition device 12 are configured separately is given, but the management device 10 and the recognition device 12 may be an integrated device. The management device 10 in this case also executes each process including the second face matching related to the recognition device 12 .

図３は、管理装置１０のハードウェア構成の一例を示した図である。
図３に示すように、管理装置１０は、ＣＰＵ１０１、ＲＯＭ１０２、ＲＡＭ１０３、記録装置１０４、通信装置１０５、入力装置１０６、および表示装置１０７を有する。 FIG. 3 is a diagram showing an example of the hardware configuration of the management device 10. As shown in FIG.
As shown in FIG. 3 , the management device 10 has a CPU 101 , ROM 102 , RAM 103 , recording device 104 , communication device 105 , input device 106 and display device 107 .

ＣＰＵ１０１は、ＲＯＭ１０２に記録された制御プログラムを読みだして各種処理を実行する。ＲＡＭ１０３は主メモリやワークエリア等の一時記憶領域として用いられる。記録装置１０４は、例えばハードディスクドライブ（ＨＤＤ）やソリッドステートドライブ（ＳＳＤ）等からなり画像ファイルや会員情報の保存に用いる。
通信装置１０５は、通信を行う回路であり、当該通信は無線通信であってもよいし、有線通信であってもよい。本実施形態の場合、通信装置１０５は、ネットワーク１１に接続されている。入力装置１０６はキーボードやタッチパネルのほか、撮像装置(カメラ）を含んでいてもよい。表示装置１０７は液晶パネルや有機ＥＬパネルを含む。 The CPU 101 reads control programs recorded in the ROM 102 and executes various processes. A RAM 103 is used as a temporary storage area such as a main memory or a work area. The recording device 104 is composed of, for example, a hard disk drive (HDD), a solid state drive (SSD), or the like, and is used to store image files and member information.
The communication device 105 is a circuit that performs communication, and the communication may be wireless communication or wired communication. In this embodiment, the communication device 105 is connected to the network 11 . The input device 106 may include a keyboard, a touch panel, and an imaging device (camera). The display device 107 includes a liquid crystal panel and an organic EL panel.

図３に例示したように、管理装置１０のハードウェア構成は、パーソナルコンピュータ（ＰＣ）に搭載されているハードウェア構成と同様の構成要素を有している。そのため、管理装置１０で実現される各種機能は、ＰＣ上で動作するソフトウェアとして実装することが可能である。管理装置１０は、ＣＰＵ１０１がプログラムを実行することにより、後述する図５の機能、後述する図１１や図１３のフローチャートの処理を実現することができる。 As illustrated in FIG. 3, the hardware configuration of the management device 10 has the same components as the hardware configuration installed in a personal computer (PC). Therefore, various functions realized by the management device 10 can be implemented as software that operates on the PC. The CPU 101 of the management device 10 executes a program, thereby realizing the function of FIG. 5 described later and the processing of the flowcharts of FIGS. 11 and 13 described later.

図４は認識装置１２のハードウェア構成の一例を示した図である。
図４に示すように、認識装置１２は、ＣＰＵ１２１、ＲＯＭ１２２、ＲＡＭ１２３、記録装置１２４、通信装置１２５、入力装置１２６及び表示装置１２７を有する。 FIG. 4 is a diagram showing an example of the hardware configuration of the recognition device 12. As shown in FIG.
As shown in FIG. 4 , the recognition device 12 has a CPU 121 , ROM 122 , RAM 123 , recording device 124 , communication device 125 , input device 126 and display device 127 .

ＣＰＵ１２１は、ＲＯＭ１２２に記録された制御プログラムを読みだして各種処理を実行する。ＲＡＭ１２３は主メモリやワークエリア等の一時記憶領域として用いられる。記録装置１２４は、例えばハードディスクドライブ（ＨＤＤ）やソリッドステートドライブ（ＳＳＤ）等からなり画像ファイルや会員情報の保存に用いる。
通信装置１２５は通信を行う回路である。当該通信は無線通信であってもよいし、有線通信であってもよい。本実施形態の場合、通信装置１２５は、ネットワーク１１とネットワーク１３に接続されている。入力装置１２６はキーボードやタッチパネルを有する。表示装置１２７は液晶パネルや有機ＥＬパネルを含む。 The CPU 121 reads control programs recorded in the ROM 122 and executes various processes. A RAM 123 is used as a temporary storage area such as a main memory or a work area. The recording device 124 is composed of, for example, a hard disk drive (HDD), a solid state drive (SSD), or the like, and is used to store image files and member information.
The communication device 125 is a circuit for communication. The communication may be wireless communication or wired communication. In this embodiment, communication device 125 is connected to network 11 and network 13 . The input device 126 has a keyboard and a touch panel. The display device 127 includes a liquid crystal panel and an organic EL panel.

図４に例示したように、認識装置１２のハードウェア構成は、パーソナルコンピュータ（ＰＣ）に搭載されているハードウェア構成と同様の構成要素を有している。そのため、認識装置１２で実現される各種機能は、ＰＣ上で動作するソフトウェアとして実装可能である。認識装置１２は、ＣＰＵ１２１がプログラムを実行することにより、後述する図７の機能、後述する図１１、図１２、図１３のフローチャートの処理を実現することができる。 As illustrated in FIG. 4, the hardware configuration of the recognition device 12 has the same components as the hardware configuration installed in a personal computer (PC). Therefore, various functions realized by the recognition device 12 can be implemented as software that operates on a PC. The CPU 121 of the recognition device 12 executes a program, thereby realizing the function of FIG. 7 described later and the processing of flowcharts of FIGS. 11, 12, and 13 described later.

図５は、管理装置１０の機能構成の一例を示す機能ブロック図である。以下、図５を参照しながら本実施形態に係る管理装置１０の機能構成について説明する。
本実施形態の管理装置１０の会員情報登録部３００は、全ての会員の会員情報リストを保持している。なお本実施形態では、会員情報登録部３００が全ての会員情報リストを保持しているとするが、不図示のクラウド上に用意されている会員情報リストやデータベースに保持されている会員情報リストなどを取得してもよい。 FIG. 5 is a functional block diagram showing an example of the functional configuration of the management device 10. As shown in FIG. The functional configuration of the management device 10 according to this embodiment will be described below with reference to FIG.
The member information registration unit 300 of the management device 10 of this embodiment holds a member information list of all members. In this embodiment, the member information registration unit 300 holds all member information lists. may be obtained.

図６は会員情報リストの一例を示す図である。
図６に示すように、会員情報リストは、会員ＩＤ５０１、クレジットカード情報などの決済情報５０２、顔認識に用いる高精度顔情報５０３、顔認識に用いる中精度顔情報５０４を含む。
会員ＩＤ５０１には、登録されている各会員を個々に識別するための識別情報が保持されている。
決済情報５０２には、各会員のクレジットカード情報などの決済に使用可能な情報が保持されている。
高精度顔情報５０３には、会員ごとに予め撮影した顔画像から第２の抽出処理を用いて抽出した高精度顔情報（高精度特徴量）が保持されている。
中精度顔情報５０４には、会員ごとに予め撮影した顔画像から第１の抽出処理にて抽出した中精度顔情報（中精度特徴量）が保持されている。
なお図６では図示していないが、会員情報リストには、前述した各情報の他、会員の氏名、年齢などの属性情報、顔画像などがさらに保持されていてもよい。本実施形態では、会員情報リストに格納されている会員ごとの情報を会員情報と呼んでいる。なお、会員情報登録部３００の会員情報は、予め入力装置１０６のキーボードから入力されたり、カメラの撮影画像から抽出されたりして保持されるが、決済情報や顔情報をも含めて入力及び保持可能であれば、その入力方法や保持方法は限定されない。 FIG. 6 is a diagram showing an example of a member information list.
As shown in FIG. 6, the member information list includes member ID 501, payment information 502 such as credit card information, high-precision face information 503 used for face recognition, and medium-precision face information 504 used for face recognition.
The member ID 501 holds identification information for individually identifying each registered member.
The payment information 502 holds information that can be used for payment such as credit card information of each member.
The high-precision face information 503 holds high-precision face information (high-precision feature amount) extracted by using the second extraction process from the face image photographed in advance for each member.
The medium-precision face information 504 holds medium-precision face information (medium-precision feature amount) extracted by the first extraction process from the face image photographed in advance for each member.
Although not shown in FIG. 6, the member information list may further hold attribute information such as the member's name and age, face image, etc., in addition to the above-described information. In this embodiment, information for each member stored in the member information list is called member information. The member information in the member information registration unit 300 is input in advance from the keyboard of the input device 106 or is extracted from the captured image of the camera and stored. If possible, the input method and holding method are not limited.

図５に説明を戻す。
顔情報取得部３０１は、認識装置１２が顔画像から抽出した顔情報（特徴量）を取得して、顔照合部３０２に送る。第１の実施形態の場合、顔情報取得部３０１が取得する顔情報は、認識装置１２が店内撮像装置１４の画像から顔画像を検出して、さらに第１の抽出処理により抽出した中精度顔情報である。 Returning to FIG.
The face information acquisition unit 301 acquires face information (feature amount) extracted from the face image by the recognition device 12 and sends it to the face matching unit 302 . In the case of the first embodiment, the face information acquired by the face information acquisition unit 301 is obtained by detecting a face image from the image captured by the in-store imaging device 14 by the recognition device 12, and extracting a medium-precision face image by the first extraction process. Information.

顔照合部３０２は、顔情報取得部３０１が認識装置１２から取得した中精度顔情報と、会員情報登録部３００に保持されている会員情報の中精度顔情報とを用いて第１の顔照合を行う。そして、顔照合部３０２は、第１の顔照合の結果として、顔情報の類似度合を示す類似度スコアを取得する。 The face matching unit 302 performs first face matching using the medium-precision face information acquired by the face information acquisition unit 301 from the recognition device 12 and the medium-precision face information of the member information held in the member information registration unit 300. I do. Then, the face matching unit 302 acquires a similarity score indicating the similarity of the face information as a result of the first face matching.

情報送信部３０３は、第１の顔照合で類似度スコアを得た顔情報にそれぞれ対応した会員情報を、会員情報登録部３００から取得する。そして、情報送信部３０３は、第１の顔照合の結果である類似度スコアと、その類似度スコアの顔情報に対応して会員情報登録部３００から取得した会員情報とを、認識装置１２に送信する。 The information transmission unit 303 acquires from the member information registration unit 300 the member information corresponding to the face information for which the similarity score is obtained in the first face collation. Then, the information transmission unit 303 sends the similarity score, which is the result of the first face matching, and the member information acquired from the member information registration unit 300 corresponding to the face information of the similarity score to the recognition device 12. Send.

次に、本実施形態における認識装置１２の機能構成について図７を用いて説明する。図７は、認識装置１２の機能構成の一例を示す機能ブロック図である。
認識装置１２の画像取得部４００は、店内撮像装置１４あるいはレジ撮像装置１５が撮像した画像と、それらの撮像時刻等を含む撮像データを受信し、その受信した撮像データを、順次、顔検出部４０１へ送信する。なお、撮像された画像は、静止画であってもよいし動画であってもよい。以下、本実施形態では、特に明示する場合を除き、静止画と動画を区別せずに画像と表記している。 Next, the functional configuration of the recognition device 12 according to this embodiment will be described with reference to FIG. FIG. 7 is a functional block diagram showing an example of the functional configuration of the recognition device 12. As shown in FIG.
The image acquisition unit 400 of the recognition device 12 receives the images captured by the in-store imaging device 14 or the cashier imaging device 15 and the imaging data including the imaging times of the images and the like. 401. Note that the captured image may be a still image or a moving image. Hereinafter, in the present embodiment, still images and moving images are referred to as images without distinguishing between them, unless otherwise specified.

顔検出部４０１は、画像取得部４００が受信した画像に対し、顔検出処理を行い、画像中で顔が含まれる領域を、顔の境界枠として抽出する。本実施形態では、いわゆる背景差分法によって顔の境界枠を抽出するが、顔および顔の位置を検出するものであれば、どのような手法が用いられてもよい。顔検出部４０１は、顔検出処理において、例えば図９（ａ）に示すように、顔の境界枠１５００の座標と幅および高さの各情報と共に、目、鼻、口といった顔器官の特徴点を示す情報１５０１と、それら顔器官の尤もらしさを示す尤度とを取得する。 The face detection unit 401 performs face detection processing on the image received by the image acquisition unit 400, and extracts an area including a face in the image as a boundary frame of the face. In this embodiment, the so-called background subtraction method is used to extract the boundary frame of the face, but any method may be used as long as it detects the face and the position of the face. In the face detection process, the face detection unit 401 detects the coordinates, width, and height information of the face boundary frame 1500 as well as feature points of facial organs such as the eyes, nose, and mouth, as shown in FIG. and the likelihood indicating the likelihood of these facial features.

特徴量抽出部４０２は、高精度処理部４０８と中精度処理部４０９とから構成される。
高精度処理部４０８は、顔検出部４０１で抽出された顔の境界枠内の画像から、第２の抽出処理によって第２の特徴量である高精度特徴量（高精度顔情報）を抽出する。中精度処理部４０９は、顔検出部４０１で抽出された顔の境界枠内の画像から、第１の抽出処理により第１の特徴量である中精度特徴量（中精度顔情報）を抽出する。本実施形態の場合、特徴量抽出部４０２は、それらの抽出処理を、ディープニューラルネットワークと呼ばれる多階層のニューラルネットワークがディープラーニングを用いて学習されたニューラルネットワークを用いる演算処理により行う。高精度処理部４０８で用いるニューラルネットワークのモデルは中精度処理部４０９のニューラルネットワークのモデルよりも大規模である。したがって高精度処理部４０８は中精度処理部４０９よりも、一般的には高い精度顔認識（顔検出、特徴量抽出）を行うことが可能であるが、処理コストは大きい。 The feature amount extraction unit 402 is composed of a high-precision processing unit 408 and a medium-precision processing unit 409 .
A high-precision processing unit 408 extracts a high-precision feature amount (high-precision face information), which is a second feature amount, from the image within the boundary frame of the face extracted by the face detection unit 401 by a second extraction process. . A medium-precision processing unit 409 extracts a medium-precision feature amount (medium-precision face information), which is the first feature amount, from the image within the boundary frame of the face extracted by the face detection unit 401 by the first extraction processing. . In the case of this embodiment, the feature amount extraction unit 402 performs these extraction processes by arithmetic processing using a multi-layered neural network called a deep neural network trained using deep learning. The neural network model used in the high-precision processing unit 408 is larger in scale than the neural network model used in the medium-precision processing unit 409 . Therefore, the high-precision processing unit 408 can generally perform face recognition (face detection, feature amount extraction) with higher precision than the medium-precision processing unit 409, but the processing cost is high.

ここで本実施形態の場合、特徴量抽出部４０２において、画像取得部４００が店内撮像装置１４から取得して顔検出部４０１で検出された顔画像に対しては、中精度処理部４０９による中精度特徴量算出処理が行われる。そして中精度特徴量算出処理で得られた中精度特徴量、つまり店内撮像装置１４が撮像した来店者の顔画像から抽出された中精度特徴量は、情報送信部４０３から管理装置１０に送られる。
これにより、管理装置１０では、店内撮像装置１４の画像から得られた来店者の中精度特徴量（中精度顔情報）を用いた前述した第１の顔照合が行われ、その第１の顔照合の結果と会員情報とが当該認識装置１２の会員情報取得部４０４に送られてくる。 Here, in the case of this embodiment, in the feature quantity extraction unit 402 , the image acquisition unit 400 acquires the face image from the in-store imaging device 14 and the face image detected by the face detection unit 401 is processed by the medium-precision processing unit 409 . Accuracy feature amount calculation processing is performed. Then, the medium-precision feature amount obtained by the medium-precision feature amount calculation process, that is, the medium-precision feature amount extracted from the face image of the visitor captured by the in-store imaging device 14 is sent from the information transmission unit 403 to the management device 10. .
As a result, the management device 10 performs the above-described first face collation using the middle-precision feature amount (medium-precision face information) of the visitor obtained from the image of the in-store imaging device 14, and the first face matching is performed. The collation result and the member information are sent to the member information acquiring section 404 of the recognition device 12 .

会員情報取得部４０４は、管理装置１０による第１の顔照合の結果と会員情報とを取得し、リスト管理部４０５へ出力する。
リスト管理部４０５では、管理装置１０における第１の顔照合の結果と会員情報とを、来店者照合リストとして保持および管理する。 The member information acquisition unit 404 acquires the result of the first face matching by the management device 10 and the member information, and outputs them to the list management unit 405 .
The list management unit 405 retains and manages the result of the first face matching in the management device 10 and the member information as a visitor matching list.

リスト管理部４０５において保持、管理される来店者照合リストは、図８に示すように、会員ＩＤ８０１、決済情報８０２、高精度顔情報８０５、中精度顔情報８０３、および類似度スコア８０４を有するリストとなされる。来店者照合リストは、来店者ごとに管理装置１０の顔照合部３０２で行われた第１の顔照合処理の結果に基づくリストとなされている。類似度スコア８０４には、管理装置１０による第１の顔照合の結果の類似度スコアが高い順に保持される。会員ＩＤ８０１には、管理装置１０の情報送信部３０３が第１の顔照合の結果に応じて図６の会員情報リストから選出して送られてきた会員の識別情報が保持される。決済情報８０２には、管理装置１０の情報送信部３０３が会員ＩＤ８０１の識別情報に対応して図６の会員情報リストから選出して送られてきた決済情報が保持される。高精度顔情報８０５には、管理装置１０の情報送信部３０３が会員ＩＤ８０１の識別情報に対応して図６の会員情報リストから選出して送られてきた高精度顔情報（特徴量）が保持される。中精度顔情報８０３には、管理装置１０の情報送信部３０３が会員ＩＤ８０１の識別情報に対応して図６の会員情報リストから選出して送られてきた中精度顔情報（特徴量）が保持される。 The visitor verification list held and managed by the list management unit 405 is a list having member IDs 801, payment information 802, high-precision face information 805, medium-precision face information 803, and similarity scores 804, as shown in FIG. is made. The store visitor matching list is a list based on the results of the first face matching process performed by the face matching unit 302 of the management device 10 for each store visitor. The similarity score 804 holds the results of the first face matching by the management device 10 in descending order of similarity score. The member ID 801 holds member identification information selected from the member information list shown in FIG. The payment information 802 holds the payment information selected from the member information list in FIG. The high-precision face information 805 holds the high-precision face information (feature amount) selected from the member information list in FIG. be done. The medium-precision face information 803 holds the medium-precision face information (feature amount) selected from the member information list in FIG. be done.

なお、来店者照合リストに含まれる会員情報の数は、会員情報リストの全ての会員情報でもよいが、会員数が膨大な数である場合には、例えば上位１００件というような上限値が設けられてもよい。また来店者照合リストは、来店者が新規に発見された場合のみ、その来店者の会員情報が追加されて更新され、店内において後述するように追尾できている来店者の会員情報は更新されないとする。さらに来店者照合リストは、来店者が店内で追尾できなくなり、その後、当該来店者がリストに追加された時点から一定時間経過した場合に削除されてもよいし、店舗の閉店時間でリセットされてもよい。なお来店者照合リストには、会員情報取得部４０４が会員情報を取得した時刻や、画像取得部４００が取得した画像等がさらに含まれていてもよい。 The number of member information included in the visitor verification list may be all the member information in the member information list. may be Also, the visitor verification list is updated by adding the member information of the visitor only when a visitor is newly discovered, and the member information of the visitor who can be tracked in the store is not updated as described later. do. Furthermore, the visitor verification list may be deleted when a visitor becomes untrackable in the store and a certain period of time has elapsed since the visitor was added to the list, or reset at the closing time of the store. good too. The visitor verification list may further include the time when the member information acquisition unit 404 acquired the member information, the image acquired by the image acquisition unit 400, and the like.

また本実施形態において、店内撮像装置１４の画像から顔検出部４０１で検出された来店者の顔画像をついて、特徴量抽出部４０２の中精度処理部４０９が算出した中精度の顔情報は、顔照合部４０６にも送られる。このときの顔照合部４０６は、その来店者における中精度の顔情報を用いて、当該来店者を追尾する処理を行う。本実施形態の場合、１人の来店者に対して１つの来店者照合リストが紐付けられているため、その追尾している来店者がレジ撮像装置１５の前に来た時に、当該来店者に紐付けられた来店者照合リストを用いて、第２の顔照合を行う。 In this embodiment, the medium-precision face information calculated by the medium-precision processing unit 409 of the feature amount extraction unit 402 for the face image of the visitor detected by the face detection unit 401 from the image of the in-store imaging device 14 is It is also sent to the face matching unit 406 . At this time, the face matching unit 406 performs processing for tracking the visitor using medium-precision face information of the visitor. In the case of the present embodiment, one visitor verification list is associated with one visitor, so when the visitor being tracked comes in front of the register imaging device 15, the visitor can The second face matching is performed using the store visitor matching list linked to.

ここで本実施形態において、第２の顔照合は、レジ撮像装置１５にて得られた画像から、顔検出部４０１が検出した顔画像に対し、特徴量抽出部４０２の高精度処理部４０８が高精度特徴量算出処理を行って抽出した特徴量（顔情報）を用いて行われる。すなわち顔照合部４０６は、追尾している来店者がレジ前に来た時、レジ撮像装置１５が撮像した来店者の顔画像から高精度特徴量算出処理にて得られた高精度の顔情報（特徴量）と、その来店者に紐づけられた来店者照合リストとを用いて、第２の顔照合を行う。 Here, in the present embodiment, the second face collation is performed by the high-precision processing unit 408 of the feature amount extraction unit 402 for the face image detected by the face detection unit 401 from the image obtained by the register imaging device 15. This is performed using the feature amount (face information) extracted by performing the high-precision feature amount calculation process. That is, when the tracked visitor comes to the front of the cash register, the face matching unit 406 obtains high-precision face information from the face image of the visitor captured by the cash register imaging device 15 through high-precision feature amount calculation processing. The second face matching is performed using the (feature amount) and the store visitor matching list linked to the store visitor.

第２の顔照合の際、顔照合部４０６は、レジ撮像装置１５の画像から検出されて高精度処理部４０８で取得された決済者の高精度顔情報（特徴量）と、来店者照合リストの類似度スコア８０４の高い順の高精度特徴量とを用いた照合を行う。そして、顔照合部４０６は、第２の顔照合の結果、類似度スコアが所定のスコア閾値以上である会員情報に対応した会員ＩＤ８０１と決済情報８０２を、決済部４０７に通知する。 At the time of the second face matching, the face matching unit 406 extracts the high-precision face information (feature amount) of the payer detected from the image of the cash register imaging device 15 and acquired by the high-precision processing unit 408, and the store visitor matching list. matching is performed using high-accuracy feature quantities in descending order of similarity scores 804 of the . Then, the face matching unit 406 notifies the payment unit 407 of the member ID 801 and the payment information 802 corresponding to the member information whose similarity score is equal to or higher than a predetermined score threshold as a result of the second face matching.

なお他の例として、リスト管理部４０５は、複数の来店者に対して、１つの来店者照合リストを作成してもよい。その場合、来店者照合リストには、店内のいずれかの来店者と、会員との類似度スコアが保存される。そして、顔照合部４０６は、第２の顔照合において、決済者が誰かに関わらず、来店者照合リストの類似度スコアの高い会員情報から順次、照合を行う。また本実施形態では、来店者照合リストの類似度スコアの高い順に第２の顔照合が行われるが、類似度スコアが別途定めた閾値未満の会員については第２の顔照合を行わないようにしてもよい。 As another example, the list management unit 405 may create one visitor verification list for a plurality of visitors. In that case, the visitor matching list stores the similarity score between any visitor in the store and the member. Then, in the second face matching, the face matching unit 406 performs matching in order from the member information with the highest similarity score in the visitor matching list, regardless of who the payer is. Further, in the present embodiment, the second face matching is performed in descending order of similarity score in the visitor matching list, but the second face matching is not performed for members whose similarity score is less than a separately defined threshold. may

図８に示した来店者照合リストを参照しながら、第２の顔照合の一例について説明する。またここでは、認識が成功（照合が成功）したと判定される際の所定のスコア閾値として７００が設定されている場合、つまり類似度スコアが７００以上であれば認識成功と判定される例を挙げる。 An example of second face matching will be described with reference to the store visitor matching list shown in FIG. Further, here, an example in which 700 is set as a predetermined score threshold when it is determined that the recognition is successful (matching is successful), that is, if the similarity score is 700 or more, the recognition is determined to be successful. list.

まず顔照合部４０６は、第１の顔照合の結果を基に管理されている来店者照合リストの類似度スコア８０４のなかで、最も類似度スコアが高い値（７８０）の会員ＩＤ００００３を選出する。さらに顔照合部４０６は、その会員ＩＤ００００３に対応付けられている高精度顔情報８０５と、レジ撮像装置１５の画像から抽出された人物の高精度特徴量（顔情報）とを用いて、第２の顔照合を行う。そしてこの第２の顔照合の結果として得られた類似度スコアが６８０であったとする。この場合、第２の顔照合による類似度スコアの６８０は、所定のスコア閾値である７００未満であるため、顔照合部４０６は、レジ前にいる人物は会員ＩＤ００００３の人物ではない、つまり決済が可能な決済者ではないと判定する。 First, the face matching unit 406 selects the member ID 00003 with the highest similarity score (780) among the similarity scores 804 in the store visitor matching list managed based on the result of the first face matching. . Furthermore, the face matching unit 406 uses the high-precision face information 805 associated with the member ID 00003 and the high-precision feature amount (face information) of the person extracted from the image of the cash register imaging device 15 to perform the second perform face matching. Assume that the similarity score obtained as a result of this second face collation is 680. In this case, since the similarity score of 680 obtained by the second face matching is less than the predetermined score threshold of 700, the face matching unit 406 determines that the person in front of the cash register is not the member ID 00003, that is, the payment has not been completed. It is determined that it is not a possible settlement person.

次に顔照合部４０６は、来店者照合リストの類似度スコア８０４のなかで次に類似度スコアが高い値（６６０）の会員ＩＤ００００１を選出する。さらに顔照合部４０６は、その会員ＩＤ００００１に対応付けられている高精度顔情報８０５と、レジ撮像装置１５の画像から抽出された人物の高精度特徴量（顔情報）とを用いて、第２の顔照合を行う。そしてこの第２の顔照合の結果として得られた類似度スコアが７２０であったとする。この場合、第２の顔照合による類似度スコアの７２０は、所定のスコア閾値である７００を超えているため、顔照合部４０６は、レジ前にいる人物は会員ＩＤ００００１の人物である、つまり決済可能な決済者であると判定する。 Next, the face matching unit 406 selects the member ID 00001 with the next highest similarity score (660) among the similarity scores 804 of the visitor matching list. Furthermore, the face matching unit 406 uses the high-precision face information 805 associated with the member ID 00001 and the high-precision feature amount (face information) of the person extracted from the image of the cash register imaging device 15 to perform the second perform face matching. Assume that the similarity score obtained as a result of this second face collation is 720. In this case, since the similarity score of 720 obtained by the second face matching exceeds the predetermined score threshold of 700, the face matching unit 406 determines that the person in front of the cash register is the person with the member ID 00001, that is, the payment It is judged that it is a possible settlement person.

このようにして、顔照合部４０６での第２の顔照合の結果、決済者としての会員が特定されると、決済部４０７は、その特定された会員の会員ＩＤ８０１に対応する決済情報８０２を用いて、決済処理を行う。本実施形態の場合、例えば、決済処理では、レジといった不図示の購入金額集計システムから決済額を受け取り、不図示の決済システムに決済額、会員ＩＤ、決算情報を送信することで決済処理が行われる。 In this way, when the member as the payer is specified as a result of the second face matching in the face matching unit 406, the payment unit 407 retrieves the payment information 802 corresponding to the member ID 801 of the specified member. payment processing. In the case of this embodiment, for example, in the settlement process, the settlement amount is received from a purchase amount totaling system (not shown) such as a cash register, and the settlement amount, member ID, and settlement information are transmitted to the settlement system (not shown). will be

図１０は、第１の実施形態に係る管理装置１０が実行する会員登録処理の流れを示すフローチャートである。本実施形態の場合、会員登録は、会員が店舗で決済を行うよりも前に行われる処理である。図１０のフローチャートに示した会員登録処理は、システム管理者が、管理装置１０上で不図示の登録処理開始ボタンを押下することで開始されるとする。なおこれ以降の各フローチャートやシーケンス図において用いる符号のＳは、それぞれ処理のステップを表しているとする。 FIG. 10 is a flow chart showing the flow of member registration processing executed by the management device 10 according to the first embodiment. In the case of this embodiment, member registration is a process performed before the member makes a payment at the store. It is assumed that the member registration process shown in the flowchart of FIG. 10 is started when the system administrator presses a registration process start button (not shown) on the management device 10 . It should be noted that the symbol S used in each flow chart and sequence diagrams below represents each step of processing.

Ｓ５０１において、事前に不図示の撮像装置で登録対象の人物の顔を撮像した静止画に対して高精度特徴量算出処理と中精度特徴量算出処理を行って取得された高精度顔情報と中精度顔情報とが、管理装置１０の会員情報登録部３００に入力される。
また５０１において、会員情報登録部３００は、予めシステム管理者等が不図示のキーボード等の入力装置を介して入力した会員ＩＤや決済情報等を取得する。そして、会員情報登録部３００は、Ｓ５０１で入力された高精度顔情報と中精度顔情報とＳ５０２で取得した会員ＩＤや決済情報等を、会員情報として、図６に示した会員情報リストに保存する。なお、会員情報は、会員が所有する携帯電話やスマートフォンなどのモバイル端末、パーソナルコンピュータ等の情報機器から、ネットワークを介して取得されてもよい。 In S501, high-precision face information and medium-precision face information acquired by performing high-precision feature quantity calculation processing and medium-precision feature quantity calculation processing on a still image of a person's face captured in advance by an imaging device (not shown) and medium-precision feature quantity calculation processing are performed. The accurate facial information is input to the member information registration unit 300 of the management device 10 .
In 501, the member information registration unit 300 acquires the member ID, payment information, and the like input in advance by the system administrator or the like via an input device such as a keyboard (not shown). Then, the member information registration unit 300 saves the high-precision face information and medium-precision face information input in S501 and the member ID, payment information, etc. acquired in S502 as member information in the member information list shown in FIG. do. The member information may be acquired via a network from a mobile terminal such as a mobile phone or smart phone owned by the member, or an information device such as a personal computer.

図１１は、本実施形態に係る認識装置１２と管理装置１０とが連携して実行する、店内撮像装置１４の画像を用いた第１の顔照合の処理の流れを示すフローチャートである。図１１のフローチャートに示した第１の顔照合処理は、システム管理者が、例えば認識装置１２上で不図示の第１の顔照合処理開始ボタンを押下することで開始されるとする。なおシステム管理者による指示の他に、予め認識装置１２に店舗の営業開始時刻を記憶させておき、営業開始時刻になったことを条件に、認識装置１２が第１の顔照合処理を開始するようになされていてもよい。 FIG. 11 is a flow chart showing the flow of the first face matching process using the image of the in-store imaging device 14, which is executed in cooperation by the recognition device 12 and the management device 10 according to this embodiment. It is assumed that the first face matching process shown in the flowchart of FIG. 11 is started by the system administrator pressing a first face matching process start button (not shown) on the recognition device 12, for example. In addition to the instruction from the system administrator, the recognition device 12 is pre-stored with the business start time of the store, and the recognition device 12 starts the first face matching process on condition that the business start time has come. It may be done as follows.

Ｓ６００において、認識装置１２の画像取得部４００は、店内撮像装置１４が店舗内を撮像した画像を取得する。
次にＳ６０１において、認識装置１２の顔検出部４０１は、画像取得部４００から受け取った画像に対して顔検出を行い、図９（ａ）で説明したような顔の境界枠を取得する。
さらにＳ６０２において、認識装置１２の特徴量抽出部４０２は、中精度処理部４０９により、境界枠内の画像から中精度特徴量（顔情報）を抽出する。
そしてＳ６０３において、認識装置１２の情報送信部４０３は、Ｓ６０２で抽出した中精度特徴量（顔情報）を、管理装置１０の顔情報取得部３０１に送信する。 In S600, the image acquisition unit 400 of the recognition device 12 acquires an image of the inside of the store captured by the in-store imaging device 14 .
Next, in S601, the face detection unit 401 of the recognition device 12 performs face detection on the image received from the image acquisition unit 400, and acquires the boundary frame of the face as described with reference to FIG. 9A.
Further, in S602, the medium-precision processing unit 409 of the feature amount extraction unit 402 of the recognition device 12 extracts medium-precision feature amounts (face information) from the image within the boundary frame.
In S603 , the information transmission unit 403 of the recognition device 12 transmits the medium-precision feature amount (face information) extracted in S602 to the face information acquisition unit 301 of the management device 10 .

管理装置１０では、Ｓ６０４において、顔情報取得部３０１が、認識装置１２から送信されてき中精度特徴量（顔情報）を受信する。
次にＳ６０５において、管理装置１０の顔照合部３０２は、受信した中精度特徴量と、管理装置１０の会員情報登録部３００が保持している会員情報リストから取得した中精度顔情報（特徴量）との比較により、類似度スコアを取得する。
そしてＳ６０６において、管理装置１０の情報送信部３０３は、Ｓ６０５で取得した類似度スコアと、会員情報登録部３００から取得した会員情報とを併せて、認識装置１２の会員情報取得部４０４に送信する。 In the management device 10 , in S604 , the face information acquisition unit 301 receives the medium-precision feature amount (face information) transmitted from the recognition device 12 .
Next, in S605, the face matching unit 302 of the management apparatus 10 combines the received medium-precision feature amount and the medium-precision face information (feature amount ) to obtain a similarity score.
In S606, the information transmission unit 303 of the management device 10 transmits the similarity score acquired in S605 and the member information acquired from the member information registration unit 300 together to the member information acquisition unit 404 of the recognition device 12. .

認識装置１２では、Ｓ６０７において、会員情報取得部４０４が、管理装置１０から送信されてきた類似度スコアと会員情報とを受信する。
次にＳ６０８において、認識装置１２のリスト管理部４０５は、管理装置１０から送信された類似度スコアと会員情報とを、来店者照合リストに保存する。 In the recognition device 12, the member information acquisition unit 404 receives the similarity score and member information transmitted from the management device 10 in S607.
Next, in S608, the list management unit 405 of the recognition device 12 saves the similarity score and member information transmitted from the management device 10 in the visitor verification list.

認識装置１２と管理装置１０は、以上のような流れの第１の顔照合処理を、店内撮像装置１４が起動している間、連携して繰り返し続けることにより、店舗内に滞在している全ての来店者について登録済の会員情報との紐づけを行う。 The recognition device 12 and the management device 10 continue to cooperate and repeat the above-described first face matching process while the in-store imaging device 14 is activated, so that all the people staying in the store Visitors are linked with registered member information.

本実施形態では、全ての来店者について登録済の会員情報と紐づけを行って来店者照合リストに登録する例を説明したが、他の例として、類似度スコアが予め設定されたスコア閾値以上の会員情報のみを来店者照合リストに登録してもよい。さらに他の例として、第１の顔照合処理のフローチャートにおいて、認識装置１２が実行する処理の一部または全部を、管理装置１０が実行してもよい。或いは、管理装置が実行する処理の一部または全部を、認識装置１２が実行してもよい。 In the present embodiment, an example has been described in which all visitors are linked with registered member information and registered in the visitor verification list. may be registered in the visitor verification list. As still another example, the management device 10 may execute part or all of the processing executed by the recognition device 12 in the flowchart of the first face matching processing. Alternatively, part or all of the processing executed by the management device may be executed by the recognition device 12 .

図１２は、本実施形態に係る認識装置１２がレジ撮像装置１５の画像を用いて実行する第２の顔照合における処理の流れを示すフローチャートである。図１２のフローチャートに示した第２の顔照合処理は、システム管理者が、認識装置１２上で不図示の第２の顔照合処理開始ボタンを押下することで開始されるとする。なお、システム管理者による指示の他に、予め認識装置１２に店舗の営業開始時刻を記憶させておき、営業開始時刻になったことを条件に、認識装置１２が第２の顔照合処理を開始するようになされていてもよい。 FIG. 12 is a flow chart showing the flow of processing in the second face collation executed by the recognition device 12 according to this embodiment using the image of the register imaging device 15 . Assume that the second face matching process shown in the flowchart of FIG. 12 is started by the system administrator pressing a second face matching process start button (not shown) on the recognition device 12 . In addition to the instruction from the system administrator, the recognition device 12 is stored in advance with the business start time of the store, and the recognition device 12 starts the second face matching process on the condition that the business start time has come. It may be made to do.

Ｓ７００において、画像取得部４００は、レジ撮像装置１５から、レジで商品の決済を行おうとしている人物の顔を正面から撮像した画像を取得する。
次にＳ７０１において、顔検出部４０１は、画像取得部４００から受け取った画像に対して顔検出を行い、顔の境界枠を取得する。
さらにＳ７０２において、特徴量抽出部４０２は、高精度処理部４０８により、境界枠の画像から高精度特徴量（顔情報）を抽出する。 In S700 , the image acquisition unit 400 acquires, from the cash register imaging device 15 , a front image of the face of a person who is going to pay for a product at the cash register.
Next, in S701, the face detection unit 401 performs face detection on the image received from the image acquisition unit 400, and acquires the boundary frame of the face.
Further, in S702, the feature amount extraction unit 402 causes the high-precision processing unit 408 to extract a high-precision feature amount (face information) from the image of the boundary frame.

次にＳ７０３において、顔照合部４０６は、Ｓ７０２で取得した高精度特徴量と、リスト管理部４０５で管理している来店者照合リストの会員情報の中で類似度スコアが最も高い会員情報の高精度顔情報（特徴量）とを比較し、類似度スコアを取得する。すなわちＳ７０３において、顔照合部４０６は、来店者照合リスト内の高精度顔情報の中で類似度スコアが会員情報の高精度顔情報から順に、Ｓ７０２で取得した高精度特徴量（顔情報）との照合を行う。これにより、レジ撮像装置１５の画像に写っている人物、つまりレジで決済を行おうとしている人物が、来店者照合リスト内のいずれの会員情報の人物であるかを特定する。 Next, in S703 , the face matching unit 406 compares the high-precision feature amount acquired in S702 with the member information with the highest similarity score among member information in the visitor matching list managed by the list management unit 405 . A similarity score is obtained by comparing with the accurate face information (feature amount). That is, in S703, the face matching unit 406 selects the high-precision feature amount (face information) acquired in S702 in order from the high-precision face information of the member information with the similarity score among the high-precision face information in the visitor verification list. perform matching. As a result, it is specified which member information in the store visitor collation list the person appearing in the image of the cash register imaging device 15, that is, the person who is about to make a payment at the cash register.

本実施形態では、前述のように店内撮像装置１４の画像を用いて来店者を追尾し、その追尾を行っている間に、中精度特徴量を用いた第１の顔照合を行い、その第１の顔照合の結果を基に来店者照合リストが１つ生成されて、当該来店者に紐づけられる。そして本実施形態では、店内撮像装置１４の画像を基に追尾していた人物がレジ前に来た時、Ｓ７０３において、顔照合部４０６が、来店者に紐づく来店者照合リストを用い、高精度特徴量による第２の顔照合を行い、来店者がいずれの会員であるかを特定する。 In the present embodiment, as described above, the image of the in-store imaging device 14 is used to track the visitor, and while the tracking is being performed, the first face matching is performed using the medium-precision feature amount. One store visitor matching list is generated based on one face matching result and linked to the store visitor. Then, in this embodiment, when a person who has been tracked based on the image of the in-store imaging device 14 comes to the front of the cash register, in S703 the face matching unit 406 uses the store visitor matching list linked to the store visitor. A second face collation is performed using the accuracy feature amount to identify which member the visitor belongs to.

次にＳ７０４において、顔照合部４０６は、高精度特徴量を用いた照合による類似度スコアが所定のスコア閾値以上であるかを判定する。そして顔照合部４０６において、スコア閾値以上であると判定した場合、Ｓ７０３でリスト管理部４０５から取得した会員情報に対応する人物が、決済可能な人物であるして特定される。一方、スコア閾値未満であると判定した場合、顔照合部４０６は、Ｓ７０３に処理を戻し、次に高い類似度スコアの会員情報における高精度顔情報を用いた比較を行う。なお来店者照合リストの全ての会員情報の高精度顔情報との比較を行っても、類似度スコアがスコア閾値以上にならなかった場合、顔照合部４０６は、レジ前の人物は非登録者であるとし、不図示の現金払い等の決済を行うように決済部４０７に指示する。 Next, in S704, the face matching unit 406 determines whether the similarity score obtained by matching using the high-precision feature amount is equal to or greater than a predetermined score threshold. When the face matching unit 406 determines that the score is equal to or higher than the score threshold, the person corresponding to the member information acquired from the list management unit 405 in S703 is specified as a person who can make a payment. On the other hand, if it is determined to be less than the score threshold, the face matching unit 406 returns the process to S703 and performs comparison using the highly accurate face information in the member information with the next highest similarity score. If the similarity score does not reach or exceed the score threshold even after comparison with the high-precision face information of all member information in the visitor verification list, the face verification unit 406 determines that the person in front of the checkout is an unregistered person. , and instructs the settlement unit 407 to perform settlement such as cash payment (not shown).

Ｓ７０５において、決済部４０７は、前段の処理で決済を行おうとしている人物が会員として特定された場合、その会員（決済者）の決済情報８０２を用いて、決済処理を行う。
なお他の例として、第２の顔照合処理のフローチャートにおいて認識装置１２が実行する処理の一部または全部を管理装置１０が実行してもよい。 In S705, the settlement unit 407 performs settlement processing using the settlement information 802 of the member (settler) when the person who is about to make the settlement is identified as a member in the preceding processing.
As another example, the management device 10 may execute part or all of the processing executed by the recognition device 12 in the flowchart of the second face matching processing.

以上説明したように、第１の実施形態では、店内撮像装置１４が取得した画像（俯瞰画像）から検出した顔画像から抽出した中精度特徴量（顔情報）を用いて第１の顔照合を行うことで、全ての会員の中から来店している会員の候補者を絞っておく。そして、絞られた候補者について、会員情報に登録されている顔画像の高精度顔情報（特徴量）と、レジ撮像装置１５で正面撮影した顔画像から抽出した高精度特徴量とを比較する第２の顔照合が行われる。これにより、店内撮像装置１４の画像の認識精度が低い場合であっても、レジでの認識時間を短縮しながらも、精度の高い顔認識を行って決済者を特定することができる。また、店内撮像装置１４における顔認識は中精度特徴量を用いた処理であるため、処理コストを低減させることが可能となる。 As described above, in the first embodiment, the first face matching is performed using the medium-precision feature amount (face information) extracted from the face image detected from the image (overhead image) acquired by the in-store imaging device 14. By doing so, candidates for visiting members are narrowed down from among all the members. Then, for the narrowed down candidates, the high-precision face information (feature amount) of the face image registered in the member information is compared with the high-precision feature amount extracted from the face image photographed from the front by the register imaging device 15. A second face match is performed. As a result, even if the image recognition accuracy of the in-store image pickup device 14 is low, it is possible to specify the payer by performing highly accurate face recognition while shortening the recognition time at the cash register. In addition, since face recognition in the in-store imaging device 14 is a process using a medium-precision feature amount, it is possible to reduce the processing cost.

＜第２の実施形態＞
次に第２の実施形態について説明する。ここでは、第１の実施形態とは異なる機能と処理についてのみ説明し、それ以外については特に触れない限り、第１の実施形態と同様であるものとする。第２の実施形態の場合、店内撮像装置１４で取得した画像について高い顔認識精度が必要となるような条件下では、高精度特徴量（高精度顔情報）を用いた認識処理を行うようにする。本実施形態では、店内撮像装置１４の画像に対して所定の解析処理を行い、その解析処理の結果を基に、高い顔認識精度が必要となるような条件を満たすか判定し、高い顔認識精度が必要と判定した場合、高精度特徴量を用いた認識処理を行う。店内撮像装置１４で取得した画像に対して高い顔認識精度が必要となる条件としては、例えば、顔の一部が髪の毛やマスク、サングラス等の遮蔽物によって隠れていたり、人物が物陰に隠れていたりするような場合等を挙げることができる。第２の実施形態では、所定の解析処理として、人物の顔の一部が隠れているかを解析し、顔の一部が隠れているとの解析結果が得られた場合に、店内撮像装置１４の画像から高精度特徴量を抽出する。本実施形態における所定の解析処理の詳細は後述する。 <Second embodiment>
Next, a second embodiment will be described. Here, only the functions and processes different from those of the first embodiment will be described, and the rest are assumed to be the same as those of the first embodiment unless otherwise specified. In the case of the second embodiment, under conditions where high accuracy of face recognition is required for an image acquired by the in-store imaging device 14, recognition processing using high-precision feature amounts (high-precision face information) is performed. do. In this embodiment, a predetermined analysis process is performed on the image of the in-store imaging device 14, and based on the result of the analysis process, it is determined whether or not conditions requiring high face recognition accuracy are satisfied, and high face recognition is satisfied. When it is determined that accuracy is required, recognition processing is performed using high-accuracy feature amounts. Conditions that require high face recognition accuracy for images acquired by the in-store imaging device 14 include, for example, when a part of the face is hidden by an obstacle such as hair, a mask, or sunglasses, or when a person is hidden behind an object. Such cases can be mentioned. In the second embodiment, as a predetermined analysis process, it is analyzed whether a part of a person's face is hidden. extract high-precision features from the image. Details of the predetermined analysis processing in this embodiment will be described later.

また前述した第１の実施形態では、来店者が新規に発見された場合のみ来店者照合リストが作成され、追尾されている来店者に対しては更新されない例となっている。これに対し、第２の実施形態では、来店者が新規に発見された場合のみ来店者照合リストが作成され、店内カメラで来店者の顔が検出される度に更新されるとする。 Further, in the above-described first embodiment, the visitor verification list is created only when a visitor is newly discovered, and the visitor who is being tracked is not updated. On the other hand, in the second embodiment, the visitor verification list is created only when a visitor is newly discovered, and is updated every time the visitor's face is detected by the in-store camera.

第２の実施形態の場合、認識装置１２において、店内撮像装置１４の画像に写っている人物（来店者２０３）の顔画像から特徴量を抽出する際、顔の一部が遮蔽物等によって隠れているかの解析処理を行う。そして認識装置１２は、顔の一部が遮蔽物等により隠れているとの解析結果が得られた場合、店内撮像装置１４による顔画像から、第１の抽出処理による第１の特徴量つまり高精度特徴量（高精度顔情報）を取得する。そして、その抽出した高精度特徴量を、ネットワーク１１を介して管理装置１０へ送信する。 In the case of the second embodiment, when the recognition device 12 extracts the feature amount from the face image of the person (visitor 203) appearing in the image of the in-store imaging device 14, part of the face is hidden by a shield or the like. Perform analysis processing to determine whether or not Then, when the recognition device 12 obtains an analysis result that a part of the face is hidden by a shield or the like, the face image obtained by the in-store imaging device 14 is extracted from the face image by the first extraction process, that is, the high Acquire precision features (high-precision face information). Then, the extracted high-precision feature amount is transmitted to the management device 10 via the network 11 .

第２の実施形態の場合、管理装置１０は、店内撮像装置１４の画像から抽出された高精度特徴量と、登録会員における高精度顔情報（高精度特徴量）とを用いて、第１の顔照合を行う。そして管理装置１０は、第２の実施形態に係る第１の顔照合の結果と会員情報とを、認識装置１２に送る。第２の実施形態の場合も、第１の顔照合の結果を示す情報は、顔の特徴量が類似する度合を示す類似度スコアとなされる。 In the case of the second embodiment, the management device 10 uses the high-precision feature amount extracted from the image of the in-store imaging device 14 and the high-precision face information (high-precision feature amount) of the registered member to perform the first Perform face matching. The management device 10 then sends the result of the first face matching and the member information according to the second embodiment to the recognition device 12 . In the case of the second embodiment as well, the information indicating the result of the first face collation is a similarity score indicating the degree of similarity of facial feature amounts.

認識装置１２は、管理装置１０による第１の顔照合の結果と会員情報とを含む来店者照合リストを作成および保持し、管理する。来店者照合リストの構成は、第１の実施形態と同様であり、第１の顔照合による類似度スコアと、その第１の顔照合の結果に応じた会員の識別情報、決済情報、および顔認識に用いる顔情報を少なくとも含むリストとなされる。 The recognition device 12 creates, holds, and manages a visitor verification list including the result of the first face verification by the management device 10 and member information. The configuration of the store visitor matching list is the same as in the first embodiment, and includes the similarity score obtained by the first face matching, and the identification information, payment information, and face information of the member according to the result of the first face matching. A list containing at least face information used for recognition is created.

図１３は、第２の実施形態に係る認識装置１２と管理装置１０とが連携して実行する、店内撮像装置１４の画像を用いた第１の顔照合処理の流れを示すフローチャートである。前述した図１１との違いは、認識装置１２において、Ｓ６０１の処理後、Ｓ８０１の判断処理が行われ、その結果に応じてＳ８０２またはＳ８０３の処理が行われた後、Ｓ６０３の処理に進み、さらにＳ６０７の処理後にＳ８０５の処理が実行される点である。また管理装置１０では、Ｓ６０４の処理後に、Ｓ８０４の処理が実行され、そのＳ８０４の処理後にＳ６０６の処理が実行される点が、図１１の処理とは異なる。 FIG. 13 is a flow chart showing the flow of the first face matching process using the image of the in-store imaging device 14, which is executed in cooperation with the recognition device 12 and the management device 10 according to the second embodiment. The difference from FIG. 11 described above is that in the recognition device 12, after the processing of S601, the determination processing of S801 is performed, and depending on the result, the processing of S802 or S803 is performed, then the processing proceeds to S603, and furthermore, The point is that the process of S805 is executed after the process of S607. 11 in that the management apparatus 10 executes the process of S804 after the process of S604, and the process of S606 is executed after the process of S804.

Ｓ８０１に進むと、認識装置１２の特徴量抽出部４０２は、所定の解析処理として判断処理を行う。この時の特徴量抽出部４０２は、顔画像における顔の遮蔽の程度を取得する処理、例えば、Ｓ６０１の顔検出時に取得した顔器官の尤度が、所定の尤度閾値以上であるかを判定する。ここで、図９（ａ）に示したように顔が隠れていないため顔器官の尤度が所定の尤度閾値以上であれば、顔器官が尤もらしく各顔器官が露出している可能性が高いと判断できる。したがって、特徴量抽出部４０２は、顔器官の尤度が所定の尤度閾値以上である場合にはＳ８０２に処理を進める。一方、図８（ｂ）に示すように、顔の一部が何らかの遮蔽物１５０２によって隠れている場合、顔器官の尤度が所定の尤度閾値未満になることがあり、この場合、顔器官が尤もらしくなく各顔器官の一部が隠れている可能性が高いと判断できる。したがって、特徴量抽出部４０２は、顔器官の尤度が所定の尤度閾値未満である場合にはＳ８０３に処理を進める。 When proceeding to S801, the feature amount extraction unit 402 of the recognition device 12 performs determination processing as predetermined analysis processing. At this time, the feature amount extraction unit 402 performs processing for acquiring the degree of masking of the face in the face image, for example, determines whether the likelihood of the facial features acquired during face detection in S601 is equal to or greater than a predetermined likelihood threshold. do. Here, as shown in FIG. 9A, since the face is not hidden and the likelihood of the facial features is equal to or greater than a predetermined likelihood threshold, there is a possibility that the facial features are plausible and that each facial feature is exposed. can be judged to be high. Therefore, if the likelihood of facial features is greater than or equal to the predetermined likelihood threshold, the feature quantity extraction unit 402 advances the process to S802. On the other hand, as shown in FIG. 8(b), when a part of the face is hidden by some kind of shield 1502, the likelihood of facial features may be less than the predetermined likelihood threshold. However, it is highly probable that part of each facial organ is hidden. Therefore, the feature amount extraction unit 402 advances the process to S803 when the likelihood of the facial feature is less than the predetermined likelihood threshold.

Ｓ８０２に進んだ場合、認識装置１２の特徴量抽出部４０２では、中精度処理部４０９によって、顔の境界枠の画像から中精度特徴量（中精度顔情報）を算出する。
一方、Ｓ８０３に進んだ場合、認識装置１２の特徴量抽出部４０２では、高精度処理部４０８によって、顔の境界枠の画像から高精度特徴量（高精度顔情報）を算出する。 When proceeding to S802, the medium-precision processing unit 409 of the feature amount extraction unit 402 of the recognition apparatus 12 calculates a medium-precision feature amount (medium-precision face information) from the image of the boundary frame of the face.
On the other hand, if the process proceeds to S803, the high-precision processing unit 408 of the feature amount extraction unit 402 of the recognition device 12 calculates a high-precision feature amount (high-precision face information) from the image of the boundary frame of the face.

管理装置１０において、Ｓ６０４からＳ８０４に進むと、顔照合部３０２は、認識装置１２から送られてきた特徴量（顔情報）と、会員情報登録部３００から取得した会員の顔情報（特徴量）とを比較して類似度スコアを取得する。すなわち認識装置１２から高精度特徴量を受信した場合、顔照合部３０２では、会員情報登録部３００から取得した会員の高精度顔情報との比較によって高精度類似度スコアを取得する。また認識装置１２から中精度顔情報を受信した場合、顔照合部３０２では、会員情報登録部３００から取得した会員の中精度顔情報との比較によって中精度類似度スコアを取得する。 In the management device 10, when proceeding from S604 to S804, the face matching unit 302 combines the feature amount (face information) sent from the recognition device 12 with the member's face information (feature amount) acquired from the member information registration unit 300. to get a similarity score. That is, when the high-precision feature amount is received from the recognition device 12 , the face matching unit 302 acquires a high-precision similarity score by comparing it with the member's high-precision face information acquired from the member information registration unit 300 . Also, when medium-precision face information is received from the recognition device 12 , the face matching unit 302 acquires a medium-precision similarity score by comparing with the member's medium-precision face information acquired from the member information registration unit 300 .

また認識装置１２において、Ｓ６０７からＳ８０５に進むと、リスト管理部４０５は、来店者ごとに、会員の類似度スコアと会員情報を来店者照合リストに保存する。第２の実施形態の場合、来店者照合リストに保存される類似度スコアは、管理装置１０がＳ８０４で取得した中精度類似度スコアもしくは高精度類似度スコアを含むことになる。すなわちリスト管理部４０５は、Ｓ８０２でした中精度特徴量に応じて管理装置１０で取得された中精度類似度スコア、もしくはＳ８０３で算出した高精度特徴量に応じて管理装置１０で取得された高精度類似度スコアを、来店者照合リストに保存する。 Also, in the recognition device 12, when proceeding from S607 to S805, the list management unit 405 saves the member's similarity score and member information for each visitor in the visitor collation list. In the case of the second embodiment, the similarity scores saved in the store visitor collation list include the medium precision similarity score or the high precision similarity score acquired by the management device 10 in S804. That is, the list management unit 405 uses the medium-precision similarity score obtained by the management apparatus 10 according to the medium-precision feature amount in S802, or the high-precision similarity score obtained by the management apparatus 10 in accordance with the high-precision feature amount calculated in S803. Store the accuracy similarity score in the visitor matching list.

なお、既に過去に顔検出され類似度スコアが取得されている来店者であった場合、リスト管理部４０５は、類似度スコアの値の更新を行う。また、来店者において高精度類似度スコアもしくは中精度類似度スコアのどちらか１つのみが取得されている場合、リスト管理部４０５は、その来店者について取得されている類似度スコアを評価スコアとする。 Note that if the visitor is a visitor whose face has already been detected and a similarity score has been obtained in the past, the list management unit 405 updates the value of the similarity score. In addition, if only one of the high-precision similarity score and the medium-precision similarity score is acquired for the visitor, the list management unit 405 uses the similarity score acquired for the visitor as the evaluation score. do.

一方、来店者について既に高精度類似度スコアと中精度類似度スコアが共に取得されている場合、リスト管理部４０５は、各類似度スコアに重みを考慮し、評価スコアを算出し保存する。例えば、高精度類似度スコア８０６、中精度類似度スコアに対し、重みを０．５：０．５として、その和を評価スコアとする。 On the other hand, if both the high-precision similarity score and the medium-precision similarity score have already been acquired for the visitor, the list management unit 405 considers the weight of each similarity score, calculates and stores the evaluation score. For example, the high-precision similarity score 806 and the medium-precision similarity score are weighted 0.5:0.5, and the sum thereof is used as the evaluation score.

図１４は、高精度類似度スコア１４０１、中精度類似度スコア１４０２、および評価スコア１４０３を含む来店者照合リストの一例を示した図である。図１４において、会員ＩＤ８１０から中精度顔情報８０３までは図８と同様であるため、それらの説明は省略する。すなわち第２の実施形態の場合、来店者照合リストには、高精度特徴量を用いた第１の顔照合による類似度スコアが高精度類似度スコア１４０１に保存され、中精度特徴量を用いた第１の顔照合による類似度スコアが中精度類似度スコア１４０２に保存される。さらに第２の実施形態に係る来店者照合リストには、重みを考慮して算出された評価スコアが評価スコア１４０３に保存される。 FIG. 14 is a diagram showing an example of a visitor matching list including a high-precision similarity score 1401, a medium-precision similarity score 1402, and an evaluation score 1403. FIG. In FIG. 14, members from member ID 810 to medium-precision face information 803 are the same as in FIG. 8, and description thereof will be omitted. That is, in the case of the second embodiment, in the visitor matching list, the similarity score obtained by the first face matching using the high-precision feature quantity is stored in the high-precision similarity score 1401, and The similarity score from the first face match is saved in medium precision similarity score 1402 . Furthermore, in the store visitor collation list according to the second embodiment, the evaluation score calculated in consideration of the weight is stored in the evaluation score 1403 .

第２の実施形態において、その後、図１２に示したフローチャートで決済処理が行われる際、顔照合部４０６は、Ｓ７０４の処理として、前述のようにして算出された評価スコアの順にレジでの第２の顔照合を行う。第２の実施形態では、来店者一人に対して何度かの顔検出と特徴量抽出が実行されたときを考慮した処理が行われる。 In the second embodiment, when settlement processing is subsequently performed according to the flowchart shown in FIG. 2 face matching is performed. In the second embodiment, processing is performed in consideration of when face detection and feature quantity extraction are performed several times for one visitor.

以上説明したように、第２の実施形態では、来店客の顔が覆われておらず確認できるときには中精度特徴量を用いた顔認識を行うことで、処理コストを下げ、一方、客の顔の一部が隠れている場合には、高精度特徴量を用いた顔認識を行う。これにより、第２の実施形態によれば、システム全体としての処理コストを抑えつつも高い認識精度を維持可能となる。 As described above, in the second embodiment, when the customer's face is not covered and can be confirmed, face recognition is performed using medium-precision feature amounts, thereby reducing the processing cost and reducing the customer's face. is partially hidden, face recognition is performed using high-precision feature amounts. Thus, according to the second embodiment, it is possible to maintain high recognition accuracy while suppressing the processing cost of the system as a whole.

第２の実施形態では、所定の解析処理として、顔器官の尤度といった値を用いて顔の露出状態、言い換えると顔の隠れ状態を確認することを行っているが、これに限るものではない。所定の解析処理として、例えば、画像からサングラスやマスクなどの遮蔽物を検知する処理を行ってもよい。認識装置１２は、遮蔽物を検知した場合に、高精度特徴量を用いた顔認識を行うようにする。 In the second embodiment, as the predetermined analysis processing, the face exposed state, in other words, the hidden state of the face is confirmed using a value such as the likelihood of facial features, but the present invention is not limited to this. . As the predetermined analysis process, for example, a process of detecting shielding objects such as sunglasses or a mask from the image may be performed. The recognition device 12 performs face recognition using the high-precision feature amount when an obstacle is detected.

また、顔認識においては、一般に顔の向きが撮像装置に対して正対している方が認識精度を高くなる。したがって、所定の解析処理として、顔検出部４０１において、顔の検出と共に顔の向きを検出する処理を行い、Ｓ８０１において顔向きが正対の向きであるか否かをも判断して、顔認識の精度を切り替えてもよい。すなわち認識装置１２は、顔向きが正対の向きである場合には中精度特徴量を用いた顔認識を行い、正対の向きでない場合には高精度特徴量を用いた顔認識を行う。 Further, in face recognition, recognition accuracy is generally higher when the face faces the imaging device. Therefore, as predetermined analysis processing, the face detection unit 401 performs face detection and face orientation detection processing. You can switch the precision of That is, the recognition device 12 performs face recognition using the medium-precision feature amount when the face is facing forward, and performs face recognition using the high-precision feature amount when the face is not facing the face.

また本実施形態では、高精度類似度スコアと中精度類似度スコアに基づく評価スコアを用いたが、これに限るものではなく、過去に取得された類似度スコアを累積したものを評価スコアとしてもよい。 In addition, in the present embodiment, the evaluation score based on the high-precision similarity score and the medium-precision similarity score is used, but the present invention is not limited to this. good.

＜第３の実施形態＞
次に第３の実施形態について説明する。ここでは、第２の実施形態とは異なる機能や処理についてのみ説明し、その他については特に触れない限りは第１，第２の実施形態と同様であるものとする。
第１または第２の実施形態の認識装置１２では、第１の顔照合の際には中精度特徴量を用いた照合処理を行うことを基本としている。第３の実施形態の認識装置１２は、すべての顔認識に対して基本的には高精度特徴量を用いた照合処理を行い、高精度特徴量を用いても高い認識精度の結果が望めないとき、あるいは過度に処理コストが増加してしまうとき、中精度特徴量を用いた照合を行う。これにより、第３の実施形態では、認識の精度を維持しつつ、処理コストを抑えるようにする。 <Third Embodiment>
Next, a third embodiment will be described. Here, only the functions and processes that are different from those of the second embodiment will be described, and the others are the same as those of the first and second embodiments unless otherwise specified.
In the recognition device 12 of the first or second embodiment, the matching process using medium-precision feature amounts is basically performed in the first face matching. The recognition device 12 of the third embodiment basically performs matching processing using high-precision feature amounts for all face recognition, and high-precision recognition results cannot be expected even if high-precision feature amounts are used. In some cases, or when the processing cost increases excessively, matching is performed using medium-precision feature amounts. As a result, in the third embodiment, the processing cost is suppressed while maintaining recognition accuracy.

第３の実施形態では、高精度特徴量を用いても高い精度の認識結果を望めない場合の一例として、画像から検出した顔の境界枠のサイズが小さい場合を挙げて説明する。すなわち第３の実施形態では、所定の解析処理として、画像から検出した顔の境界枠のサイズを求め、そのサイズの大小に応じて中精度特徴量を用いた照合を行うか、高精度特徴量を用いた照合を行うかを選択（決定）する。例えば、顔の境界枠のサイズが小さい場合、顔画像から得られる情報量が少ないために、高精度特徴量を用いても認識精度の向上が図れないことが多い。このため、本実施形態では、顔の境界枠のサイズが所定の閾値未満である場合には、中精度特徴量を用いることで処理コストを抑えるようにする。一方、顔の境界枠のサイズが所定の閾値以上である場合には、高精度特徴量を用いることで高い認識精度での照合を可能とする。 In the third embodiment, as an example of a case in which a highly accurate recognition result cannot be expected even if a highly accurate feature amount is used, a case where the size of the boundary frame of a face detected from an image is small will be described. That is, in the third embodiment, as predetermined analysis processing, the size of the boundary frame of the face detected from the image is obtained, and depending on the size of the size, matching is performed using medium-precision feature amounts or high-precision feature amounts. Select (determine) whether to perform matching using . For example, when the size of the boundary frame of the face is small, the amount of information obtained from the face image is small, so that it is often impossible to improve the recognition accuracy even if a high-precision feature amount is used. For this reason, in the present embodiment, when the size of the boundary frame of the face is less than a predetermined threshold value, the processing cost is suppressed by using the medium-precision feature amount. On the other hand, when the size of the boundary frame of the face is equal to or larger than the predetermined threshold, matching with high recognition accuracy is possible by using the high-precision feature amount.

第３の実施形態において、認識装置１２が管理装置１０と連携して実行する、店内撮像装置１４の画像を用いた第１の顔照合の処理は、概ね前述した図１３のフローチャートと同様な流れになる。ただし、第３の実施形態の場合、図１３のフローチャートのＳ８０１における判断処理が異なる。 In the third embodiment, the processing of the first face matching using the image of the in-store imaging device 14, which is executed by the recognition device 12 in cooperation with the management device 10, is generally similar to the flow chart of FIG. 13 described above. become. However, in the case of the third embodiment, the judgment processing in S801 of the flowchart of FIG. 13 is different.

第３の実施形態の場合、Ｓ８０１において、認識装置１２の特徴量抽出部４０２は、Ｓ６０１の顔検出にて検出された境界枠のサイズが所定のサイズ閾値未満か否かを判断する。より具体的には、特徴量抽出部４０２は、境界枠の幅が所定の幅閾値未満か、または境界枠の高さが高さ閾値未満かを判断する。そして特徴量抽出部４０２は、境界枠の幅と高さのいずれか、もしくは両方がそれぞれ閾値（幅閾値、高さ閾値）未満である場合、高精度特徴量（顔情報）を算出するのに十分な情報量がなく、処理コストが低い中精度特徴量を用いた処理でよいと判断する。つまり、境界枠の幅と高さのいずれか、もしくは両方がそれぞれ対応した閾値未満である場合、特徴量抽出部４０２では、Ｓ８０３において中精度特徴量算出処理を実行する。一方、特徴量抽出部４０２は、境界枠の幅と高さのいずれもが対応した閾値（幅閾値あるいは高さ閾値）以上の場合、高精度の特徴量を出するのに十分な情報量があると判断して、Ｓ８０２の高精度特徴量算出処理を実行する。 In the case of the third embodiment, in S801, the feature amount extraction unit 402 of the recognition device 12 determines whether or not the size of the boundary frame detected by face detection in S601 is less than a predetermined size threshold. More specifically, the feature quantity extraction unit 402 determines whether the width of the bounding box is less than a predetermined width threshold or the height of the bounding box is less than the height threshold. If either or both of the width and height of the boundary frame are less than the respective threshold values (width threshold value, height threshold value), the feature amount extraction unit 402 calculates the high-precision feature amount (face information). It is judged that the process using the medium-precision feature quantity, which does not have a sufficient amount of information and has a low processing cost, is sufficient. In other words, if either or both of the width and height of the boundary frame are less than the corresponding threshold value, the feature quantity extraction unit 402 executes medium precision feature quantity calculation processing in S803. On the other hand, when both the width and height of the boundary frame are equal to or greater than the corresponding threshold values (width threshold value or height threshold value), the feature amount extraction unit 402 does not have sufficient information amount to generate a highly accurate feature amount. It is determined that there is, and high-precision feature amount calculation processing in S802 is executed.

以上説明したように、第３の実施形態によれば、高精度特徴量（高精度顔情報）を抽出する処理を行っても十分な精度が得られないと判断した場合、中精度特徴量（中精度顔情報）の抽出を行うことにより、精度の維持と処理コストを低減とが可能となる。 As described above, according to the third embodiment, when it is determined that sufficient accuracy cannot be obtained even if a process for extracting high-precision feature amounts (high-precision face information) is performed, medium-precision feature amounts ( By extracting medium-precision face information, it is possible to maintain accuracy and reduce processing costs.

＜第３の実施形態の変形例＞。
第３の実施形態では、検出された顔の境界枠の大きさを用いたが、所定の解析処理の他の例として、撮像画像から検出された顔の数を検出する処理を行い、その解析結果を基に高精度と中精度のいずれの特徴量算出処理を行うかを選択（切り替え）てもよい。すなわち画像から検出される顔の数が多い場合、それら多数の顔画像を用いた処理を行うと極端に処理コストが増大することが想定される。したがって認識装置１２では、第１の顔照合を行う際に、所定の解析処理として画像から顔の数を検出いし、その検出した顔の数が所定の数閾値以上である場合、中精度特徴量算出処理を行うようにする。一方、顔の数が所定の数閾値未満である場合、認識装置１２では、高精度特徴量算出処理を行うようする。この変形例によれば、顔の数が多く、高精度特徴量算出を行った場合に処理コストが増大すると判断した場合、中精度特徴量算出を行うことにより、処理コストを低減することが可能となる。 <Modified example of the third embodiment>.
In the third embodiment, the size of the boundary frame of the detected face is used, but as another example of the predetermined analysis processing, processing for detecting the number of faces detected from the captured image is performed, and the analysis is performed. Based on the results, it is possible to select (switch) whether to perform high-precision or medium-precision feature amount calculation processing. That is, when the number of faces detected from an image is large, it is assumed that the processing cost will increase significantly if processing is performed using the large number of face images. Therefore, in the recognition device 12, when performing the first face matching, the number of faces is detected from the image as a predetermined analysis process. Calculation processing is performed. On the other hand, when the number of faces is less than the predetermined number threshold, the recognition device 12 performs high-precision feature amount calculation processing. According to this modified example, when it is determined that there are many faces and the processing cost will increase if the high-precision feature amount calculation is performed, the processing cost can be reduced by performing the medium-precision feature amount calculation. becomes.

＜第４の実施形態＞
前述した第１～第３の実施形態は、管理装置と認識装置が分かれたシステム構成例を挙げたが、管理装置と認識装置の両機能が一つの情報処理装置により実現されてもよい。第４の実施形態は、管理装置と認識装置の両機能を備えた情報処理装置について説明する。 <Fourth Embodiment>
In the above-described first to third embodiments, system configuration examples in which the management device and the recognition device are separate have been given, but both the functions of the management device and the recognition device may be realized by one information processing device. The fourth embodiment describes an information processing apparatus having both functions of a management device and a recognition device.

図１５は第４の実施形態に係る情報処理装置の機能構成例を示した図である。なお、第４の実施形態に係る情報処理装置のハードウェア構成は、前述した図３や図４の構成と概ね同様であるため、その図示と説明は省略する。また第４の実施形態の情報処理装置では、前述した第１～第３のいずれかの実施形態における管理装置および認識装置で説明した処理を行えるが、ここでは一例として第１の実施形態で説明した例を第４の実施形態に適用する場合を挙げて説明する。 FIG. 15 is a diagram showing a functional configuration example of an information processing apparatus according to the fourth embodiment. Note that the hardware configuration of the information processing apparatus according to the fourth embodiment is substantially the same as the configurations of FIGS. Further, the information processing apparatus of the fourth embodiment can perform the processing described in the management apparatus and the recognition apparatus in any one of the first to third embodiments, but the first embodiment will be described here as an example. A case where the above example is applied to the fourth embodiment will be described.

第４の実施形態において、情報処理装置の画像取得部１４１は店内撮像装置１４或いはレジ撮像装置１５から画像を取得し、顔検出部１４２は画像から顔検出を行う。
特徴量抽出部１４６では、店内撮像装置１４の画像から検出された顔画像が送られてきた場合、中精度処理部１４６Ｍが、中精度特徴量（顔情報）を抽出して第１の顔照合部１４３に送る。またこのとき、会員情報取得部１４０は予め用意された会員情報に登録されている顔情報を第１の顔照合部１４３に送る。そして、第１の顔照合部１４３では、前述同様の第１の顔照合を行い、リスト管理部１４４ではその第１の顔照合の結果を基に来店者リストを作成して保持する。 In the fourth embodiment, the image acquisition unit 141 of the information processing device acquires an image from the in-store imaging device 14 or the cash register imaging device 15, and the face detection unit 142 detects a face from the image.
In the feature amount extraction unit 146, when a face image detected from the image of the in-store imaging device 14 is sent, the medium-precision processing unit 146M extracts the medium-precision feature amount (face information) and performs the first face matching. Send to section 143 . At this time, the member information acquisition unit 140 sends the face information registered in the member information prepared in advance to the first face matching unit 143 . Then, the first face collation unit 143 performs the same first face collation as described above, and the list management unit 144 creates and holds a visitor list based on the result of the first face collation.

また特徴量抽出部１４６では、レジ撮像装置１５の画像から検出された顔画像が送られてきた場合、高精度処理部１４６Ｈが、高精度特徴量（顔情報）を抽出して第２の顔照合部１４７に送る。この時の第２の顔照合部１４７は、さきに作成して保持されている来店者照合リストから類似度が高い順番で高精度特徴量を取得して第２の顔照合を行う。そして第２の顔照合部１４７においてレジ前の人物が会員であると特定できたときに、決済部１４８は決済処理を行う。 Further, in the feature amount extraction unit 146, when a face image detected from the image of the register imaging device 15 is sent, the high-precision processing unit 146H extracts the high-precision feature amount (face information) to obtain a second face image. Send to collation unit 147 . At this time, the second face collation unit 147 acquires high-precision feature amounts in descending order of similarity from the store visitor collation list created and held earlier, and performs the second face collation. Then, when the second face matching unit 147 can identify the person in front of the register as a member, the settlement unit 148 performs settlement processing.

図１６は、第４の実施形態に係る情報処理装置における処理の流れを示すフローチャートである。
Ｓ１６０において、画像取得部１４１は店内撮像装置１４あるいはレジ撮像装置１５から画像を取得する。そして、Ｓ１６１において、情報処理装置は、取得した画像が、店内撮像装置１４からの画像である場合にはＳ１６２以降に処理を進め、レジ撮像装置１５からの画像である場合にはＳ１６６以降に処理を進める。 FIG. 16 is a flow chart showing the flow of processing in the information processing apparatus according to the fourth embodiment.
In S160 , the image acquisition unit 141 acquires an image from the in-store imaging device 14 or the cash register imaging device 15 . Then, in S161, if the acquired image is the image from the in-store imaging device 14, the information processing device proceeds with the processing from S162 onwards, and if it is the image from the cash register imaging device 15, the processing proceeds from S166 onwards. proceed.

Ｓ１６２の処理に進んだ場合、顔検出部１４２は、店内撮像装置１４による画像から顔検出を行う。
次にＳ１６２において、特徴量抽出部１４６は、顔検出部１４２にて検出された顔画像（境界枠の画像）から、中精度処理部１４６Ｍによって中精度特徴量（顔情報）を抽出する。
さらにＳ１６４において、第１の顔照合部１４３は、会員情報取得部１４０から会員情報を取得し、その会員情報内の中精度顔情報（特徴量）と、Ｓ１６３で抽出された中精度特徴量とを用いた第１の顔照合を行って類似度スコアを取得する。
そしてＳ１６５において、リスト管理部１４４は、会員情報取得部１４０から得られた会員情報と、Ｓ１６４で取得された類似度スコアとを基に、来店者照合リストを作成して保持する。その後、情報処理装置は、図１６のフローチャートの処理を終了する。 When proceeding to the process of S162 , the face detection unit 142 performs face detection from the image captured by the in-store imaging device 14 .
Next, in S162, the feature amount extraction unit 146 extracts medium-precision feature amounts (face information) from the face image (boundary frame image) detected by the face detection unit 142 by the medium-precision processing unit 146M.
Furthermore, in S164, the first face matching unit 143 acquires the member information from the member information acquisition unit 140, and uses the medium-precision face information (feature amount) in the member information and the medium-precision feature amount extracted in S163. to obtain a similarity score.
Then, in S165, the list management unit 144 creates and holds a visitor verification list based on the member information obtained from the member information obtaining unit 140 and the similarity score obtained in S164. After that, the information processing device ends the processing of the flowchart of FIG. 16 .

一方、Ｓ１６６に進んだ場合、顔検出部１４２は、レジ撮像装置１５の画像から顔検出を行う。
次にＳ１６７において、特徴量抽出部１４６は、顔検出部１４２にて検出された顔画像（境界枠の画像）から、高精度処理部１４６Ｈによって高精度特徴量（顔情報）を抽出する。
さらにＳ１６８において、第２の顔照合部１４３は、リスト管理部１４４が保持している来店者照合リストの高精度顔情報（特徴量）と、Ｓ１６７で抽出された高精度特徴量とを用いた第２の顔照合を行う。
そして、情報処理装置は、第２の顔照合において類似度スコアが閾値以上となる結果が得られた場合にはＳ１７０に処理を進め、Ｓ１７０において決済部１４８が決済処理を行う。
一方、Ｓ１６９において類似度スコアが閾値未満の結果が得られた場合、情報処理装置は図１６のフローチャートの処理を終了する。すなわちこの場合の決済者は、現金等よる決済を行うことになる。 On the other hand, when proceeding to S166 , the face detection unit 142 performs face detection from the image of the register imaging device 15 .
Next, in S167, the feature amount extraction unit 146 extracts a high-precision feature amount (face information) from the face image (boundary frame image) detected by the face detection unit 142 by the high-precision processing unit 146H.
Furthermore, in S168, the second face matching unit 143 uses the high-precision face information (feature amount) of the visitor matching list held by the list management unit 144 and the high-precision feature amount extracted in S167. A second face matching is performed.
Then, when the similarity score equals or exceeds the threshold in the second face collation, the information processing apparatus advances the processing to S170, and the settlement unit 148 performs settlement processing in S170.
On the other hand, if a similarity score less than the threshold value is obtained in S169, the information processing device ends the processing of the flowchart of FIG. In other words, the payer in this case will make the payment in cash or the like.

図１７は、第４の実施形態において、店舗内の情報処理装置１７００が、第１の顔照合および来店者照合リストの作成・管理、さらに第２の顔照合から決済要求までを行う場合の処理の流れを示したシーケンス図である。なおこの例では、全ての登録会員の会員情報を管理し、決済処理を行う装置としての、管理装置１０００が設けられているとする。 FIG. 17 shows the processing when the information processing device 1700 in the store performs the first face verification, creation and management of the store visitor verification list, and the second face verification to payment request in the fourth embodiment. is a sequence diagram showing the flow of In this example, it is assumed that a management device 1000 is provided as a device that manages member information of all registered members and performs settlement processing.

Ｓ９００において、情報処理装置１７００は、管理装置１０００に対し、すべての会員における特徴量（顔情報）を含む会員情報を要求する。そしてＳ９０１において、情報処理装置１７００は、管理装置１０００から特徴量を含む会員情報を取得する。 In S900, the information processing apparatus 1700 requests the management apparatus 1000 for member information including feature amounts (face information) of all members. Then, in S901 , the information processing device 1700 acquires member information including the feature amount from the management device 1000 .

情報処理装置１７００は、Ｓ９０２で店内撮像装置１４に対して画像の取得要求を送り、店内撮像装置１４から画像が送られてきた場合に、Ｓ９０３においてその店内撮像装置１４からの画像を取得する。
次にＳ９０４において、情報処理装置１７００は、店内撮像装置１４から取得した画像から人物の顔画像を検出し、その顔画像から中精度特徴量（顔情報）を抽出する。さらに、情報処理装置１７００は、Ｓ９０５において、Ｓ９０４で抽出した中精度特徴量と、会員情報リストの中精度顔情報（特徴量）とを用いた第１の顔照合を行い、来店者照合リストを作成する。 The information processing apparatus 1700 sends an image acquisition request to the in-store imaging device 14 in S902, and when an image is sent from the in-store imaging device 14, acquires the image from the in-store imaging device 14 in S903.
Next, in S904, the information processing device 1700 detects a face image of a person from the image acquired from the in-store imaging device 14, and extracts a medium precision feature amount (face information) from the face image. Further, in S905, the information processing apparatus 1700 performs first face matching using the medium-precision feature amount extracted in S904 and the medium-precision face information (feature amount) of the member information list, and creates a visitor verification list. create.

その後、情報処理装置１７００は、Ｓ９０６においてレジ撮像装置１５に対して画像の取得要求を送り、レジ撮像装置１５から画像が送られてきた場合に、Ｓ９０７においてそのレジ撮像装置１５からの画像を取得する。
次にＳ９０８において、情報処理装置１７００は、レジ撮像装置１５の画像から人物の顔画像を検出し、その顔画像から高精度特徴量（顔情報）を抽出する。さらに、Ｓ９０９において、情報処理装置１７００は、Ｓ９０８で抽出した高精度特徴量と、来店者照合リストの高精度顔情報（特徴量）とを比較する第２の顔照合を行う。 Thereafter, the information processing apparatus 1700 sends an image acquisition request to the cash register imaging device 15 in S906, and when an image is sent from the cash register imaging device 15, the image from the cash register imaging device 15 is acquired in S907. do.
Next, in S908, the information processing device 1700 detects a face image of a person from the image captured by the register imaging device 15, and extracts high-precision feature amounts (face information) from the face image. Further, in S909, the information processing apparatus 1700 performs second face collation for comparing the high-accuracy feature amount extracted in S908 with the high-accuracy face information (feature amount) in the visitor collation list.

情報処理装置１７００は、第２の顔照合の結果、Ｓ９１０において会員を特定すると、Ｓ９１１において管理装置１０００に決済情報を送って決済処理を要求する。そして、管理装置１０００において決済処理が完了すると、Ｓ９１２において決済処理結果の通知を受け取る。 When the information processing device 1700 identifies the member in S910 as a result of the second face matching, in S911, the information processing device 1700 sends payment information to the management device 1000 to request payment processing. Then, when the settlement process is completed in the management device 1000, a notification of the settlement process result is received in S912.

本発明は、上述の実施形態の一以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける一つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、一以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。
上述の実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (eg, ASIC) that implements one or more functions.
All of the above-described embodiments merely show specific examples for carrying out the present invention, and the technical scope of the present invention should not be construed to be limited by these. That is, the present invention can be embodied in various forms without departing from its technical concept or main features.

１０：管理装置、１２：認識装置、１４：店内撮像装置、１５：レジ撮像装置、３００：会員情報登録部、３０１：顔情報取得部、３０２：顔照合部、４００：画像取得部、４０１：顔検出部、４０２：特徴量抽出部、４０４：会員情報取得部、４０５：リスト管理部、４０６：顔照合部、４０７：決済部 10: Management device, 12: Recognition device, 14: In-store imaging device, 15: Cash register imaging device, 300: Member information registration unit, 301: Face information acquisition unit, 302: Face matching unit, 400: Image acquisition unit, 401: Face detection unit 402: Feature amount extraction unit 404: Member information acquisition unit 405: List management unit 406: Face matching unit 407: Payment unit

Claims

An information processing system including a first information processing device and a second information processing device,
The first information processing device is
First feature information extracted by a first extraction process, and second feature information extracted by a second extraction process with a larger amount of processing than the first extraction process, from face images of a plurality of persons. and obtain person information, including
3rd feature information extracted by said 1st extraction process from the 1st face image which image|photographed a person, and said 1st feature information contained in said person information are compared with said 1st feature information, and the similarity is acquired. 1 matching process is performed, and the personal information including the degree of similarity obtained by the first matching process is transmitted to the second information processing device;
The second information processing device is
acquiring personal information including the degree of similarity from the first information processing device, creating a list including the first characteristic information and the second characteristic information according to the degree of similarity from the personal information;
The fourth feature information extracted by the second extraction process from the second facial image of a person is compared with the second feature information in the list in descending order of the degree of similarity, and the performing a second matching process for identifying the person in the face image of 2 from the person information;
An information processing system characterized by:

The first extraction process and the second extraction process are processes using neural networks having different circuit scales,
2. The information processing system according to claim 1, wherein the second extraction process is a process using a neural network having a circuit scale larger than that of the first extraction process.

The second information processing device transmits to the first information processing device feature information extracted by the first extraction process from the first face image of a person, and
3. The information processing system according to claim 1, wherein said first information processing device acquires said feature information received from said second information processing device as said third feature information.

The first collation processing in the first information processing apparatus includes matching feature information extracted from the first face image by the second extraction processing and the second feature information included in the person information. It also includes processing to obtain similarity by comparing
The second information processing apparatus performs the first extraction process and the second extraction process on the first face image based on the result of a predetermined analysis process using a photographed face image of a person. selecting one of the processes, transmitting the feature information extracted by the selected extraction process to the first information processing device;
In the first matching process, the first information processing apparatus performs the feature information received from the second information processing apparatus and the extraction process selected by the second information processing apparatus. 3. The information processing system according to claim 1, wherein the similarity is obtained by comparing the feature information of the person information.

The second information processing device weights the degree of similarity depending on which of the first extraction process and the second extraction process is selected based on the result of the predetermined analysis process. 5. The information processing system according to claim 4, wherein:

The predetermined analysis process is a process of obtaining the number of face images appearing in the photographed image,
6. The information processing system according to claim 4, wherein the second information processing device selects the first extraction process when the number is equal to or greater than a predetermined number threshold.

The predetermined analysis process is a process of acquiring the degree of masking of the face in the photographed face image,
The second information processing apparatus acquires the likelihood of facial organs in the facial image as the degree of masking of the face, and executes the first extraction process when the likelihood is equal to or greater than a likelihood threshold. 6. The information processing system according to claim 4, wherein the selection is made.

The predetermined analysis process is a process of detecting an image of a predetermined shield from the photographed face image,
6. The information processing system according to claim 4, wherein the second information processing device selects the second extraction process when the image of the predetermined shielding object is detected.

The predetermined analysis process is a process of detecting the orientation of the face in the photographed face image,
6. The information processing system according to claim 4, wherein the second information processing device selects the second extraction process when the face orientation is not the facing orientation.

The predetermined analysis process is a process of acquiring the size of the face in the photographed face image,
6. The information processing system according to claim 4, wherein the second information processing device selects the first extraction process when the size of the face is less than a predetermined size threshold.

The first facial image is an image detected from an image captured by a first imaging device that captures an image of the interior of the store, and the second facial image is a second imaging that captures an image of a person making a payment at a cash register. 11. The information processing system according to any one of claims 1 to 10, wherein the image is an image captured by a device.

12. The information processing according to any one of claims 1 to 11, wherein the second information processing apparatus does not perform the second matching process for face images with the degree of similarity less than a threshold. system.

13. The information processing system according to any one of claims 1 to 12, wherein the second information processing device performs payment processing based on payment information associated with the specified person.

First feature information extracted by a first extraction process, and second feature information extracted by a second extraction process with a larger amount of processing than the first extraction process, from face images of a plurality of persons. and an information acquisition means for acquiring personal information including
A process of comparing third feature information extracted by the first extraction process from a first facial image of a person photographed with the first feature information included in the person information to acquire a degree of similarity. means and
management means for creating and managing a list including the first feature information and the second feature information extracted from the person information according to the similarity and the similarity;
The fourth feature information extracted by the second extraction process from the second face image of a person is compared with the second feature information in the list in descending order of similarity to determine the second face image. identification means for identifying the person in the face image of from the person information;
An information processing device comprising:

The first extraction process and the second extraction process are processes using neural networks having different circuit scales,
15. The information processing apparatus according to claim 14, wherein the second extraction process is a process using a neural network having a circuit scale larger than that of the first extraction process.

The processing means
selecting either the first extraction process or the second extraction process as the extraction process for the first face image based on the result of a predetermined analysis process using the photographed face image; ,
The feature information extracted from the first face image by the selected extraction process and the feature information of the person information according to the selected extraction process are compared to obtain a degree of similarity. 16. The information processing device according to Item 14 or 15.

The processing means weights the degree of similarity depending on whether the first extraction process or the second extraction process is selected based on the result of the predetermined analysis process. 17. The information processing apparatus according to claim 16.

The predetermined analysis process is a process of obtaining the number of face images appearing in the photographed image,
18. An information processing apparatus according to claim 16, wherein said processing means selects said first extraction process when said number is greater than or equal to a predetermined number threshold.

The predetermined analysis process is a process of acquiring the degree of masking of the face in the photographed face image,
The processing means acquires the likelihood of facial organs in the facial image as the degree of masking of the face, and selects the first extraction process when the likelihood is equal to or greater than a likelihood threshold. 18. The information processing apparatus according to claim 16 or 17.

The predetermined analysis process is a process of detecting an image of a predetermined shield from the photographed face image,
18. The information processing apparatus according to claim 16, wherein the processing means selects the second extraction process when the image of the predetermined shielding object is detected.

The predetermined analysis process is a process of detecting the orientation of the face in the photographed face image,
18. The information processing apparatus according to claim 16, wherein said processing means selects said second extraction process when said face is not facing forward.

The predetermined analysis process is a process of acquiring the size of the face in the photographed face image,
18. The information processing apparatus according to claim 16, wherein said processing means selects said first extraction process when the size of said face is less than a predetermined size threshold.

The first facial image is an image detected from an image captured by a first imaging device that captures an image of the interior of the store, and the second facial image is a second imaging that captures an image of a person making a payment at a cash register. 23. The information processing apparatus according to any one of claims 14 to 22, wherein the image is an image captured by the apparatus.

24. The information processing apparatus according to any one of claims 14 to 23, wherein said specifying means does not perform said specifying process for face images whose similarity is less than a threshold.

25. The information processing apparatus according to any one of claims 14 to 24, further comprising payment means for performing payment processing based on payment information associated with the specified person.

First feature information extracted by a first extraction process, and second feature information extracted by a second extraction process with a larger amount of processing than the first extraction process, from face images of a plurality of persons. and an information acquisition step of acquiring personal information including
A process of comparing third feature information extracted by the first extraction process from a first facial image of a person photographed with the first feature information included in the person information to acquire a degree of similarity. process and
a management step of creating and managing a list including the first feature information and the second feature information extracted from the person information according to the similarity and the similarity;
The fourth feature information extracted by the second extraction process from the second face image of a person is compared with the second feature information in the list in descending order of similarity to determine the second face image. an identifying step of identifying the person in the face image of from the person information;
An information processing method characterized by having

A program for causing a computer to function as the information processing apparatus according to any one of claims 1 to 25.