JP7185537B2

JP7185537B2 - Imaging system

Info

Publication number: JP7185537B2
Application number: JP2019004370A
Authority: JP
Inventors: 健介上田; 信貴松嶌
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2019-01-15
Filing date: 2019-01-15
Publication date: 2022-12-07
Anticipated expiration: 2039-01-15
Also published as: JP2020113109A

Description

本発明は、撮像システムに関する。 The present invention relates to imaging systems.

例えばテーマパーク等において敷地内に定点カメラを複数設置し、各定点カメラにおいて撮像されたシーンをつなぎ合わせて、動画を生成するサービスが知られている。このようなサービスにおいては、例えば入場者毎にその人物が写っているシーンをつなぎ合わせた動画を生成することが考えられる。入場者毎の動画を生成する場合には、例えば特許文献１に記載されたような顔認証技術によって人物の識別が行われる。 For example, a service is known in which a plurality of fixed-point cameras are installed in a theme park or the like, and scenes captured by the fixed-point cameras are connected to generate a moving image. In such a service, for example, it is conceivable to generate a moving image by connecting scenes in which the person is shown for each visitor. When generating a moving image for each visitor, identification of a person is performed by face recognition technology as described in Patent Literature 1, for example.

特開２００７－２７２８９６号公報JP 2007-272896 A

ここで、例えば顔写真のみを人物特定のための情報として用いる場合等においては、人物を特定するための事前情報が少なく、人物抽出を高精度に行うことができない。すなわち、例えば、顔写真のみを用いて人物抽出を行おうとした場合には、人物の後姿のみを撮像した定点カメラの画像からは人物抽出を行うことができず、人物抽出を高精度に行うことができない。 Here, for example, when only a photograph of a person's face is used as information for identifying a person, there is little prior information for identifying the person, and the person cannot be extracted with high accuracy. That is, for example, when trying to extract a person using only a face photograph, it is impossible to extract a person from an image of a fixed-point camera that captures only the back of the person. can't

本発明は上記実情に鑑みてなされたものであり、複数の撮像装置を含んで構成される撮像システムにおいて、撮像画像から対象人物の抽出を高精度に行うことを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and it is an object of the present invention to extract a target person from a captured image with high precision in an imaging system including a plurality of imaging devices.

本発明の一態様に係る撮像システムは、複数の撮像装置を含んで構成される撮像システムであって、撮像画像を取得する撮像画像取得部と、撮像画像から、対象人物の顔情報、及び、該対象人物の顔以外の情報であって該対象人物に関連する関連情報の少なくともいずれか一方を検出し、検出結果に基づき対象人物が写っているシーンを抽出する検出部と、検出部による検出結果について、複数の撮像装置間で共有されるように出力する出力部と、複数の撮像装置間で共有される、対象人物の顔情報及び関連情報を対応付けて記憶する記憶部と、を備え、検出部は、記憶部において対応付けて記憶された情報に基づき、撮像画像から対象人物に関連する関連情報を検出し、該関連情報が写っているシーンを、該関連情報に係る前記対象人物が写っているシーンとして抽出する。 An imaging system according to an aspect of the present invention is an imaging system including a plurality of imaging devices, and includes a captured image acquisition unit that acquires a captured image, face information of a target person from the captured image, and a detection unit for detecting at least one of information other than the face of the target person and related to the target person, and extracting a scene in which the target person is shown based on the detection result; and detection by the detection unit. An output unit that outputs results so as to be shared among a plurality of imaging devices, and a storage unit that associates and stores face information of a target person and related information shared among the plurality of imaging devices. , the detection unit detects relevant information related to the target person from the captured image based on the information stored in correspondence in the storage unit, and detects the scene in which the relevant information is captured as the target person related to the relevant information. is extracted as a scene in which is shown.

本発明の一態様に係る撮像システムでは、対象人物の顔情報及び関連情報の検出結果が各撮像装置間で共有され、記憶部において、対象人物の顔情報及び関連情報が対応付けて記憶されている。このため、撮像システムにおいては、撮像画像から関連情報を検出することができれば、該関連情報に係る対象人物が写っているシーンを適切に抽出することができる。すなわち、対象人物の顔情報及び関連情報が対応付けられていることによって、例えば対象人物の顔情報を検出することができない場合であっても、関連情報さえ検出できれば、該関連情報が写っているシーンを対象人物が写っているシーンとして適切に抽出することができる。このように、顔情報を検出することができない撮像画像からも対象人物のシーンを抽出することによって、対象人物の抽出をより高精度に行うことができる。 In the imaging system according to one aspect of the present invention, detection results of the target person's face information and related information are shared among the imaging devices, and the storage unit stores the target person's face information and related information in association with each other. there is Therefore, in the imaging system, if the related information can be detected from the captured image, the scene in which the target person related to the related information is shown can be appropriately extracted. That is, since the face information of the target person and the related information are associated with each other, even if the face information of the target person cannot be detected, if the related information can be detected, the related information is captured. A scene can be appropriately extracted as a scene in which a target person is shown. In this way, by extracting the scene of the target person even from the captured image in which the face information cannot be detected, the extraction of the target person can be performed with higher accuracy.

上述した撮像システムでは、検出部は、対象人物の顔情報の検出に成功した場合において、対象人物が携行している情報を関連情報として検出してもよい。対象人物が携行している情報は、対象人物との関連度が高く、対象人物と共に撮像される可能性が高いと考えられる。このような情報が、対象人物の顔情報の検出に成功した撮像装置によって関連情報として検出されることにより、該関連情報に基づいてより高精度に対象人物の抽出を行うことができる。 In the imaging system described above, the detection unit may detect information carried by the target person as related information when the target person's face information is successfully detected. The information carried by the target person has a high degree of relevance to the target person, and is likely to be imaged together with the target person. Such information is detected as related information by an imaging device that has successfully detected the face information of the target person, so that the target person can be extracted with higher accuracy based on the related information.

上述した撮像システムでは、検出部は、対象人物の顔情報の検出に成功した場合において、対象人物の周囲にいる周囲人物を関連情報として検出してもよい。対象人物の周囲人物は、対象人物との関連度が高く、対象人物と共に撮像される可能性が高いと考えられる。このような情報が、対象人物の顔情報の検出に成功した撮像装置によって関連情報として検出されることにより、該関連情報に基づいてより高精度に対象人物の抽出を行うことができる。 In the imaging system described above, when the detection of the target person's face information is successful, the detection unit may detect surrounding persons around the target person as related information. People around the target person have a high degree of association with the target person, and are likely to be imaged together with the target person. Such information is detected as related information by an imaging device that has successfully detected the face information of the target person, so that the target person can be extracted with higher accuracy based on the related information.

上述した撮像システムでは、周囲人物は、他の撮像装置においても対象人物の周囲人物として検出されていた場合に、関連情報とされてもよい。これにより、互い異なるロケーションの撮像装置のいずれにおいても周囲人物として検出されていた場合にのみ、該周囲人物が関連情報とされるため、例えば偶然（一時的に）、対象人物の周囲にいたような人物が関連人物（関連情報）とされることを抑制し、より高精度に対象人物の抽出を行うことができる。 In the imaging system described above, surrounding persons may be regarded as related information when they are detected as surrounding persons of the target person also in other imaging devices. As a result, only when a surrounding person is detected as a surrounding person by any of the imaging devices at different locations, the surrounding person is treated as relevant information. Therefore, it is possible to suppress a person who is not a relevant person from being regarded as a related person (related information), and extract a target person with higher accuracy.

上述した撮像システムでは、記憶部は、複数の撮像装置それぞれの位置情報と、対象人物の顔情報及び関連情報の少なくともいずれか一方を検出した撮像装置を示す情報と、該撮像装置が検出した時刻とを更に記憶しており、検出部は、記憶部に記憶されている、複数の撮像装置それぞれの位置情報、検出した撮像装置を示す情報、及び、該撮像装置が検出した時刻を考慮して、撮像画像から対象人物の顔情報及び関連情報の少なくともいずれか一方を検出してもよい。対象人物を検出するに際し、上記の内容が考慮されることにより、対象人物が撮像範囲に入りうるフレームや、対象人物が撮像範囲に流入してくる方向が推定できる（すなわち、検出範囲の絞り込みができる）ため、対象人物の検出精度及び検出速度を向上させることができる。 In the imaging system described above, the storage unit stores position information of each of the plurality of imaging devices, information indicating the imaging device that detected at least one of the face information and related information of the target person, and the time at which the imaging device detected it. is further stored, and the detection unit considers the position information of each of the plurality of imaging devices, the information indicating the detected imaging device, and the time when the imaging device detected it, which are stored in the storage unit. , at least one of face information and related information of the target person may be detected from the captured image. By considering the above contents when detecting a target person, it is possible to estimate the frame in which the target person can enter the imaging range and the direction in which the target person flows into the imaging range (that is, the detection range can be narrowed down). Therefore, it is possible to improve the detection accuracy and detection speed of the target person.

上述した撮像システムでは、検出部は、検出した関連情報について、種別に応じた変化のしやすさを特定し、該変化のしやすさを考慮して、該関連情報が写っているシーンを抽出するか否かを決定してもよい。関連情報については、比較的短期間で情報が変化しやすいもの（例えば服装や持ち運んでいる食べ物等）と、変化しにくいもの（例えば指輪等）とがある。このような変化のしやすさを考慮して、例えば変化しやすい関連情報についてはその情報だけでは関連情報が写っているシーンを抽出せず他の関連情報を検出した場合にのみ関連情報が写っているシーンを抽出する等を行うことによって、より高精度に対象人物の抽出を行うことができる。 In the imaging system described above, the detection unit specifies the easiness of change according to the type of detected related information, and extracts a scene in which the related information is captured, taking into consideration the easiness of change. You may decide whether to Related information includes information that is likely to change in a relatively short period of time (for example, clothes, food that you are carrying, etc.) and information that is difficult to change (for example, rings). In consideration of such easiness of change, for example, for related information that is easily changed, the scene in which the related information is captured is not extracted using only that information, and the related information is captured only when other related information is detected. The target person can be extracted with higher accuracy by extracting the scene in which the target person appears.

記憶部は、初期状態において対象人物の顔情報を記憶しており、撮像画像から対象人物の顔情報及び関連情報の双方の検出に成功した撮像装置からの情報に基づき、対象人物の顔情報に関連情報を対応付けて記憶してもよい。このように、実際に検出された情報に基づき関連情報を対象人物に対応付けることにより、関連情報を用いた対象人物の抽出をより高精度に行うことができる。 The storage unit stores the face information of the target person in an initial state, and stores the face information of the target person based on the information from the imaging device that has successfully detected both the face information of the target person and related information from the captured image. Related information may be associated and stored. In this way, by associating the relevant information with the target person based on the actually detected information, it is possible to extract the target person using the relevant information with higher accuracy.

本発明によれば、複数の撮像装置を含んで構成される撮像システムにおいて、撮像画像から対象人物の抽出を高精度に行うことができる。 Advantageous Effects of Invention According to the present invention, it is possible to extract a target person from a captured image with high accuracy in an imaging system including a plurality of imaging devices.

本実施形態に係る撮像システムの基本動作を説明する図である。It is a figure explaining the basic operation|movement of the imaging system which concerns on this embodiment. 本実施形態に係る撮像システムに含まれるカメラの機能構成を示すブロック図である。It is a block diagram showing the functional configuration of a camera included in the imaging system according to the present embodiment. 本実施形態に係る撮像システムに含まれるカメラが行う処理を示すフローチャートである。4 is a flowchart showing processing performed by a camera included in the imaging system according to the embodiment; 比較例にかかわる撮像システムの動作イメージを説明する図である。It is a figure explaining the operation|movement image of the imaging system in connection with a comparative example. 撮像システムに含まれるカメラのハードウェア構成を示す図である。3 is a diagram showing the hardware configuration of a camera included in the imaging system; FIG.

以下、添付図面を参照しながら本発明の実施形態を詳細に説明する。図面の説明において、同一又は同等の要素には同一符号を用い、重複する説明を省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same reference numerals are used for the same or equivalent elements, and overlapping descriptions are omitted.

図１は、本実施形態に係る撮像システム１の基本動作を説明する図である。図１に示される撮像システム１は、複数のカメラ１０（撮像装置）と、サーバ５０とを含んで構成されている。複数のカメラ１０は、特定の敷地内において、互いに異なる地点に固定的に設置された定点カメラである。カメラ１０は、撮影した動画について、ネットワーク（インターネット）を介してサーバ５０にアップロード可能に構成されている。撮像システム１は、複数のカメラ１０の入力を用いて、予め指定されている対象人物が写っているシーンを抽出するシステムである。例えば、図１に示されるように、人物Ａが対象人物である場合においては、撮像システム１の複数のカメラ１０は、撮像した動画において人物Ａを検出（識別）することができた場合には動画をサーバ５０にアップロードし、撮像した動画において人物Ａを検出することができなかった場合には動画をサーバ５０にアップロードしない。そして、サーバ５０は、各カメラ１０から取得した動画（人物Ａが写ったシーンからなる動画）をつなぎ合わせることにより、複数のカメラ１０が設置された敷地内における人物Ａの動画を生成する。撮像システム１は、例えば、テーマパーク、スキー場等の敷地内における入園者毎の動画を自動生成し提供するサービスに利用される。 FIG. 1 is a diagram for explaining the basic operation of an imaging system 1 according to this embodiment. The imaging system 1 shown in FIG. 1 includes a plurality of cameras 10 (imaging devices) and a server 50 . A plurality of cameras 10 are fixed-point cameras fixedly installed at different points in a specific site. The camera 10 is configured to be able to upload captured moving images to the server 50 via a network (Internet). The imaging system 1 is a system that uses inputs from a plurality of cameras 10 to extract a scene in which a predesignated target person is captured. For example, as shown in FIG. 1, when a person A is the target person, the plurality of cameras 10 of the imaging system 1 can detect (identify) the person A in the captured moving image. A moving image is uploaded to the server 50, and if the person A cannot be detected in the captured moving image, the moving image is not uploaded to the server 50.例文帳に追加Then, the server 50 generates a moving image of the person A in the premises where the cameras 10 are installed by connecting the moving images (moving images of scenes in which the person A is captured) acquired from the respective cameras 10 . The imaging system 1 is used, for example, for a service that automatically generates and provides a moving image for each visitor in the premises of a theme park, ski resort, or the like.

撮像システム１においては、複数のカメラ１０間において、対象人物の検出結果が共有される。対象人物の検出結果とは、対象人物の顔情報だけでなく、対象人物に関連する関連情報（詳細は後述）を含むものである。このような検出結果が各カメラ１０間で共有されることにより、例えば、対象人物を正面から撮像することができず対象人物の顔情報を検出できないカメラ１０においても、対象人物の顔情報及び関連情報が予め共有されていることによって、関連情報を検出し、該関連情報に係る対象人物のシーンを抽出することが可能になる。このように、撮像システム１は、複数のカメラ１０間で対象人物の検出結果を共有することによって、対象人物が写ったシーンをより高精度且つ漏れなく抽出するものである。 In the imaging system 1 , detection results of the target person are shared among the plurality of cameras 10 . The target person detection result includes not only the target person's face information, but also related information related to the target person (details will be described later). By sharing such detection results among the cameras 10, for example, even in the cameras 10 that cannot image the target person from the front and cannot detect the target person's face information, the target person's face information and related Since the information is shared in advance, it becomes possible to detect the related information and extract the scene of the target person related to the related information. As described above, the imaging system 1 shares the detection result of the target person among the plurality of cameras 10, thereby extracting the scene in which the target person is captured with higher accuracy and without omission.

なお、敷地内に複数設置されたカメラ１０について、その設置角度は特に限定されないが、例えば、オープンスペースにおける曲がり角において角の２等分線上に設置されていてもよい。この場合には、人物が角を曲がる前後の映像から、人物の表側（腹側４５度）及び裏側（背側４５度）の映像を取得することができるため、例えば人物の顔、服装の表側（関連情報の一例）、及び服装の裏側（関連情報の一例）のデータを適切に紐づけることができる。 In addition, the installation angle of the cameras 10 installed in the site is not particularly limited. In this case, it is possible to acquire images of the front side (45 degrees on the ventral side) and the back side (45 degrees on the dorsal side) of the person from the images before and after the person turns a corner. (an example of related information) and data on the back side of clothes (an example of related information) can be appropriately linked.

図２は、本実施形態に係る撮像システム１に含まれるカメラ１０の機能構成を示すブロック図である。図２に示されるように、複数のカメラ１０のそれぞれは、記憶部１１と、取得部１２（撮像画像取得部）と、検出部１３と、データ生成部１４と、出力部１５と、を備えている。 FIG. 2 is a block diagram showing the functional configuration of the camera 10 included in the imaging system 1 according to this embodiment. As shown in FIG. 2, each of the cameras 10 includes a storage unit 11, an acquisition unit 12 (captured image acquisition unit), a detection unit 13, a data generation unit 14, and an output unit 15. ing.

記憶部１１は、複数のカメラ１０間で共有される、対象人物の顔情報及び関連情報を対応付けて記憶している。関連情報とは、対象人物の顔以外の情報であって該対象人物に関連する情報である（詳細は後述）。具体的には、記憶部１１は、人物レコードと、カメラレコードと、カメラ関係レコードとを記憶するデータベースである。なお、記憶部１１において記憶されている各情報（少なくとも人物レコード、カメラレコード、及びカメラ関係レコードを含む情報）については、サーバ５０においても記憶されている。すなわち、複数のカメラ１０及びサーバ５０間において、共通の情報が記憶されている。 The storage unit 11 associates and stores face information of a target person and related information shared among the plurality of cameras 10 . Related information is information other than the target person's face and related to the target person (details will be described later). Specifically, the storage unit 11 is a database that stores person records, camera records, and camera-related records. Each piece of information stored in the storage unit 11 (at least information including person records, camera records, and camera-related records) is also stored in the server 50 . In other words, common information is stored between the cameras 10 and the server 50 .

人物レコードは、撮像対象である対象人物毎に設定される情報であり、人物ＩＤと、顔画像（顔情報）と、服装画像のリスト（関連情報）と、周辺人物ＩＤのリストと、関連人物ＩＤのリスト（関連情報）と、最終検出カメラＩＤと、最終検出時刻と、最終検出移動方向とが対応付けられた情報である。人物ＩＤとは人物を一意に特定する識別情報である。顔画像とは、当該人物の顔画像（詳細には、顔画像に係る特徴データ）である。服装画像のリストとは、当該人物の服装（例えば上着）について腹側及び背側等の複数のアングルから取得された画像のリストである。周辺人物ＩＤのリストとは、当該人物の周囲にいる人物を一意に特定する識別情報のリストである。関連人物ＩＤのリストとは、周辺人物ＩＤで示される人物のうち対象人物に関連する人物であると推定される人物（詳細は後述）を一意に特定する識別情報のリストである。なお、周辺人物ＩＤのリスト及び関連人物ＩＤのリストに含まれる各人物の識別情報は、対応する各人物の顔画像に係る特徴データと対応付けられている。最終検出カメラＩＤとは、対象人物の顔情報及び関連情報の少なくともいずれか一方を直近で検出したカメラ１０を一意に特定する識別情報である。最終検出時刻とは、最終検出カメラＩＤで示されるカメラ１０が対象人物の顔情報等を検出した時刻である。最終検出移動方向とは、最終検出カメラＩＤで示されるカメラ１０が対象人物の顔情報等を検出した際における該対象人物の移動方向である。人物レコードの各情報のうち、顔画像は「対象人物の顔情報」であり、服装画像のリスト及び関連人物ＩＤのリストは「対象人物に関連する関連情報」である。詳細には、服装画像のリストは、対象人物が携行している情報の一例であり、関連人物ＩＤのリストは対象人物の周囲にいる周囲人物の一例である。なお、人物レコードにおいて、対象人物に関連する関連情報（詳細には対象人物が携行している情報）として、バッグ、食べ物、サングラス、マスク、帽子、指輪、及び髪型等の画像が記憶されていてもよい。 A person record is information set for each target person who is an imaging target, and includes a person ID, a face image (face information), a list of clothes images (related information), a list of peripheral person IDs, and related persons. It is information in which a list of IDs (related information), the last detected camera ID, the last detected time, and the last detected movement direction are associated with each other. A person ID is identification information that uniquely identifies a person. A face image is a face image of the person (specifically, feature data relating to the face image). The clothing image list is a list of images of the person's clothing (for example, a jacket) obtained from a plurality of angles such as the ventral side and the back side. A peripheral person ID list is a list of identification information that uniquely identifies persons around the person. The list of related person IDs is a list of identification information that uniquely identifies a person (details will be described later) who is presumed to be related to the target person among the persons indicated by the surrounding person IDs. The identification information of each person included in the peripheral person ID list and the related person ID list is associated with the feature data related to the corresponding face image of each person. The last detected camera ID is identification information that uniquely identifies the camera 10 that most recently detected at least one of the target person's face information and related information. The last detection time is the time when the camera 10 indicated by the last detection camera ID detects the face information of the target person. The final detected moving direction is the moving direction of the target person when the camera 10 indicated by the final detection camera ID detects the target person's face information or the like. Among the pieces of information in the person record, the face image is "face information of the target person", and the list of clothing images and the list of related person IDs are "related information related to the target person". Specifically, the list of clothing images is an example of information carried by the target person, and the list of related person IDs is an example of people around the target person. In the person record, images of bags, food, sunglasses, masks, hats, rings, hairstyles, etc. are stored as related information related to the target person (more specifically, information carried by the target person). good too.

人物レコードは、時系列の進行に伴って情報が書き換わっていく（情報が増減する）。人物レコードの各情報のうち、人物ＩＤ及び顔画像については、例えばサーバ５０において予め設定（記憶）されて各カメラ１０に共有されるものであり、カメラ１０による対象人物の検出を待たずに初期状態から記憶されている情報である。人物レコードの各情報のうち、服装画像のリスト、周辺人物ＩＤのリスト、及び関連人物ＩＤのリストについては、対象人物の顔情報及び関連情報の双方の検出に成功した少なくとも１つ以上のカメラ１０からの情報に基づき、人物ＩＤ及び顔画像に対応付けて記憶される情報である。ただし、服装画像のリストのうち背側のアングルから取得される画像については、対象人物の関連情報（服装）のみの検出に成功したカメラ１０からの情報に基づき記憶されるものであってもよい。最終検出カメラＩＤ、最終検出時刻、及び最終検出移動方向については、対象人物の顔情報及び関連情報の少なくともいずれか一方の検出に成功したカメラ１０からの情報に基づき、人物ＩＤ及び顔画像に対応付けて記憶される情報である。 In the person record, information is rewritten (information increases or decreases) as time series progresses. Of the information in the person record, the person ID and the face image are set (stored) in advance in the server 50 and shared by the cameras 10, and are initialized without waiting for the camera 10 to detect the target person. It is the information stored from the state. Among the information of the person record, the clothing image list, peripheral person ID list, and related person ID list include at least one or more cameras 10 that have successfully detected both the face information and related information of the target person. This information is stored in association with the person ID and the face image based on the information from. However, the image acquired from the back angle in the list of clothing images may be stored based on the information from the camera 10 that successfully detected only the relevant information (clothing) of the target person. . The last detected camera ID, the last detected time, and the last detected movement direction correspond to the person ID and the face image based on the information from the camera 10 that successfully detected at least one of the target person's face information and related information. This is information that is stored with the

カメラレコードは、複数のカメラ１０毎に設定される情報であり、カメラＩＤと、カメラ設置位置と、カメラ設置方向とが対応付けられた情報である。カメラＩＤとはカメラ１０を一意に特定する識別情報である。カメラ設置位置とはカメラ１０が設置された位置（場所）を示す情報である。カメラ設置方向とはカメラ１０の撮像方向を示す情報である。カメラ関係レコードは、各カメラ１０について、他の１つのカメラ１０との関係を示す情報である。カメラ関係レコードでは、２つのカメラ（例えば第１カメラ及び第２カメラ）について、第１カメラ及び第２カメラの撮像エリア間の距離と、第１カメラから第２カメラへ移動する場合の移動方向と、第２カメラから第１カメラへ移動する場合の移動方向とが対応付けられている。カメラレコード及びカメラ関係レコードは、カメラ１０の設置場所等を変更しない限りは不変の情報である。 A camera record is information set for each of a plurality of cameras 10, and is information in which a camera ID, a camera installation position, and a camera installation direction are associated with each other. A camera ID is identification information that uniquely identifies the camera 10 . The camera installation position is information indicating the position (place) where the camera 10 is installed. The camera installation direction is information indicating the imaging direction of the camera 10 . A camera relationship record is information indicating the relationship between each camera 10 and another camera 10 . In the camera-related record, for two cameras (for example, the first camera and the second camera), the distance between the imaging areas of the first camera and the second camera, and the movement direction when moving from the first camera to the second camera , and the moving direction when moving from the second camera to the first camera. The camera record and the camera-related record are information that does not change unless the installation location or the like of the camera 10 is changed.

取得部１２は、撮像素子において撮像された撮像画像を取得する。取得部１２は、撮像素子から、フレーム毎に撮像画像を読み込む。取得部１２は、読み込んだフレーム毎の撮像画像を検出部１３に出力する。 The acquisition unit 12 acquires a captured image captured by the image sensor. The acquisition unit 12 reads the captured image for each frame from the image sensor. The acquisition unit 12 outputs the read captured image for each frame to the detection unit 13 .

検出部１３は、撮像画像から、対象人物の顔情報、及び、対象人物の顔以外の情報であって対象人物に関連する関連情報の少なくともいずれか一方を検出し、検出結果に基づき対象人物が写っているシーンを抽出する。検出部１３は、記憶部１１の人物レコードを参照し、各人物ＩＤに対応付けられた顔画像（対象人物の顔情報）を撮像画像から検出可能か否かを判定する。顔画像の検出は、例えば従来から周知の顔認証技術を用いることにより行われる。検出部１３は、対象人物の顔画像（顔情報）の検出に成功した場合、該対象人物が写っているシーンを抽出する。 The detection unit 13 detects at least one of face information of the target person and related information related to the target person that is information other than the face of the target person from the captured image, and detects whether the target person is detected based on the detection result. Extract the captured scene. The detection unit 13 refers to the person record in the storage unit 11 and determines whether or not the face image (face information of the target person) associated with each person ID can be detected from the captured image. Face image detection is performed by using, for example, a conventionally known face authentication technique. When the detection of the target person's face image (face information) is successful, the detection unit 13 extracts a scene in which the target person is shown.

検出部１３は、対象人物の顔画像（顔情報）の検出に成功した場合において、関連情報の検出を行う。具体的には、検出部１３は、検出した対象人物が携行している情報を関連情報として検出する。本実施形態では、携行している情報として服装（上着）の画像を検出する例を説明するが、検出部１３は、携行している情報として、バッグ、食べ物、サングラス、マスク、帽子、及び指輪等の画像を検出してもよい。 The detection unit 13 detects related information when the face image (face information) of the target person is successfully detected. Specifically, the detection unit 13 detects information carried by the detected target person as related information. In this embodiment, an example of detecting an image of clothes (outerwear) as the information carried is described. An image of a ring or the like may be detected.

また、検出部１３は、検出した対象人物の周囲にいる周囲人物を関連人物（関連情報）として検出してもよい。検出部１３は、例えば、対象人物との離間距離が、団体（２人以上）で行動する際に想定され得る範囲内である人物を周囲人物として検出する。なお、周囲人物は、他のカメラ１０においても対象人物の周囲人物として検出されていた場合に、関連人物（関連情報）とされてもよい。また、検出部１３は、周囲人物と対象人物が一定の距離を保って横に並んで同じ速度で歩いていることを検出した場合に周囲人物を関連人物としてもよいし、周囲人物と対象人物が向き合って会話をしている（口の動きで会話をしていると判断できる）場合に周囲人物を関連人物としてもよいし、周囲人物と対象人物が手をつないでいる場合に周囲人物を関連人物としてもよい。このような方法によれば、単一のカメラ１０の情報から物体検出ＡＩ等を用いて関連人物を検出できるため、複数のカメラ１０の情報を用いずに関連人物を検出することができる。周囲人物が関連人物であるか否かの判定は、例えば、検出部１３が記憶部１１の人物レコードを参照することにより行われる。この場合、検出部１３は、人物レコードの周辺人物ＩＤのリストに対応付けられた顔画像（他のカメラ１０において検出された周囲人物の顔画像）が、検出した周囲人物の顔画像と同様である場合に、該周囲人物が関連人物（関連情報）であると判定する。また、周囲人物が関連人物であるか否かの判定は、カメラ１０と同様の情報を記憶するサーバ５０において行われてもよい。 Further, the detection unit 13 may detect surrounding persons around the detected target person as related persons (related information). For example, the detection unit 13 detects, as surrounding persons, persons whose separation distance from the target person is within a range that can be assumed when acting in a group (two or more persons). Surrounding persons may be regarded as related persons (related information) when they are also detected as surrounding persons of the target person by other cameras 10 . Further, when detecting that the surrounding persons and the target person are walking side by side at a constant distance and walking at the same speed, the detecting unit 13 may regard the surrounding persons as related persons, or may determine the surrounding persons and the target person as related persons. When the two are facing each other and having a conversation (it can be judged that they are talking by mouth movements), the surrounding people may be regarded as related people, and when the surrounding people and the target person are holding hands, the surrounding people It may be a related person. According to such a method, it is possible to detect a related person from information of a single camera 10 using object detection AI or the like. Determination as to whether or not the surrounding person is a related person is made, for example, by referring to the person record in the storage unit 11 by the detection unit 13 . In this case, the detection unit 13 determines that the face image associated with the peripheral person ID list of the person record (the face image of the peripheral person detected by the other camera 10) is the same as the detected peripheral person face image. If there is, it is determined that the surrounding person is a related person (related information). Further, the server 50 that stores the same information as the camera 10 may determine whether or not the surrounding person is a related person.

また、検出部１３は、上述した情報以外の様々な情報を関連情報として検出してもよい。例えば検出部１３は、人物を正面以外から撮像した画像（横からの姿、後姿）を関連情報としてもよいし、髪型、髪色、耳の形など携行物ではない人物自身の情報を関連情報としてもよい。例えばカメラ１０を曲がり角に設置することによって、角を曲がる前後で人物の撮像角度を異ならし、上述した正面以外から撮像した画像を取得することができる。また、例えばカメラ１０をチケット券売機を撮像できる位置に設置することによって、購入前（券売機を向いた人物の後姿）、購入後（券売機に背を向けた人物の正面の姿）の画像を取得することができる。すなわち、購入後のフレームにおいて顔写真から人物の検出に成功した場合において、さかのぼって、購入前のフレームから後姿の関連情報を取得することができる。 Further, the detection unit 13 may detect various information other than the above-described information as related information. For example, the detection unit 13 may use, as related information, an image of a person captured from other than the front (a side view, a back view), or information about the person other than the carried item, such as hairstyle, hair color, ear shape, etc., as the related information. may be For example, by installing the camera 10 at a corner, it is possible to change the imaging angle of the person before and after the corner, and obtain an image captured from a position other than the front as described above. In addition, for example, by installing the camera 10 at a position where the ticket vending machine can be imaged, images before purchase (back view of a person facing the ticket vending machine) and after purchase (front view of a person with their back turned to the ticket vending machine) can be obtained. can be obtained. That is, when a person is successfully detected from a photograph of a person's face in a frame after purchase, it is possible to retroactively acquire information related to the rear view from the frame before purchase.

検出部１３は、例えば対象人物の顔画像（顔情報）を検出できなかった場合においても、記憶部１１において対応付けて記憶された情報に基づき、撮像画像から対象人物に関連する関連情報を検出し、関連情報が写っているシーンを、該関連情報に係る対象人物が写っているシーンとして抽出する。例えば、検出部１３は、記憶部１１の人物レコードを参照し、服装画像のリストに示された服装画像を撮像画像から検出可能か否かを判定する。検出部１３は、服装画像のリストに示された服装画像が写っているシーンを、対象人物が写っているシーンとして抽出する。或いは、検出部１３は、記憶部１１の人物レコードを参照し、関連人物ＩＤのリストに対応付けられた顔画像（関連人物の顔画像）を撮像画像から検出可能か否かを判定する。検出部１３は、関連人物ＩＤのリストに対応付けられた顔画像が写っているシーンを、対象人物が写っているシーンとして抽出する。 For example, even when the face image (face information) of the target person cannot be detected, the detection unit 13 detects related information related to the target person from the captured image based on the information stored in association in the storage unit 11. Then, the scene in which the relevant information is shown is extracted as the scene in which the target person related to the relevant information is shown. For example, the detection unit 13 refers to the person record in the storage unit 11 and determines whether the clothing image shown in the clothing image list can be detected from the captured image. The detection unit 13 extracts a scene in which the clothing image shown in the clothing image list is shown as a scene in which the target person is shown. Alternatively, the detection unit 13 refers to the person record in the storage unit 11 and determines whether or not the face image (face image of the related person) associated with the list of related person IDs can be detected from the captured image. The detection unit 13 extracts a scene in which the face image associated with the related person ID list is shown as a scene in which the target person is shown.

検出部１３は、検出した関連情報について、種別に応じた変化のしやすさを特定し、該変化のしやすさを考慮して、該関連情報が写っているシーンを対象人物が写っているシーンとして抽出するか否かを決定してもよい。変化のしやすさとは、例えば変化にかかる推定時間である。関連情報のうち、例えば服装（上着）や持ち運んでいる食べ物等については、比較的変化しやすい（短期間で変化する）と考えられる。一方で、関連情報のうち、例えば関連人物や指輪等については、比較的変化しにくい（短期間で変化しない）と考えられる。このような関連情報の変化のしやすさを考慮することにより、例えば変化しやすい関連情報については、その情報だけでは該関連情報が写っているシーンを対象人物が写っているシーンとして抽出せずに、他の関連情報についても写っていることを条件として対象人物が写っているシーンとする等の判断が可能となる。 The detecting unit 13 identifies the easiness of change according to the type of the detected related information, and considers the easiness of change to determine whether the target person appears in the scene containing the related information. It may be determined whether or not to extract as a scene. The susceptibility to change is, for example, an estimated time required for change. Of the related information, for example, clothes (outerwear) and food carried by the user are considered to change relatively easily (change in a short period of time). On the other hand, among related information, for example, related persons, rings, etc. are considered to be relatively difficult to change (do not change in a short period of time). By taking into account the ease with which such related information changes, for example, with regard to related information that is likely to change, the scene in which the related information is shown is not extracted as the scene in which the target person is shown, based only on that information. In addition, it is possible to make a judgment such as setting the scene in which the target person is photographed on the condition that other related information is also photographed.

検出部１３は、記憶部１１に記憶されている情報を更に利用して、撮像画像から対象人物の顔情報及び関連情報の少なくともいずれか一方を検出してもよい。例えば、検出部１３は、記憶部１１のカメラレコードを参照して得られる複数のカメラ１０の位置情報、人物レコードを参照して得られる最終検出カメラＩＤ（検出したカメラ１０を示す情報）、及び、人物レコードを参照して得られる最終検出時刻（カメラ１０が検出した時刻）を考慮して、対象人物の顔情報及び関連情報の少なくともいずれか一方を検出してもよい。上記の情報を考慮することによって、検出部１３は、対象人物が撮像範囲に入りうるフレームや、対象人物が撮像範囲に流入してくる方向が推定できる（すなわち、検出範囲の絞り込みができる）ため、対象人物の検出精度及び検出速度を向上させることができる。また、検出部１３は、例えば、遠く（例えば数百ｍ等）離れたカメラ１０において所定時間内（例えば数分等）に顔情報又は関連情報が検出された対象人物については、検出され得ない対象人物として検出対象から除外し、その他の対象人物の検出のみを試みる等の処理が可能になる。また、検出部１３は、例えば記憶部１１のカメラ関係レコードを更に参照して、対象人物を検出したカメラ１０の撮像対象エリアから自らのカメラ１０の撮像対象エリアまでの距離（例えば５０ｍ）を特定すると共に、対象人物を検出したカメラ１０の撮像エリアから自らのカメラ１０の撮像エリアに向かって人物が歩いてくる場合の移動方向を特定し、これらの特定した情報と、人物の通常想定され得る移動速度とを考慮して、自らのカメラ１０において対象人物が撮像され得る撮像フレームを絞り込んで、対象人物の検出を行ってもよい。 The detection unit 13 may further use the information stored in the storage unit 11 to detect at least one of the target person's face information and related information from the captured image. For example, the detection unit 13 stores the position information of the plurality of cameras 10 obtained by referring to the camera records in the storage unit 11, the last detected camera ID (information indicating the detected camera 10) obtained by referring to the person record, and At least one of the face information and related information of the target person may be detected in consideration of the final detection time (time detected by the camera 10) obtained by referring to the person record. By considering the above information, the detection unit 13 can estimate the frame in which the target person can enter the imaging range and the direction in which the target person flows into the imaging range (that is, the detection range can be narrowed down). , the target person detection accuracy and detection speed can be improved. Further, the detection unit 13 cannot detect a target person whose face information or related information is detected within a predetermined time (for example, several minutes) by the camera 10 that is far away (for example, several hundred meters). It is possible to perform processing such as excluding a person as a target person from detection targets and attempting only detection of other target persons. Further, the detection unit 13 further refers to, for example, the camera-related record in the storage unit 11, and specifies the distance (for example, 50 m) from the imaging target area of the camera 10 that detected the target person to the imaging target area of the own camera 10. At the same time, the moving direction of the person walking from the imaging area of the camera 10 that detected the target person to the imaging area of the own camera 10 is specified, and the specified information and the person can be normally assumed. The target person may be detected by narrowing down the imaging frames in which the target person can be imaged by the own camera 10 in consideration of the moving speed.

検出部１３は、抽出したシーンをデータ生成部１４に出力する。データ生成部１４は、検出部１３によって抽出された各フレームのシーンについてエンコード処理を行って動画データを生成し、抽出されたシーンに係る動画データをサーバ５０に送信する。サーバ５０は、動画データを受信してデコード処理を行う。更に、サーバ５０は、例えば動画の時間幅を所定長以下に収めるための動画ハイライトシーン抽出をフレーム毎に繰り返し行い、対象人物を含むハイライト動画を生成する。 The detector 13 outputs the extracted scene to the data generator 14 . The data generation unit 14 performs encoding processing on the scene of each frame extracted by the detection unit 13 to generate moving image data, and transmits the moving image data related to the extracted scene to the server 50 . The server 50 receives the video data and decodes it. Further, the server 50 repeatedly extracts moving image highlight scenes for each frame so as to keep the time width of the moving image within a predetermined length, for example, and generates a highlight moving image including the target person.

また、検出部１３は、検出結果に基づき、検出した対象人物の顔画像の特徴データ、関連情報（服装画像、関連人物の顔画像の特徴データ等）、及び周囲人物の顔画像の特徴データを生成し、出力部１５に出力する。出力部１５は、検出部１３による検出結果について、複数のカメラ１０間で共有されるように出力する。出力部１５は、例えば、検出部１３による検出結果をサーバ５０に送信する。サーバ５０は、カメラ１０における検出結果を記憶すると共に、各カメラ１０に検出結果を送信する。これにより、複数のカメラ１０及びサーバ５０間で、各カメラ１０における検出結果が共有される。なお、上述したように、サーバ５０において、周囲人物が関連人物であるか否かの判定が行われてもよい。 Further, based on the detection result, the detection unit 13 detects feature data of the face image of the detected target person, related information (clothing image, feature data of the face image of the related person, etc.), and feature data of the face image of the surrounding people. It is generated and output to the output unit 15 . The output unit 15 outputs the results of detection by the detection unit 13 so as to be shared among the plurality of cameras 10 . The output unit 15 transmits the detection result by the detection unit 13 to the server 50, for example. The server 50 stores the detection results of the cameras 10 and transmits the detection results to each camera 10 . Thereby, the detection result of each camera 10 is shared among the plurality of cameras 10 and the server 50 . As described above, the server 50 may determine whether or not the surrounding person is a related person.

次に、図３を参照して、撮像システム１に含まれるカメラ１０が行う処理を説明する。図３は、本実施形態に係る撮像システム１に含まれるカメラ１０が行う処理を示すフローチャートである。なお、ステップＳ４及びＳ５の処理は、ステップＳ６及びステップＳ７の処理と同時に行われてもよいし、ステップＳ６及びステップＳ７の処理よりも後に行われてもよい。 Next, processing performed by the camera 10 included in the imaging system 1 will be described with reference to FIG. FIG. 3 is a flowchart showing processing performed by the camera 10 included in the imaging system 1 according to this embodiment. The processing of steps S4 and S5 may be performed simultaneously with the processing of steps S6 and S7, or may be performed after the processing of steps S6 and S7.

図３に示されるように、カメラ１０は、まず、撮像素子からフレーム毎に撮像画像を読み込む（ステップＳ１）。つづいて、カメラ１０は、撮像画像から、人物検出を行う（ステップＳ２）。具体的には、カメラ１０は、人物レコードを参照し、各人物ＩＤに対応付けられた顔画像（対象人物の顔情報）を撮像画像から検出可能か否かを判定する。また、カメラ１０は、対象人物の関連情報の検出を行う。具体的には、カメラ１０は、人物レコードを参照し、服装画像のリストに示された服装画像を撮像画像から検出可能か否かを判定する。或いは、カメラ１０は、人物レコードを参照し、関連人物ＩＤのリストに対応付けられた顔画像を撮像画像から検出可能か否かを判定する。 As shown in FIG. 3, the camera 10 first reads a captured image frame by frame from the image sensor (step S1). Subsequently, the camera 10 performs person detection from the captured image (step S2). Specifically, the camera 10 refers to the person record and determines whether or not the face image (face information of the target person) associated with each person ID can be detected from the captured image. In addition, the camera 10 detects related information of the target person. Specifically, the camera 10 refers to the person record and determines whether or not the clothing image shown in the clothing image list can be detected from the captured image. Alternatively, the camera 10 refers to the person record and determines whether or not the face image associated with the list of related person IDs can be detected from the captured image.

カメラ１０は、撮像画像から、対象人物の顔情報、及び、対象人物に関連する関連情報の少なくともいずれか一方を検出すると、検出結果に基づき対象人物が写っているシーンを抽出する（ステップＳ３）。そして、カメラ１０は、抽出された各フレームのシーンについてエンコード処理を行い動画データを生成し（ステップＳ４）、該動画データをサーバ５０に送信する（ステップＳ５）。また、カメラ１０は、検出結果に基づき、検出した対象人物の顔画像の特徴データ、関連人物の顔画像の特徴データ等を生成し（ステップＳ６）、該特徴データをサーバ５０に送信する（ステップＳ７）。なお、サーバ５０に送信された特徴データは、各カメラ１０に共有される。 When the camera 10 detects at least one of the target person's face information and related information related to the target person from the captured image, the camera 10 extracts a scene in which the target person is shown based on the detection result (step S3). . Then, the camera 10 performs encoding processing on the extracted scene of each frame to generate moving image data (step S4), and transmits the moving image data to the server 50 (step S5). Further, the camera 10 generates feature data of the face image of the detected target person, feature data of the face image of the related person, and the like based on the detection result (step S6), and transmits the feature data to the server 50 (step S7). Note that the feature data transmitted to the server 50 is shared by each camera 10 .

次に、本実施形態に係る撮像システム１の作用効果について説明する。 Next, the effects of the imaging system 1 according to this embodiment will be described.

例えばテーマパーク等において敷地内に定点カメラを複数設置し、各定点カメラにおいて撮像されたシーンをつなぎ合わせて、動画を生成するサービスが知られている。このようなサービスにおいては、例えば入場者毎にその人物が写っているシーンのみをつなぎ合わせた動画を生成することが考えられる。このようなサービスを実現する撮像システムについて、例えば図４に示されるような動作イメージが考えられる。図４に示される比較例に係る撮像システムでは、各カメラにおいて、撮像素子からフレーム毎の撮像画像が読み込まれ、動画データが生成されて、全ての動画データ（全シーン）がネットワークを介してサーバに送信されている。そして、サーバにおいて、動画データ（全シーン）が受信され、動画データがデコードされて、フレーム毎に対象人物の人物検出が行われて、対象人物を含むハイライト動画が生成されている。このような撮像システムでは、カメラからサーバに対して撮像した全ての動画データが送信されているため、データ量が大きくなりコストが高くなることが問題となる。 For example, a service is known in which a plurality of fixed-point cameras are installed in a theme park or the like, and scenes captured by the fixed-point cameras are connected to generate a moving image. In such a service, for example, it is conceivable to generate, for each visitor, a moving image in which only scenes in which the visitor is shown are combined. For an imaging system that realizes such a service, an operation image as shown in FIG. 4, for example, is conceivable. In the imaging system according to the comparative example shown in FIG. 4, in each camera, captured images for each frame are read from the imaging element, moving image data is generated, and all moving image data (all scenes) are sent to the server via the network. has been sent to Then, the server receives the video data (all scenes), decodes the video data, detects the target person for each frame, and generates a highlight video including the target person. In such an imaging system, since all captured moving image data is transmitted from the camera to the server, there is a problem that the amount of data increases and the cost increases.

この点、本実施形態に係る撮像システム１では、カメラ１０において対象人物が写っているシーンが抽出されるため、カメラ１０からサーバ５０に送信するデータ量を必要最小限に抑えることができ、上述した比較例に係る撮像システムの課題（データ量が大きくなりコストが高くなる）を解決することができる。なお、例えば図４に示される構成以外の比較例として、カメラで人物検出を行うものの、人物検出をバッチ処理（例えば過去１時間分の映像を全部集めてきて処理する等）で行う構成が考えられる。このような構成の場合、一定時間分の映像を一時的に保存するための記憶装置が各カメラに必要になるか、或いは、別の装置で記憶する場合にはカメラ及び装置間で膨大な通信が生じるため、コスト及びスケーラビリティの点が問題となる。この点、本実施形態に係る撮像システム１では、バッチ処理ではなく同時処理（カメラ１０が撮像すると同時に検出処理及び情報共有処理）を行うため、不要なデータについてはその都度削除することとなり、各カメラ１０に上述したような記憶装置が不要になる。また、同時処理であることによって、例えば施設内の迷子を捜したい場合等、即時に人物抽出結果を得たい場合に、タイムラグなく、人物検出を行うことができる。 In this regard, in the imaging system 1 according to the present embodiment, since the scene in which the target person is captured by the camera 10 is extracted, the amount of data transmitted from the camera 10 to the server 50 can be minimized. It is possible to solve the problem of the imaging system according to the comparative example (the amount of data increases and the cost increases). As a comparative example other than the configuration shown in FIG. 4, for example, a configuration in which human detection is performed by a camera, but the human detection is performed by batch processing (for example, all images for the past hour are collected and processed) is conceivable. be done. In the case of such a configuration, each camera needs a storage device for temporarily storing images for a certain period of time, or if the images are stored in another device, a huge amount of communication is required between the camera and the device. Therefore, cost and scalability issues arise. In this regard, in the imaging system 1 according to the present embodiment, simultaneous processing (detection processing and information sharing processing at the same time when the camera 10 captures an image) is performed instead of batch processing. The camera 10 does not require a storage device as described above. In addition, since it is a simultaneous process, it is possible to detect a person without a time lag when it is desired to obtain a person extraction result immediately, such as when searching for a lost child in a facility.

より詳細には、本実施形態に係る撮像システム１は、複数のカメラ１０を含んで構成される撮像システムであって、複数のカメラ１０それぞれは、撮像画像を取得する取得部１２と、撮像画像から、対象人物の顔情報、及び、該対象人物の顔以外の情報であって該対象人物に関連する関連情報の少なくともいずれか一方を検出し、検出結果に基づき対象人物が写っているシーンを抽出する検出部１３と、検出部１３による検出結果について、複数のカメラ１０間で共有されるように出力する出力部１５と、複数のカメラ１０間で共有される、対象人物の顔情報及び関連情報を対応付けて記憶する記憶部１１と、を備え、検出部１３は、記憶部１１において対応付けて記憶された情報に基づき、撮像画像から対象人物に関連する関連情報を検出し、該関連情報が写っているシーンを、該関連情報に係る対象人物が写っているシーンとして抽出する。 More specifically, the imaging system 1 according to the present embodiment is an imaging system including a plurality of cameras 10. Each of the plurality of cameras 10 includes an acquisition unit 12 that acquires a captured image, a captured image At least one of the face information of the target person and related information other than the face of the target person and related to the target person is detected from the detection results, and a scene in which the target person is shown is detected based on the detection result. a detection unit 13 for extracting; an output unit 15 for outputting detection results by the detection unit 13 so as to be shared among the plurality of cameras 10; A storage unit 11 that stores information in association with each other, and the detection unit 13 detects related information related to the target person from the captured image based on the information stored in association with the storage unit 11, and detects the related information. A scene in which information is shown is extracted as a scene in which a target person related to the related information is shown.

このような撮像システム１では、対象人物の顔情報及び関連情報の検出結果が各カメラ１０間で共有され、記憶部１１において、対象人物の顔情報及び関連情報が対応付けて記憶されている。このため、撮像システム１においては、撮像画像から関連情報を検出することができれば、該関連情報に係る対象人物が写っているシーンを適切に抽出することができる。すなわち、対象人物の顔情報及び関連情報が対応付けられていることによって、例えば対象人物の顔情報を検出することができない場合であっても、関連情報さえ検出できれば、該関連情報が写っているシーンを対象人物が写っているシーンとして適切に抽出することができる。このように、顔情報を検出することができない撮像画像からも対象人物のシーンを抽出することによって、対象人物の抽出をより高精度に行うことができる。なお、関連情報は、顔情報と比べて形状やパターンが変化しにくい場合が多く、認識しやすいというメリットもある。これにより、対象人物の抽出をより高精度に行うことができる。 In such an imaging system 1, detection results of the target person's face information and related information are shared among the cameras 10, and the storage unit 11 stores the target person's face information and related information in association with each other. Therefore, in the imaging system 1, if the related information can be detected from the captured image, the scene in which the target person related to the related information is shown can be appropriately extracted. That is, since the face information of the target person and the related information are associated with each other, even if the face information of the target person cannot be detected, if the related information can be detected, the related information is captured. A scene can be appropriately extracted as a scene in which a target person is shown. In this way, by extracting the scene of the target person even from the captured image in which the face information cannot be detected, the extraction of the target person can be performed with higher accuracy. It should be noted that the related information is more difficult to change in shape and pattern than the face information in many cases, and has the advantage of being easy to recognize. As a result, the target person can be extracted with higher accuracy.

撮像システム１では、検出部１３は、対象人物の顔情報の検出に成功した場合において、対象人物が携行している情報を関連情報として検出する。対象人物が携行している情報は、対象人物との関連度が高く、対象人物と共に撮像される可能性が高いと考えられる。このような情報が、対象人物の顔情報の検出に成功したカメラ１０によって関連情報として検出されることにより、該関連情報に基づいてより高精度に対象人物の抽出を行うことができる。 In the imaging system 1, the detection unit 13 detects information carried by the target person as related information when the detection of the target person's face information is successful. The information carried by the target person has a high degree of relevance to the target person, and is likely to be imaged together with the target person. Such information is detected as related information by the camera 10 that has successfully detected the face information of the target person, so that the target person can be extracted with higher accuracy based on the related information.

撮像システム１では、検出部１３は、対象人物の顔情報の検出に成功した場合において、対象人物の周囲にいる周囲人物を関連人物（関連情報）として検出する。対象人物の周囲人物は、対象人物との関連度が高く、対象人物と共に撮像される可能性が高いと考えられる。このような情報が、対象人物の顔情報の検出に成功したカメラ１０によって関連情報として検出されることにより、該関連情報に基づいてより高精度に対象人物の抽出を行うことができる。すなわち、対象人物が例えば関連人物等により遮蔽されており検出が困難な状況でも、対象人物が存在すると想定されるシーンの抽出を適切に行うことができる。なお、関連人物を含めた複数の人物からなるグループを考えた場合には、例えばグループに含まれる各人物に正解情報を用意することなく、代表者１人の正解情報のみがあればグループ全体のシーンを抽出することができる。 In the imaging system 1, the detection unit 13 detects surrounding persons around the target person as related persons (related information) when the detection of the target person's face information is successful. People around the target person have a high degree of association with the target person, and are likely to be imaged together with the target person. Such information is detected as related information by the camera 10 that has successfully detected the face information of the target person, so that the target person can be extracted with higher accuracy based on the related information. That is, even in a situation where the target person is hidden by, for example, a related person or the like, and detection is difficult, it is possible to appropriately extract a scene in which the target person is assumed to exist. When considering a group consisting of a plurality of persons including related persons, for example, without preparing correct information for each person included in the group, if there is only one representative's correct information, the entire group Scenes can be extracted.

撮像システム１において、周囲人物は、他のカメラ１０においても対象人物の周囲人物として検出されていた場合に、関連情報とされる。これにより、互い異なるロケーションのカメラ１０のいずれにおいても周囲人物として検出されていた場合にのみ、該周囲人物が関連情報とされるため、例えば偶然、対象人物の周囲にいたような人物が関連人物（関連情報）とされることを抑制し、より高精度に対象人物の抽出を行うことができる。 In the imaging system 1 , surrounding persons are regarded as related information when they are detected as surrounding persons of the target person also in other cameras 10 . As a result, only when a surrounding person is detected as a surrounding person by any of the cameras 10 at different locations, the surrounding person is treated as related information. (Related information) can be suppressed, and the target person can be extracted with higher accuracy.

撮像システム１において、記憶部１１は、複数のカメラ１０それぞれの位置情報と、対象人物の顔情報及び関連情報の少なくともいずれか一方を検出したカメラ１０を示す情報と、該カメラ１０が検出した時刻とを更に記憶しており、検出部１３は、記憶部１１に記憶されている、複数のカメラ１０それぞれの位置情報、検出したカメラ１０を示す情報、及び、該カメラ１０が検出した時刻を考慮して、撮像画像から対象人物の顔情報及び関連情報の少なくともいずれか一方を検出する。対象人物を検出するに際し、上記の内容が考慮されることにより、対象人物が撮像範囲に入りうるフレームや、対象人物が撮像範囲に流入してくる方向が推定できる（検出範囲の絞り込みができる）ため、対象人物の検出精度及び検出速度を向上させることができる。 In the imaging system 1, the storage unit 11 stores position information of each of the plurality of cameras 10, information indicating the camera 10 that detected at least one of the target person's face information and related information, and the time at which the camera 10 detected it. , and the detection unit 13 considers the position information of each of the plurality of cameras 10, the information indicating the detected camera 10, and the time when the camera 10 detected, which are stored in the storage unit 11. Then, at least one of the target person's face information and related information is detected from the captured image. By considering the above contents when detecting a target person, it is possible to estimate the frame in which the target person may enter the imaging range and the direction in which the target person flows into the imaging range (the detection range can be narrowed down). Therefore, the target person detection accuracy and detection speed can be improved.

撮像システム１において、検出部１３は、検出した関連情報について、種別に応じた変化のしやすさを特定し、該変化のしやすさを考慮して、該関連情報が写っているシーンを抽出するか否かを決定する。関連情報については、比較的短期間で情報が変化しやすいもの（例えば服装や持ち運んでいる食べ物等）と、変化しにくいもの（例えば指輪等）とがある。このような変化のしやすさを考慮して、例えば変化しやすい関連情報についてはその情報だけでは関連情報が写っているシーンを抽出せず他の関連情報を検出した場合にのみ関連情報が写っているシーンを抽出する等を行うことによって、より高精度に対象人物の抽出を行うことができる。 In the imaging system 1, the detection unit 13 specifies the easiness of change according to the type of the detected related information, and extracts a scene showing the related information in consideration of the easiness of change. Decide whether to Related information includes information that is likely to change in a relatively short period of time (for example, clothes, food that you are carrying, etc.) and information that is difficult to change (for example, rings). In consideration of such easiness of change, for example, for related information that is easily changed, the scene in which the related information is captured is not extracted using only that information, and the related information is captured only when other related information is detected. The target person can be extracted with higher accuracy by extracting the scene in which the target person appears.

撮像システム１において、記憶部１１は、初期状態において対象人物の顔情報を記憶しており、撮像画像から対象人物の顔情報及び関連情報の双方の検出に成功したカメラ１０からの情報に基づき、対象人物の顔情報に関連情報を対応付けて記憶する。このように、実際に検出された情報に基づき関連情報を対象人物に対応付けることにより、関連情報を用いた対象人物の抽出をより高精度に行うことができる。 In the imaging system 1, the storage unit 11 stores the face information of the target person in the initial state, and based on the information from the camera 10 that has successfully detected both the face information of the target person and related information from the captured image, Related information is associated with the face information of the target person and stored. In this way, by associating the relevant information with the target person based on the actually detected information, it is possible to extract the target person using the relevant information with higher accuracy.

最後に、撮像システム１に含まれたカメラ１０のハードウェア構成について、図５を参照して説明する。上述のカメラ１０は、物理的には、プロセッサ１００１、メモリ１００２、ストレージ１００３、通信装置１００４、入力装置１００５、出力装置１００６、バス１００７などを含むコンピュータ装置として構成されてもよい。 Finally, the hardware configuration of the camera 10 included in the imaging system 1 will be described with reference to FIG. The camera 10 described above may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.

なお、以下の説明では、「装置」という文言は、回路、デバイス、ユニットなどに読み替えることができる。カメラ１０のハードウェア構成は、図に示した各装置を１つ又は複数含むように構成されてもよいし、一部の装置を含まずに構成されてもよい。 Note that in the following description, the term "apparatus" can be read as a circuit, device, unit, or the like. The hardware configuration of the camera 10 may be configured to include one or more of each device shown in the figure, or may be configured without some of the devices.

カメラ１０における各機能は、プロセッサ１００１、メモリ１００２などのハードウェア上に所定のソフトウェア（プログラム）を読み込ませることで、プロセッサ１００１が演算を行い、通信装置１００４による通信や、メモリ１００２及びストレージ１００３におけるデータの読み出し及び／又は書き込みを制御することで実現される。 Each function of the camera 10 is performed by loading predetermined software (programs) on hardware such as the processor 1001 and the memory 1002 , the processor 1001 performs calculations, communication by the communication device 1004 , and It is realized by controlling reading and/or writing of data.

プロセッサ１００１は、例えば、オペレーティングシステムを動作させてコンピュータ全体を制御する。プロセッサ１００１は、周辺装置とのインターフェース、制御装置、演算装置、レジスタなどを含む中央処理装置（ＣＰＵ：Central Processing Unit）で構成されてもよい。例えば、カメラ１０の取得部１２等の制御機能はプロセッサ１００１で実現されてもよい。 The processor 1001, for example, operates an operating system to control the entire computer. The processor 1001 may be configured with a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic device, registers, and the like. For example, the control functions of the acquisition unit 12 and the like of the camera 10 may be implemented by the processor 1001 .

また、プロセッサ１００１は、プログラム（プログラムコード）、ソフトウェアモジュールやデータを、ストレージ１００３及び／又は通信装置１００４からメモリ１００２に読み出し、これらに従って各種の処理を実行する。プログラムとしては、上述の実施の形態で説明した動作の少なくとも一部をコンピュータに実行させるプログラムが用いられる。例えば、カメラ１０の取得部１２等の制御機能は、メモリ１００２に格納され、プロセッサ１００１で動作する制御プログラムによって実現されてもよく、他の機能ブロックについても同様に実現されてもよい。上述の各種処理は、１つのプロセッサ１００１で実行される旨を説明してきたが、２以上のプロセッサ１００１により同時又は逐次に実行されてもよい。プロセッサ１００１は、１以上のチップで実装されてもよい。なお、プログラムは、電気通信回線を介してネットワークから送信されても良い。 The processor 1001 also reads programs (program codes), software modules, and data from the storage 1003 and/or the communication device 1004 to the memory 1002, and executes various processes according to them. As the program, a program that causes a computer to execute at least part of the operations described in the above embodiments is used. For example, the control functions of the acquisition unit 12 and the like of the camera 10 may be implemented by a control program stored in the memory 1002 and operated by the processor 1001, and other functional blocks may be similarly implemented. Although it has been described that the above-described various processes are executed by one processor 1001, they may be executed by two or more processors 1001 simultaneously or sequentially. Processor 1001 may be implemented with one or more chips. Note that the program may be transmitted from a network via an electric communication line.

メモリ１００２は、コンピュータ読み取り可能な記録媒体であり、例えば、ＲＯＭ（Read Only Memory）、ＥＰＲＯＭ（Erasable Programmable ＲＯＭ）、ＥＥＰＲＯＭ（Electrically Erasable Programmable ＲＯＭ）、ＲＡＭ（Random Access Memory）などの少なくとも１つで構成されてもよい。メモリ１００２は、レジスタ、キャッシュ、メインメモリ（主記憶装置）などと呼ばれてもよい。メモリ１００２は、本発明の一実施の形態に係る無線通信方法を実施するために実行可能なプログラム（プログラムコード）、ソフトウェアモジュールなどを保存することができる。 The memory 1002 is a computer-readable recording medium, and is composed of at least one of, for example, ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), and RAM (Random Access Memory). may be The memory 1002 may also be called a register, cache, main memory (main storage device), or the like. The memory 1002 can store executable programs (program codes), software modules, etc. for implementing a wireless communication method according to an embodiment of the present invention.

ストレージ１００３は、コンピュータ読み取り可能な記録媒体であり、例えば、ＣＤ－ＲＯＭ（Compact Disc ＲＯＭ）などの光ディスク、ハードディスクドライブ、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Ｂｌｕ－ｒａｙ（登録商標）ディスク)、スマートカード、フラッシュメモリ(例えば、カード、スティック、キードライブ)、フロッピー（登録商標）ディスク、磁気ストリップなどの少なくとも１つで構成されてもよい。ストレージ１００３は、補助記憶装置と呼ばれてもよい。上述の記憶媒体は、例えば、メモリ１００２及び／又はストレージ１００３を含むデータベース、サーバその他の適切な媒体であってもよい。 The storage 1003 is a computer-readable recording medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disk), smart card, flash memory (eg, card, stick, key drive), floppy disk, magnetic strip, and/or the like. Storage 1003 may also be called an auxiliary storage device. The storage medium described above may be, for example, a database, server, or other suitable medium including memory 1002 and/or storage 1003 .

通信装置１００４は、有線及び／又は無線ネットワークを介してコンピュータ間の通信を行うためのハードウェア（送受信デバイス）であり、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード、通信モジュールなどともいう。 The communication device 1004 is hardware (transmitting/receiving device) for communicating between computers via a wired and/or wireless network, and is also called a network device, network controller, network card, communication module, or the like.

入力装置１００５は、外部からの入力を受け付ける入力デバイス（例えば、キーボード、マウス、マイクロフォン、スイッチ、ボタン、センサなど）である。出力装置１００６は、外部への出力を実施する出力デバイス（例えば、ディスプレイ、スピーカー、LEDランプなど）である。なお、入力装置１００５及び出力装置１００６は、一体となった構成（例えば、タッチパネル）であってもよい。 The input device 1005 is an input device (for example, keyboard, mouse, microphone, switch, button, sensor, etc.) that receives input from the outside. The output device 1006 is an output device (eg, display, speaker, LED lamp, etc.) that outputs to the outside. Note that the input device 1005 and the output device 1006 may be integrated (for example, a touch panel).

また、プロセッサ１００１やメモリ１００２などの各装置は、情報を通信するためのバス１００７で接続される。バス１００７は、単一のバスで構成されてもよいし、装置間で異なるバスで構成されてもよい。 Devices such as the processor 1001 and the memory 1002 are connected by a bus 1007 for communicating information. The bus 1007 may be composed of a single bus, or may be composed of different buses between devices.

また、カメラ１０は、マイクロプロセッサ、デジタル信号プロセッサ（ＤＳＰ：Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）などのハードウェアを含んで構成されてもよく、当該ハードウェアにより、各機能ブロックの一部又は全てが実現されてもよい。例えば、プロセッサ１００１は、これらのハードウェアの少なくとも１つで実装されてもよい。 In addition, the camera 10 includes hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). A part or all of each functional block may be implemented by the hardware. For example, processor 1001 may be implemented with at least one of these hardware.

以上、本実施形態について詳細に説明したが、当業者にとっては、本実施形態が本明細書中に説明した実施形態に限定されるものではないということは明らかである。本実施形態は、特許請求の範囲の記載により定まる本発明の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。したがって、本明細書の記載は、例示説明を目的とするものであり、本実施形態に対して何ら制限的な意味を有するものではない。例えば、撮像システム１にはサーバ５０が含まれているとして説明したがこれに限定されず、撮像システムはサーバを有さずに複数のカメラ（撮像装置）で構成されていてもよい。この場合においても、複数のカメラ（撮像装置）が互いに通信を行い、検出結果を共有することによって、上述した撮像システム１と同様の効果を奏することができる。また、上述した実施形態では、撮像システム１に含まれる複数のカメラ１０それぞれが、取得部１２、検出部１３、出力部１５、及び記憶部１１を備えているとして説明したがこれに限定されず、例えば、記憶部等の一部の構成はカメラ以外の撮像システムに含まれる構成（サーバ等）が備えていてもよいし、一部のカメラは上記の各構成を備えると共に他の一部のカメラは上記の各構成の一部のみを備えていてもよい。 Although the present embodiments have been described in detail above, it will be apparent to those skilled in the art that the present embodiments are not limited to the embodiments described herein. This embodiment can be implemented as modifications and changes without departing from the spirit and scope of the present invention defined by the description of the claims. Therefore, the description in this specification is for the purpose of illustration and explanation, and does not have any restrictive meaning with respect to the present embodiment. For example, although it has been described that the imaging system 1 includes the server 50, the imaging system is not limited to this, and the imaging system may be composed of a plurality of cameras (imaging devices) without a server. Even in this case, a plurality of cameras (imaging devices) communicate with each other and share detection results, thereby achieving the same effect as the imaging system 1 described above. Further, in the above-described embodiment, each of the plurality of cameras 10 included in the imaging system 1 has been described as including the acquisition unit 12, the detection unit 13, the output unit 15, and the storage unit 11, but the present invention is not limited to this. For example, part of the configuration such as the storage unit may be included in the configuration (server, etc.) included in the imaging system other than the camera, and some cameras include each of the above configurations and some other The camera may have only part of each of the above configurations.

本明細書で説明した各態様／実施形態は、ＬＴＥ（Long Term Evolution）、ＬＴＥ－Ａ（LTE-Advanced）、ＳＵＰＥＲ３Ｇ、ＩＭＴ－Ａｄｖａｎｃｅｄ、４Ｇ、５Ｇ、ＦＲＡ（Future Radio Access）、Ｗ－ＣＤＭＡ（登録商標）、ＧＳＭ（登録商標）、ＣＤＭＡ２０００、ＵＭＢ（Ultra Mobile Broad-band）、ＩＥＥＥ８０２．１１（Ｗｉ－Ｆｉ）、ＩＥＥＥ８０２．１６（ＷｉＭＡＸ）、ＩＥＥＥ８０２．２０、ＵＷＢ（Ultra-Wide Band）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、その他の適切なシステムを利用するシステム及び／又はこれらに基づいて拡張された次世代システムに適用されてもよい。 Aspects/embodiments described herein support Long Term Evolution (LTE), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G, 5G, Future Radio Access (FRA), W-CDMA (registered trademark), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broad-band), IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, UWB (Ultra-Wide Band), Bluetooth®, other suitable systems and/or extended next generation systems based on these.

本明細書で説明した各態様／実施形態の処理手順、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本明細書で説明した方法については、例示的な順序で様々なステップの要素を提示しており、提示した特定の順序に限定されない。 The processing procedures, flow charts, etc. of each aspect/embodiment described herein may be interchanged as long as there is no inconsistency. For example, the methods described herein present elements of the various steps in a sample order, and are not limited to the specific order presented.

入出力された情報等は特定の場所(例えば、メモリ)に保存されてもよいし、管理テーブルで管理してもよい。入出力される情報等は、上書き、更新、または追記され得る。出力された情報等は削除されてもよい。入力された情報等は他の装置へ送信されてもよい。 Input/output information and the like may be stored in a specific location (for example, memory), or may be managed in a management table. Input/output information and the like may be overwritten, updated, or appended. The output information and the like may be deleted. The entered information and the like may be transmitted to another device.

判定は、１ビットで表される値（０か１か）によって行われてもよいし、真偽値（Boolean：trueまたはfalse）によって行われてもよいし、数値の比較（例えば、所定の値との比較）によって行われてもよい。 The determination may be made by a value represented by one bit (0 or 1), by a true/false value (Boolean: true or false), or by numerical comparison (for example, a predetermined value).

本明細書で説明した各態様／実施形態は単独で用いてもよいし、組み合わせて用いてもよいし、実行に伴って切り替えて用いてもよい。また、所定の情報の通知（例えば、「Ｘであること」の通知）は、明示的に行うものに限られず、暗黙的（例えば、当該所定の情報の通知を行わない）ことによって行われてもよい。 Each aspect/embodiment described herein may be used alone, in combination, or switched between implementations. In addition, the notification of predetermined information (for example, notification of “being X”) is not limited to being performed explicitly, but may be performed implicitly (for example, not notifying the predetermined information). good too.

ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 Software, whether referred to as software, firmware, middleware, microcode, hardware description language or otherwise, includes instructions, instruction sets, code, code segments, program code, programs, subprograms, and software modules. , applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, and the like.

また、ソフトウェア、命令などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、同軸ケーブル、光ファイバケーブル、ツイストペア及びデジタル加入者回線（ＤＳＬ）などの有線技術及び／又は赤外線、無線及びマイクロ波などの無線技術を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び／又は無線技術は、伝送媒体の定義内に含まれる。 Software, instructions, etc. may also be sent and received over a transmission medium. For example, the software can be used to access websites, servers, or other When transmitted from a remote source, these wired and/or wireless technologies are included within the definition of transmission media.

本明細書で説明した情報、信号などは、様々な異なる技術のいずれかを使用して表されてもよい。例えば、上記の説明全体に渡って言及され得るデータ、命令、コマンド、情報、信号、ビット、シンボル、チップなどは、電圧、電流、電磁波、磁界若しくは磁性粒子、光場若しくは光子、又はこれらの任意の組み合わせによって表されてもよい。 Information, signals, etc. described herein may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. may be represented by a combination of

なお、本明細書で説明した用語及び／又は本明細書の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えてもよい。 The terms explained in this specification and/or terms necessary for understanding this specification may be replaced with terms having the same or similar meanings.

また、本明細書で説明した情報、パラメータなどは、絶対値で表されてもよいし、所定の値からの相対値で表されてもよいし、対応する別の情報で表されてもよい。 In addition, the information, parameters, etc. described in this specification may be represented by absolute values, may be represented by relative values from a predetermined value, or may be represented by corresponding other information. .

ユーザ端末は、当業者によって、移動通信端末、加入者局、モバイルユニット、加入者ユニット、ワイヤレスユニット、リモートユニット、モバイルデバイス、ワイヤレスデバイス、ワイヤレス通信デバイス、リモートデバイス、モバイル加入者局、アクセス端末、モバイル端末、ワイヤレス端末、リモート端末、ハンドセット、ユーザエージェント、モバイルクライアント、クライアント、またはいくつかの他の適切な用語で呼ばれる場合もある。 User terminals are defined by those skilled in the art as mobile communication terminals, subscriber stations, mobile units, subscriber units, wireless units, remote units, mobile devices, wireless devices, wireless communication devices, remote devices, mobile subscriber stations, access terminals, It may also be called a mobile terminal, wireless terminal, remote terminal, handset, user agent, mobile client, client or some other suitable term.

本明細書で使用する「判断(determining)」、「決定(determining)」という用語は、多種多様な動作を包含する場合がある。「判断」、「決定」は、例えば、計算(calculating)、算出(computing)、処理(processing)、導出(deriving)、調査(investigating)、探索(looking up)（例えば、テーブル、データベースまたは別のデータ構造での探索）、確認(ascertaining)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、受信(receiving)（例えば、情報を受信すること）、送信(transmitting)(例えば、情報を送信すること)、入力(input)、出力(output)、アクセス(accessing)（例えば、メモリ中のデータにアクセスすること）した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、解決(resolving)、選択(selecting)、選定(choosing)、確立(establishing)、比較(comparing)などした事を「判断」「決定」したとみなす事を含み得る。つまり、「判断」「決定」は、何らかの動作を「判断」「決定」したとみなす事を含み得る。 As used herein, the terms "determining" and "determining" may encompass a wide variety of actions. "Determining", "determining" means, for example, calculating, computing, processing, deriving, investigating, looking up (e.g., in a table, database or other search in data structures), ascertaining as "judgement" or "decision". Also, "judgment" and "determination" are used for receiving (e.g., receiving information), transmitting (e.g., transmitting information), input, output, access (accessing) (for example, accessing data in memory) may include deeming that a "judgment" or "decision" has been made. In addition, "judgment" and "decision" are considered to be "judgment" and "decision" by resolving, selecting, choosing, establishing, comparing, etc. can contain. In other words, "judgment" and "decision" may include considering that some action is "judgment" and "decision".

本明細書で使用する「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 As used herein, the phrase "based on" does not mean "based only on," unless expressly specified otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."

本明細書で「第１の」、「第２の」などの呼称を使用した場合においては、その要素へのいかなる参照も、それらの要素の量または順序を全般的に限定するものではない。これらの呼称は、２つ以上の要素間を区別する便利な方法として本明細書で使用され得る。したがって、第１および第２の要素への参照は、２つの要素のみがそこで採用され得ること、または何らかの形で第１の要素が第２の要素に先行しなければならないことを意味しない。 Where "first," "second," etc. designations are used herein, any reference to such elements does not generally limit the quantity or order of those elements. These designations may be used herein as a convenient method of distinguishing between two or more elements. Thus, references to first and second elements do not imply that only two elements may be employed therein or that the first element must precede the second element in any way.

「含む（include）」、「含んでいる（including）」、およびそれらの変形が、本明細書あるいは特許請求の範囲で使用されている限り、これら用語は、用語「備える(comprising)」と同様に、包括的であることが意図される。さらに、本明細書あるいは特許請求の範囲において使用されている用語「または（or）」は、排他的論理和ではないことが意図される。 Wherever "include," "including," and variations thereof are used in the specification or claims, these terms are synonymous with the term "comprising." are intended to be inclusive. Furthermore, the term "or" as used in this specification or the claims is not intended to be an exclusive OR.

本明細書において、文脈または技術的に明らかに1つのみしか存在しない装置である場合以外は、複数の装置をも含むものとする。 In this specification, plural devices are also included unless the context or technicality clearly dictates that there is only one.

本開示の全体において、文脈から明らかに単数を示したものではなければ、複数のものを含むものとする。 Throughout this disclosure, the plural shall be included unless the context clearly indicates the singular.

１…撮像システム、１０…カメラ、１１…記憶部、１２…取得部（撮像画像取得部）、１３…検出部、１５…出力部。 DESCRIPTION OF SYMBOLS 1... Imaging system, 10... Camera, 11... Storage part, 12... Acquisition part (captured image acquisition part), 13... Detection part, 15... Output part.

Claims

An imaging system including a plurality of imaging devices,
a captured image acquisition unit that acquires a captured image;
At least one of face information of a target person and related information other than the face of the target person and related to the target person is detected from the captured image, and the target person is captured based on the detection result. a detection unit for extracting a scene in which
an output unit for outputting the detection result by the detection unit so as to be shared among the plurality of imaging devices;
a storage unit that associates and stores the face information of the target person and related information shared among the plurality of imaging devices;
The detection unit is
Related information related to the target person is detected from the captured image based on the information stored in correspondence in the storage unit, and a scene in which the related information is captured is displayed in which the target person related to the related information is captured. extracted as a scene with
An imaging system that identifies the easiness of change according to the type of detected related information, and determines whether or not to extract a scene in which the related information is captured, in consideration of the easiness of change. .