JP2024059428A

JP2024059428A - Signal processing device, signal processing method, and storage medium

Info

Publication number: JP2024059428A
Application number: JP2022167102A
Authority: JP
Inventors: 一優鬼木
Original assignee: Sony Semiconductor Solutions Corp
Current assignee: Sony Semiconductor Solutions Corp
Priority date: 2022-10-18
Filing date: 2022-10-18
Publication date: 2024-05-01
Also published as: WO2024085023A1

Abstract

【課題】複数の撮像画像を用いること無く密集度を局所的に推定する。【解決手段】信号処理装置は、撮像画像において検出された対象物に対応したバウンディングボックスを前記撮像画像上に設定し、前記バウンディングボックス同士の重なり度合いに応じて前記対象物についての密集度を算出する画像処理部と、を備える。【選択図】図１２A signal processing device includes an image processing unit that sets a bounding box on a captured image corresponding to an object detected in the captured image and calculates the density of the object according to the degree of overlap between the bounding boxes.

Description

本技術は信号処理装置、信号処理方法及び記憶媒体に関し、撮像画像に基づいて対象物の密集度を算出するための技術に関する。 This technology relates to a signal processing device, a signal processing method, and a storage medium, and to a technology for calculating the density of objects based on a captured image.

監視カメラ等において撮像された撮像画像から画角内に存在する人物を検出しその密集度を算出することにより、密集状態を回避するための処理や行楽地において人気のエリアを提示する処理などが知られている。 Processes that detect people within the field of view from images captured by surveillance cameras and calculate their density are known to help avoid crowded conditions and to show popular areas in tourist spots.

下記特許文献１では、撮像画像から密集度を算出する例が記載されている。具体的には、撮像画像を複数のブロックに分割し、ブロックごとに群衆を検出する技術が開示されている。 The following Patent Document 1 describes an example of calculating crowd density from a captured image. Specifically, it discloses a technology that divides a captured image into multiple blocks and detects a crowd for each block.

特開２０１７－０６８５９８号公報JP 2017-068598 A

ブロックごとの群衆検出は局所性に難があり、撮像画像における人物の密集状態の大まかな把握は可能であるが、より詳細な分析には用いることができない可能性がある。
また、背景差分処理を適用することにより群衆を検出することが記載されているが、この手法は複数枚の撮像画像を要するという問題がある。 Block-based crowd detection has problems with localization, and while it can provide a rough grasp of the density of people in a captured image, it may not be usable for more detailed analysis.
It is also described that crowds are detected by applying background subtraction processing, but this method has the problem that it requires multiple captured images.

本技術は、このような事情に鑑みて為されたものであり、複数の撮像画像を用いること無く密集度を局所的に推定することを目的とする。 This technology was developed in light of these circumstances, and aims to estimate the density locally without using multiple captured images.

本技術に係る信号処理装置は、撮像画像において検出された対象物に対応したバウンディングボックスを前記撮像画像上に設定し、前記バウンディングボックス同士の重なり度合いに応じて前記対象物についての密集度を算出する画像処理部と、を備えている。
これにより、１枚の撮像画像を用いた簡易な処理で密集度を算出することができる。 A signal processing device according to the present technology includes an image processing unit that sets a bounding box on a captured image corresponding to an object detected in the captured image, and calculates a density of the object according to a degree of overlap between the bounding boxes.
This makes it possible to calculate the density through simple processing using one captured image.

密集度算出システムの構成例を示す図である。FIG. 1 is a diagram illustrating an example of the configuration of a congestion calculation system. カメラ装置の内部構成例を示したブロック図である。FIG. 2 is a block diagram showing an example of the internal configuration of the camera device. ＡＩ画像処理部に入力される画像の一例である。This is an example of an image input to the AI image processing unit. 人物についてのバウンディングボックスが重畳された画像の一例である。1 is an example of an image with a bounding box superimposed on it for a person. 人物についてのキーポイントが重畳された画像の一例である。1 is an example of an image on which key points for a person are superimposed. ＡＩ画像処理部の機能ブロック図である。FIG. 1 is a functional block diagram of an AI image processing unit. カメラ装置によって撮像された撮像画像の一例である。4 is an example of a captured image captured by a camera device. ボックス重心位置とキーポイント重心位置の紐付けについての説明図である。FIG. 13 is an explanatory diagram of linking a box centroid position and a keypoint centroid position; 対象人物と重複人物の例である。This is an example of a target person and an overlapping person. ＩｏＵの算出例である。1 is a calculation example of IoU. 密集度を算出するためにＡＩ画像処理部等によって実行される処理についてのフローチャートの一例である。This is an example of a flowchart of processing executed by an AI image processing unit or the like to calculate the density. 密集度算出処理についてのフローチャートの一例である。13 is an example of a flowchart of a density calculation process. 管理装置の機能ブロック図である。FIG. 2 is a functional block diagram of a management device. ユーザに対して密集度の情報を提示するための画像の一例である。13 is an example of an image for presenting congestion information to a user. バウンディングボックスに密集度の情報を重畳表示する例である。This is an example of superimposing density information on a bounding box. 密集度の情報を円形状に重畳表示する例である。This is an example in which information on the density is displayed in a circular, overlapping manner. クラウドサーバを有して構成される密集度算出システムの例を示す図である。FIG. 1 is a diagram illustrating an example of a congestion calculation system including a cloud server. クラウド側情報処理装置が備えるマーケットプレイス機能を介してＡＩモデルやＡＩアプリケーションの登録やダウンロードを行う各機器について説明するための図である。This is a diagram for explaining each device that registers and downloads AI models and AI applications via the marketplace function provided in the cloud-side information processing device. クラウド側の情報処理装置とエッジ側の情報処理装置の接続態様について説明するための図である。1 is a diagram for explaining a connection mode between a cloud-side information processing device and an edge-side information processing device. FIG. クラウド側情報処理装置の機能ブロック図である。FIG. 2 is a functional block diagram of a cloud-side information processing device. カメラのソフトウェア構成を示すブロック図である。FIG. 2 is a block diagram showing the software configuration of the camera. コンテナ技術を用いた場合のコンテナの動作環境を示すブロック図である。FIG. 1 is a block diagram showing an operating environment of a container when container technology is used. 情報処理装置のハードウェア構成の一例を示すブロック図である。FIG. 2 is a block diagram showing an example of a hardware configuration of an information processing device. その他の説明における処理の流れを説明する図である。FIG. 11 is a diagram for explaining the flow of processing in other explanations. マーケットプレイスにログインするためのログイン画面の一例を示す図である。FIG. 13 is a diagram illustrating an example of a login screen for logging in to a marketplace. マーケットプレイスを利用する各開発者に提示される開発者向け画面の一例を示す図である。FIG. 13 is a diagram showing an example of a developer screen presented to each developer who uses the marketplace. マーケットプレイスを利用するアプリケーション利用ユーザに提示される利用者向け画面の一例を示す図である。FIG. 13 is a diagram showing an example of a user-oriented screen presented to an application user who uses a marketplace.

以下、実施の形態を次の順序で説明する。
＜１．密集度算出システムの構成＞
＜２．カメラ装置の構成＞
＜３．ＡＩ画像処理部の機能＞
＜４．キーポイントの尤度について＞
＜５．未検出率に基づく人物の検出について＞
＜６．処理例＞
＜７．表示処理＞
＜８．変形例＞
＜９．クラウドアプリケーションとしての例＞
＜１０．クラウド側情報処理装置の機能概要＞
＜１１．ＡＩモデル及びＡＩアプリケーションの展開＞
＜１２．情報処理装置のハードウェア構成＞
＜１３．その他＞
＜１４．マーケットプレイスの画面例＞
＜１５．まとめ＞
＜１６．本技術＞
The embodiments will be described below in the following order.
1. Configuration of the congestion calculation system
2. Configuration of the camera device
<3. Functions of AI image processing unit>
4. About the likelihood of keypoints
5. Person detection based on undetected rate
<6. Processing example>
<7. Display Processing>
8. Modifications
<9. Examples of cloud applications>
<10. Functional overview of cloud-side information processing device>
<11. Deployment of AI models and AI applications>
<12. Hardware configuration of information processing device>
<13. Other>
<14. Marketplace screen example>
<15. Summary>
<16. This Technology>

＜１．密集度算出システムの構成＞
本実施の形態における密集度算出システム１の構成について添付図を参照して説明する。 1. Configuration of the congestion calculation system
The configuration of a congestion calculation system 1 according to this embodiment will be described with reference to the accompanying drawings.

密集度算出システム１は、例えば、管理装置２と複数台のカメラ装置３とを備えて構成されている。各カメラ装置３は、撮像動作によって得られた撮像画像において密集度を算出する処理を実行する。 The density calculation system 1 is configured, for example, with a management device 2 and multiple camera devices 3. Each camera device 3 executes a process to calculate the density in an image captured by an imaging operation.

管理装置２は、各カメラ装置３を管理するための各種処理を行う情報処理装置である。また、管理装置２は、各カメラ装置３において算出された密集度と、カメラ装置３が設置されているロケーション情報などに基づいて、混雑しているエリア或いは混雑していないエリアを特定する処理などを実行する。 The management device 2 is an information processing device that performs various processes for managing each camera device 3. In addition, the management device 2 performs processes such as identifying congested or uncrowded areas based on the density calculated for each camera device 3 and location information where the camera device 3 is installed.

これらの情報は、管理装置２によって適宜ユーザ等に通知される。例えば、アミューズメントパークなどにおいては、密集度が高いエリアを混雑しているエリアとして特定することにより、人気の高いエリアをユーザに知らせることができる。 These pieces of information are notified to users, etc., by the management device 2 as appropriate. For example, in an amusement park, etc., areas with high crowding can be identified as congested areas, and users can be informed of popular areas.

或いは、密集度の変化を捉え密集度が高まったことを検出することによりイベントが開始されたことをユーザに通知することができる。 Alternatively, it can capture changes in density and detect an increase in density to notify the user that an event has started.

また、密集度が高いエリアを特定することにより混雑していて避けるべきエリアをユーザに通知することができる。 In addition, by identifying areas of high density, users can be notified of congested areas that should be avoided.

一方、密集度が低いエリアを特定することによりユーザに情報を提供することも可能である。例えば、密集度が低いエリアを特定することにより待ち時間が少ないエリアや回りやすいエリアをユーザに推奨することが可能となる。 On the other hand, it is also possible to provide information to users by identifying areas with low congestion. For example, by identifying areas with low congestion, it is possible to recommend areas with short waiting times or areas that are easy to get around to users.

管理装置２は、このような情報をユーザに提供するための通知処理や、ユーザが所持しているユーザ端末（不図示）の画面上に情報を表示させるための表示処理を行う。 The management device 2 performs notification processing to provide such information to the user and display processing to display the information on the screen of a user terminal (not shown) held by the user.

管理装置２と各カメラ装置３は、通信ネットワーク４を介して相互にデータ通信可能に構成されている。 The management device 2 and each camera device 3 are configured to be able to communicate data with each other via the communication network 4.

管理装置２は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、及びＲＡＭ（Random Access Memory）等を有するマイクロコンピュータを備えた情報処理装置として構成されている。 The management device 2 is configured as an information processing device equipped with a microcomputer having a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), etc.

管理装置２は、監視者が用いる情報処理端末とされてもよいし、フォグサーバなどのサーバ装置とされてもよい。 The management device 2 may be an information processing terminal used by a monitor, or may be a server device such as a fog server.

カメラ装置３は監視カメラ等とされ、画角内に位置する人物を俯瞰で撮像可能な位置及び角度に調整されて各所に設置されている。具体的には、カメラ装置３は水平方向と光軸の成す角度が３０度程度から５０度程度となるように設定されている。
在る程度俯瞰で撮像した撮像画像が得られるようにカメラ装置３を設置することで、人物Ｐ同士の位置が遠く密集していないにもかかわらず人物が重なって撮像されてしまうことが防止され、適切でない密集度が算出されないようにすることができる The camera device 3 is a surveillance camera or the like, and is installed in various locations at positions and angles adjusted so that people positioned within the viewing angle can be captured from above. Specifically, the camera device 3 is set so that the angle between the horizontal direction and the optical axis is about 30 degrees to about 50 degrees.
By installing the camera device 3 so as to obtain an image captured from a bird's-eye view to some extent, it is possible to prevent people P from being captured in an image overlapping each other even though they are far away from each other and are not crowded together, and to prevent an inappropriate density from being calculated.

カメラ装置３の構成については改めて後述する。 The configuration of the camera device 3 will be described later.

通信ネットワーク４の構成は各種考えられる。例えば、インターネット、イントラネット、エキストラネット、ＬＡＮ（Local Area Network）、ＣＡＴＶ（Community Antenna TeleVision）通信網、仮想専用網（Virtual Private Network）、電話回線網、移動体通信網、衛星通信網などが通信ネットワーク４として想定される。
また、通信ネットワーク４の全部又は一部を構成する伝送媒体についても多様な例が想定される。例えばＩＥＥＥ（Institute of Electrical and Electronics Engineers）１３９４、ＵＳＢ（Universal Serial Bus）、電力線搬送、電話線などの有線でも、ＩｒＤＡ（Infrared Data Association）のような赤外線、ブルートゥース（登録商標）、８０２．１１無線、携帯電話網、衛星回線、地上波デジタル網などの無線でも利用可能である。
Various configurations are possible for the communication network 4. For example, the communication network 4 may be the Internet, an intranet, an extranet, a LAN (Local Area Network), a CATV (Community Antenna TeleVision) communication network, a virtual private network, a telephone line network, a mobile communication network, a satellite communication network, or the like.
Various examples of transmission media are also conceivable for constituting all or part of the communication network 4. For example, wired media such as IEEE (Institute of Electrical and Electronics Engineers) 1394, USB (Universal Serial Bus), power line carrier, and telephone line, as well as wireless media such as infrared ray such as IrDA (Infrared Data Association), Bluetooth (registered trademark), 802.11 wireless, mobile phone network, satellite line, and terrestrial digital network can be used.

＜２．カメラ装置の構成＞
図２は、カメラ装置３の内部構成例を示したブロック図である。
図示のようにカメラ装置３は、撮像光学系３１、光学系駆動部３２、イメージセンサＩＳ、制御部３３、メモリ部３４、通信部３５を備えている。イメージセンサＩＳと制御部３３とメモリ部３４と通信部３５はバス３６を介して接続され、相互にデータ通信を行うことが可能とされている。 2. Configuration of the camera device
FIG. 2 is a block diagram showing an example of the internal configuration of the camera device 3.
As shown in the figure, the camera device 3 includes an imaging optical system 31, an optical system driving unit 32, an image sensor IS, a control unit 33, a memory unit 34, and a communication unit 35. The image sensor IS, the control unit 33, the memory unit 34, and the communication unit 35 are connected via a bus 36, and are capable of mutual data communication.

撮像光学系３１は、カバーレンズ、ズームレンズ、フォーカスレンズ等のレンズや絞り（アイリス）機構を備える。この撮像光学系３１により、被写体からの光（入射光）が導かれ、イメージセンサＩＳの受光面に集光される。 The imaging optical system 31 includes lenses such as a cover lens, a zoom lens, and a focus lens, as well as an iris mechanism. The imaging optical system 31 guides light (incident light) from the subject and focuses it on the light receiving surface of the image sensor IS.

光学系駆動部３２は、撮像光学系３１が有するズームレンズ、フォーカスレンズ、及び絞り機構の駆動部を包括的に示したものである。具体的に、光学系駆動部３２は、これらズームレンズ、フォーカスレンズ、絞り機構それぞれを駆動するためのアクチュエータ、及び該アクチュエータの駆動回路を有している。 The optical system driving unit 32 collectively refers to the driving units for the zoom lens, focus lens, and aperture mechanism of the imaging optical system 31. Specifically, the optical system driving unit 32 has actuators for driving the zoom lens, focus lens, and aperture mechanism, respectively, and driving circuits for the actuators.

制御部３３は、例えばＣＰＵ、ＲＯＭ、及びＲＡＭを有するマイクロコンピュータを備えて構成され、ＣＰＵがＲＯＭに記憶されているプログラム、又はＲＡＭにロードされたプログラムに従って各種の処理を実行することで、カメラ装置３の全体制御を行う。 The control unit 33 is configured with, for example, a microcomputer having a CPU, ROM, and RAM, and performs overall control of the camera device 3 by the CPU executing various processes according to programs stored in the ROM or programs loaded into the RAM.

また、制御部３３は、光学系駆動部３２に対してズームレンズ、フォーカスレンズ、絞り機構等の駆動指示を行う。光学系駆動部３２はこれらの駆動指示に応じてフォーカスレンズやズームレンズの移動、絞り機構の絞り羽根の開閉等を実行させることになる。 The control unit 33 also issues drive instructions to the optical system drive unit 32 to drive the zoom lens, focus lens, aperture mechanism, etc. In response to these drive instructions, the optical system drive unit 32 moves the focus lens and zoom lens, opens and closes the aperture blades of the aperture mechanism, etc.

また、制御部３３は、メモリ部３４に対する各種データの書き込みや読み出しについての制御を行う。
メモリ部３４は、例えばＨＤＤ（Hard Disk Drive）やフラッシュメモリ装置等の不揮発性の記憶デバイスとされ、イメージセンサＩＳから出力された画像データの保存先（記録先）として用いられる。 The control unit 33 also controls writing and reading of various data to and from the memory unit 34 .
The memory unit 34 is a non-volatile storage device such as a hard disk drive (HDD) or a flash memory device, and is used as a storage destination (recording destination) for image data output from the image sensor IS.

さらに、制御部３３は、通信部３５を介して外部装置との間で各種データ通信を行う。本例における通信部３５は、少なくとも図１に示した管理装置２との間でのデータ通信を行うことが可能に構成されている。 Furthermore, the control unit 33 performs various data communications with external devices via the communication unit 35. In this example, the communication unit 35 is configured to be capable of performing data communications with at least the management device 2 shown in FIG. 1.

イメージセンサＩＳは、例えばＣＣＤ型、ＣＭＯＳ型等のイメージセンサとして構成されており、例えば、ＲＧＢ画像を撮像することが可能とされている。なお、イメージセンサＩＳは、ＩＲ（Infrared）光の帯域に感度を有するＩＲセンサであってもよい。 The image sensor IS is configured as, for example, a CCD type or CMOS type image sensor, and is capable of capturing, for example, an RGB image. Note that the image sensor IS may also be an IR sensor that is sensitive to the IR (Infrared) light band.

イメージセンサＩＳは、撮像部４１、画像信号処理部４２、センサ内制御部４３、ＡＩ画像処理部４４、メモリ部４５、通信Ｉ／Ｆ４６とを備え、それぞれがバス４７を介して相互にデータ通信可能とされている。 The image sensor IS comprises an imaging unit 41, an image signal processing unit 42, an internal sensor control unit 43, an AI image processing unit 44, a memory unit 45, and a communication I/F 46, each of which is capable of data communication with each other via a bus 47.

撮像部４１は、フォトダイオード等の光電変換素子を有する画素が二次元に配列された画素アレイ部と、画素アレイ部が備えるそれぞれの画素から光電変換によって得られた電気信号を読み出す読み出し回路とを備えており、該電気信号を撮像画像信号として出力することが可能とされている。 The imaging unit 41 includes a pixel array section in which pixels having photoelectric conversion elements such as photodiodes are arranged two-dimensionally, and a readout circuit that reads out electrical signals obtained by photoelectric conversion from each pixel in the pixel array section, and is capable of outputting the electrical signals as an imaging image signal.

読み出し回路では、光電変換により得られた電気信号について、例えばＣＤＳ（Correlated Double Sampling）処理、ＡＧＣ（Automatic Gain Control）処理などを実行し、さらにＡ／Ｄ（Analog/Digital）変換処理を行う。 The readout circuit performs processes such as CDS (Correlated Double Sampling) and AGC (Automatic Gain Control) on the electrical signal obtained by photoelectric conversion, and also performs A/D (Analog/Digital) conversion.

画像信号処理部４２は、Ａ／Ｄ変換処理後のデジタルデータとしての撮像画像信号に対して、前処理、同時化処理、ＹＣ生成処理、解像度変換処理、コーデック処理等を行う。
前処理では、撮像画像信号に対してＲ、Ｇ、Ｂの黒レベルを所定のレベルにクランプするクランプ処理や、Ｒ、Ｇ、Ｂの色チャンネル間の補正処理等を行う。同時化処理では、各画素についての画像データが、Ｒ、Ｇ、Ｂ全ての色成分を有するようにする色分離処理を施す。例えば、ベイヤー配列のカラーフィルタを用いた撮像素子の場合は、色分離処理としてデモザイク処理が行われる。ＹＣ生成処理では、Ｒ、Ｇ、Ｂの画像データから、輝度（Ｙ）信号及び色（Ｃ）信号を生成（分離）する。解像度変換処理では、各種の信号処理が施された画像データに対して、解像度変換処理を実行する。
コーデック処理では、上記の各種処理が施された画像データについて、例えば記録用や通信用の符号化処理、ファイル生成を行う。コーデック処理では、動画のファイル形式として、例えばＭＰＥＧ－２（ＭＰＥＧ：Moving Picture Experts Group）やＨ．２６４などの形式によるファイル生成を行うことが可能とされる。また静止画ファイルとしてＪＰＥＧ（Joint Photographic Experts Group）、ＴＩＦＦ（Tagged Image File Format）、ＧＩＦ（Graphics Interchange Format）等の形式のファイル生成を行うことも考えられる。なお、イメージセンサＩＳが測距センサとされている場合には、画像信号処理部４２は、例えば、ｉＴｏＦ（indirect Time of Flight）としてのイメージセンサＩＳから出力される二つの信号に基づいて被写体についての距離情報を算出して距離画像を出力する。 The image signal processing unit 42 performs pre-processing, synchronization processing, YC generation processing, resolution conversion processing, codec processing, etc. on the captured image signal as digital data after A/D conversion processing.
In the pre-processing, clamping processing is performed to clamp the R, G, and B black levels of the captured image signal to a predetermined level, and correction processing between the R, G, and B color channels is performed. In the synchronization processing, a color separation processing is performed so that the image data for each pixel has all the R, G, and B color components. For example, in the case of an image sensor using a Bayer array color filter, a demosaic processing is performed as the color separation processing. In the YC generation processing, a luminance (Y) signal and a color (C) signal are generated (separated) from the R, G, and B image data. In the resolution conversion processing, a resolution conversion processing is performed on the image data that has been subjected to various signal processing.
In the codec process, the image data that has been subjected to the above-mentioned various processes is encoded and a file is generated for recording or communication. In the codec process, a file can be generated in a format such as MPEG-2 (Moving Picture Experts Group) or H.264 as a moving image file format. It is also possible to generate a file in a format such as JPEG (Joint Photographic Experts Group), TIFF (Tagged Image File Format), or GIF (Graphics Interchange Format) as a still image file. In addition, when the image sensor IS is a distance measurement sensor, the image signal processing unit 42 calculates distance information about a subject based on two signals output from the image sensor IS as, for example, iToF (indirect Time of Flight), and outputs a distance image.

なお、画像信号処理部４２から出力される画像データについて、以降の説明においては単に「撮像画像」と記載する場合がある。 In the following explanation, the image data output from the image signal processing unit 42 may be simply referred to as the "captured image."

センサ内制御部４３は、撮像部４１に対する指示を行って撮像動作の実行制御を行う。同様に、画像信号処理部４２に対しても処理の実行制御を行う。 The sensor control unit 43 issues instructions to the imaging unit 41 and controls the execution of the imaging operation. Similarly, it also controls the execution of processing in the image signal processing unit 42.

センサ内制御部４３はまた、ＡＩ画像処理部４４によって算出された密集度などの情報を撮像画像データのメタデータとして出力するための処理を行う。例えば、センサ内制御部４３は、密集度などの情報がペイロード領域に格納されたデータとしてＭＩＰＩ（Mobile Industry Processor Interface）規格に準拠したデータを生成する。 The sensor control unit 43 also performs processing to output information such as density calculated by the AI image processing unit 44 as metadata for the captured image data. For example, the sensor control unit 43 generates data that conforms to the MIPI (Mobile Industry Processor Interface) standard, with information such as density stored in the payload area.

ＡＩ画像処理部４４は、撮像画像についてＡＩ（Artificial Intelligence）を用いた画像処理を行う。以降の説明においては、ＡＩを用いた画像処理を「ＡＩ画像処理」と記載する。ＡＩ画像処理は、例えば、画像認識処理である。 The AI image processing unit 44 performs image processing using AI (Artificial Intelligence) on the captured image. In the following description, image processing using AI will be referred to as "AI image processing." AI image processing is, for example, image recognition processing.

ＡＩ画像処理部４４は、ＤＳＰ（Digital Signal Processor）によって実現される。 The AI image processing unit 44 is realized by a DSP (Digital Signal Processor).

本実施の形態において、ＡＩ画像処理部４４が実行するＡＩ画像処理は複数ある。一つは、撮像画像を入力とし、撮像画像における人物Ｐの検出結果を出力する処理である。人物Ｐの検出結果は、例えば、バウンディングボックスＢＢの情報として出力される。 In this embodiment, there are multiple types of AI image processing executed by the AI image processing unit 44. One is a process that takes a captured image as input and outputs the detection result of person P in the captured image. The detection result of person P is output, for example, as information on a bounding box BB.

バウンディングボックスＢＢの情報は、撮像画像上における座標情報として出力されてもよいし、撮像画像にバウンディングボックスＢＢが重畳された画像が出力されてもよい。 The information about the bounding box BB may be output as coordinate information on the captured image, or an image in which the bounding box BB is superimposed on the captured image may be output.

人物Ｐの検出結果の一例を図３及び図４に示す。 An example of the detection results for person P is shown in Figures 3 and 4.

図３はＡＩ画像処理部４４に入力される画像であり、撮像部４１によって撮像された撮像画像である。図示するようにカメラ装置３の画角内に一人のサッカープレーヤーとしての人物Ｐが位置している。 Figure 3 shows an image input to the AI image processing unit 44, which is an image captured by the imaging unit 41. As shown in the figure, a person P, who is a soccer player, is positioned within the angle of view of the camera device 3.

図４は、ＡＩ画像処理部４４によって検出された人物ＰにバウンディングボックスＢＢを重畳させた画像である。図示するように、カメラ装置３の画角内に位置するサッカープレーヤーとしての人物Ｐを囲むように矩形状とされたバウンディングボックスＢＢが重畳されている。 Figure 4 shows an image in which a bounding box BB is superimposed on a person P detected by the AI image processing unit 44. As shown in the figure, a rectangular bounding box BB is superimposed so as to surround the person P, who is a soccer player and is located within the angle of view of the camera device 3.

ＡＩ画像処理部４４が実行するＡＩ画像処理としてのもう一つの処理は、撮像画像を入力とし、検出された人物ＰのキーポイントＫＰについての情報を出力する処理である。 Another type of AI image processing performed by the AI image processing unit 44 is a process that takes a captured image as input and outputs information about the key points KP of a detected person P.

キーポイントＫＰについての一例を図５に示す。 An example of a key point KP is shown in Figure 5.

キーポイントＫＰは、撮像画像において検出された人物Ｐの主要な関節等に対応して設定されるものである。本例においては、図５に示すように、人物Ｐの頭部、両肩部分、両肘部分、両手部分、左右の腰部分、両膝部分、両足首部分とされた合計１３箇所がキーポイントＫＰとして設定され得る。但し、撮像画像において検出できない部分にはキーポイントＫＰは設定されない。例えば、撮像画像の端に頭部のみが撮像された人物Ｐについては、頭部についてのキーポイントＫＰのみが設定される。 Key points KP are set corresponding to the main joints of person P detected in the captured image. In this example, as shown in FIG. 5, a total of 13 locations can be set as key points KP: person P's head, both shoulders, both elbows, both hands, both waists, both knees, and both ankles. However, key points KP are not set in parts that cannot be detected in the captured image. For example, for person P whose head is the only part captured at the edge of the captured image, only key points KP for the head are set.

また、図５に示すキーポイントＫＰはあくまで一例であり、左右それぞれの腰部分にキーポイントＫＰが一つずつ設定される代わりに臍部分に一つのキーポイントＫＰが設定されるなど、各種の態様が考えられる。 The key points KP shown in FIG. 5 are merely an example, and various configurations are possible, such as setting one key point KP at the navel area instead of setting one key point KP at each of the left and right waist areas.

ＡＩ画像処理部４４は、ＡＩ画像処理によって得られたバウンディングボックスＢＢの情報とキーポイントＫＰの情報に基づいて後述する各種の演算処理を行う。これらの演算処理は、上述した密集度を算出するための情報を算出する処理である。このような演算処理は、センサ内制御部４３によって実行されてもよい。 The AI image processing unit 44 performs various calculation processes described below based on the information of the bounding box BB and the information of the key points KP obtained by the AI image processing. These calculation processes are processes for calculating information for calculating the density described above. Such calculation processes may be executed by the sensor control unit 43.

ＡＩ画像処理部４４は、バウンディングボックスＢＢを重畳させる処理（或いは重畳させなくとも設定する処理であってもよい）と、キーポイントＫＰを設定する処理と、をそれぞれ別のＡＩモデルを用いて行ってもよいし、一つのＡＩモデルで各処理を行ってもよい。 The AI image processing unit 44 may use separate AI models to perform the process of superimposing the bounding box BB (or the process of setting the bounding box BB without superimposing it) and the process of setting the key points KP, or may use a single AI model to perform each process.

ＡＩモデルの切り替えは、例えば、カメラ装置３の制御部３３やセンサ内制御部４３の処理に基づいてなされる。また、ＡＩモデルの切り替えは、例えば、メモリ部４５に記憶されている複数のＡＩモデルから切り替えられる。 The AI model is switched based on the processing of the control unit 33 of the camera device 3 or the sensor internal control unit 43. The AI model is also switched from among multiple AI models stored in the memory unit 45, for example.

なお、数分に一度や数時間に一度などの比較的長い時間を空けて定期或いは不定期に撮像画像に対するバウンディングボックス重畳処理とキーポイント設定処理とを実行すればよい場合においては、処理ごとにクラウド側情報処理装置からＡＩモデルを受信して展開することによりＡＩモデルを切り替えてもよい。切り替えのたびにクラウド側情報処理装置からＡＩモデルを受信することにより、メモリ部４５にＡＩモデルを記憶させておく必要がなくなり、メモリ部４５の容量を抑えることができ、小型化や省電力化及びコスト削減を図ることができる。 In the case where the bounding box superimposition process and the key point setting process are performed on the captured image periodically or irregularly at relatively long intervals, such as once every few minutes or once every few hours, the AI model may be switched by receiving and deploying the AI model from the cloud-side information processing device for each process. By receiving the AI model from the cloud-side information processing device each time a switch is made, it is no longer necessary to store the AI model in the memory unit 45, and the capacity of the memory unit 45 can be reduced, resulting in a smaller size, lower power consumption, and reduced costs.

メモリ部４５には、画像信号処理部４２により得られた撮像画像データ（ＲＡＷ画像データ）や同時化処理後の画像データが保存される所謂フレームメモリとして利用可能である。また、メモリ部４５は、ＡＩ画像処理部４４がＡＩ画像処理の過程で用いるデータの一時的な記憶にも用いることが可能とされる。 The memory unit 45 can be used as a so-called frame memory in which the captured image data (RAW image data) obtained by the image signal processing unit 42 and image data after synchronization processing are stored. The memory unit 45 can also be used for temporary storage of data used by the AI image processing unit 44 in the AI image processing process.

また、メモリ部４５には、ＡＩ画像処理部４４で用いられるＡＩアプリケーションやＡＩモデルの情報が記憶される。なお、ＡＩアプリケーションは、ＡＩモデルを利用したアプリケーションを指す。例えば、ＡＩモデルを用いたＡＩ画像処理によって得られたバウンディングボックスＢＢの情報とキーポイントＫＰの設定情報に基づいて人物Ｐについての密集度を算出するアプリケーションなどはＡＩアプリケーションの一例である。 In addition, the memory unit 45 stores information on AI applications and AI models used by the AI image processing unit 44. Note that an AI application refers to an application that uses an AI model. For example, an application that calculates the density of a person P based on information on a bounding box BB obtained by AI image processing using an AI model and setting information on key points KP is an example of an AI application.

なお、ＡＩアプリケーションやＡＩモデルの情報は、後述するコンテナ技術を用いて、コンテナなどとしてメモリ部４５に展開されてもよいし、マイクロサービス技術を用いて展開されてもよい。ＡＩ画像処理に用いられるＡＩモデルをメモリ部４５に展開することにより、ＡＩ画像処理の機能種別を変更したり、再学習によって性能の向上が図られたＡＩモデルに変更したりすることができる。
なお、上述のように本実施の形態においては画像認識に用いられるＡＩモデルやＡＩアプリケーションについての例に基づいた説明を行っているが、これに限定されず、ＡＩ技術を用いて実行されるプログラム等が対象とされていてもよい。
また、メモリ部４５の容量が小さい場合には、ＡＩアプリケーションやＡＩモデルの情報は、コンテナ技術を用いて、コンテナなどとしてメモリ部３４などイメージセンサＩＳ外のメモリに展開した後、ＡＩモデルだけを下記で説明する通信Ｉ／Ｆ４６を介してイメージセンサＩＳ内のメモリ部４５に格納させるようにしてもよい。 The information on the AI application and the AI model may be deployed in the memory unit 45 as a container or the like using a container technology described later, or may be deployed using a microservice technology. By deploying the AI model used for the AI image processing in the memory unit 45, it is possible to change the function type of the AI image processing, or to change to an AI model whose performance has been improved by re-learning.
As described above, in this embodiment, the explanation is based on examples of AI models and AI applications used for image recognition, but the present invention is not limited to these and may also target programs executed using AI technology.
In addition, if the capacity of the memory unit 45 is small, information on the AI application and AI model may be expanded into a memory outside the image sensor IS, such as the memory unit 34, as a container using container technology, and then only the AI model may be stored in the memory unit 45 within the image sensor IS via the communication I/F 46 described below.

通信Ｉ／Ｆ４６は、イメージセンサＩＳの外部にある制御部３３やメモリ部３４等との通信を行うインタフェースである。通信Ｉ／Ｆ４６は、画像信号処理部４２が実行するプログラムやＡＩ画像処理部４４が利用するＡＩアプリケーションやＡＩモデルなどを外部から取得するための通信を行い、イメージセンサＩＳが備えるメモリ部４５に記憶させる。
これにより、イメージセンサＩＳが備えるメモリ部４５の一部にＡＩモデルが記憶され、ＡＩ画像処理部４４による利用が可能となる。 The communication I/F 46 is an interface for communicating with the control unit 33, the memory unit 34, and the like that are external to the image sensor IS. The communication I/F 46 communicates to acquire from the outside the program executed by the image signal processing unit 42, the AI application and the AI model used by the AI image processing unit 44, and the like, and stores them in the memory unit 45 provided in the image sensor IS.
As a result, the AI model is stored in part of the memory unit 45 provided in the image sensor IS, making it available for use by the AI image processing unit 44.

ＡＩ画像処理部４４は、このようにして得られたＡＩアプリケーションやＡＩモデルを用いて所定の画像認識処理を行うことにより目的に準じた被写体の認識等を行う。 The AI image processing unit 44 uses the AI application or AI model obtained in this way to perform a specified image recognition process, thereby recognizing the subject according to the purpose.

ＡＩ画像処理によって得られたバウンディングボックスＢＢやキーポイントＫＰの情報を用いた演算処理によって得られる密集度などのログ情報は、通信Ｉ／Ｆ４６を介してイメージセンサＩＳの外部に出力される。 Log information such as density obtained by calculation processing using the bounding box BB and key point KP information obtained by AI image processing is output to the outside of the image sensor IS via the communication I/F 46.

即ち、イメージセンサＩＳの通信Ｉ／Ｆ４６からは、画像信号処理部４２から出力される画像データだけでなく、ＡＩ画像処理の認識結果に基づいて算出される密集度などの情報が画像データのメタデータとして出力される。
なお、イメージセンサＩＳの通信Ｉ／Ｆ４６からは、画像データと密集度の情報の何れか一方だけを出力させることもできる。 That is, from the communication I/F 46 of the image sensor IS, not only the image data output from the image signal processing unit 42 but also information such as density calculated based on the recognition results of the AI image processing is output as metadata of the image data.
It should be noted that the communication I/F 46 of the image sensor IS can be made to output only either the image data or the density information.

例えば、上述したＡＩモデルの再学習機能を利用する場合には、再学習機能に用いられる撮像画像データが通信Ｉ／Ｆ４６及び通信部３５を介してイメージセンサＩＳからクラウド側情報処理装置にアップロードされる。 For example, when using the re-learning function of the AI model described above, the captured image data used in the re-learning function is uploaded from the image sensor IS to the cloud-side information processing device via the communication I/F 46 and the communication unit 35.

また、ＡＩモデルを用いた推論を行う場合には、推論結果としての密集度の情報が通信Ｉ／Ｆ４６及び通信部３５を介してイメージセンサＩＳからカメラ装置３外の他の情報処理装置に出力される。
In addition, when inference is performed using an AI model, congestion information as the inference result is output from the image sensor IS to another information processing device outside the camera device 3 via the communication I/F 46 and the communication unit 35.

＜３．ＡＩ画像処理部の機能＞
ＡＩ画像処理部４４は、密集度を算出するための各種の機能を有する。具体的には、ＡＩ画像処理部４４は、重心位置算出機能Ｆ１と、ＩｏＵ（Intersection of Union）算出機能Ｆ２と、未検出率算出機能Ｆ３と、密集度算出機能Ｆ４とを有する（図６参照）。 <3. Functions of AI image processing unit>
The AI image processing unit 44 has various functions for calculating the density. Specifically, the AI image processing unit 44 has a center of gravity position calculation function F1, an Intersection of Union (IoU) calculation function F2, a non-detection rate calculation function F3, and a density calculation function F4 (see FIG. 6).

なお、これらの各機能の少なくとも一部が画像信号処理部４２やセンサ内制御部４３に設けられていてもよい。 At least some of these functions may be provided in the image signal processing unit 42 or the sensor internal control unit 43.

重心位置算出機能Ｆ１は、撮像画像において検出された人物Ｐについてのボックス重心位置ＢＢＰとキーポイント重心位置ＫＢＰとを算出する。 The center of gravity calculation function F1 calculates the box center of gravity position BBP and the key point center of gravity position KBP for the person P detected in the captured image.

ボックス重心位置ＢＢＰは、バウンディングボックスＢＢの重心位置であり、具体的にボックス重心位置ＢＢＰのｘ座標は、バウンディングボックスＢＢの左端のｘ座標と右端のｘ座標の平均値とされ、ボックス重心位置ＢＢＰのｙ座標は、バウンディングボックスＢＢの上端のｙ座標と下端のｙ座標の平均値とされる。 The box center of gravity position BBP is the center of gravity position of the bounding box BB. Specifically, the x coordinate of the box center of gravity position BBP is the average value of the x coordinate of the left end and the x coordinate of the right end of the bounding box BB, and the y coordinate of the box center of gravity position BBP is the average value of the y coordinate of the top end and the y coordinate of the bottom end of the bounding box BB.

キーポイント重心位置ＫＢＰは、同一人物のものとして検出されたキーポイントＫＰの重心位置である。具体的には、キーポイント重心位置ＫＢＰのｘ座標は同一人物についての各キーポイントＫＰのｘ座標の平均値とされ、キーポイント重心位置ＫＢＰのｙ座標は同一人物についての各キーポイントＫＰのｙ座標の平均値とされる。 The keypoint center of gravity KBP is the center of gravity of the keypoints KP detected as belonging to the same person. Specifically, the x coordinate of the keypoint center of gravity KBP is the average value of the x coordinates of each keypoint KP for the same person, and the y coordinate of the keypoint center of gravity KBP is the average value of the y coordinates of each keypoint KP for the same person.

なお、キーポイント重心位置ＫＢＰについては、未検出のキーポイントＫＰを考慮してもよい。例えば、ＡＩ画像処理部４４は、ＡＩモデルを用いてキーポイントＫＰの検出を行う際に、当該人物Ｐについての未検出のキーポイントＫＰの位置を推定する処理を行う。重心位置算出機能Ｆ１では、推定された未検出のキーポイントＫＰも含めてキーポイント重心位置ＫＢＰについてのｘ座標とｙ座標を算出してもよい。 In addition, undetected keypoints KP may be taken into consideration when determining the keypoint center of gravity position KBP. For example, when detecting keypoints KP using an AI model, the AI image processing unit 44 performs processing to estimate the positions of undetected keypoints KP for the person P. The center of gravity position calculation function F1 may calculate the x and y coordinates for the keypoint center of gravity position KBP, including the estimated undetected keypoints KP.

重心位置算出機能Ｆ１は、同一の人物ＰについてのバウンディングボックスＢＢとキーポイントＫＰを対応付ける処理を行う。
例えば、図７に示すように一枚の撮像画像に複数の人物Ｐが含まれている場合がある。このような場合には、画像認識により検出された人物ＰごとにバウンディングボックスＢＢとキーポイントＫＰを対応付ける処理が行われる。 The center of gravity position calculation function F1 performs processing to associate a bounding box BB and key points KP for the same person P.
7, a single captured image may contain multiple persons P. In such a case, a process is performed to associate a bounding box BB with a key point KP for each person P detected by image recognition.

具体的には、ＡＩ画像処理によって設定されたバウンディングボックスＢＢごとに、ボックス重心位置ＢＢＰに対して座標上最も近いキーポイント重心位置ＫＢＰを特定し、対応付ける処理を行う（例えば図８参照）。
なお、対応付けの際には、既に対応付けが済んでいるキーポイント重心位置ＫＢＰを除外することにより、一つのキーポイント重心位置ＫＢＰに対して複数のバウンディングボックスＢＢが紐付けられてしまうことを防止することができる。 Specifically, for each bounding box BB set by AI image processing, the key point center of gravity position KBP that is closest in coordinates to the box center of gravity position BBP is identified and a matching process is performed (see Figure 8, for example).
In addition, when associating, by excluding key point centroid positions KBP that have already been associated, it is possible to prevent multiple bounding boxes BB from being linked to one key point centroid position KBP.

また、ボックス重心位置ＢＢＰに対して最も近いキーポイント重心位置ＫＢＰであっても、当該バウンディングボックスＢＢの領域外にキーポイント重心位置ＫＢＰが位置する場合には当該バウンディングボックスＢＢとキーポイントＫＰの対応付けを行わない。 Even if the keypoint center of gravity position KBP is closest to the box center of gravity position BBP, if the keypoint center of gravity position KBP is located outside the area of the bounding box BB, the bounding box BB will not be associated with the keypoint KP.

このようにして対応付けを行った場合には、キーポイントＫＰと対応付けが行われない独立したバウンディングボックスＢＢが生じる可能性がある。独立したバウンディングボックスＢＢについては、密集度の算出が行われない。これにより、処理負担の軽減が図られる。 When matching is performed in this manner, there is a possibility that an independent bounding box BB that is not matched with a key point KP may occur. Density calculation is not performed for the independent bounding box BB. This reduces the processing load.

なお、一つのＡＩモデルを用いたＡＩ画像処理により、検出された人物ＰについてのバウンディングボックスＢＢとキーポイントＫＰについての情報の双方が得られる場合には対応付けの処理は不要である。 Note that if both the bounding box BB and information about key points KP for a detected person P can be obtained through AI image processing using a single AI model, matching processing is not necessary.

重心位置算出機能Ｆ１により、撮像画像上において検出された人物ＰごとにバウンディングボックスＢＢとキーポイントＫＰが対応付けられる。また、バウンディングボックスＢＢとキーポイントＫＰの何れか一方のみが対応付けられた人物Ｐについては、密集度が算出されない。 The center of gravity position calculation function F1 associates a bounding box BB with a key point KP for each person P detected in the captured image. In addition, for a person P with only one of a bounding box BB and a key point KP associated with it, the density is not calculated.

ＩｏＵ算出機能Ｆ２は、処理対象の人物Ｐに紐付いたバウンディングボックスＢＢとそれ以外の人物Ｐに紐付いたバウンディングボックスＢＢの重なり具合を数値化する処理を行う。 The IoU calculation function F2 performs processing to quantify the degree of overlap between the bounding box BB associated with the person P being processed and the bounding boxes BB associated with other people P.

ここで、処理対象の人物Ｐを対象人物Ｐｔとし、対象人物ＰｔについてのバウンディングボックスＢＢを対象バウンディングボックスＢＢｔとする。 Here, the person P to be processed is defined as the target person Pt, and the bounding box BB for the target person Pt is defined as the target bounding box BBt.

また、対象バウンディングボックスＢＢｔと一部の領域が重なっているバウンディングボックスＢＢを重複バウンディングボックスＢＢｄとし、該重複バウンディングボックスＢＢｄが紐付けられる人物Ｐを重複人物Ｐｄとする。 Furthermore, a bounding box BB that partially overlaps with the target bounding box BBt is referred to as an overlapping bounding box BBd, and a person P to which the overlapping bounding box BBd is linked is referred to as an overlapping person Pd.

図９は、対象人物Ｐｔと二人の重複人物Ｐｄ１、Ｐｄ２を示した例である。なお、図９においては、撮像画像の図示を省略することにより対象人物Ｐｔと重複人物Ｐｄ１、Ｐｄ２についてのバウンディングボックスＢＢとキーポイントＫＰのみを図示している。具体的には、対象人物Ｐｔについての対象バウンディングボックスＢＢｔと、重複人物Ｐｄ１についての重複バウンディングボックスＢＢｄ１と、重複人物Ｐｄ２についての重複バウンディングボックスＢＢｄ２と、それぞれのキーポイントＫＰのみが図示されている。 Figure 9 is an example showing a target person Pt and two overlapping people Pd1 and Pd2. Note that in Figure 9, the captured image is omitted, and only the bounding boxes BB and key points KP for the target person Pt and overlapping people Pd1 and Pd2 are shown. Specifically, only the target bounding box BBt for the target person Pt, the overlapping bounding box BBd1 for the overlapping person Pd1, and the overlapping bounding box BBd2 for the overlapping person Pd2, and their respective key points KP are shown.

また、対象人物Ｐｔの対象バウンディングボックスＢＢｔは実線で示しており、重複人物Ｐｄ１、Ｐｄ２についての重複バウンディングボックスＢＢｄ１、ＢＢｄ２については線幅の異なる一点鎖線で示している。 The target bounding box BBt of the target person Pt is shown with a solid line, and the overlapping bounding boxes BBd1 and BBd2 of the overlapping people Pd1 and Pd2 are shown with dashed lines of different line widths.

ＩｏＵ算出機能Ｆ２は、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄ１、ＢＢｄ２の組ごとに基礎ＩｏＵを算出し、基礎ＩｏＵから対象人物ＰｔについてのＩｏＵを算出する。 The IoU calculation function F2 calculates a basic IoU for each pair of the target bounding box BBt and the overlapping bounding boxes BBd1 and BBd2, and calculates the IoU for the target person Pt from the basic IoU.

具体的に、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄ１の組についての基礎ＩｏＵ（ＢＢｔ、ＢＢｄ１）について図１０を参照して説明する。 Specifically, the basic IoU(BBt, BBd1) for the pair of the target bounding box BBt and the overlapping bounding box BBd1 is explained with reference to FIG. 10.

基礎ＩｏＵ（ＢＢｔ、ＢＢｄ１）は、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄ１の共通領域ＣＡ１を、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄ１の論理和領域ＤＡ１で除算した値として算出される。基礎ＩｏＵは値が大きいほど二つのバウンディングボックスＢＢの重なり度合いが高いことを意味する。 The basic IoU(BBt, BBd1) is calculated as the common area CA1 of the target bounding box BBt and the overlapping bounding box BBd1 divided by the logical sum area DA1 of the target bounding box BBt and the overlapping bounding box BBd1. The larger the basic IoU value, the greater the degree of overlap between the two bounding boxes BB.

基礎ＩｏＵは、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄの組ごとに算出される。即ち、図９に示す状態においては、基礎ＩｏＵ（ＢＢｔ、ＢＢｄ１）と基礎ＩｏＵ（ＢＢｔ、ＢＢｄ２）とが算出される。 The basic IoU is calculated for each pair of a target bounding box BBt and an overlapping bounding box BBd. That is, in the state shown in FIG. 9, basic IoU(BBt, BBd1) and basic IoU(BBt, BBd2) are calculated.

ＩｏＵ算出機能Ｆ２は、基礎ＩｏＵから対象人物ＰｔについてのＩｏＵを算出する。対象人物ＰｔについてのＩｏＵは、対象人物Ｐｔについての基礎ＩｏＵの平均値とされる。即ち、図９に示す例における対象人物ＰｔについてのＩｏＵは、基礎ＩｏＵ（ＢＢｔ、ＢＢｄ１）と基礎ＩｏＵ（ＢＢｔ、ＢＢｄ２）の平均値とされる。 The IoU calculation function F2 calculates the IoU for the target person Pt from the basic IoU. The IoU for the target person Pt is taken as the average value of the basic IoU for the target person Pt. That is, the IoU for the target person Pt in the example shown in FIG. 9 is taken as the average value of the basic IoU (BBt, BBd1) and the basic IoU (BBt, BBd2).

未検出率算出機能Ｆ３は、人物ＰについてのキーポイントＫＰの未検出率を算出する。未検出率は、人物Ｐが密集しているほど高くなるものと推定される。 The undetected rate calculation function F3 calculates the undetected rate of key points KP for person P. It is estimated that the undetected rate is higher the more densely people P are located.

具体的に、図９に示す対象人物Ｐｔについての未検出率は、対象人物Ｐｔと重複人物Ｐｄ１、Ｐｄ２についての未検出のキーポイントＫＰの数を、対象人物Ｐｔと重複人物Ｐｄ１、Ｐｄ２についての全てのキーポイントＫＰの数で除算した値として算出される。 Specifically, the undetected rate for the target person Pt shown in FIG. 9 is calculated as the number of undetected key points KP for the target person Pt and overlapping persons Pd1 and Pd2 divided by the total number of key points KP for the target person Pt and overlapping persons Pd1 and Pd2.

一人の人物Ｐ当たりのキーポイントＫＰの総数は１３個であるため、図９に示す例においては対象人物Ｐｔと重複人物Ｐｄ１、Ｐｄ２のキーポイントＫＰの総数は３９個となる。 The total number of key points KP per person P is 13, so in the example shown in Figure 9, the total number of key points KP for the target person Pt and overlapping persons Pd1 and Pd2 is 39.

また、対象人物Ｐｔの未検出のキーポイントＫＰは０個とされ、重複人物Ｐｄ１の未検出のキーポイントＫＰは３個とされ、重複人物Ｐｄ２の未検出のキーポイントＫＰは４個とさるため、未検出のキーポイントＫＰの数は７個となる。なお、図９においては、未検出のキーポイントＫＰは白抜きの円で示している。
従って、対象人物ＰｔのキーポイントＫＰの未検出率は、「７」を「３９」で除算した値とされる。 In addition, the number of undetected key points KP of the target person Pt is set to 0, the number of undetected key points KP of the overlapping person Pd1 is set to 3, and the number of undetected key points KP of the overlapping person Pd2 is set to 4, so the total number of undetected key points KP is 7. In addition, in Fig. 9, the undetected key points KP are indicated by white circles.
Therefore, the non-detection rate of the key points KP of the target person Pt is calculated by dividing "7" by "39".

なお、未検出率算出機能Ｆ３は、バウンディングボックスＢＢが撮像画像の縁部に掛かって検出された人物Ｐ、或いは、未検出のキーポイントＫＰが撮像画像の枠外に位置すると推定された人物Ｐは見切れ人物として扱うことにより、当該人物ＰについてのキーポイントＫＰを除外して対象人物ＰｔのキーポイントＫＰの未検出率を算出してもよい。 The undetected rate calculation function F3 may treat a person P whose bounding box BB is detected as overlapping the edge of the captured image, or a person P whose undetected key point KP is estimated to be located outside the frame of the captured image, as an out-of-view person, and calculate the undetected rate of the key points KP of the target person Pt by excluding the key points KP for that person P.

例えば、図９に示す対象人物Ｐｔと重複人物Ｐｄ１、Ｐｄ２のうち、重複人物Ｐｄ２が見切れ人物である場合には、対象人物Ｐｔと重複人物Ｐｄ１のキーポイントＫＰの総数が２６個とされ、未検出のキーポイントＫＰはそれぞれ０個と３個とされ、未検出率は「３」を「２６」で除算した値として算出される。 For example, if the overlapping person Pd2 of the target person Pt and overlapping persons Pd1 and Pd2 shown in Figure 9 is an incomplete person, the total number of key points KP for the target person Pt and overlapping person Pd1 is 26, the number of undetected key points KP is 0 and 3, respectively, and the undetection rate is calculated as the value obtained by dividing "3" by "26".

密集度算出機能Ｆ４は、ＩｏＵ算出機能Ｆ２によって算出された対象人物ＰｔについてのＩｏＵと、未検出率算出機能Ｆ３によって算出された対象人物Ｐｔについての未検出率に基づいて対象人物Ｐｔについての密集度を算出する。 The density calculation function F4 calculates the density of the target person Pt based on the IoU for the target person Pt calculated by the IoU calculation function F2 and the undetected rate for the target person Pt calculated by the undetected rate calculation function F3.

また、密集度算出機能Ｆ４は、撮像画像において検出された人物Ｐを順次対象人物Ｐｔとして選択することにより、人物Ｐごとに密集度を算出する。このとき、見切れ人物については対象人物Ｐｔとして選択しないようにしてもよい。 The density calculation function F4 also calculates the density for each person P by sequentially selecting the people P detected in the captured image as target people Pt. At this time, it is possible not to select people who are out of view as target people Pt.

密集度の算出方法としては幾つか考えられる。例えば、ＩｏＵと未検出率の平均値を算出して密集度としてもよいし、ＩｏＵと未検出率を乗算して得た値を密集度としてもよい。 There are several possible methods for calculating the density. For example, the density may be calculated by averaging the IoU and the undetected rate, or the density may be calculated by multiplying the IoU and the undetected rate.

また、ＩｏＵと未検出率の重要度の違いを考慮してそれぞれに係数を乗算した値を補正値として用いて密集度を算出してもよい。具体的には、ＩｏＵの数値の方が密集度にとってより重要である場合には、ＩｏＵの数値に対して密集度に高い影響を与えるように未検出率よりも高い係数を乗算してもよい。
Also, taking into consideration the difference in importance between the IoU and the undetected rate, the density may be calculated by multiplying each of them by a coefficient and using the result as a correction value. Specifically, when the IoU value is more important for the density, the IoU value may be multiplied by a coefficient higher than the undetected rate so as to have a greater impact on the density.

＜４．キーポイントの尤度について＞
ＡＩ画像処理部４４は、ＡＩ画像処理によって人物ＰのキーポイントＫＰを設定すると共に当該キーポイントＫＰごとの尤度を出力するものであってもよい。 4. About the likelihood of keypoints
The AI image processing unit 44 may set key points KP of the person P by AI image processing and output the likelihood for each key point KP.

その場合には、キーポイントＫＰごとの尤度を考慮してキーポイントＫＰについての未検出率を算出してもよい。キーポイントＫＰの尤度は、値が高いほど当該キーポイントＫＰが確からしいことを示す。即ち、尤度が低いキーポイントＫＰは、人物Ｐとは異なる物体を人物Ｐの関節等と誤認したことによって検出されたキーポイントＫＰである可能性が高いこととなる。 In that case, the non-detection rate for the keypoint KP may be calculated by taking into account the likelihood of each keypoint KP. The higher the likelihood value of a keypoint KP, the more likely the keypoint KP is. In other words, a keypoint KP with a low likelihood is likely to be a keypoint KP that was detected by misidentifying an object other than person P as a joint of person P, etc.

キーポイントＫＰの尤度が所定の値とされた尤度閾値Ｔｈ１よりも低い場合には、当該キーポイントＫＰは未検出として扱う。
これにより、不確かなキーポイントＫＰによって未検出率が低く算出されること、及び、密集度が低く算出されてしまうことを防止することができる。 If the likelihood of a keypoint KP is lower than a predetermined likelihood threshold Th1, the keypoint KP is treated as undetected.
This makes it possible to prevent the non-detection rate from being calculated low and the density from being calculated low due to uncertain key points KP.

尤度が「０」から「１００」の数値として出力される場合には、尤度閾値Ｔｈ１は例えば「２０」や「３０」などとされる。
When the likelihood is output as a numerical value from "0" to "100", the likelihood threshold Th1 is set to, for example, "20" or "30".

＜５．未検出率に基づく人物の検出について＞
未検出率が著しく高い場合には、当該人物Ｐを検出していないものとして扱ってもよい。
例えば、人物Ｐの右手のみが検出された場合には、未検出率は１３分の１２とされる。このように未検出率が高い場合には、当該人物Ｐを適切に検出したとは言い難い。 5. Person detection based on undetected rate
If the undetected rate is extremely high, the person P may be treated as not having been detected.
For example, when only the right hand of person P is detected, the non-detection rate is 12 out of 13. When the non-detection rate is high like this, it is difficult to say that person P has been appropriately detected.

従って、キーポイントＫＰの未検出率が未検出率閾値Ｔｈ２以上である場合には、当該人物Ｐは未検出の人物として扱ってもよい。換言すれば、キーポイントＫＰの未検出率が未検出率閾値Ｔｈ２未満である場合に当該人物Ｐを検出したと判定してもよい。換言すれば、未検出率が未検出率閾値Ｔｈ２以上である場合に当該キーポイントＫＰを除外して、即ち、当該人物Ｐを除外して密集度の算出を行ってもよい。 Therefore, if the undetected rate of the key point KP is equal to or greater than the undetected rate threshold Th2, the person P may be treated as an undetected person. In other words, if the undetected rate of the key point KP is less than the undetected rate threshold Th2, it may be determined that the person P has been detected. In other words, if the undetected rate is equal to or greater than the undetected rate threshold Th2, the key point KP may be excluded, i.e., the density may be calculated excluding the person P.

なお、キーポイントＫＰの未検出率が未検出率閾値Ｔｈ２以上である場合、当該人物Ｐは、上述の対象人物Ｐｔや重複人物Ｐｄとして選択しないものとする。 If the undetected rate of a key point KP is equal to or greater than the undetected rate threshold Th2, the person P will not be selected as the target person Pt or overlapping person Pd described above.

未検出率は、例えば、「０．１」や「０．２」などとされる。
The non-detection rate is, for example, "0.1" or "0.2."

＜６．処理例＞
イメージセンサＩＳのＡＩ画像処理部４４が実行する処理の流れについて一例を図１１及び図１２に示す。なお、各処理の実行主体はＡＩ画像処理部４４とするが、前述したように、少なくともその一部の処理が画像信号処理部４２やセンサ内制御部４３によって実行されてもよい。また、一部の処理が、イメージセンサＩＳの外部にある制御部３３によって実行されてもよい。 <6. Processing example>
An example of the flow of processing executed by the AI image processing unit 44 of the image sensor IS is shown in Figures 11 and 12. Note that each process is executed by the AI image processing unit 44, but as described above, at least a part of the process may be executed by the image signal processing unit 42 or the sensor internal control unit 43. Also, a part of the process may be executed by the control unit 33 outside the image sensor IS.

例えば、バウンディングボックスＢＢを設定する処理とキーポイントＫＰを設定する処理をカメラ装置３のイメージセンサＩＳ内で行い、それらの情報に基づいて密集度を算出する処理をイメージセンサＩＳ外で行ってもよい。 For example, the process of setting the bounding box BB and the process of setting the key points KP may be performed within the image sensor IS of the camera device 3, and the process of calculating the density based on this information may be performed outside the image sensor IS.

また、それぞれの処理の少なくとも一部が、例えば、管理装置２などカメラ装置３の外部において実行されてもよい。 In addition, at least a portion of each process may be executed outside the camera device 3, for example, by the management device 2.

ＡＩ画像処理部４４は、図１１に示すステップＳ１０１において、前処理を実行する。前処理は、例えば、撮像部４１から出力される撮像画像の正規化やリサイズを行う処理である。この処理は、ＡＩ画像処理に用いられるＡＩモデルへ入力する入力データとして適切となるように撮像画像を整える処理とも言える。 The AI image processing unit 44 executes pre-processing in step S101 shown in FIG. 11. Pre-processing is, for example, processing for normalizing and resizing the captured image output from the imaging unit 41. This processing can also be said to be processing for preparing the captured image so that it is suitable as input data to be input to an AI model used in AI image processing.

ＡＩ画像処理部４４はステップＳ１０２において、ＡＩ画像処理としての推論処理を行う。この処理によって、ＡＩモデルに入力された前処理後の撮像画像について、人物Ｐが検出されると共に、人物ＰについてのバウンディングボックスＢＢとキーポイントＫＰが設定される。 In step S102, the AI image processing unit 44 performs inference processing as AI image processing. Through this processing, a person P is detected from the preprocessed captured image input to the AI model, and a bounding box BB and key points KP for the person P are set.

なお、前述したように、ステップＳ１０２の推論処理では、複数のＡＩモデルを用いてもよいし、一つのＡＩモデルのみを用いてもよい。 As mentioned above, the inference process in step S102 may use multiple AI models or only one AI model.

ＡＩ画像処理部４４はステップＳ１０３において、後処理を行う。後処理は、例えば、バウンディングボックスＢＢとキーポイントＫＰを人物Ｐごとに紐付ける処理である。 In step S103, the AI image processing unit 44 performs post-processing. The post-processing is, for example, a process of linking the bounding box BB and key points KP for each person P.

ＡＩ画像処理部４４はステップＳ１０４において、検出された人物Ｐのうちの一人を対象人物Ｐｔとして選択する処理を行う。なお、上述したように、キーポイントＫＰの未検出率が未検出率閾値Ｔｈ２未満とされた人物Ｐのみを検出された人物Ｐとして扱ってもよい。 In step S104, the AI image processing unit 44 performs a process of selecting one of the detected persons P as a target person Pt. Note that, as described above, only persons P whose undetected rate of key points KP is less than the undetected rate threshold Th2 may be treated as detected persons P.

続いて、ＡＩ画像処理部４４はステップＳ１０５において、対象人物Ｐｔについての密集度算出処理を実行する。 Next, in step S105, the AI image processing unit 44 executes a density calculation process for the target person Pt.

密集度算出処理の一例を図１２に示す。
ＡＩ画像処理部４４はステップＳ２０１において、対象人物Ｐｔについての重複人物Ｐｄを特定し、重複人物Ｐｄの重複バウンディングボックスＢＢｄを取得する。 An example of the density calculation process is shown in FIG.
In step S201, the AI image processing unit 44 identifies an overlapping person Pd for the target person Pt and obtains an overlapping bounding box BBd of the overlapping person Pd.

続いて、ＡＩ画像処理部４４はステップＳ２０２において、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄの組ごとに基礎ＩｏＵを算出し、基礎ＩｏＵに基づいて対象人物ＰｔについてのＩｏＵを算出する。 Next, in step S202, the AI image processing unit 44 calculates the basic IoU for each pair of the target bounding box BBt and the overlapping bounding box BBd, and calculates the IoU for the target person Pt based on the basic IoU.

ＡＩ画像処理部４４はステップＳ２０３において、重複人物ＰｄごとのキーポイントＫＰを取得する。このとき、上述したようにキーポイントＫＰごとに出力される尤度に基づいてキーポイントＫＰを取得してもよい。具体的には、尤度が尤度閾値Ｔｈ１以上となるキーポイントＫＰのみを取得してもよい。 In step S203, the AI image processing unit 44 acquires key points KP for each overlapping person Pd. At this time, the key points KP may be acquired based on the likelihood output for each key point KP as described above. Specifically, only key points KP whose likelihood is equal to or greater than the likelihood threshold value Th1 may be acquired.

ＡＩ画像処理部４４はステップＳ２０４において、見切れ人物を除外する処理を行う。続くステップＳ２０５において、ＡＩ画像処理部４４は、見切れ人物を除外した状態で対象人物Ｐｔ及び重複人物Ｐｄについてそれぞれ検出されているキーポイントＫＰに基づいて未検出率を算出する。 In step S204, the AI image processing unit 44 performs a process of excluding out-of-view persons. In the following step S205, the AI image processing unit 44 calculates a non-detection rate based on the key points KP detected for the target person Pt and the overlapping person Pd with the out-of-view persons excluded.

ＡＩ画像処理部４４はステップＳ２０６において、対象人物ＰｔについてのＩｏＵと未検出率に基づいて密集度を算出する。 In step S206, the AI image processing unit 44 calculates the density based on the IoU and non-detection rate for the target person Pt.

図１１の説明に戻る。
対象人物Ｐｔについての密集度を算出した後、ＡＩ画像処理部４４はステップＳ１０６において、通信Ｉ／Ｆ４６を介して密集度をログ出力する。なお、出力されるログには、密集度の情報以外にも検出された人物Ｐの座標情報や未検出率の情報やＩｏＵの情報や重複バウンディングボックスＢＢｄの数などが含まれていてもよい。 Returning to the explanation of FIG.
After calculating the density of the target person Pt, in step S106, the AI image processing unit 44 outputs the density as a log via the communication I/F 46. Note that the output log may include, in addition to the density information, coordinate information of the detected person P, information on the undetected rate, information on IoU, the number of overlapping bounding boxes BBd, and the like.

続いて、ＡＩ画像処理部４４はステップＳ１０７において、密集度が未算出の人物が存在するか否かを判定する。密集度を算出していない人物Ｐが存在すると判定した場合、ＡＩ画像処理部４４はステップＳ１０４において新たな人物Ｐを対象人物Ｐｔとして選択し、ステップＳ１０５の密集度算出処理を実行する。 Then, in step S107, the AI image processing unit 44 determines whether or not there is a person whose density has not been calculated. If it is determined that there is a person P whose density has not been calculated, the AI image processing unit 44 selects a new person P as a target person Pt in step S104, and executes the density calculation process in step S105.

一方、密集度を算出していない人物Ｐが存在しないと判定した場合、即ち、全ての人物Ｐについての密集度の算出を終えたと判定した場合、ＡＩ画像処理部４４は図１１に示す一連の処理を終了する。
On the other hand, if it is determined that there is no person P for which the density has not been calculated, that is, if it is determined that the calculation of the density for all people P has been completed, the AI image processing unit 44 terminates the series of processes shown in Figure 11.

＜７．表示処理＞
撮像画像に写ったそれぞれの人物Ｐについての密集度の情報は、カメラ装置３以外の情報処理装置である管理装置２に出力されることにより、管理装置２において通知処理が実行される。
また、管理装置２において適切な形で表示処理を行うことにより密集度の情報がユーザ端末において提示されてもよい。 <7. Display Processing>
Information on the density of each person P captured in the captured image is output to a management device 2, which is an information processing device other than the camera device 3, and a notification process is executed in the management device 2.
Moreover, the management device 2 may perform display processing in an appropriate manner to present information about the congestion level on the user terminal.

なお、管理装置２ではなく、カメラ装置３が備えるモニタなどの表示部を利用して密集度の情報を表示させる表示処理がなされてもよい。この場合には、カメラ装置３の制御部３３によって表示処理がなされる。 In addition, a display process may be performed to display the congestion information using a display unit such as a monitor provided in the camera device 3, instead of the management device 2. In this case, the display process is performed by the control unit 33 of the camera device 3.

ここでは、図１に示す管理装置２において密集度についての表示処理が行われる例について説明する。 Here, we will explain an example in which the display process for density is performed by the management device 2 shown in Figure 1.

管理装置２のＣＰＵが所定のプログラムを実行することにより、管理装置２は、図１３に示すように、通信機能Ｆ１１や表示制御機能Ｆ１２などを有する。 When the CPU of the management device 2 executes a specific program, the management device 2 has a communication function F11, a display control function F12, and the like, as shown in FIG. 13.

通信機能Ｆ１１は、各カメラ装置３から撮像画像と密集度の情報を受信する。密集度の情報は、例えば、ＭＩＰＩ規格のデータ構造で撮像画像データを受信する際の拡張領域に格納することで、撮像画像データと同時に取得可能である。 The communication function F11 receives captured images and density information from each camera device 3. The density information can be acquired simultaneously with the captured image data, for example, by storing it in an extension area when receiving captured image data in a data structure conforming to the MIPI standard.

また、密集度の情報に加えてキーポイントＫＰの未検出率の情報やＩｏＵの情報をカメラ装置３から取得してもよい。 In addition to the density information, information on the non-detection rate of key points KP and IoU information may also be obtained from the camera device 3.

表示制御機能Ｆ１２は、密集度についての情報をユーザに提示するための表示処理を行う。一例を図１４に示す。 The display control function F12 performs display processing to present information about the density to the user. An example is shown in FIG. 14.

図１４は、カメラ装置３から受信した撮像画像に密集度の情報を重畳させて表示させたものである。撮像画像に写る各人物Ｐについて密集度に応じた画像を重畳させることで図１４に示すような画像がユーザに提示される。このような画像の表示は、管理装置２が表示制御機能Ｆ１２を実行することで、例えばユーザが所持するスマートフォンなどのユーザ端末においてなされてもよい。 Figure 14 shows an image received from the camera device 3 with density information superimposed on it. By superimposing an image corresponding to the density of each person P appearing in the captured image, an image such as that shown in Figure 14 is presented to the user. Such an image may be displayed on a user terminal such as a smartphone carried by the user by executing the display control function F12 by the management device 2.

図１５は、図１４に示す撮像画像に表示された一人の人物ＰについてのバウンディングボックスＢＢと重畳された密集度の情報を示したものである。なお、図１５のバウンディングボックスＢＢは説明のために図示したものであり、ユーザに密集度の情報を提示する際には撮像画像に重畳させなくてもよい。 Figure 15 shows density information superimposed on a bounding box BB for one person P displayed in the captured image shown in Figure 14. Note that the bounding box BB in Figure 15 is shown for explanatory purposes, and does not need to be superimposed on the captured image when presenting density information to the user.

密集度の情報は、バウンディングボックスＢＢ内で二次元のガウス分布を定義し、ガウス分布に応じたヒートマップ画像を生成して撮像画像に重畳することにより可視化される。 The density information is visualized by defining a two-dimensional Gaussian distribution within the bounding box BB, generating a heat map image according to the Gaussian distribution, and superimposing it on the captured image.

ヒートマップ画像は、バウンディングボックスＢＢのボックス重心位置ＢＢＰを最大値としてボックス重心位置ＢＢＰから離れた位置（画素）ほど数値が小さくなり、バウンディングボックスＢＢの外周縁において「０」となるようにされる。 The heat map image has a maximum value at the box center of gravity BBP of the bounding box BB, and the values decrease the further away (in pixels) the position is from the box center of gravity BBP, until it reaches "0" at the outer edge of the bounding box BB.

また、ボックス重心位置ＢＢＰにおけるガウス分布の最大値は当該人物Ｐについての密集度の高さに応じたものとされる。即ち、人物Ｐについての密集度が高いほど、当該人物ＰのバウンディングボックスＢＢにおけるガウス分布の最大値も高くされる。 The maximum value of the Gaussian distribution at the box center of gravity position BBP corresponds to the density of the person P. In other words, the higher the density of the person P, the higher the maximum value of the Gaussian distribution at the bounding box BB of the person P.

ガウス分布に応じたヒートマップ画像は、数値が高くなるほど所定の色（例えば黄色）が濃くなる画像とされる。 A heat map image based on a Gaussian distribution is an image in which the higher the numerical value, the darker the specified color (e.g., yellow) becomes.

なお、図１５に示す例では、バウンディングボックスＢＢの外周縁で数値が「０」となるようにガウス分布を設定したが、ボックス重心位置ＢＢＰからの距離によって数値が一意に定まるようにしてもよい。即ち、バウンディングボックスＢＢが長方形とされている場合には、バウンディングボックスＢＢの長辺において数値が「０」となり、短辺において数値が「０」よりも大きな値とされていてもよい。 In the example shown in FIG. 15, the Gaussian distribution is set so that the numerical value is "0" at the outer edge of the bounding box BB, but the numerical value may be uniquely determined by the distance from the box center of gravity position BBP. That is, if the bounding box BB is rectangular, the numerical value may be "0" at the long side of the bounding box BB, and may be greater than "0" at the short side.

また、図１５の例では、バウンディングボックスＢＢのボックス重心位置ＢＢＰがガウス分布の最大値となるようにしたが、人物Ｐのキーポイント重心位置ＫＢＰがガウス分布の最大値となるようにしてもよい。 In the example of Figure 15, the box center of gravity position BBP of the bounding box BB is set to the maximum value of the Gaussian distribution, but the keypoint center of gravity position KBP of person P may also be set to the maximum value of the Gaussian distribution.

更に、図１５の例では、ガウス分布を用いる例を説明したが、ボックス重心位置ＢＢＰが最大値とされそこからの距離が離れるほど線形的に値が減少するように設定してもよい。 Furthermore, in the example of Figure 15, an example using a Gaussian distribution is described, but it may also be set so that the box center of gravity position BBP is set as the maximum value and the value decreases linearly as the distance from there increases.

また、図１５の例では、バウンディングボックスＢＢを利用して人物Ｐについての密集度を可視化するためのヒートマップを生成したが、人物Ｐのボックス重心位置ＢＢＰが最大値とされ、そこからの距離に応じて数値が減少するように設定することにより矩形ではなく円形の画像がヒートマップ画像として生成されてもよい（図１６参照）。
In the example of Figure 15, a heat map was generated to visualize the density of person P using a bounding box BB, but a circular image rather than a rectangular image may be generated as the heat map image by setting the box center of gravity position BBP of person P to a maximum value and setting the numerical value to decrease depending on the distance from there (see Figure 16).

＜８．変形例＞
上述した例では、バウンディングボックスＢＢとキーポイントＫＰの双方を用いて密集度を算出したが、バウンディングボックスＢＢ同士の重なり度合いのみを用いて密集度を算出してもよい。 8. Modifications
In the above example, the density is calculated using both the bounding boxes BB and the key points KP, but the density may be calculated using only the degree of overlap between the bounding boxes BB.

具体的には、先ず、一人の人物Ｐを対象人物Ｐｔとして選択する。次に、対象人物Ｐｔの対象バウンディングボックスＢＢｔを特定し、当該対象バウンディングボックスＢＢｔと一部の領域が重複している重複バウンディングボックスＢＢｄを特定する。 Specifically, first, one person P is selected as the target person Pt. Next, a target bounding box BBt of the target person Pt is identified, and an overlapping bounding box BBd, which is a region that partially overlaps with the target bounding box BBt, is identified.

そして、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄの組ごとに基礎ＩｏＵを算出する。この基礎ＩｏＵは、平均値を採るなどして対象人物Ｐｔについての密集度の算出に用いられる。従って、基礎ＩｏＵは基礎密集度と見なすことができる。 Then, a basic IoU is calculated for each pair of the target bounding box BBt and the overlapping bounding box BBd. This basic IoU is used to calculate the density for the target person Pt by taking the average value, for example. Therefore, the basic IoU can be regarded as the basic density.

これにより、キーポイントＫＰの未検出率を算出する上述の手法よりも簡易的な手法で密集度を算出することができるため、カメラ装置３やイメージセンサＩＳの処理負担の軽減を図ることができる。 This allows the density to be calculated using a simpler method than the above-mentioned method for calculating the non-detection rate of key points KP, thereby reducing the processing burden on the camera device 3 and the image sensor IS.

また、上述した例では、対象物として人物を挙げたが、これ以外の対象物であってもよい。例えば、動物などの生き物を対象として密集度を算出してもよいし、車両などの無機物を対象として密集度を算出してもよい。
In the above example, a person is given as an example of the object, but the object may be something other than a person. For example, the density may be calculated for a living thing such as an animal, or for an inorganic object such as a vehicle.

＜９．クラウドアプリケーションとしての例＞
密集度算出システム１による密集度についての情報提示は、管理装置２以外設けられたクラウドサーバ５によるクラウドアプリケーションによる機能によって実現されてもよい。但し、クラウドサーバ５の機能を管理装置２が有していてもよい。 <9. Examples of cloud applications>
Presentation of information on the density by the density calculation system 1 may be realized by a function of a cloud application by a cloud server 5 provided other than the management device 2. However, the management device 2 may have the function of the cloud server 5.

密集度算出システム１がクラウドサーバ５を有する構成についての一例を図１７に示す。 An example of a configuration in which the density calculation system 1 has a cloud server 5 is shown in Figure 17.

図１７に示す情報処理システムは撮像画像に基づいて被写体としての人物Ｐの密集度を算出して提示するシステムとして機能する。 The information processing system shown in FIG. 17 functions as a system that calculates and presents the density of a person P as a subject based on a captured image.

密集度算出システム１は、クラウドサーバ５とユーザ端末６と管理装置２が通信ネットワーク７を介して相互通信可能に接続されている。 The density calculation system 1 is configured such that a cloud server 5, a user terminal 6, and a management device 2 are connected to each other via a communication network 7 so that they can communicate with each other.

管理装置２はフォグサーバとして機能する。
フォグサーバとしての管理装置２は、例えば各カメラ装置３と共に監視対象のアミューズメントパークや店舗に配置される等、監視対象ごとに配置されることが想定される。このように監視対象となる場所ごとに管理装置２を設けることで、監視対象における複数のカメラ装置３からの送信データをクラウドサーバ５が直接受信する必要がなくなり、クラウドサーバ５の処理負担軽減が図られる。 The management device 2 functions as a fog server.
It is assumed that the management device 2 as a fog server is arranged for each monitored object, for example, in an amusement park or a store to be monitored together with each camera device 3. By providing a management device 2 for each monitored location in this manner, the cloud server 5 does not need to directly receive transmission data from the multiple camera devices 3 at the monitored object, thereby reducing the processing load of the cloud server 5.

なお、管理装置２は、監視対象とする店舗が複数あり、それら店舗が全て同一系列に属する店舗である場合には、店舗ごとに設けるのではなく、それら複数の店舗につき一つ設けることも考えられる。すなわち、管理装置２は、監視対象ごとに一つ設けることに限定されず、複数の監視対象に対して一つの管理装置２を設けることも可能なものである。 Note that when there are multiple stores to be monitored and all of these stores belong to the same chain, it is possible to provide one management device 2 for each of these multiple stores, rather than providing one for each store. In other words, the management device 2 is not limited to being provided for each monitored object, and it is also possible to provide one management device 2 for multiple monitored objects.

なお、クラウドサーバ５もしくは、各カメラ装置３側に処理能力があるなどの理由で、管理装置２の機能をクラウドサーバ５もしくは各カメラ装置３側に持たせることができる場合は密集度算出システム１において管理装置２を省略し、各カメラ装置３を直接通信ネットワーク７に接続させて、複数のカメラ装置３からの送信データをクラウドサーバ５が直接受信するようにしてもよい。 Note that if the cloud server 5 or each camera device 3 has the processing capacity to perform the functions of the management device 2, the management device 2 may be omitted from the density calculation system 1, and each camera device 3 may be directly connected to the communication network 7 so that the cloud server 5 directly receives the data transmitted from the multiple camera devices 3.

なお、管理装置２は、オンプレミスサーバとされていてもよい。 The management device 2 may be an on-premise server.

クラウドサーバ５は、ＣＰＵ、ＲＯＭ、ＲＡＭ等を有するマイクロコンピュータを備えた情報処理装置として構成されており、エッジ側情報処理装置としてのカメラ装置３において実行されたＡＩ画像処理の結果情報（例えば撮像画像に付随して送信される密集度の情報やキーポイントＫＰの未検出率などの情報）を用いて高度なアプリケーション機能を提供するものである。 The cloud server 5 is configured as an information processing device equipped with a microcomputer having a CPU, ROM, RAM, etc., and provides advanced application functions using the result information of the AI image processing executed in the camera device 3 as an edge-side information processing device (e.g., information such as density information transmitted in conjunction with the captured image and the rate of non-detection of key points KP).

ユーザ端末６は、クラウドサーバ５によって提供される各種のサービスの受け手であるユーザによって使用されることが想定される情報処理装置である。 The user terminal 6 is an information processing device that is expected to be used by a user who is the recipient of various services provided by the cloud server 5.

上記各種の装置は、以下の説明において、クラウド側情報処理装置とエッジ側情報処理装置とに大別することができる。
クラウド側情報処理装置にはクラウドサーバ５等のサーバ装置が該当し、複数のユーザによる利用が想定されるクラウドサービスを提供する装置群である。 In the following description, the above various devices can be broadly classified into cloud-side information processing devices and edge-side information processing devices.
The cloud-side information processing device corresponds to a server device such as the cloud server 5, and is a group of devices that provide cloud services that are expected to be used by multiple users.

また、エッジ側情報処理装置にはカメラ装置３と管理装置２が該当し、クラウドサービスを利用するユーザによって用意される環境内に配置される装置群として捉えることが可能である。
なお、管理装置２はクラウド側情報処理装置とされてもよい。 The edge-side information processing device corresponds to the camera device 3 and the management device 2, and can be regarded as a group of devices arranged in an environment prepared by a user who uses a cloud service.
The management device 2 may be a cloud-side information processing device.

ここで、クラウド側の情報処理装置であるクラウドサーバ５に密集度の情報を提示する機能などを有する各アプリケーション機能を登録する手法は各種考えられる。
その一例について、図１８を参照して説明する。
なお、管理装置２については図２における図示を省略しているが、管理装置２を備えた構成とされてもよい。その際における管理装置２は、エッジ側の機能の一部を負担してもよい。 Here, various methods are conceivable for registering each application function having a function for presenting congestion information in cloud server 5, which is an information processing device on the cloud side.
An example of this will be described with reference to FIG.
2, the management device 2 may be included in the configuration. In this case, the management device 2 may take on part of the functions of the edge device.

上述したクラウドサーバ５は、クラウド側の環境を構成する情報処理装置である。
また、カメラ装置３はエッジ側の環境を構成する情報処理装置である。 The above-mentioned cloud server 5 is an information processing device that constitutes the cloud environment.
The camera device 3 is an information processing device that constitutes the edge side environment.

なお、カメラ装置３だけでなくイメージセンサＩＳについてもエッジ側の環境を構成する情報処理装置として捉えることができる。即ち、エッジ側情報処理装置であるカメラ装置３の内部に別のエッジ側情報処理装置であるイメージセンサＩＳが搭載されていると捉えてもよい。 In addition to the camera device 3, the image sensor IS can also be considered as an information processing device that constitutes the edge-side environment. In other words, the image sensor IS, which is another edge-side information processing device, can be considered to be mounted inside the camera device 3, which is an edge-side information processing device.

また、クラウド側の情報処理装置が提供する各種のサービスを利用するユーザが使用するユーザ端末６としては、ＡＩ画像処理に用いられるアプリケーションを開発するユーザが使用するアプリケーション開発者端末６Ａと、アプリケーションを利用するユーザが使用するアプリケーション利用者端末６Ｂと、ＡＩ画像処理に用いられるＡＩモデルを開発するユーザが使用するＡＩモデル開発者端末６Ｃなどがある。
なお、もちろん、アプリケーション開発者端末６ＡはＡＩ画像処理を用いないアプリケーションを開発するユーザによって使用されてもよい。 In addition, user terminals 6 used by users who utilize various services provided by the cloud-side information processing device include an application developer terminal 6A used by a user who develops applications used in AI image processing, an application user terminal 6B used by a user who uses applications, and an AI model developer terminal 6C used by a user who develops AI models used in AI image processing.
Of course, the application developer terminal 6A may also be used by a user developing applications that do not use AI image processing.

クラウド側の情報処理装置には、ＡＩによる学習を行うための学習用データセットや開発のベースとなるＡＩモデルなどが用意されている。ＡＩモデルを開発するユーザは、ＡＩモデル開発者端末６Ｃを利用してクラウド側の情報処理装置と通信を行い、これらの学習用データセットやＡＩモデルをダウンロードする。このとき、学習用データセットが有料で提供されてもよい。例えば、ＡＩモデル開発者は、クラウド側の機能として用意されているマーケットプレイス（電子市場）に個人情報を登録することによりマーケットプレイスに登録された各種機能や素材の購入を可能にした状態で、学習用データセットの購入を行ってもよい。 The cloud-side information processing device is provided with a learning dataset for AI learning and an AI model that serves as the basis for development. A user developing an AI model communicates with the cloud-side information processing device using an AI model developer terminal 6C and downloads these learning datasets and AI models. At this time, the learning dataset may be provided for a fee. For example, the AI model developer may purchase the learning dataset in a state in which it is possible to purchase various functions and materials registered in a marketplace (electronic marketplace) by registering personal information in the marketplace provided as a function on the cloud side.

ＡＩモデル開発者は、学習用データセットを用いてＡＩモデルの開発を行った後、ＡＩモデル開発者端末６Ｃを用いて当該開発済みのＡＩモデルをマーケットプレイスに登録する。これにより、当該ＡＩモデルがダウンロードされた際にＡＩモデル開発者にインセンティブが支払われるようにしてもよい。 After developing an AI model using the learning dataset, the AI model developer uses the AI model developer terminal 6C to register the developed AI model in the marketplace. In this way, an incentive may be paid to the AI model developer when the AI model is downloaded.

また、アプリケーションを開発するユーザは、アプリケーション開発者端末６Ａを利用してマーケットプレイスからＡＩモデルをダウンロードして、当該ＡＩモデルを利用したＡＩアプリケーションの開発を行う。このとき、前述したように、ＡＩモデル開発者にインセンティブが支払われてもよい。 In addition, a user who develops an application downloads an AI model from the marketplace using the application developer terminal 6A and develops an AI application using the AI model. At this time, as described above, an incentive may be paid to the AI model developer.

アプリケーション開発ユーザは、アプリケーション開発者端末６Ａを用いて当該開発済みのＡＩアプリケーションをマーケットプレイスに登録する。これにより、当該ＡＩアプリケーションがダウンロードされた際にＡＩアプリケーションを開発したユーザにインセンティブが支払われるようにしてもよい。 The application development user uses the application developer terminal 6A to register the developed AI application in the marketplace. This may allow an incentive to be paid to the user who developed the AI application when the AI application is downloaded.

ＡＩアプリケーションを利用するユーザは、アプリケーション利用者端末６Ｂを利用してマーケットプレイスからＡＩアプリケーション及びＡＩモデルを自身が管理するエッジ側の情報処理装置としてのカメラ装置３に展開（デプロイ）するための操作を行う。このとき、ＡＩモデル開発者にインセンティブが支払われるようにしてもよい。
これにより、カメラ装置３においてＡＩアプリケーション及びＡＩモデルを用いたＡＩ画像処理を行うことが可能となり、画像を撮像するだけでなくＡＩ画像処理によって来店客の検出や車両の検出、或いは、上述したようなカメラ装置３ごとの画角内における人物Ｐの密集度を算出して提示する処理などを行うことが可能となる。 A user who uses an AI application uses the application user terminal 6B to perform an operation for deploying the AI application and the AI model from the marketplace to the camera device 3 as an edge-side information processing device that the user manages. At this time, an incentive may be paid to the AI model developer.
This makes it possible to perform AI image processing using AI applications and AI models in the camera device 3, making it possible not only to capture images but also to perform processes such as detecting customers or vehicles through AI image processing, or calculating and presenting the density of people P within the angle of view of each camera device 3 as described above.

ここで、ＡＩアプリケーション及びＡＩモデルの展開とは、実行主体としての対象（装置）がＡＩアプリケーション及びＡＩモデルを利用することができるように、換言すれば、ＡＩアプリケーションとしての少なくとも一部のプログラムを実行できるように、ＡＩアプリケーションやＡＩモデルが実行主体としての対象にインストールされることを指す。 Here, deployment of an AI application and an AI model refers to installing the AI application and the AI model in a target (device) acting as an execution subject so that the target can use the AI application and the AI model, in other words, so that at least a portion of the program as the AI application can be executed.

また、カメラ装置３においては、ＡＩ画像処理によって、カメラ装置３で撮像された撮像画像から密集度を算出するだけでなく来店客の属性情報が抽出可能とされていてもよい。
これらの属性情報は、カメラ装置３から通信ネットワーク７を介してクラウド側の情報処理装置に送信される。 In addition, the camera device 3 may be capable of using AI image processing to not only calculate the density from the image captured by the camera device 3 but also extract attribute information of customers visiting the store.
These pieces of attribute information are transmitted from the camera device 3 to the information processing device on the cloud side via the communication network 7 .

クラウド側の情報処理装置には、クラウドアプリケーションが展開されており、各ユーザは、通信ネットワーク７を介してクラウドアプリケーションを利用可能とされている。そして、クラウドアプリケーションの中には、密集度を視覚的に表示するためのアプリケーションや、来店客の属性情報や撮像画像を用いて来店客の動線を分析するアプリケーションなどが用意されている。このようなクラウドアプリケーションは、アプリケーション開発ユーザなどによりアップロードされる。 Cloud applications are deployed on the information processing device on the cloud side, and each user can use the cloud applications via the communication network 7. The cloud applications include an application for visually displaying the degree of congestion, and an application for analyzing the movement of customers using attribute information and captured images of customers. Such cloud applications are uploaded by application development users, etc.

アプリケーション利用ユーザは、アプリケーション利用者端末６Ｂを用いて密集度を視覚的に表示するためのアプリケーションを利用することにより、図１４に示すような画像がユーザに提示される。また、動線分析のためのクラウドアプリケーションを利用することにより、自身の店舗についての来店客の動線分析を行い、解析結果を閲覧することが可能とされている。解析結果の閲覧とは、店舗のマップ上に来店客の動線がグラフィカルに提示されることにより行われたりする。 By using an application for visually displaying the density using the application user terminal 6B, an image such as that shown in FIG. 14 is presented to the user. In addition, by using a cloud application for flow analysis, it is possible to perform an analysis of the flow of customers visiting one's own store and view the analysis results. Viewing the analysis results is performed by displaying the flow of customers visiting the store graphically on a map of the store.

クラウド側のマーケットプレイスにおいては、ユーザごとに最適化されたＡＩモデルがそれぞれ登録されていてもよい。例えば、あるユーザが管理している店舗に配置されたカメラ装置３において撮像された撮像画像が適宜クラウド側の情報処理装置にアップロードされて蓄積される。 In the cloud-side marketplace, AI models optimized for each user may be registered. For example, images captured by a camera device 3 installed in a store managed by a user are appropriately uploaded to an information processing device on the cloud side and stored.

クラウドの情報処理装置においては、アップロードされた撮像画像が一定枚数溜まるごとにＡＩモデルの再学習処理を行い、ＡＩモデルを更新してマーケットプレイスに登録しなおす処理が実行される。そして、その学習においては、撮像されたカメラ装置３の設置角度ごとに学習がなされることにより、カメラ装置３の設置角度ごとに最適化されたＡＩモデルが生成される。これにより、カメラ装置３ごとに最適なＡＩモデルを用いたＡＩ画像処理を行うことが可能となる。
なお、ＡＩモデルの再学習処理は、例えば、マーケットプレイス上でユーザがオプションとして選べるようにしてもよい。 In the cloud information processing device, a re-learning process of the AI model is performed each time a certain number of uploaded captured images are accumulated, and a process of updating the AI model and re-registering it in the marketplace is executed. In the learning process, learning is performed for each installation angle of the camera device 3 that captured the image, and an AI model optimized for each installation angle of the camera device 3 is generated. This makes it possible to perform AI image processing using an AI model optimized for each camera device 3.
In addition, the re-learning process of the AI model may be made available to users as an option on the marketplace, for example.

また、カメラ装置３の設置角度以外の例として、例えば、店舗内に配置されたカメラ装置３からの暗い画像を用いて再学習されたＡＩモデルが当該カメラ装置３に展開されることにより、暗い場所で撮像された撮像画像についての画像処理の認識率等を向上させることができる。また、店舗外に配置されたカメラ装置３からの明るい画像を用いて再学習されたＡＩモデルが当該カメラ装置３に展開されることにより、明るい場所で撮像された画像についての画像処理の認識率等を向上させることができる。
即ち、アプリケーション利用ユーザは、更新されたＡＩモデルを再度カメラ装置３に展開しなおすことにより、常に最適化された処理結果情報を得ることが可能となる。 As an example other than the installation angle of the camera device 3, for example, an AI model retrained using a dark image from a camera device 3 placed inside a store can be deployed to the camera device 3 to improve the recognition rate of image processing for images captured in dark places. Also, an AI model retrained using a bright image from a camera device 3 placed outside the store can be deployed to the camera device 3 to improve the recognition rate of image processing for images captured in bright places.
In other words, the application user can always obtain optimized processing result information by redeploying the updated AI model to the camera device 3 again.

また、クラウド側のマーケットプレイスにおいては、カメラごとに最適化されたＡＩモデルがそれぞれ登録されていてもよい。例えば、ＲＧＢ画像を取得可能なカメラ装置３に対して適用されるＡＩモデルや、距離画像を生成する測距センサを備えたカメラ装置３に対して適用されるＡＩモデルなどが考えられる。
また、明るい時間帯にカメラ装置３で用いられるべきＡＩモデルとして明るい環境で車両や撮像した画像を用いて学習されたＡＩモデルと、暗い時間帯にカメラ装置３で用いられるべきＡＩモデルとして暗い環境で撮像した画像を用いて学習されたＡＩモデルが、それぞれマーケットプレイスにおいて登録されていてもよい。
そしてこれらのＡＩモデルは再学習処理によって認識率が高められたＡＩモデルに適宜更新されることが望ましい。 In addition, in the cloud marketplace, AI models optimized for each camera may be registered. For example, an AI model that is applied to a camera device 3 capable of acquiring RGB images, an AI model that is applied to a camera device 3 equipped with a distance measuring sensor that generates a distance image, etc. may be considered.
In addition, an AI model trained using images of a vehicle or captured in a bright environment as an AI model to be used by the camera device 3 during daylight hours, and an AI model trained using images captured in a dark environment as an AI model to be used by the camera device 3 during dark hours may each be registered in the marketplace.
It is desirable to appropriately update these AI models to AI models with improved recognition rates through re-learning processes.

また、カメラ装置３からクラウド側の情報処理装置にアップロードされる情報（撮像画像など）に個人情報が含まれている場合には、プライバシーの保護の観点からプライバシーに関する情報を削除したデータがアップロードされるようにしてもよいし、プライバシーに関する情報が削除されたデータをＡＩモデル開発ユーザやアプリケーション開発ユーザが利用可能にしてもよい。
In addition, if personal information is included in the information (such as captured images) uploaded from the camera device 3 to the information processing device on the cloud side, the data may be uploaded with the privacy information deleted from the standpoint of privacy protection, or the data with the privacy information deleted may be made available to AI model development users and application development users.

＜１０．クラウド側情報処理装置の機能概要＞
本実施の形態では、サーバ装置が提供するサービスとして、顧客としてのユーザが各カメラ装置３のＡＩ画像処理についての機能の種別を選択することのできるサービスを想定している。機能の種別の選択とは、上述した目的の設定と換言することもできる。また、例えば、画像認識機能と画像検出機能などを選択してもよいし、特定の被写体についての画像認識機能や画像検出機能を発揮するように更に細かい種別を選択してもよい。
例えば、ビジネスモデルとして、サービス提供者は、ＡＩによる画像認識機能を有したカメラ装置３や管理装置２をユーザに販売し、それらカメラ装置３や管理装置２を監視対象となる場所に設置させる。そして、上述したような分析情報をユーザに提供するサービスを展開する。 <10. Functional overview of cloud-side information processing device>
In this embodiment, the service provided by the server device is assumed to be a service in which a user as a customer can select a function type for the AI image processing of each camera device 3. The selection of the function type can be said to be the setting of the above-mentioned purpose. In addition, for example, an image recognition function and an image detection function may be selected, or a more detailed type may be selected to perform an image recognition function or an image detection function for a specific subject.
For example, as a business model, the service provider sells the camera device 3 and the management device 2 having an AI-based image recognition function to the user, and has the user install the camera device 3 and the management device 2 in the location to be monitored. Then, the service provider develops a service that provides the above-mentioned analysis information to the user.

このとき、密集度を算出する用途や店舗監視の用途や交通監視の用途等、顧客ごとにシステムに求める用途（目的）が異なるため、顧客が求める用途に対応した分析情報が得られるように、カメラ装置３が有するＡＩ画像処理機能を選択的に設定することを可能とする。 At this time, since each customer has a different purpose (purpose) for the system, such as calculating density, monitoring stores, or monitoring traffic, it is possible to selectively set the AI image processing function of the camera device 3 so that analytical information corresponding to the purpose desired by the customer can be obtained.

また、地震などの災害が起きた際に、カメラ装置３を用いて取得したい情報が変わることも考えられる。具体的には、通常時においては、店舗の監視カメラとしての機能を実現するために来店客の検出や密集度の算出や属性を特定するためのＡＩ画像処理機能を発現させておき、災害発生時においては、商品棚に残る商品を把握するためのＡＩ画像処理機能に切り替える。この切り替えの際には、適切な認識結果を得ることができるように、ＡＩモデルを変更することが考えられる。 It is also conceivable that the information to be obtained using the camera device 3 will change when a disaster such as an earthquake occurs. Specifically, during normal times, the AI image processing function for detecting customers, calculating crowding levels, and identifying attributes will be activated in order to function as a store surveillance camera, and when a disaster occurs, the AI image processing function will be switched to identify products remaining on the shelves. When making this switch, it is conceivable to change the AI model so that appropriate recognition results can be obtained.

本例では、このようなカメラ装置３のＡＩ画像処理機能を選択的に設定する機能をクラウドサーバ５が有する。 In this example, the cloud server 5 has the function of selectively setting the AI image processing function of such a camera device 3.

なお、クラウドサーバ５の機能を管理装置２や他のサーバ装置が備えていてもよい。 The functions of the cloud server 5 may also be provided by the management device 2 or another server device.

ここで、クラウド側の情報処理装置であるクラウドサーバ５と、エッジ側の情報処理装置であるカメラ装置３の接続について、図１９を参照して説明する。 Here, the connection between the cloud server 5, which is an information processing device on the cloud side, and the camera device 3, which is an information processing device on the edge side, will be described with reference to FIG. 19.

クラウド側の情報処理装置には、Ｈｕｂを介して利用可能な機能である再学習機能とデバイス管理機能とマーケットプレイス機能が実装されている。 The cloud-side information processing device is equipped with a re-learning function, device management function, and marketplace function that are available via the hub.

Ｈｕｂは、エッジ側情報処理装置に対してセキュリティで保護された信頼性の高い通信を行う。これにより、エッジ側情報処理装置に対して各種の機能を提供することができる。 The Hub performs secure and highly reliable communication with the edge-side information processing device. This allows it to provide various functions to the edge-side information processing device.

再学習機能は、再学習を行い新たに最適化されたＡＩモデルの提供を行う機能であり、これにより、新たな学習素材に基づく適切なＡＩモデルの提供が行われる。 The re-learning function is a function that performs re-learning and provides a newly optimized AI model, thereby providing an appropriate AI model based on new learning materials.

デバイス管理機能は、エッジ側情報処理装置としてのカメラ装置３などを管理する機能であり、例えば、カメラ装置３に展開されたＡＩモデルの管理や監視、そして問題の検出やトラブルシューティングなどの機能を提供することができる。 The device management function is a function for managing the camera device 3 and other edge information processing devices, and can provide functions such as management and monitoring of the AI model deployed in the camera device 3, as well as problem detection and troubleshooting.

また、デバイス管理機能は、カメラ装置３や管理装置２の情報を管理する機能でもある。カメラ装置３や管理装置２の情報とは、演算処理部として使用されているチップの情報や、メモリ容量及び記憶容量、そして、ＣＰＵやメモリの使用率などの情報、更に、各装置にインストールされているＯＳ（Operating System）などのソフトウェアの情報などである。 The device management function also manages information about the camera device 3 and the management device 2. The information about the camera device 3 and the management device 2 includes information about the chips used as the arithmetic processing units, memory capacity and storage capacity, CPU and memory usage rates, and software information such as the OS (Operating System) installed in each device.

更に、デバイス管理機能は、認証されたユーザによるセキュアなアクセスを保護する。 Additionally, device management features ensure secure access by authorized users.

マーケットプレイス機能は、上述したＡＩモデル開発者によって開発されたＡＩモデルやアプリケーション開発者によって開発されたＡＩアプリケーションを登録する機能や、それらの開発物を許可されたエッジ側情報処理装置に展開する機能などを提供する。また、マーケットプレイス機能は、開発物の展開に応じたインセンティブの支払いに関する機能も提供される。 The marketplace function provides functions such as the registration of AI models developed by the above-mentioned AI model developers and AI applications developed by application developers, and the deployment of these developments to authorized edge-side information processing devices. The marketplace function also provides a function related to the payment of incentives in accordance with the deployment of the developments.

エッジ側情報処理装置としてのカメラ装置３には、エッジランタイムやＡＩアプリケーション及びＡＩモデルやイメージセンサＩＳを備えている。 The camera device 3, which serves as an edge-side information processing device, is equipped with an edge runtime, an AI application, an AI model, and an image sensor IS.

エッジランタイムは、カメラ装置３に展開されたアプリケーションの管理やクラウド側情報処理装置との通信を行うための組み込みソフトウェアなどとして機能する。 The edge runtime functions as embedded software for managing applications deployed on the camera device 3 and communicating with the cloud-side information processing device.

ＡＩモデルは、前述したように、クラウド側情報処理装置におけるマーケットプレイスに登録されたＡＩモデルを展開したものであり、これによってカメラ装置３は撮像画像を用いて目的に応じたＡＩ画像処理の結果情報を得ることができる。 As mentioned above, the AI model is an expansion of an AI model registered in the marketplace in the cloud-side information processing device, which enables the camera device 3 to use captured images to obtain AI image processing result information according to the purpose.

図２０を参照し、クラウド側情報処理装置が有する機能の概要を説明する。なお、クラウド側情報処理装置とは、クラウドサーバ５などの装置をまとめて呼称したものである。
図示のようにクラウド側情報処理装置は、ライセンスオーソリ機能Ｆ２１、アカウントサービス機能Ｆ２２、デバイス監視機能Ｆ２３、マーケットプレイス機能Ｆ２４、及びカメラサービス機能Ｆ２５を有する。 An overview of the functions of the cloud-side information processing device will be described with reference to Fig. 20. Note that the cloud-side information processing device is a collective name for devices such as the cloud server 5.
As shown in the figure, the cloud-side information processing device has a license authorization function F21, an account service function F22, a device monitoring function F23, a marketplace function F24, and a camera service function F25.

ライセンスオーソリ機能Ｆ２１は、各種の認証に係る処理を行う機能である。具体的に、ライセンスオーソリ機能Ｆ２１では、各カメラ装置３のデバイス認証に係る処理や、カメラ装置３で使用されるＡＩモデル、ソフトウェア、ファームウェアそれぞれについての認証に係る処理が行われる。 The license authorization function F21 is a function that performs various authentication-related processes. Specifically, the license authorization function F21 performs processes related to device authentication of each camera device 3, and processes related to authentication of each of the AI models, software, and firmware used in the camera device 3.

ここで、上記のソフトウェアは、カメラ装置３においてＡＩ画像処理を適切に実現させるために必要となるソフトウェアを意味する。
撮像画像に基づくＡＩ画像処理が適切に行われ、ＡＩ画像処理の結果が適切な形式で管理装置２やクラウドサーバ５に送信されるようにするためには、ＡＩモデルへの入力データを制御したり、ＡＩモデルの出力データを適切に処理したりすることが要求される。上記のソフトウェアは、ＡＩ画像処理を適切に実現させるために必要な周辺処理を含んだソフトウェアとなる。このようなソフトウェアは、ＡＩモデルを利用して所望の機能を実現するためのソフトウェアであり、上述のＡＩアプリケーションに該当する。 Here, the above software refers to software necessary to properly realize AI image processing in the camera device 3.
In order to properly perform AI image processing based on the captured image and to transmit the results of the AI image processing to the management device 2 and the cloud server 5 in an appropriate format, it is required to control the input data to the AI model and properly process the output data of the AI model. The above software is software that includes peripheral processing required to properly realize the AI image processing. Such software is software for realizing a desired function using the AI model, and corresponds to the above-mentioned AI application.

なお、ＡＩアプリケーションとしては、一つのＡＩモデルのみを利用するものに限らず、２以上のＡＩモデルを利用するものも考えられる。例えば、撮像画像を入力テンソルとしてＡＩ画像処理を実行するＡＩモデルで得られた認識結果の情報（画像データなどであり、以降、「認識結果情報」と記載）としての画像データは、入力テンソルとして更に別のＡＩモデルに入力されて第二のＡＩ画像処理が実行されるという処理の流れを有するＡＩアプリケーションも存在し得る。
或いは、第一のＡＩ画像処理の認識結果情報としての座標情報を用いて、第一のＡＩ画像処理に対する入力テンソルに対して第二のＡＩ画像処理としての所定の画像処理を施すＡＩアプリケーションであってもよい。なお、各ＡＩ画像処理に対する入力テンソルは、ＲＡＷ画像であってもよいし、ＲＡＷ画像に同時化処理を施したＲＧＢ画像などであってもよい。 In addition, the AI application is not limited to one that uses only one AI model, but may use two or more AI models. For example, there may be an AI application having a process flow in which image data as information of the recognition result (such as image data, hereinafter referred to as "recognition result information") obtained by an AI model that performs AI image processing using a captured image as an input tensor is input to another AI model as an input tensor to perform a second AI image processing.
Alternatively, the AI application may perform a predetermined image processing as a second AI image processing on an input tensor for the first AI image processing using coordinate information as recognition result information of the first AI image processing. Note that the input tensor for each AI image processing may be a RAW image, or may be an RGB image obtained by performing synchronization processing on a RAW image.

ライセンスオーソリ機能Ｆ２１において、カメラ装置３の認証については、カメラ装置３と通信ネットワーク７を介して接続された場合に、カメラ装置３ごとにデバイスＩＤ（Identification）を発行する処理が行われる。
また、ＡＩモデルやソフトウェアの認証については、ＡＩモデル開発者端末６Ｃから登録申請されたＡＩモデル、ＡＩアプリケーションについて、それぞれ固有のＩＤ（ＡＩモデルＩＤ、ソフトウェアＩＤ）を発行する処理が行われる。
また、ライセンスオーソリ機能Ｆ２１では、カメラ装置３やＡＩモデル開発者端末６Ｃとクラウドサーバ５との間でセキュアな通信が行われるようにするための各種の鍵や証明書等をカメラ装置３の製造業者やＡＩモデル開発者、ソフトウェア開発者に発行する処理が行われると共に、証明効力の更新や停止のための処理も行われる。
さらに、ライセンスオーソリ機能Ｆ２１では、以下で説明するアカウントサービス機能Ｆ２２によりユーザ登録（ユーザＩＤの発行を伴うアカウント情報の登録）が行われた場合に、ユーザが購入したカメラ装置３（上記デバイスＩＤ）とユーザＩＤとを紐付ける処理も行われる。 In the license authorization function F21, when the camera device 3 is authenticated via the communication network 7, a process of issuing a device ID (Identification) for each camera device 3 is performed.
In addition, for authentication of AI models and software, a process is carried out to issue unique IDs (AI model ID, software ID) for each AI model and AI application for which registration has been applied for from the AI model developer terminal 6C.
In addition, the license authorization function F21 performs processing to issue various keys and certificates, etc. to the manufacturer of the camera device 3, the AI model developer, and the software developer to ensure secure communication between the camera device 3, the AI model developer terminal 6C, and the cloud server 5, and also performs processing to update or suspend the validity of the certificate.
Furthermore, in the license authorization function F21, when user registration (registration of account information involving issuance of a user ID) is performed by the account service function F22 described below, the license authorization function F21 also performs a process of linking the camera device 3 (the above-mentioned device ID) purchased by the user with the user ID.

アカウントサービス機能Ｆ２２は、ユーザのアカウント情報の生成や管理を行う機能である。アカウントサービス機能Ｆ２２では、ユーザ情報の入力を受け付け、入力されたユーザ情報に基づいてアカウント情報を生成する（少なくともユーザＩＤとパスワード情報とを含むアカウント情報の生成を行う）。
また、アカウントサービス機能Ｆ２２では、ＡＩモデル開発者やＡＩアプリケーションの開発者（以下「ソフトウェア開発者」と略称することもある）についての登録処理（アカウント情報の登録）も行われる。 The account service function F22 is a function for generating and managing user account information. The account service function F22 accepts input of user information and generates account information based on the input user information (generating account information including at least a user ID and password information).
In addition, the account service function F22 also performs registration processing (registration of account information) for AI model developers and AI application developers (hereinafter sometimes abbreviated as "software developers").

デバイス監視機能Ｆ２３は、カメラ装置３の使用状態を監視するための処理を行う機能である。例えば、カメラ装置３の使用場所や、ＡＩ画像処理の出力データの出力頻度、ＡＩ画像処理に用いられるＣＰＵやメモリの空き容量等、カメラ装置３の使用状態に係る各種の要素として上述したＣＰＵやメモリの使用率などの情報についての監視を行う。 The device monitoring function F23 is a function that performs processing to monitor the usage status of the camera device 3. For example, it monitors information such as the location where the camera device 3 is used, the output frequency of output data for AI image processing, the free space of the CPU and memory used for AI image processing, and the usage rate of the CPU and memory mentioned above as various elements related to the usage status of the camera device 3.

マーケットプレイス機能Ｆ２４は、ＡＩモデルやＡＩアプリケーションを販売するための機能である。例えばユーザは、マーケットプレイス機能Ｆ２４により提供される販売用のＷＥＢサイト（販売用サイト）を介してＡＩアプリケーション、及びＡＩアプリケーションが利用するＡＩモデルを購入することが可能とされる。また、ソフトウェア開発者は、上記の販売用サイトを介してＡＩアプリケーションの作成のためにＡＩモデルを購入することが可能とされる。 The marketplace function F24 is a function for selling AI models and AI applications. For example, a user can purchase an AI application and an AI model used by an AI application via a sales website (sales site) provided by the marketplace function F24. In addition, a software developer can purchase an AI model for creating an AI application via the sales site.

カメラサービス機能Ｆ２５は、カメラ装置３の利用に関するサービスをユーザに提供するための機能とされる。
このカメラサービス機能Ｆ２５の一つとしては、例えば、前述した分析情報の生成に係る機能を挙げることができる。すなわち、カメラ装置３における画像処理の処理結果情報に基づき被写体の分析情報を生成しユーザ端末６を介してユーザに閲覧させるための処理を行う機能である。これには、密集度を可視化して提示するための処理も含まれる。 The camera service function F25 is a function for providing services relating to the use of the camera device 3 to the user.
One example of this camera service function F25 is the function related to the generation of the above-mentioned analysis information. That is, it is a function that performs processing to generate analytical information of a subject based on the processing result information of the image processing in the camera device 3 and to allow the user to view the information via the user terminal 6. This also includes processing to visualize and present the density.

また、カメラサービス機能Ｆ２５には、撮像設定探索機能が含まれてもよい。具体的に、この撮像設定探索機能は、カメラ装置３からＡＩ画像処理の認識結果情報を取得し、取得した認識結果情報に基づき、ＡＩを用いてカメラ装置３の撮像設定情報を探索する機能である。ここで、撮像設定情報とは、撮像画像を得るための撮像動作に係る設定情報を広く意味するものである。具体的には、フォーカスや絞り等といった光学的な設定や、フレームレート、露光時間、ゲイン等といった撮像画像信号の読み出し動作に係る設定、さらにはガンマ補正処理、ノイズリダクション処理、超解像処理等、読み出された撮像画像信号に対する画像信号処理に係る設定等を広く含むものである。また、カメラ装置３の光軸の方向についての設定、即ち、カメラ装置３の撮像方向の設定などが含まれてもよい。
撮像設定探索機能が適切に機能することにより、ユーザによって設定された目的に応じてカメラ装置３の撮像設定が最適化され、例えば、密集度などについての良好な推論結果を得ることができる。 The camera service function F25 may also include an imaging setting search function. Specifically, the imaging setting search function is a function that acquires recognition result information of the AI image processing from the camera device 3, and searches for imaging setting information of the camera device 3 using AI based on the acquired recognition result information. Here, the imaging setting information broadly means setting information related to imaging operations for obtaining an image. Specifically, it broadly includes optical settings such as focus and aperture, settings related to the readout operation of the imaging image signal such as frame rate, exposure time, gain, and further settings related to image signal processing for the readout imaging image signal such as gamma correction processing, noise reduction processing, super-resolution processing, etc. Also, settings regarding the direction of the optical axis of the camera device 3, i.e., settings of the imaging direction of the camera device 3, etc. may be included.
By properly functioning the imaging setting search function, the imaging settings of the camera device 3 are optimized according to the purpose set by the user, and good inference results can be obtained regarding, for example, the density.

また、カメラサービス機能Ｆ２５には、ＡＩモデル探索機能も含まれる。このＡＩモデル探索機能は、カメラ装置３からＡＩ画像処理の認識結果情報を取得し、取得した認識結果情報に基づき、カメラ装置３におけるＡＩ画像処理に用いられる最適なＡＩモデルをＡＩで探索する機能である。ここで言うＡＩモデルの探索とは、例えば、ＡＩ画像処理が畳み込み演算を含むＣＮＮ（Convolutional Neural Network）等により実現される場合において、重み係数等の各種の処理パラメータやニューラルネットワーク構造に係る設定情報（例えば、カーネルサイズの情報等を含む）等を最適化する処理を意味する。 The camera service function F25 also includes an AI model search function. This AI model search function acquires recognition result information of AI image processing from the camera device 3, and searches for an optimal AI model to be used for AI image processing in the camera device 3 using AI based on the acquired recognition result information. The AI model search referred to here means a process of optimizing various processing parameters such as weighting coefficients and setting information related to the neural network structure (including, for example, kernel size information) when the AI image processing is realized by a CNN (Convolutional Neural Network) that includes a convolution operation.

なお、カメラサービス機能Ｆ２５は、処理分担を決定する機能を備えていてもよい。処理分担決定機能においては、ＡＩアプリケーションをエッジ側情報処理装置に展開する際に、ＳＷコンポーネント単位での展開先の装置を決定する処理を行う。なお、一部のＳＷコンポーネントは、クラウド側の装置において実行されるものとして決定してもよく、この場合には既にクラウド側の装置に展開済みであるとして展開処理が行われなくてもよい。 The camera service function F25 may also have a function for determining processing load. In the processing load determination function, when an AI application is deployed to an edge-side information processing device, a process is performed to determine the device to which the application is deployed on a SW component-by-component basis. Some SW components may be determined to be executed on a cloud-side device, in which case the deployment process may not be performed as the SW components have already been deployed to the cloud-side device.

上記のような撮像設定探索機能及びＡＩモデル探索機能を有することで、ＡＩ画像処理の結果を良好とする撮像設定が行われるように図られると共に、実際の使用環境に応じた適切なＡＩモデルを用いてＡＩ画像処理が行われるように図ることができる。
そして、これに加えて処理分担決定機能を有することで、ＡＩ画像処理及びその解析処理が適切な装置において実行されるように図ることができる。 By having the above-mentioned image capture setting search function and AI model search function, it is possible to perform image capture settings that will produce good results in AI image processing, and it is also possible to perform AI image processing using an appropriate AI model according to the actual usage environment.
In addition, by having a processing allocation determination function, it is possible to ensure that AI image processing and its analysis processing are executed in an appropriate device.

なお、カメラサービス機能Ｆ２５は、各ＳＷコンポーネントを展開するに先立って、アプリケーション設定機能を有する。アプリケーション設定機能は、ユーザの目的に応じて適切なＡＩアプリケーションを設定する機能である。 The camera service function F25 has an application setting function prior to deploying each SW component. The application setting function is a function that sets an appropriate AI application according to the user's purpose.

例えば、ユーザが選択した目的に応じて、適切なＡＩアプリケーションを選択する。これにより、ＡＩアプリケーションを構成するＳＷコンポーネントについても自ずと決定される。なお、ＡＩアプリケーションを用いてユーザの目的を実現するためのＳＷコンポーネントの組み合わせが複数種類あってもよく、この場合には、エッジ側情報処理装置の情報やユーザの要求に応じて一つの組み合わせが選択される。 For example, an appropriate AI application is selected according to the purpose selected by the user. This automatically determines the SW components that make up the AI application. Note that there may be multiple combinations of SW components to achieve the user's purpose using the AI application, in which case one combination is selected according to the information on the edge-side information processing device and the user's request.

例えば、ユーザが店舗監視を目的とした場合に、ユーザの要求がプライバシー重視である場合と、速度重視である場合とで、ＳＷコンポーネントの組み合わせが異なってもよい。 For example, if a user wishes to monitor a store, the combination of SW components may be different depending on whether the user's requirements place a premium on privacy or speed.

アプリケーション設定機能においては、ユーザ端末６（図１８におけるアプリケーション利用者端末６Ｂに相当）においてユーザが目的（アプリケーション）を選択する操作を受け付ける処理や、選択されたアプリケーションに応じて適切なＡＩアプリケーションを選択する処理等が行われる。 The application setting function involves processes such as accepting an operation by the user on the user terminal 6 (corresponding to the application user terminal 6B in FIG. 18) to select a purpose (application) and selecting an appropriate AI application according to the selected application.

ここで、上記では、クラウドサーバ５単体でライセンスオーソリ機能Ｆ２１、アカウントサービス機能Ｆ２２、デバイス監視機能Ｆ２３、マーケットプレイス機能Ｆ２４、及びカメラサービス機能Ｆ２５を実現する構成を例示したが、これらの機能を複数の情報処理装置が分担して実現する構成とすることも可能である。例えば、上記の機能をそれぞれ１台の情報処理装置が担う構成とすることが考えられる。或いは、上記した機能のうち単一の機能を複数の情報処理装置（例えば、クラウドサーバ５とそれ以外のサーバ装置）が分担して行うといったことも可能である。
Here, in the above, the cloud server 5 alone realizes the license authorization function F21, the account service function F22, the device monitoring function F23, the marketplace function F24, and the camera service function F25, but it is also possible to realize these functions by sharing them among a plurality of information processing devices. For example, it is possible to configure each of the above functions to be performed by one information processing device. Alternatively, it is also possible to share a single function among the above functions among a plurality of information processing devices (for example, the cloud server 5 and other server devices).

＜１１．ＡＩモデル及びＡＩアプリケーションの展開＞
上述したように、ＡＩアプリケーションやＡＩモデルの情報は、コンテナ技術を用いて、コンテナなどとしてカメラ装置３におけるイメージセンサＩＳ外のメモリ部３４などに展開した後、ＡＩモデルだけをイメージセンサＩＳ内のメモリ部４５に格納させてもよい。コンテナ技術について添付図を参照して説明する。 <11. Deployment of AI models and AI applications>
As described above, information on the AI application and the AI model may be expanded as a container in the memory unit 34 outside the image sensor IS in the camera device 3 using container technology, and then only the AI model may be stored in the memory unit 45 in the image sensor IS. The container technology will be described with reference to the attached drawings.

カメラ装置３においては、図２に示す制御部３３としてのＣＰＵやＧＰＵ（Graphics Processing Unit）やＲＯＭやＲＡＭ等の各種のハードウェア５０の上にオペレーションシステム５１がインストールされている（図２１参照）。 In the camera device 3, an operation system 51 is installed on various hardware 50 such as a CPU, a GPU (Graphics Processing Unit), a ROM, and a RAM, which serve as the control unit 33 shown in FIG. 2 (see FIG. 21).

オペレーションシステム５１は、カメラ装置３における各種の機能を実現するためにカメラ装置３の全体制御を行う基本ソフトウェアである。 The operation system 51 is basic software that performs overall control of the camera device 3 to realize various functions in the camera device 3.

オペレーションシステム５１上には、汎用ミドルウェア５２がインストールされている。 General-purpose middleware 52 is installed on the operation system 51.

汎用ミドルウェア５２は、例えば、ハードウェア５０としての通信部３５を用いた通信機能や、ハードウェア５０としての表示部（モニタ等）を用いた表示機能などの基本的動作を実現するためのソフトウェアである。 The general-purpose middleware 52 is software for implementing basic operations such as a communication function using the communication unit 35 as the hardware 50 and a display function using a display unit (such as a monitor) as the hardware 50.

オペレーションシステム５１上には、汎用ミドルウェア５２だけでなくオーケストレーションツール５３及びコンテナエンジン５４がインストールされている。 In addition to the general-purpose middleware 52, an orchestration tool 53 and a container engine 54 are installed on the operation system 51.

オーケストレーションツール５３及びコンテナエンジン５４は、コンテナ５５の動作環境としてのクラスタ５６を構築することにより、コンテナ５５の展開や実行を行う。
なお、図１９に示すエッジランタイムは図２１に示すオーケストレーションツール５３及びコンテナエンジン５４に相当する。 The orchestration tool 53 and the container engine 54 deploy and execute the container 55 by constructing a cluster 56 as an operating environment for the container 55 .
The edge runtime shown in FIG. 19 corresponds to the orchestration tool 53 and the container engine 54 shown in FIG.

オーケストレーションツール５３は、コンテナエンジン５４に対して上述したハードウェア５０及びオペレーションシステム５１のリソースの割り当てを適切に行わせるための機能を有する。オーケストレーションツール５３によって各コンテナ５５が所定の単位（後述するポッド）にまとめられ、各ポッドが論理的に異なるエリアとされたワーカノード（後述）に展開される。 The orchestration tool 53 has a function for appropriately allocating the resources of the above-mentioned hardware 50 and operation system 51 to the container engine 54. The orchestration tool 53 groups each container 55 into a predetermined unit (a pod, described later), and each pod is deployed to a worker node (described later) that is a logically different area.

コンテナエンジン５４は、オペレーションシステム５１にインストールされるミドルウェアの一つであり、コンテナ５５を動作させるエンジンである。具体的には、コンテナエンジン５４は、コンテナ５５内のミドルウェアが備える設定ファイルなどに基づいてハードウェア５０及びオペレーションシステム５１のリソース（メモリや演算能力など）をコンテナ５５に割り当てる機能を持つ。 The container engine 54 is a piece of middleware installed in the operation system 51, and is an engine that runs the container 55. Specifically, the container engine 54 has a function of allocating the resources (memory, computing power, etc.) of the hardware 50 and the operation system 51 to the container 55 based on a configuration file or the like provided by the middleware in the container 55.

また、本実施の形態において割り当てられるリソースは、カメラ装置３が備える制御部３３等のリソースだけでなく、イメージセンサＩＳが備えるセンサ内制御部４３やメモリ部４５や通信Ｉ／Ｆ４６などのリソースも含まれる。 In addition, the resources allocated in this embodiment include not only resources such as the control unit 33 of the camera device 3, but also resources such as the sensor internal control unit 43, memory unit 45, and communication I/F 46 of the image sensor IS.

コンテナ５５は、所定の機能を実現するためのアプリケーションとライブラリなどのミドルウェアを含んで構成される。
コンテナ５５は、コンテナエンジン５４によって割り当てられたハードウェア５０及びオペレーションシステム５１のリソースを用いて所定の機能を実現するために動作する。 The container 55 includes middleware such as applications and libraries for implementing predetermined functions.
The container 55 operates to realize a specified function using the resources of the hardware 50 and the operation system 51 allocated by the container engine 54.

本実施の形態においては、図１９に示すＡＩアプリケーション及びＡＩモデルはコンテナ５５のうちの一つに相当する。即ち、カメラ装置３に展開された各種のコンテナ５５のうちの一つは、ＡＩアプリケーション及びＡＩモデルを用いた所定のＡＩ画像処理機能を実現する。 In this embodiment, the AI application and AI model shown in FIG. 19 correspond to one of the containers 55. That is, one of the various containers 55 deployed in the camera device 3 realizes a predetermined AI image processing function using the AI application and AI model.

コンテナエンジン５４及びオーケストレーションツール５３によって構築されるクラスタ５６の具体的な構成例について図２２を参照して説明する。なおクラスタ５６は、一つのカメラ装置３が備えるハードウェア５０だけでなく他の装置が備える他のハードウェアのリソースを利用して機能が実現するように複数の機器にまたがって構築されてもよい。 A specific configuration example of a cluster 56 constructed by the container engine 54 and the orchestration tool 53 will be described with reference to FIG. 22. Note that the cluster 56 may be constructed across multiple devices so that functions are realized using not only the hardware 50 of one camera device 3 but also the resources of other hardware of other devices.

オーケストレーションツール５３は、コンテナ５５の実行環境の管理をワーカノード５７単位で行う。また、オーケストレーションツール５３は、ワーカノード５７の全体を管理するマスタノード５８を構築する。 The orchestration tool 53 manages the execution environment of the container 55 on a worker node 57 basis. The orchestration tool 53 also constructs a master node 58 that manages all of the worker nodes 57.

ワーカノード５７においては、複数のポッド５９が展開される。ポッド５９は、１または複数のコンテナ５５を含んで構成され、所定の機能を実現する。ポッド５９は、オーケストレーションツール５３によってコンテナ５５を管理するための管理単位とされる。 In the worker node 57, multiple pods 59 are deployed. The pod 59 is configured to include one or more containers 55 and realizes a specified function. The pod 59 is treated as a management unit for managing the containers 55 by the orchestration tool 53.

ワーカノード５７におけるポッド５９の動作は、ポッド管理ライブラリ６０によって制御される。 The operation of the pod 59 on the worker node 57 is controlled by the pod management library 60.

ポッド管理ライブラリ６０は、論理的に割り当てられたハードウェア５０のリソースをポッド５９に利用させるためのコンテナランタイムやマスタノード５８から制御を受け付けるエージェントやポッド５９間の通信やマスタノード５８との通信を行うネットワークプロキシなどを有して構成されている。
即ち、各ポッド５９は、ポッド管理ライブラリ６０によって各リソースを用いた所定の機能を実現可能とされる。 The pod management library 60 is composed of a container runtime that allows the pods 59 to utilize the resources of the logically allocated hardware 50, an agent that accepts control from the master node 58, and a network proxy that communicates between the pods 59 and with the master node 58.
That is, each pod 59 is enabled to realize a predetermined function using each resource by the pod management library 60 .

マスタノード５８は、ポッド５９の展開を行うアプリサーバ６１と、アプリサーバ６１によるコンテナ５５の展開状況を管理するマネージャ６２と、コンテナ５５を配置するワーカノード５７を決定するスケジューラ６３と、データ共有を行うデータ共有部６４を含んで構成されている。 The master node 58 includes an application server 61 that deploys the pod 59, a manager 62 that manages the deployment status of the container 55 by the application server 61, a scheduler 63 that determines the worker node 57 on which the container 55 is placed, and a data sharing unit 64 that shares data.

図２１及び図２２に示す構成を利用することにより、コンテナ技術を用いて前述したＡＩアプリケーション及びＡＩモデルをカメラ装置３のイメージセンサＩＳに展開することが可能となる。
なお、前述したとおり、ＡＩモデルについて、図２の通信Ｉ／Ｆ４６を介してイメージセンサＩＳ内のメモリ部４５に格納させ、イメージセンサＩＳ内でＡＩ画像処理を実行させるようにしてもよいし、図２１及び図２２に示す構成をイメージセンサＩＳ内のメモリ部４５及びセンサ内制御部４３に展開し、イメージセンサＩＳ内でコンテナ技術を用いて前述したＡＩアプリケーション及びＡＩモデルを実行させてもよい。
また、後述するように、ＡＩアプリケーション及び／またはＡＩモデルを管理装置２やクラウド側情報処理装置に展開する場合でもコンテナ技術を用いることができる。
その際は、ＡＩアプリケーションやＡＩモデルの情報は、コンテナなどとして、後述する図２３の不揮発性メモリ部７４、記憶部７９またはＲＡＭ７３などのメモリに展開されて実行される。
By utilizing the configurations shown in Figures 21 and 22, it is possible to deploy the aforementioned AI applications and AI models to the image sensor IS of the camera device 3 using container technology.
As mentioned above, the AI model may be stored in the memory unit 45 in the image sensor IS via the communication I/F 46 in Figure 2, and AI image processing may be executed within the image sensor IS, or the configurations shown in Figures 21 and 22 may be deployed in the memory unit 45 and sensor control unit 43 in the image sensor IS, and the aforementioned AI application and AI model may be executed within the image sensor IS using container technology.
In addition, as described below, container technology can also be used when deploying AI applications and/or AI models to a management device 2 or a cloud-side information processing device.
In this case, information on the AI application or AI model is deployed as a container or the like in memory such as the non-volatile memory unit 74, storage unit 79, or RAM 73 in Figure 23 described below, and executed.

＜１２．情報処理装置のハードウェア構成＞
密集度算出システム１が備えるクラウドサーバ５、ユーザ端末６、管理装置２などの情報処理装置のハードウェア構成について図２３を参照して説明する。 <12. Hardware configuration of information processing device>
The hardware configuration of information processing devices such as the cloud server 5, the user terminal 6, and the management device 2 included in the density calculation system 1 will be described with reference to FIG.

情報処理装置はＣＰＵ７１を備えている。ＣＰＵ７１は、上述した各種の処理を行う演算処理部として機能し、ＲＯＭ７２や例えばＥＥＰ－ＲＯＭ（Electrically Erasable Programmable Read-Only Memory）などの不揮発性メモリ部７４に記憶されているプログラム、または記憶部７９からＲＡＭ７３にロードされたプログラムに従って各種の処理を実行する。ＲＡＭ７３にはまた、ＣＰＵ７１が各種の処理を実行する上において必要なデータなども適宜記憶される。 The information processing device includes a CPU 71. The CPU 71 functions as an arithmetic processing unit that performs the various processes described above, and executes the various processes according to a program stored in a ROM 72 or a non-volatile memory unit 74, such as an EEP-ROM (Electrically Erasable Programmable Read-Only Memory), or a program loaded from a storage unit 79 to a RAM 73. The RAM 73 also stores data necessary for the CPU 71 to execute the various processes, as appropriate.

なお、クラウドサーバ５としての情報処理装置が備えるＣＰＵ７１は、上述した各機能を実現するためにライセンスオーソリ部、アカウントサービス提供部、デバイス監視部、マーケットプレイス機能提供部、カメラサービス提供部として機能する。 The CPU 71 provided in the information processing device serving as the cloud server 5 functions as a license authorization unit, an account service providing unit, a device monitoring unit, a marketplace function providing unit, and a camera service providing unit to realize the above-mentioned functions.

ＣＰＵ７１、ＲＯＭ７２、ＲＡＭ７３、不揮発性メモリ部７４は、バス８３を介して相互に接続されている。このバス８３にはまた、入出力インタフェース（Ｉ／Ｆ）７５も接続されている。 The CPU 71, ROM 72, RAM 73, and non-volatile memory unit 74 are interconnected via a bus 83. An input/output interface (I/F) 75 is also connected to this bus 83.

入出力インタフェース７５には、操作子や操作デバイスよりなる入力部７６が接続される。
例えば入力部７６としては、キーボード、マウス、キー、ダイヤル、タッチパネル、タッチパッド、リモートコントローラ等の各種の操作子や操作デバイスが想定される。
入力部７６によりユーザの操作が検知され、入力された操作に応じた信号はＣＰＵ７１によって解釈される。 The input/output interface 75 is connected to an input unit 76 including an operator and an operating device.
For example, the input unit 76 may be various types of operators or operation devices such as a keyboard, a mouse, keys, a dial, a touch panel, a touch pad, or a remote controller.
An operation by the user is detected by the input unit 76 , and a signal corresponding to the input operation is interpreted by the CPU 71 .

また入出力インタフェース７５には、ＬＣＤ或いは有機ＥＬパネルなどよりなる表示部７７や、スピーカなどよりなる音声出力部７８が一体又は別体として接続される。
表示部７７は各種表示を行う表示部であり、例えばコンピュータ装置の筐体に設けられるディスプレイデバイスや、コンピュータ装置に接続される別体のディスプレイデバイス等により構成される。 Further, a display unit 77 such as an LCD or an organic EL panel, and an audio output unit 78 such as a speaker are connected to the input/output interface 75 either integrally or separately.
The display unit 77 is a display unit that performs various displays, and is configured, for example, by a display device provided in the housing of the computer device, or a separate display device connected to the computer device.

表示部７７は、ＣＰＵ７１の指示に基づいて表示画面上に各種の画像処理のための画像や処理対象の動画等の表示を実行する。また表示部７７はＣＰＵ７１の指示に基づいて、各種操作メニュー、アイコン、メッセージ等、即ちＧＵＩ（Graphical User Interface）としての表示を行う。 The display unit 77 displays images for various types of image processing, videos to be processed, etc., on the display screen based on instructions from the CPU 71. The display unit 77 also displays various operation menus, icons, messages, etc., i.e., a GUI (Graphical User Interface), based on instructions from the CPU 71.

入出力インタフェース７５には、ハードディスクや固体メモリなどより構成される記憶部７９や、モデムなどより構成される通信部８０が接続される場合もある。 The input/output interface 75 may also be connected to a storage unit 79 such as a hard disk or solid-state memory, or a communication unit 80 such as a modem.

通信部８０は、インターネット等の伝送路を介しての通信処理や、各種機器との有線／無線通信、バス通信などによる通信を行う。 The communication unit 80 performs communication processing via a transmission path such as the Internet, and communication with various devices via wired/wireless communication, bus communication, etc.

入出力インタフェース７５にはまた、必要に応じてドライブ８１が接続され、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブル記憶媒体８２が適宜装着される。 A drive 81 is also connected to the input/output interface 75 as necessary, and a removable storage medium 82 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory is appropriately attached.

ドライブ８１により、リムーバブル記憶媒体８２から各処理に用いられるプログラム等のデータファイルなどを読み出すことができる。読み出されたデータファイルは記憶部７９に記憶されたり、データファイルに含まれる画像や音声が表示部７７や音声出力部７８で出力されたりする。またリムーバブル記憶媒体８２から読み出されたコンピュータプログラム等は必要に応じて記憶部７９にインストールされる。 The drive 81 can read data files such as programs used for each process from the removable storage medium 82. The read data files are stored in the storage unit 79, and images and sounds contained in the data files are output on the display unit 77 and the audio output unit 78. In addition, computer programs and the like read from the removable storage medium 82 are installed in the storage unit 79 as necessary.

このコンピュータ装置では、例えば本実施の形態の処理のためのソフトウェアを、通信部８０によるネットワーク通信やリムーバブル記憶媒体８２を介してインストールすることができる。或いは当該ソフトウェアは予めＲＯＭ７２や記憶部７９等に記憶されていてもよい。
また、カメラ装置３において撮像された撮像画像やＡＩ画像処理による処理結果を受け取り、記憶部７９やドライブ８１を介してリムーバブル記憶媒体８２に記憶させてもよい。 In this computer device, for example, software for the processing of this embodiment can be installed via network communication by the communication unit 80 or via the removable storage medium 82. Alternatively, the software may be stored in advance in the ROM 72, the storage unit 79, etc.
In addition, images captured by the camera device 3 and the results of AI image processing may be received and stored in a removable storage medium 82 via the storage unit 79 or drive 81.

ＣＰＵ７１が各種のプログラムに基づいて処理動作を行うことで、上述した演算処理部を備えた情報処理装置であるクラウドサーバ５やユーザ端末６や管理装置２としての必要な情報処理や通信処理が実行される。
なお、クラウドサーバ５、ユーザ端末６、管理装置２は、それぞれが図２３のようなコンピュータ装置が単一で構成されることに限らず、複数のコンピュータ装置がシステム化されて構成されてもよい。複数のコンピュータ装置は、ＬＡＮ（Local Area Network）等によりシステム化されていてもよいし、インターネット等を利用したＶＰＮ（Virtual Private Network）等により遠隔地に配置されたものでもよい。複数のコンピュータ装置には、クラウドコンピューティングサービスによって利用可能なサーバ群（クラウド）としてのコンピュータ装置が含まれてもよい。
The CPU 71 performs processing operations based on various programs, thereby executing the necessary information processing and communication processing of the cloud server 5, user terminal 6, and management device 2, which are information processing devices equipped with the above-mentioned arithmetic processing unit.
Note that the cloud server 5, the user terminal 6, and the management device 2 are not limited to being configured as a single computer device as shown in Fig. 23, but may be configured as a system of multiple computer devices. The multiple computer devices may be systemized using a LAN (Local Area Network) or the like, or may be located in a remote location using a VPN (Virtual Private Network) using the Internet or the like. The multiple computer devices may include computer devices as a server group (cloud) available through a cloud computing service.

＜１３．その他＞
上述のように、ＡＩアプリケーションのＳＷコンポーネント及びＡＩモデルが展開された後、サービスの提供者や利用者（ユーザ）の操作をトリガとしてＡＩモデルの再学習と各カメラ装置３などに展開されたＡＩモデル（以降「エッジ側ＡＩモデル」と記載）やＡＩアプリケーションの更新を行うときの処理の流れについて、具体的に、図２４を参照して説明する。なお、図２４は複数のカメラ装置３の中の１台のカメラ装置３に着目して記載したものである。また、以下の説明において更新対象とされたエッジ側ＡＩモデルは、一例として、カメラ装置３が備えるイメージセンサＩＳに展開されているものであるが、もちろん、エッジ側ＡＩモデルはカメラ装置３におけるイメージセンサＩＳ外に展開されているものでもよい。 <13. Other>
As described above, after the SW components and AI model of the AI application are deployed, the process flow when the AI model is re-learned and the AI model (hereinafter referred to as the "edge-side AI model") and the AI application deployed in each camera device 3 are updated using the operation of the service provider or user as a trigger will be specifically described with reference to FIG. 24. Note that FIG. 24 is described with a focus on one camera device 3 among the multiple camera devices 3. In addition, the edge-side AI model to be updated in the following description is, as an example, deployed in the image sensor IS provided in the camera device 3, but of course, the edge-side AI model may be deployed outside the image sensor IS in the camera device 3.

先ず、処理ステップＰＳ１において、サービスの提供者や利用者によるＡＩモデルの再学習指示が行われる。この指示は、クラウド側情報処理装置が備えるＡＰＩ（Application Programming Interface）モジュールが備えるＡＰＩ機能を利用して行われる。また、当該指示においては、学習に用いる画像量（例えば枚数）が指定される。以降、学習に用いる画像量を「所定枚数」とも記載する。 First, in processing step PS1, a service provider or user issues an instruction to retrain the AI model. This instruction is issued using an Application Programming Interface (API) function provided by an API module provided in the cloud-side information processing device. The instruction also specifies the amount of images (e.g., number) to be used for learning. Hereinafter, the amount of images to be used for learning will also be referred to as the "predetermined number."

ＡＰＩモジュールは、当該指示を受け、処理ステップＰＳ２でＨｕｂ（図１９に示したものと同様のもの）に対して再学習のリクエストと画像量の情報を送信する。 The API module receives this instruction and in processing step PS2 sends a re-learning request and image volume information to the Hub (similar to that shown in Figure 19).

Ｈｕｂは、処理ステップＰＳ３において、エッジ側情報処理装置としてのカメラ装置３に対してアップデート通知と画像量の情報を送信する。 In processing step PS3, the Hub sends an update notification and image volume information to the camera device 3, which serves as the edge-side information processing device.

カメラ装置３は、撮影を行うことにより得られた撮像画像データを処理ステップＰＳ４においてストレージ群の画像ＤＢ（Database）に送信する。この撮影処理と送信処理は、再学習に必要な所定枚数に達成するまで行われる。 The camera device 3 transmits the captured image data obtained by capturing an image to an image database (DB) in the storage group in processing step PS4. This capturing process and transmission process is repeated until a predetermined number of images required for re-learning is reached.

なお、カメラ装置３は、撮像画像データに対する推論処理を行うことにより推論結果を得た場合には、処理ステップＰＳ４において撮像画像データのメタデータとして推論結果等を画像ＤＢに記憶してもよい。 When the camera device 3 obtains an inference result by performing inference processing on the captured image data, the camera device 3 may store the inference result, etc. in the image DB as metadata for the captured image data in processing step PS4.

カメラ装置３における推論結果がメタデータがとして画像ＤＢに記憶されることにより、クラウド側で実行されるＡＩモデルの再学習に必要なデータを厳選することができる。具体的には、カメラ装置３における推論結果とクラウド側情報処理装置において潤沢なコンピュータ資源を用いて実行される推論の結果が相違している画像データのみを用いて再学習を行うことができる。従って、再学習に要する時間を短縮することが可能となる。 By storing the inference results from the camera device 3 as metadata in the image DB, it is possible to carefully select the data necessary for re-learning the AI model executed on the cloud side. Specifically, re-learning can be performed using only image data where the inference results from the camera device 3 differ from the results of inference executed using abundant computer resources in the cloud-side information processing device. This makes it possible to shorten the time required for re-learning.

所定枚数の撮影と送信を終えた後、カメラ装置３は処理ステップＰＳ５において、所定枚数の撮像画像データの送信が完了したことをＨｕｂに通知する。 After capturing and transmitting the specified number of images, the camera device 3 notifies the Hub in processing step PS5 that transmission of the specified number of captured image data has been completed.

Ｈｕｂは、該通知を受けて、処理ステップＰＳ６において、再学習用のデータの準備が完了したことをオーケストレーションツールに通知する。 Upon receiving this notification, in processing step PS6, the Hub notifies the orchestration tool that the re-learning data has been prepared.

オーケストレーションツールは、処理ステップＰＳ７において、ラベリング処理の実行指示をラベリングモジュールに対して送信する。 In processing step PS7, the orchestration tool sends an instruction to execute the labeling process to the labeling module.

ラベリングモジュールは、ラベリング処理の対象とされた画像データを画像ＤＢから取得し（処理ステップＰＳ８）、ラベリング処理を行う。 The labeling module retrieves the image data to be subjected to the labeling process from the image DB (processing step PS8) and performs the labeling process.

ここで言うラベリング処理とは、上述したクラス識別を行う処理であってもよいし、画像の被写体についての性別や年齢を推定してラベルを付与する処理であってもよいし、被写体についてのポーズを推定してラベルを付与する処理であってもよいし、被写体の行動を推定してラベルを付与する処理であってもよい。 The labeling process referred to here may be the process of classifying images as described above, or it may be a process of estimating the gender and age of the subject of an image and assigning a label to it, or it may be a process of estimating the pose of the subject and assigning a label to it, or it may be a process of estimating the behavior of the subject and assigning a label to it.

ラベリング処理は、人手で行われてもよいし、自動で行われてもよい。また、ラベリング処理はクラウド側の情報処理装置で完結してもよいし、他のサーバ装置が提供するサービスを利用することにより実現されてもよい。 The labeling process may be performed manually or automatically. The labeling process may be completed by an information processing device on the cloud side, or may be realized by using a service provided by another server device.

ラベリング処理を終えたラベリングモジュールは、処理ステップＰＳ９において、ラベル付けの結果情報をデータセットＤＢに記憶する。ここでデータセットＤＢに記憶される情報は、ラベル情報と画像データの組とされてもよいし、画像データそのものの代わりに画像データを特定するための画像ＩＤ（Identification）情報とされてもよい。 After completing the labeling process, the labeling module stores the labeling result information in the dataset DB in processing step PS9. Here, the information stored in the dataset DB may be a combination of label information and image data, or may be image ID (Identification) information for identifying the image data instead of the image data itself.

ラベル付けの結果情報が記憶されたことを検出したストレージ管理部は、処理ステップＰＳ１０でオーケストレーションツールに対する通知を行う。 When the storage management unit detects that the labeling result information has been stored, it notifies the orchestration tool in processing step PS10.

該通知を受信したオーケストレーションツールは、所定枚数の画像データに対するラベリング処理が終了したことを確認し、処理ステップＰＳ１１において、再学習モジュールに対する再学習指示を送信する。 The orchestration tool that receives this notification confirms that the labeling process for the specified number of image data sheets has been completed, and in processing step PS11, sends a re-learning instruction to the re-learning module.

再学習指示を受信した再学習モジュールは、処理ステップＰＳ１２でデータセットＤＢから学習に用いるデータセットを取得すると共に、処理ステップＰＳ１３で学習済ＡＩモデルＤＢからアップデート対象のＡＩモデルを取得する。 Upon receiving the re-learning instruction, the re-learning module obtains the dataset to be used for learning from the dataset DB in processing step PS12, and obtains the AI model to be updated from the trained AI model DB in processing step PS13.

再学習モジュールは、取得したデータセットとＡＩモデルを用いてＡＩモデルの再学習を行う。このようにして得られたアップデート済みのＡＩモデルは、処理ステップＰＳ１４において再度学習済ＡＩモデルＤＢに記憶される。 The re-learning module re-learns the AI model using the acquired dataset and AI model. The updated AI model obtained in this way is stored again in the trained AI model DB in processing step PS14.

アップデート済みのＡＩモデルが記憶されたことを検出したストレージ管理部は、処理ステップＰＳ１５でオーケストレーションツールに対する通知を行う。 When the storage management unit detects that an updated AI model has been stored, it notifies the orchestration tool in processing step PS15.

該通知を受信したオーケストレーションツールは、処理ステップＰＳ１６において、ＡＩモデルの変換指示を変換モジュールに対して送信する。 Upon receiving the notification, the orchestration tool sends an instruction to convert the AI model to the conversion module in processing step PS16.

変換指示を受信した変換モジュールは、処理ステップＰＳ１７において学習済みＡＩモデルＤＢからアップデート済みのＡＩモデルを取得し、ＡＩモデルの変換処理を行う。
該変換処理では、展開先の機器であるカメラ装置３のスペック情報等に合わせて変換する処理を行う。この処理では、ＡＩモデルの性能をできるだけ落とさないようにダウンサイジングを行うと共に、カメラ装置３上で動作可能なようにファイル形式の変換などが行われる。 The conversion module that receives the conversion instruction obtains the updated AI model from the trained AI model DB in processing step PS17, and performs conversion processing of the AI model.
In the conversion process, the data is converted to match the specification information of the camera device 3, which is the device to which the data is to be deployed. In this process, downsizing is performed to minimize the loss of performance of the AI model, and file format conversion is performed so that the AI model can run on the camera device 3.

変換モジュールによって変換済みのＡＩモデルは上述したエッジ側ＡＩモデルとされる。この変換済みのＡＩモデルは、処理ステップＰＳ１８において変換済ＡＩモデルＤＢに記憶される。 The AI model converted by the conversion module is the edge-side AI model described above. This converted AI model is stored in the converted AI model DB in processing step PS18.

変換済みのＡＩモデルが記憶されたことを検出したストレージ管理部は、処理ステップＰＳ１９でオーケストレーションツールに対する通知を行う。 When the storage management unit detects that the converted AI model has been stored, it notifies the orchestration tool in processing step PS19.

該通知を受信したオーケストレーションツールは、処理ステップＰＳ２０において、ＡＩモデルのアップデートを実行させるための通知をＨｕｂに対して送信する。この通知には、アップデートに用いるＡＩモデルが記憶されている場所を特定するための情報を含んでいる。 In process step PS20, the orchestration tool that receives the notification sends a notification to the Hub to execute an update of the AI model. This notification includes information for identifying the location where the AI model to be used for the update is stored.

該通知を受信したＨｕｂは、カメラ装置３に対してＡＩモデルのアップデート指示を送信する。アップデート指示についても、ＡＩモデルが記憶されている場所を特定するための情報が含まれている。 The Hub, which receives the notification, sends an instruction to update the AI model to the camera device 3. The update instruction also includes information for identifying the location where the AI model is stored.

カメラ装置３は、処理ステップＰＳ２２において、変換済ＡＩモデルＤＢから対象の変換済みＡＩモデルを取得して展開する処理を行う。これにより、カメラ装置３のイメージセンサＩＳで利用されるＡＩモデルの更新が行われる。 In processing step PS22, the camera device 3 performs a process of retrieving the target converted AI model from the converted AI model DB and expanding it. This updates the AI model used by the image sensor IS of the camera device 3.

ＡＩモデルを展開することによりＡＩモデルの更新を終えたカメラ装置３は、処理ステップＰＳ２３でＨｕｂに対して更新完了通知を送信する。
該通知を受信したＨｕｂは、処理ステップＰＳ２４でオーケストレーションツールに対してカメラ装置３のＡＩモデル更新処理が完了したことを通知する。 After the camera device 3 has completed updating the AI model by deploying the AI model, it sends an update completion notification to the Hub in processing step PS23.
Upon receiving the notification, the Hub notifies the orchestration tool in processing step PS24 that the AI model update processing of the camera device 3 has been completed.

なお、ここではカメラ装置３のイメージセンサＩＳ内（例えば、図２に示すメモリ部４５）にＡＩモデルが展開されて利用される例について説明したが、カメラ装置３におけるイメージセンサ外（例えば、図２のメモリ部３４）や管理装置２内の記憶部にＡＩモデルが展開されて利用された場合であっても、同様にＡＩモデルの更新を行うことができる。
その場合には、ＡＩモデルが展開された際に当該ＡＩモデルが展開された装置（場所）をクラウド側のストレージ管理部などに記憶しておき、Ｈｕｂは、ストレージ管理部からＡＩモデルが展開された装置（場所）を読み出し、ＡＩモデルが展開された装置に対してＡＩモデルのアップデート指示を送信する。
アップデート指示を受けた装置は、処理ステップＰＳ２２において、変換済ＡＩモデルＤＢから対象の変換済みＡＩモデルを取得して展開する処理を行う。これにより、アップデート指示を受けた装置のＡＩモデルの更新が行われる。 Note that, although an example has been described here in which an AI model is deployed and used within the image sensor IS of the camera device 3 (e.g., the memory unit 45 shown in Figure 2), the AI model can be updated in the same manner even if the AI model is deployed and used outside the image sensor in the camera device 3 (e.g., the memory unit 34 in Figure 2) or in a memory unit within the management device 2.
In this case, when the AI model is deployed, the device (location) on which the AI model is deployed is stored in a storage management unit on the cloud side, and the Hub reads out the device (location) on which the AI model is deployed from the storage management unit and sends an instruction to update the AI model to the device on which the AI model is deployed.
In processing step PS22, the device that has received the update instruction performs a process of acquiring the target converted AI model from the converted AI model DB and expanding it. This updates the AI model of the device that has received the update instruction.

なお、ＡＩモデルの更新のみを行う場合は、ここまでの処理で完結する。
ＡＩモデルに加えてＡＩモデルを利用するＡＩアプリケーションの更新を行う場合には、後述する処理が更に実行される。 If you are only updating the AI model, the process is completed up to this point.
When updating an AI application that utilizes an AI model in addition to the AI model, the processing described below is further executed.

具体的に、オーケストレーションツールは処理ステップＰＳ２５において、展開制御モジュールに対してアップデートされたファームウェアなどのＡＩアプリケーションのダウンロード指示を送信する。 Specifically, in processing step PS25, the orchestration tool sends a download instruction for the AI application, such as updated firmware, to the deployment control module.

展開制御モジュールは、処理ステップＰＳ２６において、Ｈｕｂに対してＡＩアプリケーションの展開指示を送信する。この指示には、アップデートされたＡＩアプリケーションが記憶されている場所を特定するための情報が含まれている。 In process step PS26, the deployment control module sends an instruction to the Hub to deploy the AI application. This instruction includes information to identify the location where the updated AI application is stored.

Ｈｕｂは、処理ステップＰＳ２７において、当該展開指示をカメラ装置３に対して送信する。 In processing step PS27, the Hub sends the deployment instruction to the camera device 3.

カメラ装置３は、処理ステップＰＳ２８において、展開制御モジュールのコンテナＤＢからアップデートされたＡＩアプリケーションをダウンロードして展開する。 In processing step PS28, the camera device 3 downloads the updated AI application from the container DB of the deployment control module and deploys it.

なお、上記の説明においては、カメラ装置３のイメージセンサＩＳ上で動作するＡＩモデルの更新とカメラ装置３におけるイメージセンサＩＳ外で動作するＡＩアプリケーションの更新をシーケンシャルで行う例を説明した。
また、ここでは説明の簡単のため、ＡＩアプリケーションとして説明したが、前述の通り、ＡＩアプリケーションはＳＷコンポーネントＢ１、Ｂ２、Ｂ３、・・・Ｂｎなど複数のＳＷコンポーネントで定義されており、ＡＩアプリケーションが展開された際に、各ＳＷコンポーネントがどこに展開されたかをクラウド側のストレージ管理部などに記憶しておき、Ｈｕｂは、処理ステップＰＳ２７を処理する際に、ストレージ管理部から各ＳＷコンポーネントの展開された装置（場所）を読み出し、その展開された装置に対して、展開指示を送信するようにされている。展開指示を受けた装置は、処理ステップＰＳ２８において、展開制御モジュールのコンテナＤＢからアップデートされたＳＷコンポーネントをダウンロードして展開する。
なお、ここで言及するＡＩアプリケーションとは、ＡＩモデル以外のＳＷコンポーネントである。 In the above explanation, an example was described in which the update of the AI model running on the image sensor IS of the camera device 3 and the update of the AI application running outside the image sensor IS in the camera device 3 are performed sequentially.
For simplicity, an AI application has been described here, but as described above, an AI application is defined by a plurality of SW components such as SW components B1, B2, B3, ..., Bn, and when an AI application is deployed, where each SW component is deployed is stored in a storage management unit on the cloud side, and when processing step PS27, the Hub reads out the device (location) where each SW component is deployed from the storage management unit and transmits a deployment instruction to the deployed device. In processing step PS28, the device that has received the deployment instruction downloads the updated SW component from the container DB of the deployment control module and deploys it.
Note that the AI application referred to here is a SW component other than the AI model.

また、ＡＩモデルとＡＩアプリケーションの双方が一つの装置で動作するとなっていた場合には、ＡＩモデルとＡＩアプリケーションの双方を一つのコンテナとしてまとめて更新してもよい。その場合には、ＡＩモデルの更新とＡＩアプリケーションの更新がシーケンシャルではなく同時に行われてもよい。そして、処理ステップＰＳ２５、ＰＳ２６、ＰＳ２７、ＰＳ２８の各処理を実行することにより、実現可能である。 In addition, if both the AI model and the AI application are to be run on a single device, both the AI model and the AI application may be updated together as a single container. In that case, the update of the AI model and the update of the AI application may be performed simultaneously, not sequentially. This can be achieved by executing the processes of the processing steps PS25, PS26, PS27, and PS28.

例えば、カメラ装置３のイメージセンサＩＳにＡＩモデルとＡＩアプリケーションの双方のコンテナを展開することが可能な場合、上述のように処理ステップＰＳ２５、ＰＳ２６、ＰＳ２７、ＰＳ２８の各処理を実行することにより、ＡＩモデルやＡＩアプリケーションの更新を行うことができる。 For example, if it is possible to deploy containers of both an AI model and an AI application to the image sensor IS of the camera device 3, the AI model and the AI application can be updated by executing the processing steps PS25, PS26, PS27, and PS28 as described above.

上述した処理を行うことにより、ユーザの使用環境において撮像された撮像画像データを用いてＡＩモデルの再学習が行われる。従って、ユーザの使用環境において高精度の認識結果を出力できるエッジ側ＡＩモデルを生成することができる。 By performing the above-mentioned processing, the AI model is retrained using image data captured in the user's usage environment. Therefore, it is possible to generate an edge-side AI model that can output highly accurate recognition results in the user's usage environment.

また、車載カメラとしての３を搭載した車両がそれまでと異なる地域を走行している場合や、天候や時刻の変化により撮像装置に入射される入射光の光量が変化した場合など、カメラ装置３の撮像環境が変化したとしても、その都度適切にＡＩモデルの再学習を行うことができるため、ＡＩモデルによる認識精度を低下させずに維持することが可能となる。
なお、上述した各処理は、ＡＩモデルの再学習時だけでなく、ユーザの使用環境下においてシステムを初めて稼働させる際に実行してもよい。
Furthermore, even if the imaging environment of the camera device 3 changes, such as when a vehicle equipped with the onboard camera 3 is traveling in a different area than before, or when the amount of incident light entering the imaging device changes due to changes in weather or time, the AI model can be appropriately re-learned each time, making it possible to maintain the recognition accuracy of the AI model without degradation.
In addition, each of the above-mentioned processes may be performed not only when re-learning the AI model, but also when the system is operated for the first time in the user's environment.

＜１４．マーケットプレイスの画面例＞
マーケットプレイスに関してユーザに提示される画面の一例について、各図を参照して説明する。 <14. Marketplace screen example>
An example of a screen presented to a user regarding the marketplace will be described with reference to each drawing.

図２５は、ログイン画面Ｇ１の一例を示したものである。
ログイン画面Ｇ１には、ユーザＩＤを入力するためのＩＤ入力欄９１と、パスワードを入力するためのパスワード入力欄９２が設けられている。 FIG. 25 shows an example of the login screen G1.
The login screen G1 has an ID input field 91 for inputting a user ID and a password input field 92 for inputting a password.

パスワード入力欄９２の下方には、ログインを行うためのログインボタン９３と、ログインを取りやめるためのキャンセルボタン９４が配置されている。 Below the password input field 92 are a login button 93 for logging in and a cancel button 94 for cancelling the login.

また、更にその下方には、パスワードを忘れたユーザ向けのページへ遷移するための操作子や、新規にユーザ登録を行うためのページに遷移するための操作子等が適宜配置されている。 Further below that, operators for transitioning to a page for users who have forgotten their password, and operators for transitioning to a page for new user registration are appropriately placed.

適切なユーザＩＤとパスワードを入力した後にログインボタン９３を押下すると、ユーザ固有のページに遷移する処理がクラウドサーバ５及びユーザ端末６のそれぞれにおいて実行される。 When the user presses the login button 93 after entering the appropriate user ID and password, a process to transition to a user-specific page is executed on both the cloud server 5 and the user terminal 6.

図２６は、例えば、アプリケーション開発者端末６Ａを利用するＡＩアプリケーション開発者や、ＡＩモデル開発者端末６Ｃを利用するＡＩモデル開発者に提示される画面の一例である。 Figure 26 is an example of a screen presented to, for example, an AI application developer using the application developer terminal 6A or an AI model developer using the AI model developer terminal 6C.

各開発者は、マーケットプレイスを通じて、開発のための学習用データセットやＡＩモデルやＡＩアプリケーションを購入することが可能とされている。また、自身で開発したＡＩアプリケーションやＡＩモデルをマーケットプレイスに登録することが可能とされている。 Through the marketplace, developers can purchase training datasets, AI models, and AI applications for development. They can also register AI applications and AI models that they have developed themselves on the marketplace.

図２６に示す開発者向け画面Ｇ２には、購入可能な学習用データセットやＡＩモデルやＡＩアプリケーションなど（以降、まとめて「データ」と記載）が左側に表示されている。
なお、図示していないが、学習用データセットの購入の際に、学習用データセットの画像をディスプレイ上に表示させ、マウス等の入力装置を用いて画像の所望の部分のみを枠で囲み、名前を入力するだけで、学習の準備をすることができる。
例えば、猫の画像でＡＩ学習を行いたい場合、画像上の猫の部分だけを枠で囲むと共に、テキスト入力として「猫」と入力することによって、猫のアノテーションが付加された画像をＡＩ学習用に準備することができる。
また、所望のデータを見つけやすいように、「交通監視」、「動線分析」、「来店客カウント」のような目的を選択可能とされていてもよい。即ち、選択された目的に適合するデータのみが表示されるような表示処理がクラウドサーバ５及びユーザ端末６のそれぞれにおいて実行される。 The developer screen G2 shown in FIG. 26 displays purchasable learning data sets, AI models, AI applications, etc. (hereinafter collectively referred to as "data") on the left side.
Although not shown in the figure, when purchasing a training data set, the user can prepare for training by simply displaying an image of the training data set on a display, using an input device such as a mouse to frame only the desired portion of the image, and entering a name.
For example, if you want to use an image of a cat for AI training, you can prepare an image with a cat annotation for AI training by surrounding only the cat part of the image with a frame and entering "cat" as the text input.
In addition, in order to make it easier to find desired data, it may be possible to select an objective such as "traffic monitoring,""traffic line analysis," or "customer count." That is, a display process is executed in each of the cloud server 5 and the user terminal 6 so that only data that matches the selected objective is displayed.

なお、開発者向け画面Ｇ２においては、各データの購入価格が表示されていてもよい。 The purchase price of each piece of data may also be displayed on the developer screen G2.

また、開発者向け画面Ｇ２の右側には、開発者が収集または作成した学習用データセットや、開発者が開発したＡＩモデルやＡＩアプリケーションを登録するための入力欄９５が設けられている。 In addition, on the right side of the developer screen G2, there is an input field 95 for registering learning datasets collected or created by the developer, as well as AI models and AI applications developed by the developer.

データごとに、名称やデータの保存場所を入力するための入力欄９５が設けられている。また、ＡＩモデルについては、リトレーニングの要／不要を設定するためのチェックボックス９６が設けられている。 For each piece of data, an input field 95 is provided for inputting the name and where the data is saved. In addition, for AI models, a check box 96 is provided for setting whether or not retraining is required.

なお、登録対象のデータを購入する際に必要な価格を設定可能な価格設定欄（図中では入力欄９５として記載）などが設けられていてもよい。 In addition, a price setting field (shown as input field 95 in the figure) may be provided in which the price required to purchase the data to be registered can be set.

また、開発者向け画面Ｇ２の上部には、ユーザ情報の一部としてユーザ名や最終ログイン日などが表示されている。なお、これ以外にも、ユーザがデータ購入の際に使用可能な通貨量やポイント数などが表示されていてもよい。 In addition, the upper part of the developer screen G2 displays the user name, last login date, and so on as part of the user information. In addition to this, the amount of currency and points that the user can use when purchasing data may also be displayed.

図２７は、例えば、自身が管理するエッジ側の情報処理装置としてのカメラ装置３にＡＩアプリケーションやＡＩモデルを展開することにより、各種の分析等を行うユーザ（上述したアプリケーション利用ユーザ）に提示される利用者向け画面Ｇ３の一例である。 Figure 27 shows an example of a user screen G3 that is presented to a user (the application user described above) who performs various analyses, for example, by deploying an AI application or an AI model to a camera device 3 as an edge-side information processing device managed by the user.

ユーザは、マーケットプレイスを介して監視対象の空間に配置するカメラ装置３を購入可能とされている。従って、利用者向け画面Ｇ３の左側には、カメラ装置３に搭載されるイメージセンサＩＳの種類や性能、そしてカメラ装置３の性能等を選択可能なラジオボタン９７が配置されている。 Users can purchase camera devices 3 to be placed in the space to be monitored via the marketplace. Therefore, on the left side of the user screen G3, radio buttons 97 are provided that allow users to select the type and performance of the image sensor IS to be installed in the camera device 3, as well as the performance of the camera device 3.

また、ユーザは、マーケットプレイスを介して管理装置２としての情報処理装置を購入可能とされている。従って、利用者向け画面Ｇ３の左側には、管理装置２の各性能を選択するためのラジオボタン９７が配置されている。
また、既に管理装置２を有しているユーザは管理装置２の性能情報をここに入力することによって、管理装置２の性能を登録することができる。 Furthermore, a user can purchase an information processing device as the management device 2 via the marketplace. Therefore, radio buttons 97 for selecting each performance of the management device 2 are arranged on the left side of the user-oriented screen G3.
Furthermore, a user who already has a management device 2 can register the performance of the management device 2 by inputting performance information of the management device 2 here.

ユーザは、自身が経営する店舗などの任意の場所に購入したカメラ装置３（或いは、マーケットプレイスを介さずに購入したカメラ装置３でもよい）を設置することにより所望の機能を実現するが、マーケットプレイスでは、各カメラ装置３の機能を最大限に発揮させるために、カメラ装置３の設置場所についての情報を登録することが可能とされている。 A user can achieve the desired functions by installing the purchased camera device 3 (or a camera device 3 purchased without going through the marketplace) in a location of their choice, such as a store that they manage, and the marketplace allows users to register information about the installation location of each camera device 3 in order to maximize the functionality of each camera device 3.

利用者向け画面Ｇ３の右側には、カメラ装置３が設置される環境についての環境情報を選択可能なラジオボタン９８が配置されている。ユーザは、カメラ装置３が設置される環境についての環境情報を適切に選択することにより、上述した最適な撮像設定を対象のカメラ装置３に設定される。 A radio button 98 is arranged on the right side of the user screen G3, which allows the user to select environmental information about the environment in which the camera device 3 is installed. By appropriately selecting the environmental information about the environment in which the camera device 3 is installed, the above-mentioned optimal imaging settings are set for the target camera device 3.

なお、カメラ装置３を購入すると共に該購入予定のカメラ装置３の設置場所が決まっている場合には、利用者向け画面Ｇ３の左側の各項目と右側の各項目を選択することにより、設置予定場所に応じて最適な撮像設定が予め設定されたカメラ装置３を購入することができる。 When purchasing a camera device 3 and the installation location of the camera device 3 to be purchased has been decided, the user can purchase a camera device 3 with the optimal imaging settings pre-set according to the planned installation location by selecting each item on the left side and each item on the right side of the user screen G3.

利用者向け画面Ｇ３には実行ボタン９９が設けられている。実行ボタン９９を押下することにより、購入についての確認を行う確認画面や、環境情報の設定を確認するための確認画面へと遷移する。これにより、ユーザは、所望のカメラ装置３や管理装置２を購入することや、カメラ装置３についての環境情報の設定を行うことが可能とされる。 The user screen G3 has an execute button 99. Pressing the execute button 99 transitions to a confirmation screen for confirming the purchase or a confirmation screen for confirming the setting of environmental information. This allows the user to purchase the desired camera device 3 or management device 2, or to set environmental information for the camera device 3.

マーケットプレイスにおいては、カメラ装置３の設置場所を変更したときのために、各カメラ装置３の環境情報を変更することが可能とされている。図示しない変更画面においてカメラ装置３の設置場所についての環境情報を入力し直すことにより、カメラ装置３に最適な撮像設定を設定し直すことが可能となる。
In the marketplace, it is possible to change the environmental information of each camera device 3 in case the installation location of the camera device 3 is changed. By re-inputting the environmental information about the installation location of the camera device 3 on a change screen (not shown), it is possible to re-set the optimal imaging settings for the camera device 3.

＜１５．まとめ＞
上述したように、イメージセンサＩＳ或いはイメージセンサＩＳを搭載したカメラ装置３などの信号処理装置は、撮像画像において検出された人物Ｐ（対象物）に対応したバウンディングボックスＢＢを撮像画像上に設定し、バウンディングボックスＢＢ同士の重なり度合い（ＩｏＵ）に応じて人物Ｐについての密集度を算出する画像処理部（ＡＩ画像処理部４４）と、を備えている。
これにより、１枚の撮像画像を用いた簡易な処理で密集度を局所的に算出することができる。
従って、密集度に応じて人気のエリアを判定して推薦することや、混雑を避けるルートを提案することなどが可能となる。
なお、密集度を適切に算出することにより、密な状態を避けるためのアナウンス等が可能となり、感染症対策を行うことができる。
また、密集度の算出と共に、上述した人物Ｐについての動線分析を併用することにより、感染症に罹患した陽性者が感染前後に密集度の高いエリアを訪れたか否か、訪れていたとすればどのエリアかなどを特定することが可能となる。従って、陽性者と同一空間或いは密接した距離にいた他の人物Ｐを特定することやそのような人物Ｐの動線を確認することなどが可能となり、感染経路の解明や感染の疑いのある他の人物Ｐなどを特定することが可能となる。 <15. Summary>
As described above, a signal processing device such as an image sensor IS or a camera device 3 equipped with an image sensor IS is equipped with an image processing unit (AI image processing unit 44) that sets a bounding box BB corresponding to a person P (subject) detected in the captured image on the captured image and calculates the density of the person P based on the degree of overlap (IoU) between the bounding boxes BB.
This makes it possible to locally calculate the density through simple processing using one captured image.
Therefore, it will be possible to determine and recommend popular areas based on their density, as well as suggest routes that avoid crowds.
Furthermore, by properly calculating the level of crowding, it will be possible to make announcements to avoid crowded conditions, thereby enabling measures to prevent the spread of infection.
In addition, by using the above-mentioned analysis of the movement of person P in combination with the calculation of the density, it becomes possible to determine whether a person infected with an infectious disease visited a highly crowded area before and after infection, and if so, which area. Therefore, it becomes possible to identify other people P who were in the same space as the positive person or in close proximity thereto, and to confirm the movement of such people P, making it possible to clarify the infection route and identify other people P who are suspected of infection.

図６や図１１等を参照して説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、人物Ｐごとに密集度を算出してもよい。
人物Ｐごとに密集度を算出することで、エリアごとに密集度を算出する手法と比較して密集度を細かく算出することができ、密集度の情報についての解像度を上げることができる。 As described with reference to Figures 6 and 11, an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate the density for each person P.
By calculating the density for each person P, the density can be calculated more precisely than in a method of calculating the density for each area, and the resolution of the density information can be increased.

変形例において説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、処理対象とされた対象人物Ｐｔに対応したバウンディングボックスＢＢである対象バウンディングボックスＢＢｔと、該対象バウンディングボックスＢＢｔと共通領域ＣＡ１を有したバウンディングボックスＢＢである重複バウンディングボックスＢＢｄと、の組ごとに密集度合いを示す値の平均値を用いて密集度を算出してもよい。
これにより、人物ＰについてのキーポイントＫＰを特定することなく複数のバウンディングボックスＢＢの重複度合いに基づいて適切に密集度を算出することができる。従って、イメージセンサＩＳの処理負担の軽減を図ることができる。 As described in the modified example, an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate the density using the average value of values indicating the degree of density for each pair of a target bounding box BBt, which is a bounding box BB corresponding to the target person Pt being processed, and an overlapping bounding box BBd, which is a bounding box BB having a common area CA1 with the target bounding box BBt.
This makes it possible to appropriately calculate the density based on the degree of overlap of a plurality of bounding boxes BB without identifying key points KP for the person P. This makes it possible to reduce the processing load on the image sensor IS.

図１０や変形例において説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、対象バウンディングボックスＢＢｔと重複バウンディングボックスＢＢｄの論理和領域ＤＡ１に対する共通領域ＣＡ１の占める割合に応じて組ごとの密集度合いを示す値を算出してもよい。
これにより、簡易な処理で二つのバウンディングボックスＢＢについての基礎密集度（基礎ＩｏＵ）を算出することができる。 As described in Figure 10 and the modified example, an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate a value indicating the degree of congestion for each group based on the proportion of the common area CA1 to the logical sum area DA1 of the target bounding box BBt and the overlapping bounding box BBd.
This makes it possible to calculate the basic density (basic IoU) for the two bounding boxes BB through simple processing.

図６や図１２等を参照して説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、撮像画像において人物Ｐの所定部位をキーポイントＫＰとして検出し、バウンディングボックスＢＢ同士の重なり度合いとキーポイントＫＰの未検出率とに応じて密集度を算出してもよい。
バウンディングボックスＢＢの重なり度合いだけでなく、人物Ｐの関節等についての他人の体によるオクルージョンが発生したか否かを考慮することにより、人物Ｐ同士の近さをより好適に数値化することができる。従って、適切な密集度を算出することが可能となる。 As explained with reference to Figures 6 and 12, an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may detect specific parts of a person P in a captured image as key points KP, and calculate the density based on the degree of overlap between bounding boxes BB and the rate at which key points KP go undetected.
By considering not only the degree of overlap of the bounding boxes BB but also whether or not occlusion of the joints of the person P by the body of another person has occurred, it is possible to more appropriately quantify the closeness of the persons P to each other. Therefore, it becomes possible to calculate an appropriate density.

図８等を参照して説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、バウンディングボックスＢＢの重心位置をボックス重心位置ＢＢＰとして算出し、同一人物と推定される複数のキーポイントＫＰの重心位置をキーポイント重心位置ＫＢＰとして算出し、撮像画像上における座標が近いボックス重心位置ＢＢＰとキーポイント重心位置ＫＢＰを同一人物についてのものとして紐付けてもよい。
これにより、バウンディングボックスＢＢを設定する処理と、キーポイントＫＰを検出する処理とが異なるＡＩモデルを用いて実現される場合に、人物ＰごとのバウンディングボックスＢＢとキーポイントＫＰを適切に紐付けることができる。従って、人物Ｐごとの密集度を適切に算出することができる。 As explained with reference to Figure 8, etc., an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS calculates the center of gravity position of the bounding box BB as the box center of gravity position BBP, calculates the center of gravity positions of multiple key points KP that are estimated to belong to the same person as the key point center of gravity position KBP, and may link the box center of gravity position BBP and the key point center of gravity position KBP that have close coordinates on the captured image as belonging to the same person.
As a result, when the process of setting the bounding box BB and the process of detecting the key points KP are realized using different AI models, it is possible to appropriately link the bounding box BB and the key points KP for each person P. Therefore, it is possible to appropriately calculate the density for each person P.

上述したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、キーポイントＫＰごとに尤度を算出してもよい。
キーポイントＫＰごとに尤度を算出することで、検出された不確かなキーポイントＫＰによって未検出率が低く算出されてしまうことを防止することができる。従って、密集度を高精度に算出することが可能となる。 As described above, an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate the likelihood for each key point KP.
By calculating the likelihood for each key point KP, it is possible to prevent the non-detection rate from being calculated low due to an uncertain key point KP being detected. Therefore, it is possible to calculate the density with high accuracy.

図１２等を参照して述べたように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、密集度の算出において尤度が所定閾値（尤度閾値Ｔｈ１）未満とされたキーポイントＫＰは未検出のキーポイントＫＰとして扱ってもよい。
これにより、人物Ｐの所定の部位である可能性が高いキーポイントＫＰのみを用いてキーポイント重心位置ＫＢＰを算出することや未検出率を算出することなどが可能となるため、密集度を高精度に算出することが可能となる。 As described with reference to Figure 12, etc., an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may treat a key point KP whose likelihood is less than a predetermined threshold (likelihood threshold Th1) in calculating the density as an undetected key point KP.
This makes it possible to calculate the key point center of gravity position KBP and the non-detection rate using only the key points KP that are likely to be specific parts of the person P, thereby making it possible to calculate the density with high accuracy.

図１１等を参照して説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、同一人物と推定されるキーポイントＫＰにおける未検出率が所定閾値（未検出率閾値Ｔｈ２）以上である場合に、当該キーポイントＫＰを除外して密集度の算出を行ってもよい。
即ち、キーポイントＫＰがほとんど検出できない場合や、尤度の低いキーポイントＫＰが多く検出されている場合に、当該人物Ｐを検出していないと判定することで、不適切な密集度が算出されてしまうことを防止することができる。 As explained with reference to Figure 11, etc., an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate the density by excluding a key point KP when the undetection rate of a key point KP that is estimated to be the same person is equal to or greater than a predetermined threshold (undetection rate threshold Th2).
In other words, when almost no key points KP can be detected or when many key points KP with low likelihood are detected, it is possible to prevent an inappropriate density from being calculated by determining that the person P has not been detected.

図９等を参照して説明したように、イメージセンサＩＳなどの信号処理装置における画像処理部（ＡＩ画像処理部４４）は、撮像画像の外周縁部に位置し一部が見切れている人物Ｐを除外して密集度の算出を行ってもよい。
これにより、カメラ装置３の画角外に関節等の部位が位置して不当にキーポイントＫＰの未検出率が高く算出され、密集度が高く算出されてしまうことを防止することができる。即ち、密集度の精度を高めることができる。 As explained with reference to Figure 9, etc., an image processing unit (AI image processing unit 44) in a signal processing device such as an image sensor IS may calculate the density by excluding a person P that is located on the outer edge of the captured image and is partially cut off.
This makes it possible to prevent a situation in which a non-detection rate of key points KP is calculated to be unreasonably high and the density is calculated to be high due to a part such as a joint being located outside the angle of view of the camera device 3. In other words, it is possible to improve the accuracy of the density.

図２等を参照して説明したように、イメージセンサＩＳなどの信号処理装置は、密集度の情報をログとして出力する出力部（通信Ｉ／Ｆ４６）を備えていてもよい。
これにより、密集度に応じた後段の処理が容易になる。例えば、複数のカメラから得られた複数の撮像画像とその密集度のログ情報に基づいて、人が密集しているエリアや閑散としているエリアを判定する処理や、人が少ないエリアを推奨エリアとして提示する処理などを容易に実現することができる。 As described with reference to FIG. 2 etc., a signal processing device such as an image sensor IS may include an output unit (communication I/F 46) that outputs information on the density as a log.
This makes it easier to carry out downstream processing according to the degree of crowding. For example, it is easy to realize a process of determining whether an area is crowded or quiet, or a process of presenting areas with few people as recommended areas, based on multiple captured images obtained from multiple cameras and log information on the degree of crowding.

また図２等を参照して説明したように、イメージセンサＩＳなどの信号処理装置は、密集度と未検出率の情報をログとして出力する出力部（通信Ｉ／Ｆ４６）を備えていてもよい。
密集度やキーポイントＫＰの未検出率の情報をログとして出力することにより、それらを用いた後段の処理が容易となる。 As described with reference to FIG. 2 etc., a signal processing device such as an image sensor IS may include an output unit (communication I/F 46) that outputs information on the density and non-detection rate as a log.
By outputting information on the density and the rate of non-detection of key points KP as a log, it becomes easier to use them in subsequent processing.

更に、イメージセンサＩＳなどの信号処理装置における出力部（通信Ｉ／Ｆ４６）は、撮像画像のメタデータとしてログを出力してもよい。
これにより、例えばＭＩＰＩ規格に準拠した一般的なデータとして密集度や未検出率のデータを出力することができる。従って、規格化されたデータを受信可能な各種の装置に対して不要なデータ加工をせずにデータ出力することができる。 Furthermore, an output section (communication I/F 46) in a signal processing device such as an image sensor IS may output a log as metadata of a captured image.
This makes it possible to output data on the density and the non-detection rate as general data conforming to the MIPI standard, for example, and therefore to output the data to various devices capable of receiving standardized data without unnecessary data processing.

図２等を参照して説明したように、イメージセンサＩＳなどの信号処理装置において、撮像画像はＲＧＢ画像とされてもよい。
これにより、ＲＧＢ画像に基づいてキーポイントＫＰの検出を行う既存のプログラム等を利用することができ、処理効率を向上させることができる。 As described with reference to FIG. 2 etc., in a signal processing device such as an image sensor IS, a captured image may be an RGB image.
This makes it possible to use existing programs that detect key points KP based on RGB images, thereby improving processing efficiency.

管理装置２などの信号処理装置は、密集度に基づく画像表示を実行させる表示制御部（表示制御機能Ｆ１２）を備えたものである。
このような表示制御部による表示制御によってユーザ端末などの情報処理装置において所定の表示が実現されることにより、ユーザは視覚的に密集度を把握することができる。 A signal processing device such as the management device 2 includes a display control section (display control function F12) that executes image display based on the density.
Such display control by the display control unit allows a predetermined display to be realized on an information processing device such as a user terminal, thereby enabling a user to visually grasp the degree of congestion.

また、管理装置２などの信号処理装置における表示制御部（表示制御機能Ｆ１２）は、密集度に応じて重み付けされた二次元のガウス分布を用いたヒートマップを生成し撮像画像に重畳させてもよい。
これにより、ユーザは、密集度の数値の多寡を反映した画像に基づいて密集度を視覚的に把握することができる。 In addition, a display control unit (display control function F12) in a signal processing device such as the management device 2 may generate a heat map using a two-dimensional Gaussian distribution weighted according to the density and superimpose it on the captured image.
This allows the user to visually grasp the density based on the image that reflects the density value.

イメージセンサＩＳなどの信号処理装置が実行する信号処理方法は、撮像画像において検出された人物Ｐに対応したバウンディングボックスＢＢを撮像画像上に設定する処理と、バウンディングボックスＢＢ同士の重なり度合いに応じて人物Ｐについての密集度を算出する処理と、を含むものである。 The signal processing method executed by a signal processing device such as an image sensor IS includes a process of setting a bounding box BB corresponding to a person P detected in a captured image on the captured image, and a process of calculating the density of the person P according to the degree of overlap between the bounding boxes BB.

本技術におけるプログラムは、撮像画像において検出された人物Ｐに対応したバウンディングボックスＢＢを撮像画像上に設定する機能と、バウンディングボックスＢＢ同士の重なり度合いに応じて人物Ｐについての密集度を算出する機能と、を演算処理装置に実行させるものであり、本技術の記憶媒体は、このようなプログラムが記憶されたコンピュータ装置が読み取り可能なものである。
このようなプログラムにより上述した信号処理装置としてのイメージセンサＩＳにおいて、複数の撮像画像を用いること無く密集度を局所的に推定することができる。 The program in the present technology causes a computing device to execute the functions of setting a bounding box BB on the captured image corresponding to a person P detected in the captured image, and calculating the density of the person P based on the degree of overlap between the bounding boxes BB, and the storage medium of the present technology is readable by a computer device in which such a program is stored.
By using such a program, the image sensor IS serving as the above-mentioned signal processing device can locally estimate the density without using a plurality of captured images.

これらのプログラムはコンピュータ装置等の機器に内蔵されている記録媒体としてのＨＤＤ（Hard Disk Drive）や、ＣＰＵを有するマイクロコンピュータ内のＲＯＭ等に予め記録しておくことができる。あるいはまたプログラムは、フレキシブルディスク、ＣＤ－ＲＯＭ（Compact Disk Read Only Memory）、ＭＯ(Magneto Optical)ディスク、ＤＶＤ(Digital Versatile Disc)、ブルーレイディスク（Blu-ray Disc（登録商標））、磁気ディスク、半導体メモリ、メモリカードなどのリムーバブル記録媒体に、一時的あるいは永続的に格納（記録）しておくことができる。このようなリムーバブル記録媒体は、いわゆるパッケージソフトウェアとして提供することができる。
また、このようなプログラムは、リムーバブル記録媒体からパーソナルコンピュータ等にインストールする他、ダウンロードサイトから、ＬＡＮ(Local Area Network)、インターネットなどのネットワークを介してダウンロードすることもできる。 These programs can be pre-recorded in a HDD (Hard Disk Drive) as a recording medium built into a device such as a computer device, or in a ROM in a microcomputer having a CPU. Alternatively, the programs can be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, a CD-ROM (Compact Disk Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a Blu-ray Disc (registered trademark), a magnetic disk, a semiconductor memory, or a memory card. Such removable recording media can be provided as so-called package software.
Such a program can be installed in a personal computer or the like from a removable recording medium, or can be downloaded from a download site via a network such as a LAN (Local Area Network) or the Internet.

なお、本明細書に記載された効果はあくまでも例示であって限定されるものではなく、また他の効果があってもよい。 Note that the effects described in this specification are merely examples and are not limiting, and other effects may also be present.

また、上述した各例はいかように組み合わせてもよく、各種の組み合わせを用いた場合であっても上述した種々の作用効果を得ることが可能である。
Furthermore, the above-mentioned examples may be combined in any manner, and even when various combinations are used, the above-mentioned various operational effects can be obtained.

＜１６．本技術＞
本技術は以下のような構成を採ることも可能である。
（１）
撮像画像において検出された対象物に対応したバウンディングボックスを前記撮像画像上に設定し、前記バウンディングボックス同士の重なり度合いに応じて前記対象物についての密集度を算出する画像処理部と、を備えた
信号処理装置。
（２）
前記画像処理部は、前記対象物ごとに前記密集度を算出する
上記（１）に記載の信号処理装置。
（３）
前記画像処理部は、
処理対象とされた対象物に対応した前記バウンディングボックスである対象バウンディングボックスと、該対象バウンディングボックスと共通領域を有した前記バウンディングボックスである重複バウンディングボックスと、の組ごとの密集度合いを示す値の平均値を用いて前記密集度を算出する
上記（２）に記載の信号処理装置。
（４）
前記画像処理部は、前記対象バウンディングボックスと前記重複バウンディングボックスの論理和領域に対する前記共通領域の占める割合に応じて前記組ごとの密集度合いを示す値を算出する
上記（３）に記載の信号処理装置。
（５）
前記画像処理部は、
前記撮像画像において前記対象物の所定部位をキーポイントとして検出し、
前記バウンディングボックス同士の重なり度合いと前記キーポイントの未検出率とに応じて前記密集度を算出する
上記（１）から上記（４）の何れかに記載の信号処理装置。
（６）
前記画像処理部は、
前記バウンディングボックスの重心位置をボックス重心位置として算出し、
同一対象物と推定される複数の前記キーポイントの重心位置をキーポイント重心位置として算出し、
前記撮像画像上における座標が近い前記ボックス重心位置と前記キーポイント重心位置を同一対象物についてのものとして紐付ける
上記（５）に記載の信号処理装置。
（７）
前記画像処理部は、前記キーポイントごとに尤度を算出する
上記（５）から上記（６）の何れかに記載の信号処理装置。
（８）
前記画像処理部は、前記密集度の算出において前記尤度が所定閾値未満とされた前記キーポイントは未検出の前記キーポイントとして扱う
上記（７）に記載の信号処理装置。
（９）
前記画像処理部は、同一対象物と推定される前記キーポイントの未検出率が所定閾値以上である場合に、当該キーポイントを除外して前記密集度の算出を行う
上記（８）に記載の信号処理装置。
（１０）
前記画像処理部は、前記撮像画像の外周縁部に位置し一部が見切れている対象物を除外して前記密集度の算出を行う
上記（５）から上記（９）の何れかに記載の信号処理装置。
（１１）
前記密集度の情報をログとして出力する出力部を備えた
上記（１）から上記（１０）の何れかに記載の信号処理装置。
（１２）
前記密集度と前記未検出率の情報をログとして出力する出力部を備えた
上記（５）から上記（１０）の何れかに記載の信号処理装置。
（１３）
前記出力部は、前記撮像画像のメタデータとして前記ログを出力する
上記（１１）から上記（１２）の何れかに記載の信号処理装置。
（１４）
前記撮像画像はＲＧＢ画像とされた
上記（１）から上記（１３）の何れかに記載の信号処理装置。
（１５）
撮像画像において検出された対象物に対応したバウンディングボックスを前記撮像画像上に設定する処理と、
前記バウンディングボックス同士の重なり度合いに応じて前記対象物についての密集度を算出する処理と、を信号処理装置が実行する
信号処理方法。
（１６）
撮像画像において検出された対象物に対応したバウンディングボックスを前記撮像画像上に設定する機能と、
前記バウンディングボックス同士の重なり度合いに応じて前記対象物についての密集度を算出する機能と、を演算処理装置に実行させるプログラムが記憶された、コンピュータ装置が読み取り可能な
記憶媒体。 <16. This Technology>
The present technology can also be configured as follows.
(1)
and an image processing unit that sets a bounding box on a captured image corresponding to an object detected in the captured image, and calculates a density of the object according to a degree of overlap between the bounding boxes.
(2)
The signal processing device according to (1), wherein the image processing unit calculates the density for each of the objects.
(3)
The image processing unit includes:
The signal processing device according to the above (2), which calculates the density using an average value of values indicating the density for each pair of a target bounding box, which is the bounding box corresponding to the object to be processed, and an overlapping bounding box, which is the bounding box having a common area with the target bounding box.
(4)
The signal processing device according to the above (3), wherein the image processing unit calculates a value indicating a degree of congestion for each of the pairs according to a proportion of the common area to a logical sum area of the target bounding box and the overlapping bounding box.
(5)
The image processing unit includes:
Detecting a predetermined portion of the object in the captured image as a key point;
The signal processing device according to any one of (1) to (4), wherein the density is calculated according to a degree of overlap between the bounding boxes and a rate of non-detection of the key points.
(6)
The image processing unit includes:
Calculating the center of gravity of the bounding box as a box center of gravity position;
Calculating the center of gravity of the plurality of key points estimated to be the same object as a key point center of gravity position;
The signal processing device according to (5) above, wherein the box centroid position and the keypoint centroid position that are close in coordinates on the captured image are linked as being related to the same object.
(7)
The signal processing device according to any one of (5) to (6), wherein the image processing unit calculates a likelihood for each of the key points.
(8)
The signal processing device according to (7) above, wherein the image processing unit treats the keypoint whose likelihood is less than a predetermined threshold in the calculation of the density as an undetected keypoint.
(9)
The signal processing device according to (8) above, wherein when a non-detection rate of the keypoints estimated to be the same object is equal to or higher than a predetermined threshold, the image processing unit calculates the density while excluding the keypoints.
(10)
The signal processing device according to any one of (5) to (9) above, wherein the image processing unit calculates the density by excluding objects that are located on the outer periphery of the captured image and are partially cut off.
(11)
The signal processing device according to any one of (1) to (10), further comprising an output unit that outputs information about the congestion degree as a log.
(12)
The signal processing device according to any one of (5) to (10), further comprising an output unit that outputs information on the density and the non-detection rate as a log.
(13)
The signal processing device according to any one of (11) to (12), wherein the output unit outputs the log as metadata of the captured image.
(14)
The signal processing device according to any one of (1) to (13) above, wherein the captured image is an RGB image.
(15)
A process of setting a bounding box on a captured image corresponding to an object detected in the captured image;
and calculating a density of the object according to a degree of overlap between the bounding boxes.
(16)
A function of setting a bounding box on a captured image corresponding to an object detected in the captured image;
A storage medium readable by a computer device, storing a program that causes a processor to execute a function of calculating a density of the object according to the degree of overlap between the bounding boxes.

３カメラ装置（信号処理装置）
４６通信Ｉ／Ｆ（出力部）
ＩＳイメージセンサ（信号処理装置）
ＢＢバウンディングボックス
ＢＢｔ対象バウンディングボックス
ＢＢｄ１、ＢＢｄ２重複バウンディングボックス
ＫＰキーポイント
ＢＢＰボックス重心位置
ＫＢＰキーポイント重心位置
Ｐ人物（対象物）
Ｐｔ対象人物（対象物）
ＣＡ１共通領域
ＤＡ１論理和領域
Ｔｈ１尤度閾値（所定閾値）
Ｔｈ２未検出率閾値（所定閾値） 3. Camera device (signal processing device)
46 Communication I/F (output section)
IS Image sensor (signal processing device)
BB Bounding box BBt Object bounding box BBd1, BBd2 Overlapping bounding boxes KP Key point BBP Box center of gravity position KBP Key point center of gravity position P Person (object)
Pt Target person (target object)
CA1 common area DA1 logical sum area Th1 likelihood threshold (predetermined threshold)
Th2 Undetected rate threshold (predetermined threshold)

Claims

and an image processing unit that sets a bounding box on a captured image corresponding to an object detected in the captured image, and calculates a density of the object according to a degree of overlap between the bounding boxes.

The signal processing device according to claim 1 , wherein the image processing unit calculates the density for each of the objects.

The image processing unit includes:
The signal processing device according to claim 2 , wherein the density is calculated using an average value of values indicating a density degree for each pair of a target bounding box, which is the bounding box corresponding to the object to be processed, and an overlapping bounding box, which is the bounding box having a common area with the target bounding box.

The signal processing device according to claim 3 , wherein the image processing unit calculates a value indicating a degree of congestion for each of the pairs in accordance with a proportion of the common area to a logical sum area of the target bounding box and the overlapping bounding box.

The image processing unit includes:
Detecting a predetermined portion of the object in the captured image as a key point;
The signal processing device according to claim 1 , wherein the density is calculated according to a degree of overlap between the bounding boxes and a rate of non-detection of the key points.

The image processing unit includes:
Calculating the center of gravity of the bounding box as a box center of gravity position;
Calculating the center of gravity of the plurality of key points estimated to be the same object as a key point center of gravity position;
The signal processing device according to claim 5 , wherein the box centroid position and the keypoint centroid position that are close to each other in coordinates on the captured image are linked as being related to the same object.

The signal processing device according to claim 5 , wherein the image processing unit calculates a likelihood for each of the key points.

The signal processing device according to claim 7 , wherein the image processing unit treats the keypoints whose likelihoods are less than a predetermined threshold in the calculation of the density as undetected keypoints.

The signal processing device according to claim 8 , wherein, when a non-detection rate of the keypoints estimated to be the same object is equal to or higher than a predetermined threshold, the image processing unit calculates the density while excluding the keypoints.

The signal processing device according to claim 5 , wherein the image processing unit calculates the density by excluding objects that are located on an outer periphery of the captured image and are partially cut off.

The signal processing device according to claim 1 , further comprising an output unit that outputs the congestion information as a log.

The signal processing device according to claim 5 , further comprising an output unit that outputs information on the density and the non-detection rate as a log.

The signal processing device according to claim 11 , wherein the output section outputs the log as metadata of the captured image.

The signal processing device according to claim 1 , wherein the captured image is an RGB image.

A process of setting a bounding box on a captured image corresponding to an object detected in the captured image;
and calculating a density of the object according to a degree of overlap between the bounding boxes.

A function of setting a bounding box on a captured image corresponding to an object detected in the captured image;
A storage medium readable by a computer device, storing a program for causing a processor to execute a function of calculating a density of the object according to a degree of overlap between the bounding boxes.