JP7124281B2

JP7124281B2 - Program, information processing device, image processing system

Info

Publication number: JP7124281B2
Application number: JP2017181828A
Authority: JP
Inventors: 陽子杉浦; 禎史荒木
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2017-09-21
Filing date: 2017-09-21
Publication date: 2022-08-24
Anticipated expiration: 2037-09-21
Also published as: JP2019057849A

Description

本発明は、プログラム、情報処理装置、及び、画像処理システムに関する。 The present invention relates to a program, an information processing device, and an image processing system.

同じ場所から同じ方向を撮像装置が周期的に撮像して、撮像範囲の現在の状況をリアルタイムに提供するサービスがある。監視カメラや消費者の行動調査などでは人物などの動体の動向が観測対象になるが、人物などの動体以外の背景が観測対象となる場合も少なくない。 There is a service in which an imaging device periodically captures images from the same location in the same direction and provides the current situation of the imaging range in real time. Surveillance cameras and consumer behavior surveys target the movement of moving bodies such as people, but there are many cases in which backgrounds other than moving bodies such as people become targets of observation.

背景が観測対象となるケースの一例として、ＥＣ（Electronic Commerce）サイト向けに実店舗の商品棚の画像をリアルタイムに、ユーザである商品の購入者等の端末装置に配信することが検討されている。ＥＣサイトのユーザは実際の商品棚を端末装置で見ることができるため、通信販売でありながら臨場感のある買い物を楽しむことができる。しかし、実店舗の内部が撮像された画像には商品棚の前にいる来客者が写っている場合が少なくない。来客者が写っている画像をそのまま提供することはプライバシーの保護に欠け、また、商品が来客者で隠れた画像が端末装置に提供されるとユーザが商品を閲覧できなくなる。 As an example of a case in which the background is to be observed, real-time distribution of images of product shelves in physical stores for EC (Electronic Commerce) sites to terminal devices of users, such as purchasers of products, is being considered. . Since the user of the EC site can see the actual product shelf on the terminal device, he/she can enjoy shopping with a sense of realism even though it is mail-order sales. However, it is not uncommon for an image of the inside of a physical store to include a visitor in front of a product shelf. Providing an image in which a visitor is shown as it is lacks protection of privacy, and if an image in which the product is hidden by the visitor is provided to the terminal device, the user cannot view the product.

そこで、撮像された画像から背景以外の動体を除去する技術が知られている（例えば、特許文献１参照。）。特許文献１には、動体が映されたフレームを検出する動体検出部と、動体が検出されなくなった検出後フレームと検出後フレームより前の動体が検出された各検出フレームとを比較していくことによって、検出フレームから画情報が描画された検出フレームを更新フレームとして特定し、描画された画情報の領域を特定する描画フレーム特定部と、検出フレームを検出フレームの前のフレームで置き換え、検出フレームが画情報の描画された更新フレームである場合、対応する画情報の領域を検出後フレームの対応する領域の画像に重畳する合成処理部と、を備えた映像編集装置が開示されている。 Therefore, there is known a technique for removing a moving object other than the background from a captured image (see, for example, Patent Document 1). In Japanese Patent Laid-Open No. 2004-100000, a moving object detection unit detects a frame in which a moving object is shown, and a post-detection frame in which no moving object is detected is compared with each detection frame in which a moving object is detected before the post-detection frame. a drawing frame identifying unit that identifies a detected frame in which image information is drawn from the detected frame as an update frame and identifies an area of the drawn image information; When a frame is an update frame in which image information is drawn, a synthesis processing unit that superimposes the area of the corresponding image information on the image of the corresponding area of the frame after detection is disclosed.

しかながら、従来の技術は、ある程度の期間の時系列の画像から動体を除去する技術であるため、撮像しながら出力する画像を更新することができないという問題があった。例えば、従来の技術は、動体が検出されなくなった検出後フレームが検出された時点から過去にさかのぼって動体の消去や画情報の描画の合成処理を行う方式であるため、人物がいなくなるまでホワイトボードなどに手書きされた画情報が表示されない。 However, since the conventional technique is a technique for removing a moving object from time-series images for a certain period of time, there is a problem that the output image cannot be updated while the image is being captured. For example, the conventional technology is a method of deleting a moving object and synthesizing image information from the point in time when a post-detection frame in which a moving object is no longer detected is detected. The handwritten image information is not displayed.

本発明は、上記課題に鑑み、撮像しながら出力する画像を更新することができるプログラムを提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide a program capable of updating an output image while capturing an image.

本発明は、情報処理装置を、撮像装置が撮像した画像を入力画像として取得する画像取得手段と、前記画像取得手段が取得した前記入力画像が被写体の画像か否かを判断する被写体画像判断手段と、前記入力画像が被写体の画像であると判断した場合に、前記入力画像を出力すると判断し、被写体の画像でないと判断した場合に外部に出力された出力画像を出力すると判断する判断手段と、前記判断手段の判断結果に応じて、前記入力画像又は前記出力画像を出力する出力手段と、前記画像取得手段が取得した前記入力画像と外部に出力された前記出力画像の第一の類似度を算出する類似度算出手段、として機能させ、前記第一の類似度が第一の閾値以上の場合、前記被写体画像判断手段は、前記入力画像が前記被写体の画像であると判断し、前記第一の類似度が前記第一の閾値未満の場合、前記入力画像を被写体の画像でないと判断し、前記第一の類似度が前記第一の閾値以上の場合、前記類似度算出手段は、前記画像取得手段が取得した現在の入力画像と、前記現在の入力画像よりも過去の入力画像との第二の類似度を算出し、前記判断手段は、前記第二の類似度が第二の閾値以上の場合、現在の前記入力画像を出力すると判断し、前記第二の類似度が前記第二の閾値未満の場合、外部に出力した前記出力画像を出力すると判断するプログラムを提供する。
The present invention comprises an information processing apparatus comprising: image acquiring means for acquiring an image captured by an imaging device as an input image; and subject image determining means for determining whether or not the input image acquired by the image acquiring means is an image of a subject. and determining means for determining to output the input image when determining that the input image is the image of the subject, and determining to output the externally output image when determining that the input image is not the image of the subject. , output means for outputting the input image or the output image according to the determination result of the determination means; and a first degree of similarity between the input image acquired by the image acquisition means and the output image outputted to the outside When the first similarity is equal to or greater than a first threshold, the subject image determination means determines that the input image is the image of the subject, and determines that the input image is the image of the subject. If one degree of similarity is less than the first threshold, the input image is determined not to be an image of a subject; calculating a second degree of similarity between the current input image acquired by the image acquisition means and an input image past the current input image; In the above case, there is provided a program for determining to output the current input image, and determining to output the output image that was output to the outside when the second similarity is less than the second threshold .

撮像しながら出力する画像を更新することができるプログラムを提供することができる。 It is possible to provide a program capable of updating an image to be output while imaging.

時系列に撮像された商品棚の画像と動体の除去処理を説明する図の一例である。FIG. 10 is an example of a diagram for explaining an image of a product shelf imaged in time series and a process of removing a moving object; 周期的に撮像された画像がどのように変化するかを説明する図の一例である。It is an example of the figure explaining how the image imaged periodically changes. 時系列の入力画像とその中から出力される出力画像を説明する図の一例である。FIG. 10 is an example of a diagram for explaining time-series input images and output images output therefrom; 画像処理システムの概略構成図の一例である。1 is an example of a schematic configuration diagram of an image processing system; FIG. 店舗のイメージと撮像装置の配置例の一例を説明する図である。It is a figure explaining an example of an image of a store, and an example of arrangement of an imaging device. 撮像装置のハードウェア構成図の一例である。It is an example of the hardware block diagram of an imaging device. 無線通信機能を有したクレードルの場合の通信端末のハードウェア構成図の一例である。1 is an example of a hardware configuration diagram of a communication terminal in the case of a cradle having a wireless communication function; FIG. 画像管理装置及び端末装置のハードウェア構成図の一例である。1 is an example of a hardware configuration diagram of an image management device and a terminal device; FIG. 画像処理システムが有する、撮像装置、通信端末、画像管理装置、及び端末装置の各機能ブロック図の一例である。1 is an example of functional block diagrams of an imaging device, a communication terminal, an image management device, and a terminal device included in an image processing system; FIG. 分割後の画像の大きさを説明する図の一例である。It is an example of the figure explaining the size of the image after division. 類似度に基づく出力画像の決定方法について説明する図の一例である。FIG. 10 is an example of a diagram illustrating a method of determining an output image based on similarity; 画像処理システムが画像を提供する全体的な手順を示すシーケンス図の一例である。FIG. 10 is an example of a sequence diagram showing an overall procedure for providing an image by the image processing system; 端末装置が表示した店舗一覧画面の一例を示す図である。It is a figure which shows an example of the shop list screen which the terminal device displayed. 端末装置が表示した商品画面の一例を示す図である。It is a figure which shows an example of the goods screen which the terminal device displayed. 画像管理装置が出力画像を決定する手順を示すフローチャート図の一例である。FIG. 10 is an example of a flowchart showing a procedure for an image management device to determine an output image; 画像処理システムが解決する不都合を説明する図の一例である。FIG. 10 is an example of a diagram for explaining a problem solved by the image processing system; 類似度に基づく出力画像の決定方法について説明する図の一例である（実施例２）。FIG. 10 is an example of a diagram illustrating a method of determining an output image based on similarity (Example 2); 類似度に基づく出力画像の決定方法について説明する図の一例である（実施例２）。FIG. 10 is an example of a diagram illustrating a method of determining an output image based on similarity (Example 2); 画像処理システムが出力画像を更新する手順を説明するフローチャート図の一例である（実施例２）。FIG. 10 is an example of a flowchart illustrating a procedure for updating an output image by an image processing system (Embodiment 2); 画像処理システムが有する、撮像装置、通信端末、画像管理装置、及び端末装置の各機能ブロック図の一例である（実施例３）。FIG. 11 is an example of functional block diagrams of an imaging device, a communication terminal, an image management device, and a terminal device, which are included in an image processing system (Embodiment 3). 拡張処理を説明する図の一例である。It is an example of the figure explaining an expansion process. ８近傍拡張処理の好適例を説明する図の一例である。It is an example of a diagram for explaining a preferred example of 8-neighbor expansion processing. 拡張処理部が分割後の画像に対し拡張処理を行うフローチャート図の一例である。FIG. 10 is an example of a flow chart in which an extension processing unit performs extension processing on a divided image; 画像処理システムが有する、撮像装置、通信端末、画像管理装置、及び端末装置の各機能ブロック図の一例である（実施例４）。FIG. 11 is an example of functional block diagrams of an imaging device, a communication terminal, an image management device, and a terminal device, which an image processing system has (Embodiment 4). 学習モデルを用いた出力画像の決定方法について説明する図の一例である。FIG. 10 is an example of a diagram illustrating a method of determining an output image using a learning model; 学習モデルが入力画像を破棄する手順を示すフローチャート図の一例である。FIG. 10 is an example of a flowchart diagram showing a procedure for a learning model to discard an input image; 学習モデルを用いた出力画像の決定方法について説明する図の一例である。FIG. 10 is an example of a diagram illustrating a method of determining an output image using a learning model; 画像管理装置が出力画像を決定する手順を示すフローチャート図の一例である（実施例４）。FIG. 11 is an example of a flowchart showing a procedure for an image management apparatus to determine an output image (Embodiment 4).

以下、本発明を実施するための形態について図面を参照しながら実施例を挙げて説明する。 EMBODIMENT OF THE INVENTION Hereafter, an Example is given and demonstrated, referring drawings for the form for implementing this invention.

＜画像から動体を除去する画像処理の比較例＞
本実施形態を説明するに当たって、画像から動体を除去する動体除去の比較例を説明する。 <Comparison example of image processing for removing a moving object from an image>
Before describing the present embodiment, a comparative example of removing a moving object from an image will be described.

図１は、時系列に撮像された商品棚８１の画像と動体の除去処理を説明する図の一例である。撮像装置は商品棚の少なくとも一部を撮像する位置に配置され、周期的に商品棚を撮像している。このように同じ場所で撮像装置１が撮像することを定点観測という場合がある。撮像の時間間隔は、商品棚が変化した場合にいつまでも古い画像を提供することなく早期に画像を提供できる時間間隔であり、商品が売れる早さなどを考慮して決定される。このような映像配信をリアルタイム配信又はライブ配信などという。また、時間間隔の決定に当たっては、ネットワークの帯域や画像処理等の処理負荷が考慮される。時間間隔は短い方が、リアルタイム性が増すが、それほど短時間に商品棚は変化しない場合、処理負荷も増大してしまう。以上から、あくまで一例として本実施形態では時間間隔を１秒とするが、３０フレーム/secのように短い時間間隔で撮像してもよいし、２～１０フレーム/secで撮像してもよいし、１分以上を時間間隔としてもよい。 FIG. 1 is an example of a diagram illustrating an image of a product shelf 81 captured in time series and a moving object removal process. The imaging device is arranged at a position for imaging at least part of the product shelf, and periodically images the product shelf. Such imaging by the imaging device 1 at the same place is sometimes called fixed-point observation. The imaging time interval is a time interval at which an image can be provided early without providing an old image indefinitely when the product shelf changes, and is determined in consideration of the speed at which the product sells. Such video distribution is called real-time distribution or live distribution. In determining the time interval, the bandwidth of the network and the processing load such as image processing are taken into consideration. The shorter the time interval, the higher the real-time performance, but if the product shelf does not change in such a short time, the processing load will also increase. From the above, although the time interval is set to 1 second in this embodiment as an example only, imaging may be performed at a short time interval such as 30 frames/sec, or may be performed at 2 to 10 frames/sec. , 1 minute or more may be set as the time interval.

図１（ａ）は時系列に撮像された商品棚８１の画像を示す。画像８３_１、画像８３_２、画像８３_４及び画像８３_５には変化がないが、画像８３_３には人物８２が写っている。画像処理システムが画像８３_３を提供する際、人物８２を除去するが、人物８２を除去する方法としては、過去の所定時間の画像の画素ごとに平均を取って平均画像を作成する方法が考えられる。人物８２の滞留時間を１秒、所定時間を例えば３０秒とすると画像８３_３が平均画像に与える影響は１／３０になるので、人物８２が写っていない平均画像が得られる。 FIG. 1(a) shows images of a product shelf 81 captured in time series. The images 83_1, 83_2, 83_4, and 83_5 are unchanged, but the person 82 is shown in the image 83_3. When the image processing system provides the image 83_3, the person 82 is removed. As a method of removing the person 82, a method of averaging pixels of images for a predetermined time in the past to create an average image is conceivable. . If the residence time of the person 82 is 1 second and the predetermined time is 30 seconds, the effect of the image 83_3 on the average image is 1/30, so an average image without the person 82 is obtained.

しかしながら、店舗内の人物８２は単に通過する場合よりも商品を物色するために同じ商品棚８１の前で少なくとも数秒は滞留する。このため、平均画像に半透明の人物８２が写ってしまう。図１（ｂ）はこの半透明の人物８２が写っている平均画像８４を模式的に示す図である。より具体的には、人物８２が写っている画像の比率が多くなるにつれて時間と共に徐々に半透明の人物８２が濃くなり、また、比率が少なくなるにつれて時間と共に徐々に半透明の人物８２が薄くなっていく。平均に用いる画像の数を多くすれば人物８２を除去できるが、商品の変化が画像に反映されるのも遅くなってしまう。 However, a person 82 in the store stays in front of the same product shelf 81 for at least several seconds to look for products rather than simply passing by. As a result, the semi-transparent person 82 appears in the average image. FIG. 1(b) is a diagram schematically showing an average image 84 in which the translucent person 82 is shown. More specifically, the semi-transparent person 82 gradually becomes darker over time as the ratio of the images showing the person 82 increases, and the semi-transparent person 82 becomes lighter over time as the ratio decreases. becoming. If the number of images used for averaging is increased, the person 82 can be removed, but it also delays the reflection of changes in products in the images.

このように、平均画像による動体の除去は、平均画像の母数となる画像数に比べ人物８２が写っている画像の数が少なければ有効な場合があるが、商品を物色する人物８２が写る店舗内では好適でない場合がある。 In this way, the removal of a moving object using the average image may be effective if the number of images containing the person 82 is smaller than the number of images serving as the parameter of the average image. In-store may not be suitable.

＜本実施形態が着目する画像の変化例＞
そこで、本実施形態では実際に出力した出力画像と撮像された入力画像を比較して人物８２のような動体を検出し、動体が検出された場合は動体が検出されていない過去の出力画像を出力する動体除去方法で動体を除去する。出力画像とはユーザに提供される画像又は表示される画像であり、入力画像とは撮像された画像である。 <Example of change in image focused on by the present embodiment>
Therefore, in the present embodiment, a moving object such as the person 82 is detected by comparing an output image that is actually output and a captured input image, and if a moving object is detected, a past output image in which the moving object has not been detected is used. A moving object is removed by the output moving object removal method. An output image is an image provided or displayed to a user, and an input image is a captured image.

図２は、周期的に撮像された画像がどのように変化するかを説明する図の一例であり、図３は、時系列の入力画像とその中から出力される出力画像を説明する図の一例である。図２では、ドリンク類が置かれた商品棚８１が周期的に撮像された画像の変化を示す。なお、画像はブロックに分割された状態で処理されるが、ブロックへの分割については後述する。 FIG. 2 is an example of a diagram for explaining how images captured periodically change, and FIG. 3 is a diagram for explaining time-series input images and output images output from them. An example. FIG. 2 shows a change in an image in which a product shelf 81 on which drinks are placed is periodically captured. Note that the image is processed while being divided into blocks, and the division into blocks will be described later.

図２では時間の経過に対する出力画像と入力画像の類似度（０～１の値を取り値が大きいほど類似度が高い）８５とその時の画像例を示す。 FIG. 2 shows the degree of similarity 85 between an output image and an input image over time (a value between 0 and 1 indicates a higher degree of similarity as the value increases) and an example of images at that time.

時刻ｔ_０：商品棚８１が撮像され、商品が写った入力画像８６（以下、時刻ｔ_０～ｔ_ｎを付して区別する）が得られる。出力画像は入力画像８６_ｔ_０である。 Time t ₀ : An image of the product shelf 81 is captured, and an input image 86 showing products (hereinafter referred to as time t ₀ to t _n for distinction) is obtained. The output image is the input image _{86_t0} .

時刻ｔ_１：商品が写った入力画像８６_ｔ_１が得られるが、直前の入力画像８６_ｔ_０に対し変化がなく類似度はほぼ１である。出力画像は入力画像８６_ｔ_１である。 Time t ₁ : An input image 86 — t ₁ showing the product is obtained, but there is no change from the previous input image 86 — t ₀ and the similarity is approximately 1. The output image is the input image _{86_t1} .

時刻ｔ_２：人物８２が撮像範囲に入っため、人物８２が写った入力画像８６_ｔ_２が得られる。また、類似度が大きく低下したため、時刻ｔ１の出力画像は更新されない。この処理については図３で説明する。 Time t2: Since the person 82 enters the imaging range, an input image _{86_t2} in which the person 82 is _captured is obtained. In addition, the output image at time t1 is not updated because the similarity has greatly decreased. This processing will be described with reference to FIG.

時刻ｔ_３：引き続き人物８２が撮像されているため人物８２が写った入力画像８６_ｔ_３が得られる。また、類似度が大きく低下したため、時刻ｔ１の出力画像は更新されない。 Time t ₃ : An input image 86 — t ₃ in which the person 82 is captured is obtained since the person 82 is continuously captured. In addition, the output image at time t1 is not updated because the similarity has greatly decreased.

時刻ｔ_４：引き続き人物８２が撮像されているため人物８２が写った入力画像８６_ｔ_４が得られる。また、類似度が大きく低下したため、時刻ｔ_１の出力画像は更新されない。 Time t ₄ : An input image 86 — t ₄ in which the person 82 is captured is obtained because the person 82 is continuously captured. _In addition, the output image at time t1 is not updated because the similarity has greatly decreased.

時刻ｔ_５：撮像される人物８２の部位が変動しても引き続き人物８２が撮像されているため、人物８２が写った入力画像８６_ｔ_５が得られる。類似度は変動するが低下したままであり、時刻ｔ_１の出力画像は更新されない。 Time t ₅ : Even if the part of the person 82 to be imaged changes, the person 82 is still being imaged, so an input image 86 — t ₅ in which the person 82 is captured is obtained. _The similarity fluctuates but remains low, and the output image at time t1 is not updated.

時刻ｔ_６：人物８２が撮像範囲から立ち去ると商品棚８１が写った入力画像８６_ｔ_６が撮像される。類似度が高くなるため、出力画像は入力画像８６_ｔ_６に更新される。しかし、時刻ｔ_６で商品が１つなくなっているため、類似度はわずかに１より小さくなる。 Time t ₆ : When the person 82 leaves the imaging range, an input image 86 — t ₆ showing the product shelf 81 is captured. Since the degree of similarity increases, the output image is updated to the input image _{86_t6} . However, the similarity is slightly less than ₁ because there is one item missing at time t6.

図３（ａ）は時間に対する類似度８５を示し、図３（ｂ）は入力画像８６_ｔ_０～８６_ｔ_６を示し、図３（ｃ）は出力画像８７（以下、時刻ｔ_０～ｔ_ｎを付して区別する）を示す。本実施形態の画像処理システムが提供したい画像は商品棚８１であるため、人物８２が写っている入力画像８６_ｔ_２～８６_ｔ_４は除去すべきである。除去すべき画像であることは類似度８５が閾値未満であることから判断される。 FIG. 3(a) shows the degree of similarity 85 with respect to time, FIG. 3(b) shows input images _{86_t} ₀ to _{86_t} ₆ , and FIG. to distinguish). Since the image to be provided by the image processing system of this embodiment is the product shelf 81, the input images 86_t ₂ to 86_t ₄ in which the person 82 is shown should be removed. The image to be removed is determined from the similarity 85 being less than the threshold.

画像処理システムは除去した入力画像８６_ｔ_２～８６_ｔ_４の代わりに過去の出力画像を提供する。過去の出力画像は一例として最も新しい出力画像である（類似度が閾値以上の最も新しい出力画像８７_ｔ_１である。）。 The image processing system provides past output images in place of the removed input images 86_t ₂ -86_t ₄ . The past output image is, for example, the newest output image (the newest output image _{87_t1} whose similarity is equal to or greater than the threshold).

このように本実施形態の画像処理システムはある入力画像に人物８２が写っていたら人物８２が写っていない過去の出力画像を出力する。したがって、人物８２などの動体が写っている入力画像を除去して背景である商品棚８１の画像を提供することができる。動体が写っている間は動体が検出される前の出力画像が表示されるので、閲覧者が商品を閲覧できないということがない。商品棚８１の商品が取り出されて商品棚８１の画像に変化が生じても類似度の低下はわずかなので、商品が減った商品棚の入力画像で出力画像を更新できる。 As described above, the image processing system of this embodiment outputs a past output image in which the person 82 is not shown when the person 82 is shown in an input image. Therefore, it is possible to provide an image of the product shelf 81 as the background by removing the input image including a moving object such as the person 82 . Since the output image before the detection of the moving object is displayed while the moving object is captured, the viewer will not be unable to browse the product. Even if the product on the product shelf 81 is taken out and the image of the product shelf 81 changes, the similarity is only slightly reduced, so the output image can be updated with the input image of the product shelf with fewer products.

＜用語について＞
入力画像とは、出力されるか否かに関係なく情報処理装置に入力された画像をいう。撮像装置から受信した画像だけでなく、記憶媒体に記憶された画像でもよい。 <Terms>
An input image is an image input to an information processing apparatus regardless of whether it is output or not. It may be an image stored in a storage medium as well as an image received from an imaging device.

出力画像とは外部に出力された画像であり、入力画像の全部又は一部が出力画像となる。出力先はディスプレイなどの表示装置の他、ネットワークを介して接続された情報処理装置、又は、記憶媒体でもよい。出力手段が出力することには、表示装置への出力、外部の装置への送信、及び、記憶媒体への記憶が含まれる。 An output image is an image that is output to the outside, and all or part of the input image is the output image. The output destination may be a display device such as a display, an information processing device connected via a network, or a storage medium. Output by the output means includes output to a display device, transmission to an external device, and storage in a storage medium.

動体とは、移動する機能を備えるものをいう。例えば、人間、動物、車・自転車移動体、カート（手押し車）、などがある。また、人間の移動に伴い移動するものも動体である。例えば、手提げ鞄も動体であり、人間が把持するドリンクも動体である。一方、商品としてのドリンクは移動する機能がなく人間が把持していないので動体でない。 A moving object is an object that has the ability to move. For example, there are humans, animals, cars/bicycles, carts (wheelbarrows), and the like. Objects that move along with human movements are also moving objects. For example, a handbag is a moving object, and a drink held by a person is also a moving object. On the other hand, a drink as a product is not a moving object because it does not have a function to move and is not held by a person.

被写体とは写し取られる対象をいう。何が被写体であるかは撮像の目的によって決まり、変更され得る。被写体と撮像装置の間にあるものを動体と称してもよい。また、被写体は撮影対象とも呼ばれる。 A subject is an object to be photographed. What is the subject depends on the purpose of the imaging and can be changed. An object between the object and the imaging device may be called a moving object. A subject is also called an object to be photographed.

背景画像とは、商品棚の画像又は動体が写っていない画像をいう。また、被写体は商品棚といった背景画像に含まれるものでもよく、背景画像が被写体を写した画像となってもよい。 A background image is an image of a product shelf or an image that does not include a moving object. Also, the subject may be included in the background image such as a product shelf, or the background image may be an image of the subject.

＜システム構成例＞
図４は、画像処理システム２００の概略構成図の一例である。画像処理システム２００は、通信ネットワーク９を介して接続された画像管理装置５、撮像装置１、通信端末３、及び、端末装置７を有している。撮像装置１は設置者Ｘにより店舗内に設置されている。端末装置７は閲覧者Ｙにより操作される。 <System configuration example>
FIG. 4 is an example of a schematic configuration diagram of the image processing system 200. As shown in FIG. The image processing system 200 has an image management device 5 , an imaging device 1 , a communication terminal 3 and a terminal device 7 which are connected via a communication network 9 . The imaging device 1 is installed by the installer X in the store. The terminal device 7 is operated by the viewer Y. FIG.

通信ネットワーク９は、店舗内や閲覧者Ｙの所属先の企業のＬＡＮ、ＬＡＮをインターネットに接続するプロバイダのプロバイダネットワーク、及び、回線事業者が提供する回線等の少なくとも１つを含んで構築されている。通信端末３や端末装置７がＬＡＮを介さずに直接、回線電話網や携帯電話網に接続する場合は、ＬＡＮを介さずにプロバイダネットワークに接続することができる。また、通信ネットワークにはＷＡＮやインターネットが含まれる。通信ネットワークは有線又は無線のどちらで構築されてもよく、また、有線と無線が組み合わされていてもよい。 The communication network 9 is constructed to include at least one of a LAN of a store or company to which the viewer Y belongs, a provider network of a provider that connects the LAN to the Internet, and a line provided by a line operator. there is When the communication terminal 3 or the terminal device 7 is directly connected to the line telephone network or mobile phone network without going through the LAN, it can be connected to the provider network without going through the LAN. Also, communication networks include WANs and the Internet. A communication network may be constructed as either wired or wireless, or may be a combination of wired and wireless.

撮像装置１は、一般的な画角のカメラ（例えば、焦点距離で３５ｍｍ）でもよいし一度の撮像で周囲３６０度を撮像し全天球画像を作成するカメラでもよい。一般的な画角と周囲３６０度との間の画角を撮像するカメラでもよい。ただし、画角が大きいほど少ない数の撮像装置１で広い範囲をカバーできる。 The imaging device 1 may be a camera with a general angle of view (for example, a focal length of 35 mm) or a camera that captures 360-degree surroundings in one shot to create a omnidirectional image. A camera that captures an angle of view between a general angle of view and 360 degrees around the circumference may also be used. However, the larger the angle of view, the smaller the number of imaging devices 1 that can cover a wider range.

撮像装置１はデジタルスチルカメラ又はデジタルビデオカメラと呼ばれる場合がある。また、通信端末３にカメラが付いている場合は、通信端末３がデジタルカメラとなりうる。本実施形態では、説明を分かりやすくするために撮像装置１は全天球画像を得るためのデジタルカメラとして説明を行う。撮像装置１は周期的に周囲３６０度を撮像する。必ずしも周期的である必要はなく、不定期に撮像してもよいし、設置者Ｘの操作により撮像してもよいし、閲覧者Ｙが画像管理装置５に要求することで画像管理装置５からの命令で撮像してもよい。また、時間帯によって撮像する時間間隔を変更してもよい。例えば、商品が売れやすい時間帯（来客者が多い時間帯）は短くして、商品が売れにくい時間帯（来客者が少ない時間帯）は長くする。 The imaging device 1 is sometimes called a digital still camera or a digital video camera. Also, when the communication terminal 3 is equipped with a camera, the communication terminal 3 can be a digital camera. In the present embodiment, the imaging apparatus 1 is described as a digital camera for obtaining an omnidirectional image for the sake of easy understanding. The imaging device 1 periodically captures 360-degree images of the surroundings. It does not necessarily have to be periodic, and the image may be captured irregularly. You may image with the command of Also, the time interval for imaging may be changed depending on the time zone. For example, the time period during which the product is easy to sell (the time period when there are many visitors) is shortened, and the time period when the product is difficult to sell (the time period when there are few visitors) is lengthened.

なお、撮像装置１は、視線方向が異なる何枚かの風景を自動的に撮像し、複数の画像データを合成することで全天球画像を作成してもよい。 Note that the imaging device 1 may automatically capture a number of landscapes in different line-of-sight directions and synthesize a plurality of image data to create an omnidirectional image.

撮像装置１は定点観測の対象となる被写体がある場所に配置される。図４では店舗内に配置されているが一例に過ぎない。ＥＣサイト以外の動体除去の利用シーンについては後述する。撮像装置１の数は配置場所によって様々である。一般的に、見通しのよい配置場所では少ない数でよいし、見通しの悪い配置場所では多くなる。 The imaging device 1 is arranged at a place where there is a subject to be fixed-point observation. In FIG. 4, it is arranged in the store, but it is only an example. Usage scenes of moving object removal other than EC sites will be described later. The number of imaging devices 1 varies depending on the location. In general, a small number is sufficient for locations with good visibility, and a large number for locations with poor visibility.

通信端末３は、撮像装置１の代わりに通信ネットワーク９に接続する通信機能を有している。通信端末３は、撮像装置１への電力供給や店舗への固定を行うためのクレードル(Cradle)である。クレードルとは、撮像装置１の機能を拡張する拡張機器をいう。通信端末３は撮像装置１と接続するためのインタフェースを有し、これにより撮像装置１は通信端末３の機能を利用できる。通信端末３は、このインタフェースを介して撮像装置１とデータ通信を行なう。そして、無線ルータ９ａ及び通信ネットワーク９を介して画像管理装置５とデータ通信を行なう。 The communication terminal 3 has a communication function to connect to the communication network 9 instead of the imaging device 1 . The communication terminal 3 is a cradle for supplying power to the imaging device 1 and fixing it to a store. A cradle is an expansion device that expands the functions of the imaging device 1 . The communication terminal 3 has an interface for connecting with the imaging device 1 , so that the imaging device 1 can use the functions of the communication terminal 3 . The communication terminal 3 performs data communication with the imaging device 1 via this interface. Data communication is performed with the image manager 5 via the wireless router 9 a and the communication network 9 .

なお、撮像装置１が無線ルータ９ａや通信ネットワーク９と直接、データ通信する機能を有する場合、通信端末３はなくてもよい。あるいは、撮像装置１と通信端末３が一体に構成されていてもよい。 If the imaging device 1 has a function of performing data communication directly with the wireless router 9a or the communication network 9, the communication terminal 3 may be omitted. Alternatively, the imaging device 1 and the communication terminal 3 may be configured integrally.

画像管理装置５は、例えば、サーバとして機能する情報処理装置であり、通信ネットワーク９を介して、通信端末３及び端末装置７とデータ通信を行なうことができる。画像管理装置５は画像から動体を除去する画像処理を行い、この画像をＷｅｂページとして端末装置７に提供する。したがって、画像管理装置５はＷｅｂサーバとして動作する。このＷｅｂページは、例えば店舗の商品を販売するＥＣサイトのＷｅｂページとして利用されてよい。Ｗｅｂページに店舗のライブ映像が含まれるため臨場感のあるショッピングが可能になる。また、Ｗｅｂページは店舗内の商品棚８１を監視する監視用のＷｅｂページでもよい。管理者は商品棚８１の商品が少ないことを確認して補充したり、商品棚８１の商品が乱れていることを確認して整列させたりすることができる。 The image management device 5 is, for example, an information processing device that functions as a server, and can perform data communication with the communication terminal 3 and the terminal device 7 via the communication network 9 . The image management device 5 performs image processing for removing a moving object from the image, and provides this image to the terminal device 7 as a web page. Therefore, the image management device 5 operates as a web server. This web page may be used, for example, as a web page of an EC site that sells products in a store. Since the web page includes a live image of the store, shopping with a sense of presence becomes possible. Also, the Web page may be a monitoring Web page for monitoring the product shelf 81 in the store. The manager can confirm that the merchandise on the merchandise shelf 81 is low and replenish the merchandise, or confirm that the merchandise on the merchandise shelf 81 is disordered and arrange the merchandise.

全天球画像が撮像される場合、画像管理装置５には、OpenGL ES（3Dグラフィックス用のＡＰＩ：Application Interface）がインストールされている。OpenGL ESを呼び出すことで全天球画像から正距円筒画像を作成したり、全天球画像の一部の画像（所定領域画像）のサムネイル画像を作成したりすることができる。 When an omnidirectional image is captured, OpenGL ES (API for 3D graphics: Application Interface) is installed in the image management device 5 . By calling OpenGL ES, it is possible to create an equirectangular image from an omnidirectional image, or create a thumbnail image of a part of the omnidirectional image (predetermined area image).

なお、画像管理装置５にはクラウドコンピューティングが適用されていてよい。クラウドコンピューティングの物理的な構成に厳密な定義はないが、情報処理装置を構成するＣＰＵ、ＲＡＭ、ストレージなどのリソースが負荷に応じて動的に接続・切断されることで情報処理装置の構成や設置場所が柔軟に変更される構成が知られている。また、クラウドコンピューティングでは、画像管理装置５が仮想化されることが一般的である。１台の情報処理装置が仮想化によって複数の画像管理装置５としての機能を提供することや、複数の情報処理装置が仮想化によって一台の画像管理装置５としての機能を提供することができる。なお、画像管理装置５がクラウドコンピューティングとしてではなく単独の情報処理装置により提供されることも可能である。 Note that cloud computing may be applied to the image management device 5 . Although there is no strict definition for the physical configuration of cloud computing, the configuration of an information processing device by dynamically connecting and disconnecting resources such as the CPU, RAM, and storage that make up the information processing device according to the load. and a configuration in which the installation location is flexibly changed is known. In cloud computing, the image manager 5 is generally virtualized. One information processing apparatus can provide functions as a plurality of image management apparatuses 5 by virtualization, and a plurality of information processing apparatuses can provide functions as one image management apparatus 5 by virtualization. . Note that the image management device 5 may be provided by a single information processing device instead of cloud computing.

端末装置７は、画像管理装置５から動体が除去された画像を含むＷｅｂページを取得して表示する情報処理装置である。例えば、ＰＣ(Personal Computer)であり、通信ネットワーク９を介して、画像管理装置５とデータ通信を行う。端末装置７は、ノートＰＣの他、タブレット端末、ＰＣ、ＰＤＡ（Personal Digital Assistant）、電子黒板、テレビ会議端末、ウェアラブルＰＣ、ゲーム機、携帯電話、カーナビゲーションシステム、スマートフォンなどでもよい。また、これらに限られるものではない。 The terminal device 7 is an information processing device that acquires from the image management device 5 a web page including an image from which the moving object has been removed and displays the web page. For example, it is a PC (Personal Computer), and performs data communication with the image manager 5 via the communication network 9 . The terminal device 7 may be a notebook PC, a tablet terminal, a PC, a PDA (Personal Digital Assistant), an electronic blackboard, a video conference terminal, a wearable PC, a game machine, a mobile phone, a car navigation system, a smart phone, or the like. Moreover, it is not restricted to these.

撮像装置１、通信端末３、及び無線ルータ９ａは、店舗等の各販売拠点で設置者Ｘによって所定の位置に設置される。ＥＣサイトと通信する端末装置７は、一般の消費者が生活する場所に配置されるか、又は、一般の諸費者により携帯されてもよい。店舗を監視する端末装置７は、各店舗を統括する本社、店長の自宅、事務室、商品の配送者等に配置されるか、又は、管理者により携帯されてもよい。 The imaging device 1, the communication terminal 3, and the wireless router 9a are installed at predetermined positions by an installer X at each sales base such as a store. The terminal device 7 that communicates with the EC site may be placed at a place where general consumers live, or may be carried by general consumers. The terminal device 7 that monitors the store may be placed at the head office that supervises each store, the store manager's home, office, product delivery person, or the like, or may be carried by the manager.

＜店舗のイメージと撮像装置の配置例＞
図５は、店舗２のイメージと撮像装置１の配置例の一例である。一般の店舗２では通路を挟んで商品棚８１が並べられている。壁際には冷蔵庫の機能がある商品棚８１にドリンク類が陳列されている場合が多い。撮像装置１は全ての商品棚８１の全ての商品が撮像範囲に入るように数及び配置が決定されてもよいし、所定の１つ以上の商品が撮像範囲に入るように数及び配置が決定されてもよい。撮像装置１が全天球画像を撮像する場合、一般的な画角の撮像装置１よりも少ない数で所望の商品を撮像できる。図５では壁から１つ手前の商品棚８１の上部に撮像装置１が配置されている。撮像装置１は天井に配置されてもよいし、商品棚８１の中段に配置されてもよい。 <Image of store and example of arrangement of imaging device>
FIG. 5 is an example of an image of the store 2 and an example of an arrangement of the imaging devices 1 . In a general store 2, product shelves 81 are arranged across an aisle. In many cases, drinks are displayed on a product shelf 81 that functions as a refrigerator along the wall. The number and arrangement of the imaging devices 1 may be determined so that all products on all product shelves 81 are within the imaging range, or the number and arrangement are determined so that one or more predetermined products are within the imaging range. may be When the imaging device 1 captures an omnidirectional image, a desired product can be captured with a smaller number than the imaging device 1 with a general angle of view. In FIG. 5, the imaging device 1 is arranged on the upper part of the product shelf 81 one before the wall. The imaging device 1 may be placed on the ceiling, or may be placed in the middle of the product shelf 81 .

＜実施形態のハードウェア構成＞
次に、図６～図８を用いて、本実施形態の撮像装置１、通信端末３，端末装置７及び画像管理装置５のハードウェア構成を説明する。 <Hardware Configuration of Embodiment>
Next, hardware configurations of the imaging device 1, the communication terminal 3, the terminal device 7 and the image management device 5 of this embodiment will be described with reference to FIGS. 6 to 8. FIG.

<<撮像装置>>
図６は、撮像装置１のハードウェア構成図の一例である。以下では、撮像装置１は、２つの撮像素子を使用した撮像装置とするが、撮像素子は３つ以上いくつでもよい。また、必ずしも全方位撮像専用の装置である必要はなく、通常のデジタルカメラやスマートフォン等に後付けの全方位撮像ユニットを取り付けることで、実質的に撮像装置１と同じ機能を有するようにしてもよい。 <<Imaging device>>
FIG. 6 is an example of a hardware configuration diagram of the imaging device 1. As shown in FIG. In the following description, the imaging device 1 is assumed to be an imaging device using two imaging elements, but the number of imaging elements may be three or more. Further, the device does not necessarily need to be dedicated to omnidirectional imaging, and may have substantially the same functions as the imaging device 1 by attaching an omnidirectional imaging unit to an ordinary digital camera, smartphone, or the like. .

図６に示されているように、撮像装置１は、撮像ユニット１０１、画像処理ユニット１０４、撮像制御ユニット１０５、マイク１０８、音処理ユニット１０９、ＣＰＵ(Central Processing Unit)１１１、ＲＯＭ(Read Only Memory)１１２、ＳＲＡＭ(Static Random Access Memory)１１３、ＤＲＡＭ(Dynamic Random Access Memory)１１４、操作部１１５、ネットワークＩ／Ｆ１１６、通信部１１７、及びアンテナ１１７ａによって構成されている。 As shown in FIG. 6, the imaging apparatus 1 includes an imaging unit 101, an image processing unit 104, an imaging control unit 105, a microphone 108, a sound processing unit 109, a CPU (Central Processing Unit) 111, a ROM (Read Only Memory). ) 112, an SRAM (Static Random Access Memory) 113, a DRAM (Dynamic Random Access Memory) 114, an operation unit 115, a network I/F 116, a communication unit 117, and an antenna 117a.

このうち、撮像ユニット１０１は、各々半球画像を結像するための１８０°以上の画角を有する広角レンズ（いわゆる魚眼レンズ）１０２ａ，１０２ｂと、各広角レンズに対応させて設けられている２つの撮像素子１０３ａ，１０３ｂを備えている。撮像素子１０３ａ，１０３ｂは、魚眼レンズによる光学像を電気信号の画像データに変換して出力するＣＭＯＳ(Complementary Metal Oxide Semiconductor)センサやＣＣＤ(Charge Coupled Device)センサなどの画像センサ、この画像センサの水平又は垂直同期信号や画素クロックなどを生成するタイミング生成回路、この撮像素子の動作に必要な種々のコマンドやパラメータなどが設定されるレジスタ群などを有している。 Among them, the imaging unit 101 includes wide-angle lenses (so-called fish-eye lenses) 102a and 102b each having an angle of view of 180° or more for forming a hemispherical image, and two imaging units provided corresponding to each wide-angle lens. It has elements 103a and 103b. The imaging elements 103a and 103b are image sensors such as CMOS (Complementary Metal Oxide Semiconductor) sensors and CCD (Charge Coupled Device) sensors that convert an optical image by a fisheye lens into image data of an electrical signal and output the image data. It has a timing generation circuit that generates vertical synchronization signals, pixel clocks, and the like, and a group of registers in which various commands and parameters required for the operation of this imaging device are set.

撮像ユニット１０１の撮像素子１０３ａ，１０３ｂは、各々、画像処理ユニット１０４とはパラレルＩ／Ｆバスで接続されている。一方、撮像ユニット１０１の撮像素子１０３ａ，１０３ｂは、撮像制御ユニット１０５とは別に、シリアルＩ／Ｆバス（Ｉ２Ｃバス等）で接続されている。画像処理ユニット１０４及び撮像制御ユニット１０５は、バス１１０を介してＣＰＵ１１１と接続される。更に、バス１１０には、ＲＯＭ１１２、ＳＲＡＭ１１３、ＤＲＡＭ１１４、操作部１１５、ネットワークＩ／Ｆ１１６、通信部１１７、及び電子コンパス１１８なども接続される。 The imaging elements 103a and 103b of the imaging unit 101 are each connected to the image processing unit 104 via a parallel I/F bus. On the other hand, the imaging elements 103a and 103b of the imaging unit 101 are connected separately from the imaging control unit 105 by a serial I/F bus (I2C bus or the like). The image processing unit 104 and imaging control unit 105 are connected to the CPU 111 via the bus 110 . Furthermore, ROM 112, SRAM 113, DRAM 114, operation unit 115, network I/F 116, communication unit 117, electronic compass 118, and the like are also connected to bus 110. FIG.

画像処理ユニット１０４は、撮像素子１０３ａ，１０３ｂから出力される画像データをパラレルＩ／Ｆバスを通して取り込み、それぞれの画像データに対して所定の処理を施した後、これらの画像データを合成処理して、正距円筒図のデータを作成する。 The image processing unit 104 fetches the image data output from the imaging elements 103a and 103b through the parallel I/F bus, performs predetermined processing on each image data, and synthesizes these image data. , to create equirectangular plot data.

撮像制御ユニット１０５は、一般に撮像制御ユニット１０５をマスタデバイス、撮像素子１０３ａ，１０３ｂをスレーブデバイスとして、Ｉ２Ｃバスを利用して、撮像素子１０３ａ，１０３ｂのレジスタ群にコマンド等を設定する。必要なコマンド等は、ＣＰＵ１１１から受け取る。また、該撮像制御ユニット１０５は、同じくＩ２Ｃバスを利用して、撮像素子１０３ａ，１０３ｂのレジスタ群のステータスデータ等を取り込み、ＣＰＵ１１１に送る。 The imaging control unit 105 generally uses the I2C bus with the imaging control unit 105 as a master device and the imaging elements 103a and 103b as slave devices to set commands and the like in registers of the imaging elements 103a and 103b. Necessary commands and the like are received from the CPU 111 . The imaging control unit 105 also uses the I2C bus to take in status data and the like of the registers of the imaging elements 103 a and 103 b and send them to the CPU 111 .

また、撮像制御ユニット１０５は、操作部１１５のシャッターボタンが押下されたタイミングで、撮像素子１０３ａ，１０３ｂに画像データの出力を指示する。撮像装置１によっては、ディスプレイによるプレビュー表示機能や動画表示に対応する機能を持つ場合もある。この場合は、撮像素子１０３ａ，１０３ｂからの画像データの出力は、所定のフレームレート（フレーム／分）によって連続して行われる。 Further, the imaging control unit 105 instructs the imaging devices 103a and 103b to output image data at the timing when the shutter button of the operation unit 115 is pressed. Some imaging devices 1 have a preview display function on a display or a function corresponding to moving image display. In this case, the image data is output continuously from the imaging devices 103a and 103b at a predetermined frame rate (frames/minute).

また、撮像制御ユニット１０５は、後述するように、ＣＰＵ１１１と協働して撮像素子１０３ａ，１０３ｂの画像データの出力タイミングの同期をとる同期制御手段としても機能する。なお、本実施形態では、撮像装置１には表示部が設けられていないが、表示部を設けてもよい。 In addition, as will be described later, the imaging control unit 105 also functions as synchronization control means for synchronizing the output timing of the image data of the imaging devices 103a and 103b in cooperation with the CPU 111. FIG. Note that in the present embodiment, the imaging apparatus 1 is not provided with a display section, but may be provided with a display section.

マイク１０８は、音を音（信号）データに変換する。音処理ユニット１０９は、マイク１０８から出力される音データをＩ／Ｆバスを通して取り込み、音データに対して所定の処理を施す。 The microphone 108 converts sound into sound (signal) data. The sound processing unit 109 takes in sound data output from the microphone 108 through the I/F bus and performs predetermined processing on the sound data.

ＣＰＵ１１１は、撮像装置１の全体の動作を制御すると共に必要な処理を実行する。ＲＯＭ１１２は、ＣＰＵ１１１のための種々のプログラムを記憶している。ＳＲＡＭ１１３及びＤＲＡＭ１１４はワークメモリであり、ＣＰＵ１１１で実行するプログラムや処理途中のデータ等を記憶する。特にＤＲＡＭ１１４は、画像処理ユニット１０４での処理途中の画像データや処理済みの正距円筒図のデータを記憶する。 The CPU 111 controls the overall operation of the imaging device 1 and executes necessary processing. ROM 112 stores various programs for CPU 111 . The SRAM 113 and DRAM 114 are work memories, and store programs to be executed by the CPU 111, data during processing, and the like. In particular, the DRAM 114 stores image data being processed by the image processing unit 104 and data of the processed equirectangular view.

操作部１１５は、種々の操作ボタンや電源スイッチ、シャッターボタン、表示と操作の機能を兼ねたタッチパネルなどの総称である。ユーザは操作ボタンを操作することで、種々の撮像モードや撮像条件などを入力する。 The operation unit 115 is a general term for various operation buttons, a power switch, a shutter button, a touch panel having both display and operation functions, and the like. A user inputs various imaging modes, imaging conditions, and the like by operating operation buttons.

ネットワークＩ／Ｆ１１６は、ＳＤカード等の外付けのメディアやパーソナルコンピュータなどとのインタフェース回路（ＵＳＢＩ／Ｆ等）の総称である。また、ネットワークＩ／Ｆ１１６としては、無線、有線を問わずにネットワークインタフェースである場合も考えられる。ＤＲＡＭ１１４に記憶された正距円筒図のデータは、このネットワークＩ／Ｆ１１６を介して外付けのメディアに記録されたり、必要に応じてネットワークＩ／ＦとなるネットワークＩ／Ｆ１１６を介して通信端末３等の外部装置に送信されたりする。 The network I/F 116 is a general term for interface circuits (such as USB I/F) with external media such as SD cards and personal computers. Also, the network I/F 116 may be a network interface regardless of whether it is wireless or wired. The data of the equirectangular diagram stored in the DRAM 114 is recorded on an external medium via this network I/F 116, or is transmitted to the communication terminal 3 via the network I/F 116 which becomes a network I/F as necessary. or sent to an external device such as

通信部１１７は、撮像装置１に設けられたアンテナ１１７ａを介して、Wi-Fi(wireless fidelity)、ＮＦＣ、又はＬＴＥ（Long Term Evolution）等の離無線技術によって、通信端末３等の外部装置と通信を行う。この通信部１１７によっても、正距円筒図のデータを通信端末３の外部装置に送信することができる。 The communication unit 117 communicates with an external device such as the communication terminal 3 via an antenna 117a provided in the imaging device 1 by a remote wireless technology such as Wi-Fi (wireless fidelity), NFC, or LTE (Long Term Evolution). communicate. This communication unit 117 can also transmit the equirectangular diagram data to the external device of the communication terminal 3 .

電子コンパス１１８は、地球の磁気から撮像装置１の方位及び傾き(Roll回転角)を算出し、方位・傾き情報を出力する。この方位・傾き情報はExifに沿った関連情報（メタデータ）の一例であり、撮像画像の画像補正等の画像処理に利用される。なお、関連情報には、画像の撮像日時、及び画像データのデータ容量の各データも含まれている。 The electronic compass 118 calculates the azimuth and tilt (Roll rotation angle) of the imaging device 1 from the magnetism of the earth, and outputs the azimuth and tilt information. This azimuth/tilt information is an example of related information (metadata) according to Exif, and is used for image processing such as image correction of captured images. The related information also includes data such as the date and time when the image was captured and the data volume of the image data.

<<通信端末>>
次に、図７を用いて、通信端末３のハードウェア構成を説明する。なお、図７は、無線通信機能を有したクレードルの場合の通信端末３のハードウェア構成図である。 <<communication terminal>>
Next, the hardware configuration of the communication terminal 3 will be explained using FIG. Note that FIG. 7 is a hardware configuration diagram of the communication terminal 3 in the case of a cradle having a wireless communication function.

図７に示されているように、通信端末３は、通信端末３全体の動作を制御するＣＰＵ３０１、基本入出力プログラムを記憶したＲＯＭ３０２、ＣＰＵ３０１のワークエリアとして使用されるＲＡＭ(Random Access Memory)３０４、Wi-Fi（登録商標）、ＮＦＣ、ＬＴＥ等でデータ通信する通信部３０５、撮像装置１と有線で通信するためのＵＳＢ I/F３０３、カレンダーや時間情報を保持するＲＴＣ（Real Time Clock）３０６を有している。 As shown in FIG. 7, the communication terminal 3 includes a CPU 301 that controls the overall operation of the communication terminal 3, a ROM 302 that stores basic input/output programs, and a RAM (Random Access Memory) 304 that is used as a work area for the CPU 301. , Wi-Fi (registered trademark), NFC, LTE, etc., a communication unit 305 for data communication, a USB I/F 303 for wired communication with the imaging apparatus 1, and an RTC (Real Time Clock) 306 for holding calendar and time information. have.

また、上記各部を電気的に接続するためのアドレスバスやデータバス等のバスライン３１０を備えている。 It also has a bus line 310 such as an address bus and a data bus for electrically connecting the above units.

なお、ＲＯＭ３０２には、ＣＰＵ３０１が実行するオペレーティングシステム(OS)、その他のプログラム、及び、種々データが記憶されている。 The ROM 302 stores an operating system (OS) executed by the CPU 301, other programs, and various data.

通信部３０５は、アンテナ３０５ａを利用して無線通信信号により、無線ルータ９ａ等と通信を行う。 The communication unit 305 uses an antenna 305a to communicate with the wireless router 9a or the like using a wireless communication signal.

図示する他、ＧＰＳ（Global Positioning Systems）衛星又は屋内ＧＰＳとしてのＩＭＥＳ(Indoor MEssaging System）によって通信端末３の位置情報（緯度、経度、及び高度）を含んだＧＰＳ信号を受信するＧＰＳ受信部を備えていてもよい。 In addition to the illustration, a GPS receiver for receiving a GPS signal containing position information (latitude, longitude, and altitude) of the communication terminal 3 from a GPS (Global Positioning Systems) satellite or IMES (Indoor Messaging System) as an indoor GPS is provided. may be

<<画像管理装置、端末装置>>
図８を用いて、画像管理装置５及びノートＰＣの場合の端末装置７のハードウェア構成を説明する。なお、図８は、画像管理装置５及び端末装置７のハードウェア構成図である。画像管理装置５及び端末装置７はともにコンピュータであるため、以下では、画像管理装置５の構成について説明する。端末装置７の構成は画像管理装置５と同様であるとし、相違があるとしても本実施形態の説明に関し支障がないものとする。 <<Image management device, terminal device>>
The hardware configuration of the image management device 5 and the terminal device 7 in the case of a notebook PC will be described with reference to FIG. 8 is a hardware configuration diagram of the image management device 5 and the terminal device 7. As shown in FIG. Since both the image management device 5 and the terminal device 7 are computers, the configuration of the image management device 5 will be described below. It is assumed that the configuration of the terminal device 7 is the same as that of the image management device 5, and even if there is a difference, it does not hinder the explanation of the present embodiment.

画像管理装置５は、画像管理装置５全体の動作を制御するＣＰＵ５０１、ＩＰＬ等のＣＰＵ５０１の駆動に用いられるプログラムを記憶したＲＯＭ５０２、ＣＰＵ５０１のワークエリアとして使用されるＲＡＭ５０３を有する。また、画像管理装置５用のプログラム等の各種データを記憶するＨＤ５０４、ＣＰＵ５０１の制御にしたがってＨＤ５０４に対する各種データの読み出し又は書き込みを制御するＨＤＤ(Hard Disk Drive)５０５を有する。また、フラッシュメモリ等の記録メディア５０６に対するデータの読み出し又は書き込み（記憶）を制御するメディアドライブ５０７、カーソル、メニュー、ウィンドウ、文字、又は画像などの各種情報を表示するディスプレイ５０８を有する。ディスプレイ５０８にはタッチパネルが装着されていることが好ましい。また、通信ネットワーク９を利用してデータ通信するためのネットワークＩ／Ｆ５０９、文字、数値、各種指示などの入力のための複数のキーを備えたキーボード５１１、各種指示の選択や実行、処理対象の選択、カーソルの移動などを行うマウス５１２を有する。また、着脱可能な記録媒体の一例としてのＣＤ－ＲＯＭ(Compact Disc Read Only Memory)５１３に対する各種データの読み出し又は書き込みを制御するＣＤ－ＲＯＭドライブ５１４を有する。また、上記各構成要素を図８に示されているように電気的に接続するためのアドレスバスやデータバス等のバスライン５１０を備えている。 The image management apparatus 5 has a CPU 501 that controls the overall operation of the image management apparatus 5 , a ROM 502 that stores programs used to drive the CPU 501 such as IPL, and a RAM 503 that is used as a work area for the CPU 501 . It also has an HD 504 that stores various data such as programs for the image management apparatus 5 and an HDD (Hard Disk Drive) 505 that controls reading and writing of various data to and from the HD 504 under the control of the CPU 501 . It also has a media drive 507 that controls reading or writing (storage) of data to a recording medium 506 such as a flash memory, and a display 508 that displays various information such as cursors, menus, windows, characters, and images. A touch panel is preferably attached to the display 508 . In addition, a network I/F 509 for data communication using the communication network 9, a keyboard 511 having a plurality of keys for inputting characters, numerical values, various instructions, etc., selection and execution of various instructions, processing target It has a mouse 512 for selection, cursor movement, and the like. It also has a CD-ROM drive 514 that controls the reading and writing of various data to and from a CD-ROM (Compact Disc Read Only Memory) 513 as an example of a removable recording medium. In addition, as shown in FIG. 8, bus lines 510 such as an address bus and a data bus are provided for electrically connecting the components described above.

＜画像処理システムの機能について＞
図９は、本実施形態の画像処理システム２００が有する、撮像装置１、通信端末３、画像管理装置５、及び端末装置７の各機能ブロック図である。 <Functions of the image processing system>
FIG. 9 is a functional block diagram of the imaging device 1, communication terminal 3, image management device 5, and terminal device 7 included in the image processing system 200 of this embodiment.

<<撮像装置の機能構成>>
撮像装置１は、受付部１２、撮像部１３、集音部１４、接続部１５、及び記憶・読出部１９を有している。これら各部は、図６に示されている各構成要素のいずれかが、ＳＲＡＭ１１３からＤＲＡＭ１１４上に展開された撮像装置１用のプログラムに従ったＣＰＵ１１１からの命令によって動作することで実現される機能又は手段である。 <<Function configuration of imaging device>>
The imaging device 1 has a reception unit 12 , an imaging unit 13 , a sound collection unit 14 , a connection unit 15 , and a storage/readout unit 19 . Each of these units is a function or a function realized by any of the components shown in FIG. It is a means.

また、撮像装置１は、図６に示されているＲＯＭ１１２、ＳＲＡＭ１１３、及びＤＲＡＭ１１４の１つ以上によって構築される記憶部１０００を有している。記憶部１０００には撮像装置１用のプログラム及び端末ＩＤが記憶されている。 The imaging apparatus 1 also has a storage unit 1000 configured by one or more of the ROM 112, SRAM 113, and DRAM 114 shown in FIG. A program for the imaging apparatus 1 and a terminal ID are stored in the storage unit 1000 .

撮像装置１の受付部１２は、ユーザ（図４では、設置者Ｘ）から撮像装置１に対する操作入力を受け付ける。なお、撮像装置１は設置者Ｘによる撮像のための操作がなくても自動的かつ周期的に周囲を撮像する。周期の間隔は、設置者Ｘが撮像装置１に設定する。あるいは、本社の管理者である閲覧者Ｙが画像管理装置５を介して設定してもよい。 The reception unit 12 of the imaging device 1 receives an operation input to the imaging device 1 from a user (installer X in FIG. 4). Note that the imaging device 1 automatically and periodically captures images of the surroundings even if the installation person X does not perform an operation for imaging. The period interval is set in the imaging device 1 by the installer X. FIG. Alternatively, viewer Y, who is an administrator at the head office, may set via the image management device 5 .

撮像部１３は、周囲を撮像して画像データを作成する。本実施形態では周囲３６０度が写っている全天球画像の画像データを作成するが、一般的な画角の画像データであってもよい。 The image capturing unit 13 captures an image of the surroundings and creates image data. In this embodiment, the image data of the omnidirectional image showing the surrounding 360 degrees is created, but the image data of a general angle of view may be used.

集音部１４は、撮像装置１の周囲の音を集音して音声データに変換する。音声データが端末装置７に送信される場合、閲覧者Ｙはより臨場感がある状態でＥＣサイトから商品を購入できる。なお、音声データにプライバシー性が高い情報が含まれる可能性があるため、音声データが送信されなくてもよい。 The sound collector 14 collects sounds around the imaging device 1 and converts them into audio data. When the audio data is transmitted to the terminal device 7, the viewer Y can purchase the product from the EC site in a more realistic state. In addition, since there is a possibility that information with high privacy is included in the voice data, the voice data does not have to be transmitted.

接続部１５は、通信端末３からの電力供給を受けると共に、通信端末３とデータ通信を行う。電力はＡＣアダプターなど別の電源から供給されてもよい。また、撮像装置１と通信端末３は無線で通信してもよい。 The connection unit 15 receives power supply from the communication terminal 3 and performs data communication with the communication terminal 3 . Power may be supplied from another power source such as an AC adapter. Also, the imaging device 1 and the communication terminal 3 may communicate wirelessly.

記憶・読出部１９は、記憶部１０００に各種データを記憶したり、記憶部１０００から各種データを読み出したりする。なお、以下では、撮像装置１が記憶部１０００から読み書きする場合でも「記憶・読出部１９を介して」という記載を省略する場合がある。 The storage/readout unit 19 stores various data in the storage unit 1000 and reads out various data from the storage unit 1000 . Note that, hereinafter, even when the imaging apparatus 1 reads and writes from the storage unit 1000, the description “via the storage/readout unit 19” may be omitted.

<<通信端末の機能構成>>
通信端末３は、送受信部３１、受付部３２、接続部３３、及び記憶・読出部３９を有している。これら各部は、図７に示されている各構成要素のいずれかが、ＲＯＭ３０２からＲＡＭ３０４上に展開された通信端末３用のプログラムに従ったＣＰＵ３０１からの命令によって動作することで実現される機能又は手段である。 <<Function configuration of communication terminal>>
The communication terminal 3 has a transmission/reception section 31 , a reception section 32 , a connection section 33 , and a storage/readout section 39 . Each of these units is a function realized by any of the components shown in FIG. It is a means.

また、通信端末３は、図７に示されているＲＯＭ３０２及びＲＡＭ３０４によって構築される記憶部３０００を有している。記憶部３０００には通信端末３用のプログラムが記憶されている。 The communication terminal 3 also has a storage unit 3000 constructed by the ROM 302 and RAM 304 shown in FIG. A program for the communication terminal 3 is stored in the storage unit 3000 .

（通信端末の各機能構成）
通信端末３の送受信部３１は、無線ルータ９ａ及び通信ネットワーク９を介して、画像管理装置５と各種データの送受信を行う。なお、以下では、通信端末３が画像管理装置５と通信する場合でも、「送受信部３１を介して」という記載を省略する場合がある。 (Each functional configuration of the communication terminal)
The transmission/reception unit 31 of the communication terminal 3 transmits/receives various data to/from the image manager 5 via the wireless router 9 a and the communication network 9 . In the following description, even when the communication terminal 3 communicates with the image manager 5, the description "via the transmission/reception unit 31" may be omitted.

接続部３３は、撮像装置１に電力供給すると共に、データ通信を行う。撮像装置１と通信端末３は無線で通信してもよい。通信端末３にはＡＣアダプターなどから電力が供給される。 The connection unit 33 supplies power to the imaging device 1 and performs data communication. The imaging device 1 and the communication terminal 3 may communicate wirelessly. Power is supplied to the communication terminal 3 from an AC adapter or the like.

記憶・読出部３９は、記憶部３０００に各種データを記憶したり、記憶部３０００から各種データを読み出したりする。なお、以下では、通信端末３が記憶部３０００から読み書きする場合でも「記憶・読出部３９を介して」という記載を省略する場合がある。 The storage/readout unit 39 stores various data in the storage unit 3000 and reads out various data from the storage unit 3000 . In addition, hereinafter, even when the communication terminal 3 reads and writes from the storage unit 3000, the description “via the storage/readout unit 39” may be omitted.

<<画像管理装置の機能構成>>
画像管理装置５は、送受信部５１、画像分割部５２、類似度算出部５３、画面作成部５４、判断部５５、画像合成部５６、及び記憶・読出部５９を有している。これら各部は、図８に示されている各構成要素のいずれかが、ＨＤ５０４からＲＡＭ５０３上に展開された画像管理装置５用のプログラムに従ったＣＰＵ５０１からの命令によって動作することで実現される機能又は手段である。 <<Function configuration of the image management device>>
The image management device 5 has a transmission/reception unit 51 , an image division unit 52 , a similarity calculation unit 53 , a screen generation unit 54 , a judgment unit 55 , an image synthesis unit 56 and a storage/readout unit 59 . Each of these units is a function realized by any one of the components shown in FIG. or means.

また、画像管理装置５は、図８に示されているＲＡＭ５０３、及びＨＤ５０４によって構築される記憶部５０００を有している。この記憶部５０００には、店舗管理ＤＢ５００１及び分割数ＤＢ５００２が構築されている。以下、各データベースについて説明する。 The image management apparatus 5 also has a storage unit 5000 constructed by the RAM 503 and the HD 504 shown in FIG. Store management DB 5001 and division number DB 5002 are constructed in this storage unit 5000 . Each database will be described below.

表１は、店舗管理ＤＢ５００１に記憶される各情報をテーブル状に示す店舗管理テーブルを示す。店舗管理テーブルには、店舗ＩＤ、店舗名、住所、拠点レイアウトマップ、端末ＩＤと商品、及び、画角の各項目が対応付けて記憶されている。また、店舗管理テーブルの１つの行をレコードという場合がある。

Table 1 shows a store management table showing each piece of information stored in the store management DB 5001 in tabular form. In the store management table, items such as store ID, store name, address, site layout map, terminal ID and product, and angle of view are stored in association with each other. Also, one row of the store management table may be called a record.

店舗ＩＤは、店舗２を識別するための識別情報の一例である。店舗ＩＤは店舗２に対し重複しないように付与される。店舗ＩＤの一例としては重複しない番号とアルファベットの組み合わせが挙げられる。なお、ＩＤはIdentificationの略であり識別子や識別情報という意味である。ＩＤは複数の対象から、ある特定の対象を一意的に区別するために用いられる名称、符号、文字列、数値又はこれらのうち１つ以上の組み合わせをいう。以下のＩＤについても同様である。 A store ID is an example of identification information for identifying the store 2 . Store IDs are assigned to stores 2 so as not to overlap. An example of the shop ID is a unique combination of numbers and letters. Note that ID is an abbreviation for Identification and means an identifier or identification information. An ID is a name, a code, a character string, a numerical value, or a combination of one or more of these used to uniquely distinguish a specific object from a plurality of objects. The same applies to the following IDs.

店舗名は、店舗２の名称であり、主に閲覧者Ｙが店舗２を判別するために使用される。住所は店舗２の所在を示す。住所は画像処理システム２００が地図上に店舗２の位置を表示する際に使用される。必要に応じて住所は緯度と経度に変換される。 The store name is the name of the store 2 and is mainly used by the viewer Y to identify the store 2 . The address indicates the location of the store 2 . The address is used when the image processing system 200 displays the location of the store 2 on the map. Addresses are converted to latitude and longitude if necessary.

店舗レイアウトマップには、各店舗のレイアウトを示す画像データなどのファイル名が登録される。店舗レイアウトマップにより店舗２における撮像装置１の位置、及び、商品などの位置が２次元座標で特定される。 File names of image data indicating the layout of each store are registered in the store layout map. The position of the imaging device 1 and the positions of the products in the store 2 are specified by two-dimensional coordinates based on the store layout map.

端末ＩＤと商品の項目は撮像装置１と商品を対応付ける項目である。店舗レイアウトマップで商品が選択された場合に、商品に対応付けられている店舗内の撮像装置１を画像管理装置５が特定するために使用される。端末ＩＤは、撮像装置１を識別するための識別情報である。端末ＩＤは、例えば、撮像装置１の例えばシリアル番号、製造番号、型番と重複しない数値、ＩＰアドレス、又は、ＭＡＣアドレスなどであるがこれらには限定されない。表１に示すように、１つの店舗２には１つ以上の撮像装置１（端末ＩＤ）が設置されており、商品の位置が店舗レイアウトマップに登録されている。商品と端末ＩＤは１対１に対応するとは限らず、１つの端末ＩＤに複数の商品が対応付けられる場合がある。 The item of terminal ID and product is an item that associates the imaging device 1 with a product. This is used by the image management device 5 to identify the imaging device 1 in the store associated with the product when the product is selected on the store layout map. The terminal ID is identification information for identifying the imaging device 1 . The terminal ID is, for example, a serial number, manufacturing number, numerical value that does not overlap with the model number, IP address, or MAC address of the imaging device 1, but is not limited to these. As shown in Table 1, one or more imaging devices 1 (terminal IDs) are installed in one store 2, and product positions are registered in the store layout map. Products and terminal IDs do not always correspond one-to-one, and a plurality of products may be associated with one terminal ID.

画角は、被写体である商品が写っている画角である。全天球画像には周囲の３６０度が写っているため、同一種の商品が写っている範囲は全天球画像の一部である。このため、商品に対応付けて画角が登録されている。画角のうちＨ１、Ｖ１は緯度と経度であり、同一種の商品の領域の中央の座標を示す。α１は同一種の商品の領域を画角で指定する。画角は同じ種類の商品の領域を指定する座標情報である。なお、１対の対角頂点で同一種の商品の領域を指定してもよい。画像処理システム２００は全天球画像のこの画角の範囲を指定して端末装置７に送信するので、端末装置７は全天球画像を受信した直後から所定の商品の商品棚の画像を表示できる。 The angle of view is the angle of view in which the product, which is the subject, is shown. Since the omnidirectional image captures the surrounding 360 degrees, the range in which the same type of product is captured is part of the omnidirectional image. Therefore, the angle of view is registered in association with the product. Of the angles of view, H1 and V1 are latitude and longitude, and indicate the coordinates of the center of the area of the same type of product. α1 designates the area of the same type of product by the angle of view. The angle of view is coordinate information specifying the area of products of the same type. It should be noted that a pair of diagonal vertices may be used to specify areas of the same type of product. Since the image processing system 200 designates the range of this field angle of the omnidirectional image and transmits it to the terminal device 7, the terminal device 7 displays the image of the product shelf of the predetermined product immediately after receiving the omnidirectional image. can.

表２は、分割数ＤＢ５００２に記憶される各情報をテーブル状に示す分割数テーブルを示す。分割数テーブルには全天球画像が分割される際、どのくらいの大きさに分割されるが登録されている。分割数テーブルは画角、分割数、及び、商品の各項目を有する。画角と商品は表１の店舗管理テーブルと同じものである。分割数は画角で指定される領域を縦と横それぞれ何個に分割するかを示す。表２では縦と横の分割数は同じであるが異ならせてもよい。分割数が多いほど分割後の画像の大きさは小さくなる。分割数で分割後の画像の大きさを指定するのでなく分割後の画像の大きさそのものを指定してもよい。

Table 2 shows a division number table showing each piece of information stored in the division number DB 5002 in tabular form. The number of divisions table registers how large the omnidirectional image is to be divided into divisions. The division number table has items of angle of view, division number, and product. The angle of view and products are the same as those in the store management table of Table 1. The number of divisions indicates how many vertical and horizontal divisions the area specified by the angle of view is to be divided. In Table 2, the numbers of vertical and horizontal divisions are the same, but they may be different. The larger the number of divisions, the smaller the size of the image after division. Instead of designating the size of the image after division by the number of divisions, the size of the image after division itself may be designated.

商品ごとに分割後の画像の大きさが決定されることで、動体及び商品棚の変化を画像管理装置５が区別しやすくなり、全天球画像から動体を除去しやすくなる。詳細を図１０にて説明する。なお、必ずしも商品ごとに分割後の画像の大きさを変更する必要はなく、分割後の画像の大きさは商品の種類が異なっても同じでよい。 Determining the size of the image after division for each product makes it easier for the image management device 5 to distinguish between a moving object and changes in product shelves, and to remove the moving object from the omnidirectional image. Details will be described with reference to FIG. Note that it is not necessary to change the size of the image after division for each product, and the size of the image after division may be the same even if the types of products are different.

（画像管理装置の各機能構成）
画像管理装置５の送受信部５１は、通信ネットワーク９を介して通信端末３、又は端末装置７と各種データの送受信を行う。なお、以下では、画像管理装置５が端末装置７と通信する場合でも、「送受信部５１を介して」という記載を省略する場合がある。 (Each functional configuration of the image management device)
The transmission/reception unit 51 of the image manager 5 transmits/receives various data to/from the communication terminal 3 or the terminal device 7 via the communication network 9 . In the following, even when the image management device 5 communicates with the terminal device 7, the description “via the transmission/reception unit 51” may be omitted.

画像分割部５２は、全天球画像を所定の大きさのブロックに分割する。ブロックに分割することで１枚の全天球画像に動体が全く写らないことがなくても動体を除去できる。したがって、来客者数が少ない店舗２の全天球画像は分割されなくてもよい場合がある。また、撮像装置１が撮像する画像が全天球画像でなく一般的な画角である場合、分割されなくてもよい場合がある。ブロックへの分割については図１０にて説明する。 The image dividing unit 52 divides the omnidirectional image into blocks of a predetermined size. By dividing the image into blocks, it is possible to remove the moving object even if the moving object does not appear at all in one omnidirectional image. Therefore, the omnidirectional image of the store 2 with a small number of visitors may not need to be divided. Also, if the image captured by the imaging device 1 is not an omnidirectional image but has a general angle of view, it may not be divided. The division into blocks will be explained in FIG.

類似度算出部５３はブロックに分割された各画像について、上記した出力画像と入力画像の類似度を算出する。詳細は後述する。なお、類似度を算出するため、類似度算出部５３は端末装置７に送信された出力画像を保持しておく。 The similarity calculation unit 53 calculates the similarity between the output image and the input image described above for each image divided into blocks. Details will be described later. In order to calculate the similarity, the similarity calculator 53 holds the output image transmitted to the terminal device 7 .

判断部５５は、類似度が閾値以上か否かを判断することで、出力画像を更新するか否かを決定する。すなわち、入力画像が被写体の画像かどうかを判断する。類似度が閾値上の場合、出力画像を現在の入力画像で更新すると決定し、類似度が閾値未満の場合、１周期前の出力画像をそのまま出力すると決定する。 The determination unit 55 determines whether or not to update the output image by determining whether the degree of similarity is greater than or equal to the threshold. That is, it is determined whether or not the input image is the image of the subject. If the similarity is above the threshold, it is determined to update the output image with the current input image.

画像合成部５６は、判断部５５が出力画像に決定した分割後の画像を合成して元の全天球画像を生成する。画面作成部５４は、画像データを端末装置７に送信する際に、ＨＴＭＬデータ（又はＸＨＴＭＬデータ）、JavaScript（登録商標）及びＣＳＳ（Cascade Style Sheet）などで端末装置７が全天球画像を表示するためのＷｅｂページを作成する。端末装置７はＷｅｂページを解析して画面を表示するため、Ｗｅｂページは画面情報になる。なお、本実施形態でＷｅｂページと称した場合、Ｗｅｂアプリにより適宜構築されるＷｅｂページが含まれるものとする。Ｗｅｂアプリとは、Ｗｅｂブラウザ上で動作するスクリプト言語（たとえばJavaScript（登録商標））によるプログラムとＷｅｂサーバ側のプログラムが協調することによって動作し、Ｗｅｂブラウザ上で使用されるソフトウェア又はその仕組みを言う。 The image synthesizing unit 56 synthesizes the divided images determined by the determining unit 55 as the output image to generate the original omnidirectional image. When the image data is transmitted to the terminal device 7, the screen creation unit 54 causes the terminal device 7 to display the omnidirectional image using HTML data (or XHTML data), JavaScript (registered trademark), CSS (Cascade Style Sheet), or the like. Create a web page for Since the terminal device 7 analyzes the web page and displays the screen, the web page becomes screen information. Note that the term "web page" in the present embodiment includes a web page appropriately constructed by a web application. A web application is software or its mechanism that operates through cooperation between a script language (such as JavaScript (registered trademark)) program that runs on a web browser and a program on the web server side, and that is used on the web browser. .

記憶・読出部５９は、記憶部５０００に各種データを記憶したり、記憶部５０００から各種データを読み出したりする。なお、以下では、画像管理装置５が記憶部５０００から読み書きする場合でも「記憶・読出部５９を介して」という記載を省略する場合がある。 The storage/readout unit 59 stores various data in the storage unit 5000 and reads out various data from the storage unit 5000 . In the following, even when the image management device 5 reads and writes from the storage unit 5000, the description “via the storage/readout unit 59” may be omitted.

＜端末装置の機能構成＞
端末装置７は、送受信部７１、受付部７２、表示制御部７３、及び、記憶・読出部７９を有している。これら各部は、図８に示されている各構成要素のいずれかが、ＨＤ５０４からＲＡＭ５０３上に展開された端末装置７用のプログラムに従ったＣＰＵ５０１からの命令によって動作することで実現される機能又は手段である。 <Functional configuration of terminal device>
The terminal device 7 has a transmission/reception section 71 , a reception section 72 , a display control section 73 , and a storage/readout section 79 . Each of these units is a function realized by any of the components shown in FIG. It is a means.

また、端末装置７は、図８に示されているＲＡＭ５０３、及びＨＤ５０４によって構築される記憶部７０００を有している。記憶部７０００には端末装置７用のプログラムが記憶されている。端末装置７用のプログラムは、例えばブラウザソフトウェアであるが、ブラウザソフトウェアのような通信機能を備えたアプリケーションソフトウェアでもよい。 The terminal device 7 also has a storage unit 7000 constructed by the RAM 503 and the HD 504 shown in FIG. A program for the terminal device 7 is stored in the storage unit 7000 . The program for the terminal device 7 is, for example, browser software, but may be application software with a communication function such as browser software.

（端末装置の各機能構成）
端末装置７の送受信部７１は、通信ネットワーク９を介して画像管理装置５と各種データの送受信を行う。なお、以下では、端末装置７が画像管理装置５と通信する場合でも、「送受信部７１を介して」という記載を省略する場合がある。 (Each functional configuration of the terminal device)
A transmission/reception unit 71 of the terminal device 7 transmits and receives various data to and from the image management device 5 via the communication network 9 . In the following, even when the terminal device 7 communicates with the image management device 5, the description “via the transmission/reception unit 71” may be omitted.

受付部７２は、ユーザ（図４では、閲覧者Ｙ）からの操作入力を受け付ける。本実施形態では、商品棚８１（撮像装置１）の選択、閲覧者Ｙが購入する商品の選択、個数の受け付け、画像の回転、及び、商品が写っている画像の拡大や縮小などを受け付ける。 The receiving unit 72 receives an operation input from a user (viewer Y in FIG. 4). In this embodiment, selection of the product shelf 81 (imaging device 1), selection of products to be purchased by the viewer Y, acceptance of the number of products, rotation of the image, enlargement or reduction of the image showing the product, and the like are accepted.

表示制御部７３は、画像管理装置５から送信された画面情報を解釈して端末装置７のディスプレイ５０８に各種画面を表示させるための制御を行なう。 The display control unit 73 interprets the screen information transmitted from the image management device 5 and performs control for displaying various screens on the display 508 of the terminal device 7 .

記憶・読出部７９は、記憶部７０００に各種データを記憶したり、記憶部７０００から各種データを読み出したりする。なお、以下では、端末装置７が記憶部７０００から読み書きする場合でも「記憶・読出部７９を介して」という記載を省略する場合がある。 The storage/readout unit 79 stores various data in the storage unit 7000 and reads out various data from the storage unit 7000 . In the following description, even when the terminal device 7 reads and writes from the storage unit 7000, the description “via the storage/readout unit 79” may be omitted.

＜画像の分割＞
全天球画像は画角が広いため店舗内に動体があれば写してしまう可能性がある。店舗２への来客頻度が高い場合、動体が写っている全天球画像の頻度も多くなる。ある画像に人物８２が写っていたら人物８２が写っていない過去の全天球画像を出力する動体除去方法では、全く人物８２が写っていない全天球画像が来客頻度に応じて過去のものになるため、時間的に古い商品棚８１が表示されてしまう。このため、本実施形態では、全天球画像をブロック状に分割してから動体除去を行う。ブロックに分割すれば１つのブロックに動体が写る頻度が小さくなりそれほど古くない商品棚８１の画像を出力できる。 <Split image>
Since the spherical image has a wide angle of view, there is a possibility that if there is a moving object in the store, it will be captured. When the frequency of visits to the store 2 is high, the frequency of omnidirectional images showing a moving object also increases. In the moving object removal method of outputting a past omnidirectional image in which the person 82 is not captured when the person 82 is captured in a certain image, the omnidirectional image in which the person 82 is not captured at all is removed from the past according to the visitor frequency. Therefore, the product shelf 81 that is older in terms of time is displayed. For this reason, in the present embodiment, moving object removal is performed after the omnidirectional image is divided into blocks. If the image is divided into blocks, the frequency with which a moving object appears in one block is reduced, and an image of the product shelf 81 that is not so old can be output.

図１０を用いて分割後の画像の大きさについて説明する。図１０は分割後の画像の大きさを説明する図の一例である。分割後の画像の大きさは商品（図１０ではドリンク）の大きさと動体の大きさを考慮して決定される。 The size of the image after division will be described with reference to FIG. FIG. 10 is an example of a diagram explaining the size of an image after division. The size of the divided image is determined in consideration of the size of the product (drink in FIG. 10) and the size of the moving object.

図１０（ａ）に示すように分割後の画像の大きさを商品と同程度にした場合を考える。分割後の画像の大きさを点線９１で示す。本実施形態では動体が写っている場合には過去の出力画像を出力し、商品棚８１の商品が減少又は増大した場合には入力画像を出力したい。しかし、分割後の画像の大きさと商品の大きさが同程度であると、動体が写っている場合も商品が減少又は増大した場合も類似度が低くなるおそれがあり区別がつかない。 Consider a case where the size of the image after division is set to the same size as the product, as shown in FIG. 10(a). A dotted line 91 indicates the size of the image after division. In this embodiment, it is desired to output the past output image when a moving object is captured, and to output the input image when the number of products on the product shelf 81 has decreased or increased. However, if the size of the divided image and the size of the product are about the same, there is a possibility that the degree of similarity will be low when a moving object is shown and when the number of products is reduced or increased, making it impossible to distinguish between them.

次に、図１０（ｂ）に示すように分割後の画像の大きさを動体よりも大きくした場合を考える。分割後の画像の大きさを点線９１で示す。分割後の画像に動体が写っている場合、類似度は低下するが、商品の数が減少又は増大した場合も類似度が低くなるため区別がつかないおそれがある。例えば、分割後の画像の一部に人体が重なると、商品の減少時の類似度と近くなり、区別がつかない。 Next, as shown in FIG. 10B, consider the case where the size of the divided image is larger than that of the moving object. A dotted line 91 indicates the size of the image after division. If the divided image contains a moving object, the degree of similarity decreases, but if the number of products decreases or increases, the degree of similarity also decreases, and there is a possibility that the products cannot be distinguished from each other. For example, if a human body overlaps a part of the image after division, the similarity is close to that when the product is reduced, and it cannot be distinguished.

以上から、分割後の画像の大きさは少なくとも商品よりも大きく、動体の大きさ以下であることが好ましい。更に、好ましくは、分割後の画像の大きさは商品の大きさの数倍以上であり、かつ、動体により完全に覆い隠される程度である。こうすることで、動体が写っている分割後の画像の類似度は十分に低くなり、商品の数が減少又は増大した場合の類似度はそれよりも高くなるので、両者を判別しやすくなる。表２の分割数テーブルは両者を判別しやすくなるように分割数が決定されている。本実施形態ではあくまで一例であるが、商品が４つ程度入る大きさに全天球画像を分割する。 From the above, it is preferable that the size of the image after division is at least larger than the product and equal to or smaller than the size of the moving object. Furthermore, preferably, the size of the image after division is several times or more the size of the product, and is completely covered by the moving object. By doing this, the similarity of the divided images showing the moving object is sufficiently low, and the similarity is higher when the number of products is decreased or increased, so that both can be easily distinguished. In the division number table of Table 2, division numbers are determined so that both can be easily distinguished. Although this embodiment is merely an example, the omnidirectional image is divided into a size that can contain about four products.

＜類似度＞
類似度は分割後の画像と画像がどのくらい似ているかに関する指標である。一致度や相関度と称してもよい。 <Similarity>
The degree of similarity is an index of how similar an image after division is to another image. It may also be called the degree of matching or the degree of correlation.

式（１）のＲは相互相関係数であり、Ｉは例えば入力画像、Ｔは出力画像であり、（ｉ、ｊ）は画素位置を示す。Ｎは横方向の画素数であり、Ｍは縦方向の画素数である。相互相関係数Ｒは１が最大で０が最小である。相互相関係数Ｒが大きいほど類似していることを意味する。

R in equation (1) is a cross-correlation coefficient, I is, for example, an input image, T is an output image, and (i, j) indicates a pixel position. N is the number of pixels in the horizontal direction and M is the number of pixels in the vertical direction. The cross-correlation coefficient R has a maximum value of 1 and a minimum value of 0. A larger cross-correlation coefficient R means more similarity.

更に、入力画像の画素値の平均を入力画像から減じ、出力画像の画素値の平均を出力画像から減じて式（１）と同様の計算を行ってもよい。これにより、入力画像の明るさが変動しても類似度を安定的に計算することができる。 Furthermore, the average of the pixel values of the input image may be subtracted from the input image, and the average of the pixel values of the output image may be subtracted from the output image to perform calculations similar to Equation (1). As a result, the similarity can be stably calculated even if the brightness of the input image fluctuates.

また、画素値の差の絶対値の合計であるＳＡＤ（Sum of Absolute Difference）、輝度値の差の二乗の合計であるＳＳＤ(Sum of Squared Difference)、を用いてもよい。しかしながら、ＳＡＤ又はＳＳＤでは商品の数に変化がないのに照明や日差しによる明るさの変化で値が大きくなる傾向がある。このため、類似度は相互相関係数により求めることが好適である。ただし、店舗内では照明が常時点灯されているため明るさの変化は少なく、ＳＡＤ又はＳＳＤが利用されてもよい。 Alternatively, SAD (Sum of Absolute Difference), which is the sum of absolute values of differences in pixel values, and SSD (Sum of Squared Difference), which is the sum of squares of differences in luminance values, may be used. However, with SAD or SSD, even though the number of products does not change, the value tends to increase due to changes in brightness due to lighting or sunlight. Therefore, it is preferable to obtain the degree of similarity from the cross-correlation coefficient. However, since the lights are always on in the store, there is little change in brightness, so SAD or SSD may be used.

なお、人の顔を認識する技術が知られており、動体が人物８２であるとすると、画像管理装置５がこの技術を使って動体を検出することも考えられる。しかし、撮像装置１が商品棚８１を撮像する際、人は背後から撮像されるので顔が写らない。このため、人物８２の顔の検出は困難であり、本実施形態の動体の検出には向いていない。 A technique for recognizing a person's face is known, and if the moving object is the person 82, it is conceivable that the image management device 5 uses this technique to detect the moving object. However, when the imaging device 1 captures an image of the product shelf 81, the face of the person is not captured because the image is captured from behind. For this reason, it is difficult to detect the face of the person 82, and is not suitable for detecting a moving object in this embodiment.

＜出力画像の決定＞
図１１を用いて、類似度に基づく出力画像の決定方法について説明する。図１１（ａ）は類似度を示し、図１１（ｂ）は入力画像を示し、図１１（ｃ）は出力画像を示す。判断部５５は、出力画像と入力画像の類似度が閾値以上かどうかの判断を繰り返す。 <Determination of output image>
A method of determining an output image based on the degree of similarity will be described with reference to FIG. FIG. 11(a) shows the degree of similarity, FIG. 11(b) shows the input image, and FIG. 11(c) shows the output image. The determination unit 55 repeatedly determines whether or not the degree of similarity between the output image and the input image is equal to or greater than the threshold.

時刻ｔ_０：撮像装置１が全天球画像を初めて撮像した場合は、入力画像８６_ｔ_０がそのまま出力画像である。 Time t ₀ : When the imaging device 1 captures an omnidirectional image for the first time, the input image 86 — t ₀ is the output image as it is.

時刻ｔ_１：商品棚８１の商品に変化がないので、入力画像８６_ｔ_１と時刻ｔ_０の出力画像８７_ｔ_０の類似度は高く閾値以上となる。判断部５５は時刻ｔ_１の入力画像８６_ｔ_１を出力画像８７_ｔ_１に決定する。 Time t ₁ : Since there is no change in the products on the product shelf 81, the similarity between the input image 86_t ₁ and the output image 87_t ₀ at time t ₀ is high and equal to or greater than the threshold. _The determination unit 55 determines the input image _{86_t1} at time t1 as the output image _{87_t1} .

時刻ｔ_２：人物８２が写ったため、入力画像８６_ｔ_２と時刻ｔ_１の出力画像８７_ｔ_１の類似度は閾値未満となる。判断部５５は時刻ｔ_１の出力画像８７_ｔ_１をそのまま出力画像に決定する。 Time t2: Since the person 82 is captured, the similarity between the input image _{86_t2} and the output image _{87_t1} at time t1 is _less _than the threshold. _The determination unit 55 determines the output image _{87_t1} at time t1 as the output image as it is.

時刻ｔ_３～ｔ_５：引き続き人物８２が写っているため、入力画像８６_ｔ_３～８６_ｔ_５と１周期前の出力画像８７_ｔ_２～８７_ｔ_４の類似度は閾値未満となる。判断部５５は１周期前の出力画像８７_ｔ_２～８７_ｔ_４をそのまま出力画像に決定する。 Time t ₃ to t ₅ : Since the person 82 continues to appear, the degree of similarity between the input images 86_t ₃ to 86_t ₅ and the output images 87_t ₂ to 87_t ₄ one cycle earlier is less than the threshold. The determination unit 55 determines the output images 87 — t ₂ to 87 — t ₄ of one period before as the output images as they are.

時刻ｔ_６：人物８２が商品を取って商品棚８１の前から立ち去ったため、入力画像８６_ｔ_６には商品棚８１が写るが商品が１つ少なくっている。このため、時刻ｔ６の入力画像と時刻ｔ_５の出力画像の類似度は、商品に全く変更がない状況よりも小さくなり人物８２が写っている場合よりも大きくなる。以下、この類似度と「中程度の類似度」という。中程度の類似度は閾値以上なので、判断部５５は時刻ｔ６の入力画像８６_ｔ_６を出力画像に決定する。 Time t ₆ : Since the person 82 picked up the product and left in front of the product shelf 81 , the product shelf 81 is shown in the input image 86_t ₆ but there is one less product. For this reason, the degree of similarity between the input image at time _t6 and the output image at time t5 is lower than when there is no change in the product and higher than when the person 82 is shown. Hereinafter, this degree of similarity will be referred to as “moderate degree of similarity”. Since the intermediate degree of similarity is equal to or higher than the threshold, the determination unit 55 determines the input image _{86_t6} at time t6 as the output image.

時刻ｔ_ｎ：その後、商品の数に変更がないとすると、時刻ｔ_６の出力画像８７_ｔ_６と時刻ｔ_ｎの入力画像８６_ｔ_ｎの類似度が高くなるので（類似度はほぼ１）、判断部５５は時刻ｔ_ｎの入力画像８６_ｔ_ｎを出力画像に決定する。 Time _tn : After that, _if there is no change in the number of products, the similarity between the output image _{87_t6} at time t6 and the input image _{86_tn} at time tn increases ₍ similarity is approximately 1). 55 determines the input image _{86_tn} at time _tn as the output image.

時刻ｔ_２～ｔ_５の出力画像を見ると分かるように、入力画像から動体が検出された場合、出力画像８７_ｔ_２～８７_ｔ_５は過去の出力画像８７_ｔ_１がそのまま出力画像になるので、閲覧者Ｙはほぼリアルタイムに（数秒程度の遅れで）端末装置７が表示する商品を閲覧できる。また、店舗２の人物８２のプライバシーを守ることができる。 As can be seen from the output images at times t ₂ to t ₅ , when a moving object is detected from the input image, the output images 87_t ₂ to 87_t ₅ are the past output images 87_t ₁ as they are. Y can browse the products displayed by the terminal device 7 almost in real time (with a delay of about several seconds). Also, the privacy of the person 82 in the store 2 can be protected.

類似度算出部５３と判断部５５は分割後の画像ごとに類似度の算出と出力画像の決定を行う。そして、全ての分割後の画像で出力画像を決定すると、画像合成部５６が分割された画像を１つの全天球画像に合成する。 The similarity calculation unit 53 and the determination unit 55 calculate the similarity and determine the output image for each divided image. Then, when the output image is determined for all divided images, the image synthesizing unit 56 synthesizes the divided images into one omnidirectional image.

＜動作手順＞
図１２は画像処理システム２００が画像を提供する全体的な手順を示すシーケンス図の一例である。図１２では閲覧者Ｙが画像管理装置５をＥＣサイトとして利用するシーンが想定されている。 <Operation procedure>
FIG. 12 is an example of a sequence diagram showing an overall procedure for image processing system 200 to provide an image. In FIG. 12, a scene is assumed in which the viewer Y uses the image management device 5 as an EC site.

S12-1：端末装置７は画像管理装置５にログインする。一般的なＥＣサイトでは閲覧者が会員登録しており、ログインすることで閲覧者Ｙに関する情報（住所、電話番号、メールアドレス等）が特定される。なお、ログインせずに閲覧者Ｙが商品を購入することも可能であるが、この場合、閲覧者Ｙは住所等を入力する必要がある。 S12-1: The terminal device 7 logs into the image management device 5. FIG. In a typical EC site, a viewer is registered as a member, and information (address, telephone number, e-mail address, etc.) about the viewer Y is specified by logging in. It is also possible for the viewer Y to purchase the product without logging in, but in this case the viewer Y needs to enter his/her address and the like.

S12-2：ログインした端末装置７に対し画像管理装置５の画面作成部５４は、閲覧者Ｙの住所から閲覧者Ｙの近くの店舗２の一覧を含む店舗一覧画面を作成する。画面作成部５４は予め登録されている閲覧者Ｙの住所又は端末装置７から送信された端末装置７の位置情報と店舗管理テーブルの住所を比較して、例えば、最寄りの店舗２を含む１０店舗程度を特定する。そして、地図上に店舗２の位置を明示する店舗一覧画面を作成する。画像管理装置５の送受信部５１は店舗一覧画面を端末装置７に送信する。店舗一覧画面の一例を図１３に示す。 S12-2: The screen creation unit 54 of the image management device 5 creates a store list screen including a list of stores 2 near the viewer Y from the viewer Y's address for the logged-in terminal device 7 . The screen creation unit 54 compares the pre-registered address of the viewer Y or the position information of the terminal device 7 transmitted from the terminal device 7 with the address of the store management table, and finds, for example, 10 stores including the nearest store 2. Specify the extent. Then, a store list screen that clearly indicates the location of the store 2 on the map is created. The transmission/reception unit 51 of the image management device 5 transmits the shop list screen to the terminal device 7 . An example of the store list screen is shown in FIG.

S12-3：端末装置７の送受信部７１は店舗一覧画面を受信し、表示制御部７３が店舗一覧画面をディスプレイ５０８に表示する。閲覧者Ｙが商品を購入したい店舗２を選択すると受付部が商品の選択を受け付ける。 S12-3: The transmitting/receiving unit 71 of the terminal device 7 receives the store list screen, and the display control unit 73 displays the store list screen on the display 508. FIG. When the viewer Y selects the store 2 where he/she wants to purchase the product, the reception section receives the selection of the product.

S12-4：端末装置７の送受信部７１は店舗２を特定するための店舗ＩＤを画像管理装置５に送信する。画像管理装置５の送受信部５１は店舗ＩＤを受信する。これにより、店舗２を特定できたので、画像管理装置５はこの店舗２の商品画面を端末装置７に送信できるようになる。 S12-4: The transmitting/receiving section 71 of the terminal device 7 transmits the shop ID for specifying the shop 2 to the image management device 5. FIG. The transmitter/receiver 51 of the image manager 5 receives the store ID. As a result, the store 2 has been specified, so the image management device 5 can transmit the product screen of this store 2 to the terminal device 7 .

S12-5：店舗２の撮像装置１は一例として周期的に周囲を撮像する。 S12-5: As an example, the imaging device 1 of the store 2 periodically images the surroundings.

S12-6：撮像装置１は通信端末３を介して全天球画像、店舗ＩＤ及び装置ＩＤを画像管理装置５に送信する。説明の便宜上、図１２では通信端末３が省略されている。また、店舗内の全ての撮像装置１がそれぞれ全天球画像を送信する。 S<b>12 - 6 : The imaging device 1 transmits the omnidirectional image, the store ID and the device ID to the image management device 5 via the communication terminal 3 . For convenience of explanation, the communication terminal 3 is omitted in FIG. Also, all the imaging devices 1 in the store transmit omnidirectional images.

S12-7：画像管理装置５の送受信部５１は全天球画像、店舗ＩＤ及び装置ＩＤを受信し、類似度算出部５３及び判断部５５が出力画像を決定する。この処理については図１５を用いて説明する。 S12-7: The transmission/reception unit 51 of the image management device 5 receives the omnidirectional image, the store ID, and the device ID, and the similarity calculation unit 53 and determination unit 55 determine the output image. This processing will be described with reference to FIG.

S12-8：画像管理装置５の送受信部５１は全天球画像を含む商品画面を端末装置７に送信する。商品画面の一例を図１４に示す。 S12-8: The transmission/reception unit 51 of the image management device 5 transmits the product screen including the omnidirectional image to the terminal device 7 . An example of the product screen is shown in FIG.

S12-9：端末装置７の送受信部７１は商品画面を受信して、表示制御部７３は商品画面をディスプレイ５０８に表示する。閲覧者Ｙが商品画面で購入したい商品の商品棚８１を選択すると受付部７２が商品棚８１の選択を受け付ける。 S12-9: The transmitting/receiving unit 71 of the terminal device 7 receives the product screen, and the display control unit 73 displays the product screen on the display 508. FIG. When the viewer Y selects the product shelf 81 of the product to be purchased on the product screen, the reception unit 72 receives the selection of the product shelf 81 .

S12-10：端末装置７の送受信部７１は商品棚８１に対応付けられた端末ＩＤを画像管理装置５に送信する。これにより、画像管理装置５は撮像装置１を特定してこの撮像装置１が撮像した全天球画像を端末装置７に送信できるようになる。 S12-10: The transmitting/receiving section 71 of the terminal device 7 transmits the terminal ID associated with the product shelf 81 to the image management device 5 . As a result, the image management device 5 can specify the imaging device 1 and transmit the omnidirectional image captured by the imaging device 1 to the terminal device 7 .

S12-11：画像管理装置５の送受信部５１は端末ＩＤで特定される撮像装置１が送信した全天球画像を画角と共に端末装置７に送信する。全天球画像には周囲３６０度が撮像されているので、この画角は商品が撮像されている範囲を指定するために使用される。端末装置７の表示制御部７３はこの画角で指定される範囲をディスプレイ５０８に表示する。 S12-11: The transmission/reception unit 51 of the image management device 5 transmits the omnidirectional image transmitted by the imaging device 1 identified by the terminal ID to the terminal device 7 together with the angle of view. Since the omnidirectional image captures the surrounding 360 degrees, this angle of view is used to specify the range in which the product is captured. The display control unit 73 of the terminal device 7 displays the range designated by this angle of view on the display 508 .

なお、ステップＳ１２－１０の全天球画像の送信タイミングは一例であって、端末装置７から端末ＩＤが送信されたかどうかに関係なく、出力画像の決定が終わった全天球画像を遅滞なく画像管理装置５が端末装置７に送信してもよい。ステップＳ１２－１０のように端末ＩＤで指定された撮像装置１の全天球画像のみが送信される場合は、通信負荷を低減できる。出力画像の決定が終わった全天球画像を遅滞なく画像管理装置５が端末装置７に送信する場合は、閲覧者Ｙが商品棚８１（全天球画像）を選択した時に短時間で全天球画像を切り替えることができる。 Note that the transmission timing of the omnidirectional image in step S12-10 is an example, and regardless of whether the terminal ID has been transmitted from the terminal device 7, the omnidirectional image for which the output image has been determined can be displayed without delay. The management device 5 may transmit to the terminal device 7 . When only the omnidirectional image of the imaging device 1 specified by the terminal ID is transmitted as in step S12-10, the communication load can be reduced. When the image management device 5 transmits the omnidirectional image for which the output image has been determined to the terminal device 7 without delay, when the viewer Y selects the product shelf 81 (omnidirectional image), the omnidirectional image is displayed in a short time. You can switch between sphere images.

＜画面例＞
図１３は端末装置７が表示した店舗一覧画面６０１の一例を示す。店舗一覧画面６０１は、地図表示欄６０２、検索欄６０３、店舗情報欄６０４、及び、決定ボタン６０５を有している。地図表示欄６０２は、閲覧者Ｙを中心とする地図を表示する欄であり、閲覧者の周囲にある店舗２を表示する。店舗２の数は予め決まっていてもよいし、閲覧者Ｙが設定してもよい。閲覧者Ｙは地図を拡大又は縮小することができ、より多くの店舗２を表示させたり、店舗２までのより詳細な経路を確認したりできる。 <Screen example>
FIG. 13 shows an example of a store list screen 601 displayed by the terminal device 7. As shown in FIG. The store list screen 601 has a map display column 602 , a search column 603 , a store information column 604 and an enter button 605 . A map display column 602 is a column for displaying a map centered on the viewer Y, and displays stores 2 around the viewer. The number of stores 2 may be determined in advance, or may be set by the viewer Y. The viewer Y can enlarge or reduce the map, display more stores 2, and confirm more detailed routes to the stores 2. FIG.

検索欄６０３は更に入力欄６０３ａと検索ボタン６０３ｂを有する。入力欄６０３ａは店舗名や場所を閲覧者Ｙが入力するための欄であり、検索ボタン６０３ｂは検索要求を画像管理装置５に端末装置７が送信するためのボタンである。店舗２が検索されると地図表示欄６０２も更新され、検索にヒットした店舗２を中心とする地図が表示される。 The search field 603 further has an input field 603a and a search button 603b. An input field 603a is a field for the viewer Y to input a store name and location, and a search button 603b is a button for the terminal device 7 to transmit a search request to the image management device 5. FIG. When the store 2 is searched, the map display field 602 is also updated, and a map centering on the store 2 hit by the search is displayed.

店舗情報欄６０４は、選択中の店舗２の詳細な情報を表示するための欄である。送信直後の店舗一覧画面６０１では最寄り又は最期に利用した店舗２の詳細な情報が表示される。閲覧者Ｙはマウスやタッチパネルなどのポインティングデバイスで地図表示欄６０２から任意の店舗２を選択できる。決定ボタン６０５は、選択中の店舗２の画像を表示する旨を端末装置７が画像管理装置５に送信するためのボタンである。このように閲覧者Ｙは任意の店舗２を選択して、商品を選択できる。 A store information column 604 is a column for displaying detailed information on the selected store 2 . The store list screen 601 immediately after the transmission displays detailed information on the nearest or last used store 2 . The viewer Y can select an arbitrary store 2 from the map display field 602 with a pointing device such as a mouse or a touch panel. The decision button 605 is a button for the terminal device 7 to transmit to the image management device 5 that the image of the shop 2 being selected is to be displayed. In this way, the viewer Y can select an arbitrary store 2 and select a product.

地図表示欄６０２に代わって又は地図表示欄６０２と共に店舗２のリストをテキストデータで表示してもよい。 Instead of the map display column 602 or together with the map display column 602, the list of stores 2 may be displayed as text data.

図１４は、端末装置７が表示した商品画面６１１の一例を示す。商品画面６１１は、店舗レイアウトマップ欄６１２、商品画像欄６１３、カート６１４、及び、レジに進むボタン６１５を有している。店舗レイアウトマップ欄６１２には、店舗レイアウトマップが表示される。店舗レイアウトマップ欄６１２は店内のどこにどの商品があるかというマップを簡略化して表示する。例えば、商品棚アイコン６１２ａに商品のカテゴリーが表示されており、各商品棚アイコン６１２ａはポインティングデバイス６１８による選択を受け付けるボタンを兼用している。表１の店舗管理テーブルに示したように各商品棚アイコン６１２ａは撮像装置１の端末ＩＤと対応付けられており（リンクされている又は埋め込まれている）、商品棚アイコン６１２ａが選択されると撮像装置１の端末ＩＤが特定される。選択中の商品棚アイコン６１２ａは反転表示される。 FIG. 14 shows an example of a product screen 611 displayed by the terminal device 7. As shown in FIG. The product screen 611 has a store layout map column 612 , a product image column 613 , a cart 614 , and a checkout button 615 . A store layout map is displayed in the store layout map column 612 . A store layout map column 612 displays a simplified map showing which products are located where in the store. For example, product shelf icons 612 a display categories of products, and each product shelf icon 612 a also serves as a button for accepting selection by the pointing device 618 . As shown in the store management table of Table 1, each product shelf icon 612a is associated (linked or embedded) with the terminal ID of the imaging device 1, and when the product shelf icon 612a is selected, A terminal ID of the imaging device 1 is specified. The product shelf icon 612a being selected is highlighted.

商品画像欄６１３には、店舗レイアウトマップで選択された商品棚アイコン６１２ａに対応付けられた撮像装置１の全天球画像が表示される。表示された直後の画角は画像管理装置５から指定されるが、閲覧者Ｙは全天球画像を回転させ任意の方向を表示させることができる。全天球画像には、１つの商品（商品のカテゴリーでなく個別の１商品）が占める領域ごとに領域に写っている商品の商品名とその価格が対応付けられている。領域は画角により特定される。端末装置７の受付部７２はポインティングデバイス６１８のクリック（又はタップ）を受け付けると、受け付けた領域に対応付けられた商品とその価格を表示制御部７３がポップアップ画像６１７で表示する。単なるマウスオーバーによりポップアップ画像６１７を表示してもよい。 The product image column 613 displays the omnidirectional image of the imaging device 1 associated with the product shelf icon 612a selected on the store layout map. The angle of view immediately after being displayed is designated by the image management device 5, but the viewer Y can rotate the omnidirectional image to display any direction. In the omnidirectional image, for each area occupied by one product (one individual product, not a product category), the product name and the price of the product appearing in the area are associated with each other. A region is specified by an angle of view. When the accepting unit 72 of the terminal device 7 accepts the click (or tap) of the pointing device 618 , the display control unit 73 displays the product and its price associated with the accepted area in a pop-up image 617 . A pop-up image 617 may be displayed by a simple mouseover.

閲覧者Ｙは所望の商品かどうかを確認して、所望の商品であればポインティングデバイス６１８をカート６１４にドラッグ＆ドロップする。受付部７２は商品がカート６１４にドラッグ＆ドロップされたことを検出し、カート６１４に商品を対応付ける。カート６１４に対するクリック（又はタップ）等の操作により、表示制御部７３がカート６１４に入っている商品のリスト、価格、及び、合計金額等を表示する。 Viewer Y confirms whether or not the product is the desired product, and drags and drops the pointing device 618 onto the cart 614 if the product is the desired product. The reception unit 72 detects that the product has been dragged and dropped onto the cart 614 and associates the product with the cart 614 . When the cart 614 is clicked (or tapped) or otherwise operated, the display control unit 73 displays the list of products in the cart 614, the price, the total amount, and the like.

レジに進むボタン６１５はカートの商品を決済する画面に遷移するためのボタンである。閲覧者Ｙはこの画面で商品の配送先を設定したり合計金額を支払ったりすることができる。 A checkout button 615 is a button for transitioning to a screen for settling the items in the cart. The viewer Y can set the delivery destination of the product and pay the total amount on this screen.

＜出力画像の決定＞
図１５は、画像管理装置５が出力画像を決定する手順を示すフローチャート図の一例である。図１５の処理は撮像装置１が１つの全天球画像を撮像し、画像管理装置５に送信するごとにスタートする。 <Determination of output image>
FIG. 15 is an example of a flowchart showing a procedure for the image management device 5 to determine an output image. The processing in FIG. 15 starts each time the imaging device 1 captures one omnidirectional image and transmits it to the image management device 5 .

まず、画像分割部５２が全天球画像を分割する（ステップＳ１５－１）。画像分割部５２は分割数テーブルを参照し、画角に対応付けられた分割数で全天球画像を分割する。分割数テーブルに登録されていない画角については予め決まった大きさに分割する。また、全天球画像の全体を分割する必要はない。店舗２の天井や撮像装置１の真下付近には動体が存在しないと考えてよいためである。したがって、画像分割部５２は天井や撮像装置１の真下付近の画角を分割しなくてもよい。 First, the image dividing unit 52 divides the omnidirectional image (step S15-1). The image dividing unit 52 refers to the division number table and divides the omnidirectional image by the division number associated with the angle of view. Angles of view not registered in the division number table are divided into predetermined sizes. Also, it is not necessary to divide the entire omnidirectional image. This is because it can be assumed that there is no moving object on the ceiling of the store 2 or in the vicinity directly below the imaging device 1 . Therefore, the image dividing unit 52 does not need to divide the angle of view near the ceiling or directly below the imaging device 1 .

次に、類似度算出部５３は分割後の画像ごとに１周期前の出力画像と現在の入力画像の類似度を算出する（ステップＳ１５－２）。 Next, the similarity calculator 53 calculates the similarity between the output image of one period before and the current input image for each divided image (step S15-2).

そして、判断部５５が１周期前の出力画像と現在の入力画像の類似度が閾値以上か否かを判断する（ステップＳ１５－３）。 Then, the determination unit 55 determines whether or not the degree of similarity between the output image one period before and the current input image is equal to or greater than a threshold value (step S15-3).

類似度が閾値以上の場合、判断部５５は出力画像を入力画像で更新する（ステップＳ１５－４）。類似度が閾値未満の場合、判断部５５は１周期前の出力画像をそのまま出力画像として出力すると決定する（ステップＳ１５－５）。 If the degree of similarity is greater than or equal to the threshold, the determination unit 55 updates the output image with the input image (step S15-4). If the degree of similarity is less than the threshold, the judgment unit 55 determines to output the output image of one period before as it is (step S15-5).

そして、判断部５５は全ての分割後の画像の処理が終わったか否かを判断する（ステップＳ１５－６）。動体が写る可能性がない画角に対応する分割後の画像については出力画像を決定することなく、出力画像を入力画像で更新すればよい。これにより、出力画像の決定に要する時間を短縮できる。 Then, the determination unit 55 determines whether or not the processing of all divided images has been completed (step S15-6). The output image may be updated with the input image without determining the output image for the divided image corresponding to the angle of view in which there is no possibility of capturing a moving object. As a result, the time required for determining the output image can be shortened.

全ての分割後の画像の処理が終わると、判断部５５によるブロックごとの判断結果に応じて、画像合成部５６は入力画像又は出力画像を合成して１つの全天球画像を生成する（ステップＳ１５－７）。画面作成部５４は全天球画像を含むＷｅｂページを、送受信部５１を介して端末装置７に送信する。あるいは全天球画像のみを端末装置７に送信する。端末装置７の表示制御部７３は動体が除去された全天球画像をディスプレイ５０８に表示することができる。 When all divided images have been processed, the image composition unit 56 synthesizes the input image or the output image according to the determination result of each block by the determination unit 55 to generate one omnidirectional image (step S15-7). The screen creation unit 54 transmits the web page including the omnidirectional image to the terminal device 7 via the transmission/reception unit 51 . Alternatively, only the omnidirectional image is transmitted to the terminal device 7 . The display control unit 73 of the terminal device 7 can display the omnidirectional image from which the moving object is removed on the display 508 .

＜まとめ＞
以上説明したように、本実施形態の画像処理システム２００は、ある時刻の画像に人物８２が写っていたら人物８２が写っていない過去の全天球画像を出力するので、人物８２などの動体を除去して商品棚８１の画像を提供することができる。動体が写っている間は動体が検出される前の出力画像が表示されるので、閲覧者が商品を閲覧できないということがない。商品棚８１の商品が取り出されて商品棚８１の画像に変化が生じても類似度の低下はわずかなので、商品棚８１の入力画像で出力画像を更新できる。 <Summary>
As described above, the image processing system 200 of the present embodiment outputs a past omnidirectional image in which the person 82 is not shown if the person 82 is shown in the image at a certain time. It can be removed to provide an image of shelf 81 . Since the output image before the detection of the moving object is displayed while the moving object is captured, the viewer will not be unable to browse the product. Even if the product on the product shelf 81 is taken out and the image on the product shelf 81 changes, the degree of similarity decreases only slightly, so the output image can be updated with the input image on the product shelf 81 .

本実施例では動体の一部が分割後の画像に写っていてもこれを検出することで、入力画像で出力画像を更新しない画像処理システム２００について説明する。本実施例において、実施例１において同一の符号を付した構成要素は同様の機能を果たすので、主に本実施例の主要な構成要素についてのみ説明する場合がある。 In the present embodiment, an image processing system 200 that does not update the output image with the input image by detecting a part of the moving object even if it appears in the divided image will be described. In this embodiment, since the constituent elements denoted by the same reference numerals as in Embodiment 1 perform the same functions, only the main constituent elements of this embodiment may be mainly described.

図１６は、本実施例の画像処理システム２００が解決する不都合を説明する図の一例である。図１６（ａ）はある周期の出力画像であり、図１６（ｂ）と図１６（ｃ）は時間的に後の入力画像である。 FIG. 16 is an example of a diagram for explaining the problem solved by the image processing system 200 of this embodiment. FIG. 16(a) is an output image in a certain cycle, and FIGS. 16(b) and 16(c) are input images later in time.

図１６（ｂ）では商品が１つ減っているため、実施例１で説明したように類似度が中程度になる。類似度が中程度であるため、判断部５５は図１６（ａ）の出力画像を図１６（ｂ）の入力画像で更新すると判断する。これに対し、図１６（ｃ）では人物８２の一部（例えば指）が写っているため、同じように類似度が中程度になる。類似度が中程度であるため、判断部５５は図１６（ａ）の出力画像を図１６（ｃ）の入力画像で置き換えると判断する。しかし、人物８２の一部が写っているため閲覧者には違和感を与えてしまう。 In FIG. 16B, the number of products is reduced by one, so the degree of similarity is medium as described in the first embodiment. Since the degree of similarity is medium, the determination unit 55 determines to update the output image of FIG. 16(a) with the input image of FIG. 16(b). On the other hand, in FIG. 16C, part of the person 82 (for example, a finger) is shown, so the degree of similarity is also medium. Since the degree of similarity is medium, the determination unit 55 determines to replace the output image of FIG. 16(a) with the input image of FIG. 16(c). However, since a part of the person 82 is shown, the viewer feels uncomfortable.

このような不都合に対応するため、本実施例では図１６（ｃ）のような入力画像を人物８２が写っていると判断できる画像処理システム２００について説明する。 In order to deal with such an inconvenience, this embodiment will explain an image processing system 200 that can determine that a person 82 is shown in an input image as shown in FIG. 16(c).

＜出力画像の決定＞
図１７を用いて、本実施例における類似度に基づく出力画像の決定方法について説明する。図１７（ａ）は類似度を示し、図１７（ｂ）は入力画像を示し、図１７（ｃ）は出力画像を示す。判断部５５は、出力画像と入力画像の類似度が閾値以上かどうかの判断を繰り返す。 <Determination of output image>
A method of determining an output image based on the degree of similarity in this embodiment will be described with reference to FIG. FIG. 17(a) shows the degree of similarity, FIG. 17(b) shows the input image, and FIG. 17(c) shows the output image. The determination unit 55 repeatedly determines whether or not the degree of similarity between the output image and the input image is equal to or greater than the threshold.

時刻ｔ_１：商品棚８１の商品に変化がないので、入力画像８６_ｔ_１と時刻ｔ_０の出力画像８７_ｔ_０の類似度は高く閾値以上となる。本実施例では類似度が中程度以上の場合、判断部５５が所定数の周期前の入力画像と現在の入力画像の類似度を算出する。図１７では２周期前と４周期前の入力画像と、現在の入力画像８６_ｔ_１との類似度がそれぞれ算出されている。そして類似度が高い場合、商品棚８１の変化であると判断して、判断部５５は時刻ｔ_１の入力画像８６_ｔ_１を出力画像８７_ｔ_１に決定する。 Time t ₁ : Since there is no change in the products on the product shelf 81, the similarity between the input image 86_t ₁ and the output image 87_t ₀ at time t ₀ is high and equal to or greater than the threshold. In this embodiment, when the degree of similarity is medium or higher, the determination unit 55 calculates the degree of similarity between the input image a predetermined number of cycles ago and the current input image. In FIG. 17, the degrees of similarity between the input images two cycles ago and four cycles _ago and the current input image 86_t1 are calculated. When the degree of similarity is high, it is determined that there is _a change in the product shelf 81, and the determination unit 55 determines the input image _{86_t1} at time t1 as the output image _{87_t1} .

時刻ｔ_６：人物８２が商品を取って商品棚８１の前から立ち去ったため、入力画像には商品棚８１が写るが商品が１つ少なくっている。このため、時刻ｔ_６の入力画像８６_ｔ_６と時刻ｔ_５の出力画像８７_ｔ_５の類似度は、中程度となる。類似度が中程度以上の場合、類似度算出部５３は２周期前と４周期前の入力画像８６_ｔ_２、８６_ｔ_４と、現在の入力画像８６_ｔ_６との類似度をそれぞれ算出する。出力画像と入力画像の類似度が閾値以上なのに、過去の入力画像と現在の入力画像との類似度が低い場合、商品棚８１の変化か人体の一部が写っているのか判断できないと考え、判断部５５は１周期前の時刻ｔ５の出力画像８７_ｔ_５をそのまま出力画像に決定する。 Time t ₆ : Since the person 82 picked up the product and left the front of the product shelf 81, the product shelf 81 is shown in the input image, but the number of products is one less. Therefore, the degree of similarity between the input image _{86_t6} at time _t6 and the output image _{87_t5} at time _t5 is moderate. When the similarity is medium or higher, the similarity calculation unit 53 calculates similarities between the input images 86_t ₂ and 86_t ₄ two cycles ago and four cycles ago and the current input image 86_t ₆ . If the degree of similarity between the past input image and the current input image is low even though the degree of similarity between the output image and the input image is greater than or equal to the threshold, it is assumed that it is impossible to determine whether the change in the product shelf 81 or a part of the human body is captured. The determination unit 55 determines the output image _{87_t5} at time t5 one cycle before as the output image as it is.

時刻ｔｎ：その後、商品の数に変更がないとすると、時刻ｔ６の出力画像８７_ｔ_６と時刻ｔｎの入力画像８６_ｔ_ｎの類似度が中程度以上になるので、判断部５５は２周期前と４周期前の入力画像と、現在の入力画像８６_ｔ_ｎとの類似度をそれぞれ算出する。人物８２が立ち去った後のタイミングになると、時刻ｔｎの入力画像８６_ｔ_ｎと２周期前と４周期前の入力画像との類似度が高いので（ほぼ一致するので）、判断部５５は現在の入力画像８６_ｔ_ｎを出力画像に決定する。したがって、入力画像がしばらく変化しなかった場合には、出力画像と入力画像の中程度以上の変化を背景の変化と判断して、出力画像を入力画像８６_ｔ_ｎで更新できる。 Time tn: After that, if there is no change in the number of products, the degree of similarity between the output image _{87_t6} at time t6 and the input image _{86_tn} at time tn is medium or higher. The degree of similarity between the input image before the cycle and the current input image _{86_tn} is calculated. At the timing after the person 82 has left, the similarity between the input image _{86_tn} at time tn and the input images 2 cycles and 4 cycles before is high (because they almost match). The image _{86_tn} is determined as the output image. Therefore, when the input image does not change for a while, a medium or more change between the output image and the input image can be determined as a change in the background, and the output image can be updated with the input image _{86_tn} .

本実施例の動体除去方法によれば、時刻ｔ６で出力画像８７_ｔ_５が現在の入力画像８６_ｔ_６で更新されないため、商品棚が実際に変化していても端末装置７に送信されるタイミングが遅れてしまう。しかし、商品棚８１が変化していないのに人体の一部が写っているために商品棚８１の変化であると誤判断して、人体の一部が写っている現在の入力画像で出力画像を更新することを抑制できる。 According to the moving object removal method of this embodiment, the output image _{87_t5} is not updated with the current input image _{86_t6} at time t6, so the timing of transmission to the terminal device 7 is delayed even if the product shelf has actually changed. end up However, since a part of the human body appears in the product shelf 81 even though the product shelf 81 has not changed, it is erroneously determined that there is a change in the product shelf 81. can be suppressed.

図１８を用いて補足する。図１８は図１７と同様の図であるが、時刻ｔ_６の入力画像８６_ｔ_６が図１７と異なっている。 A supplementary explanation is given using FIG. FIG. ₁₈ is similar to FIG. 17, but the input image _{86_t6} at time t6 is different from FIG.

時刻ｔ_６：人物８２が商品棚８１の前から立ち去ったが、商品を取らなかった。しかし、現在の入力画像８６_ｔ_６には人体の一部が写っている。このため、時刻ｔ_６の入力画像８６_ｔ_６と時刻ｔ_５の出力画像８７_ｔ_５の類似度は、中程度となる。類似度が中程度以上の場合、類似度算出部５３は２周期前と４周期前の入力画像８６_ｔ_２、８６_ｔ_４と、現在の入力画像８６_ｔ_６との類似度がそれぞれ算出する。 Time t ₆ : The person 82 left the product shelf 81 but did not pick up the product. However, the current input image _{86_t6} shows a part of the human body. Therefore, the degree of similarity between the input image _{86_t6} at time _t6 and the output image _{87_t5} at time _t5 is moderate. When the similarity is medium or higher, the similarity calculation unit 53 calculates similarities between the input images 86_t ₂ and 86_t ₄ two cycles ago and four cycles ago and the current input image 86_t ₆ .

類似度が低いので、判断部５５は１周期前の時刻ｔ５の出力画像８７_ｔ_５をそのまま出力画像に決定する。したがって、人体の一部が写っているが入力画像と１周期前の出力画像との類似度が中程度になっても、人体の一部が写っている現在の入力画像で出力画像を更新することを抑制できる。 Since the degree of similarity is low, the determination unit 55 determines the output image _{87_t5} at time t5 one cycle earlier as the output image as it is. Therefore, even if a part of the human body is shown and the degree of similarity between the input image and the output image of one period before is intermediate, the output image is updated with the current input image that shows a part of the human body. can be suppressed.

図１８から理解されるように、２周期前と４周期前の入力画像と現在の入力画像とが比較されるのは、現在の入力画像に人体の一部が写っている場合は、人間の移動速度から考えて数秒前（所定時間前）の入力画像にも人体が写っている可能性が高いためである。毎秒１枚の画像が撮像される場合、２周期前と４周期前は２秒前と４秒前に相当する。撮像装置１の撮像周期が変わった場合、一例として２秒前と４秒前の入力画像が抽出されればよい。したがって、２周期前と４周期前というタイミングは一例であり、人間の移動速度から考えられる所定時間前の入力画像と現在の入力画像が比較されればよい。例えば、１周期前と３周期前、３秒前と５秒前等でもよい。 As can be understood from FIG. 18, the reason why the current input image is compared with the input image two cycles ago and four cycles ago is that if the current input image contains a part of the human body, the This is because there is a high possibility that the human body is also captured in the input image several seconds (predetermined time) before, considering the moving speed. When one image is captured every second, two cycles before and four cycles before correspond to two seconds before and four seconds before. When the imaging cycle of the imaging device 1 is changed, as an example, it is sufficient to extract the input images two seconds and four seconds before. Therefore, the timing of two cycles before and four cycles before is just an example, and the current input image may be compared with the input image a predetermined time ago, which can be considered from the human movement speed. For example, it may be 1 cycle before and 3 cycles before, 3 seconds before and 5 seconds before, and the like.

２周期前と４周期前の２つのタイミングで比較するのは、過去に人体が写っていなかったことをより確実に確認するためであり、２周期前又は４周期前のいずれか一方のタイミングで比較してもよい。どちらかの入力画像と現在の入力画像との類似度が低ければ、現在の入力画像に人体の一部が写っている可能性があるためである。 The reason why the two timings before 2 cycles and 4 cycles before are compared is to more reliably confirm that the human body was not captured in the past. You can compare. This is because if the degree of similarity between one of the input images and the current input image is low, the current input image may include a part of the human body.

１周期前～５周期前の入力画像のうち１つと現在の入力画像との類似度を算出してもよいし、１周期前～５周期前の入力画像のうち２つ以上（１秒周期の場合は最大で５つ）の入力画像と現在の入力画像との類似度を算出してもよい。複数の入力画像と現在の入力画像の類似度が算出された場合、すべての類似度が閾値以上かどうか判断される。 The degree of similarity between one of the input images from 1 cycle to 5 cycles ago and the current input image may be calculated, or two or more of the input images from 1 cycle to 5 cycles before The degree of similarity between the input image and the current input image may be calculated for up to five images in this case. When similarities between a plurality of input images and the current input image are calculated, it is determined whether all similarities are equal to or greater than a threshold.

＜動作手順＞
図１９は、本実施例の画像処理システム２００が出力画像を更新する手順を説明するフローチャート図の一例である。図１９の説明では図１５との相違を説明する。まず、ステップＳ１９－１とＳ１９－２の処理は図１５のステップＳ１５－１、Ｓ１５－２と同様でよい。 <Operation procedure>
FIG. 19 is an example of a flow chart illustrating a procedure for updating an output image by the image processing system 200 of this embodiment. In the description of FIG. 19, differences from FIG. 15 will be described. First, the processing of steps S19-1 and S19-2 may be the same as steps S15-1 and S15-2 of FIG.

ステップＳ１９－３で判断部５５は類似度が閾値ｘ以上か否かを判断する（ステップＳ１９－３）。閾値ｘは商品棚８１に変化がある場合又は人体の一部が写っている場合の類似度よりも低い閾値である。閾値ｘは図１５のステップＳ１５－３の閾値と同程度になる。 At step S19-3, the determination unit 55 determines whether or not the degree of similarity is equal to or greater than the threshold value x (step S19-3). The threshold x is a threshold lower than the degree of similarity when there is a change in the product shelf 81 or when a part of the human body is shown. The threshold x becomes approximately the same as the threshold in step S15-3 of FIG.

ステップＳ１９－３の判断がＹｅｓの場合、類似度算出部５３は２周期前と４周期前の入力画像と現在の入力画像の類似度を算出する（ステップＳ１９－４）。すなわち、数秒前の動体が写っている可能性がある入力画像と現在の入力画像との類似度が算出される。 If the determination in step S19-3 is Yes, the similarity calculator 53 calculates the similarity between the input images two cycles ago and four cycles ago and the current input image (step S19-4). That is, the degree of similarity between the current input image and the input image that may include a moving object several seconds ago is calculated.

判断部５５は２周期前の入力画像と現在の入力画像との類似度が閾値ｙ以上かどうか、及び、４周期前の入力画像と現在の入力画像との類似度が閾値ｙ以上かどうかを判断する（ステップＳ１９－５）。 The determination unit 55 determines whether the degree of similarity between the input image two cycles ago and the current input image is equal to or greater than a threshold y, and whether the degree of similarity between the input image four cycles ago and the current input image is equal to or greater than the threshold y. decision (step S19-5).

ステップＳ１９－５でＹｅｓと判断された場合、過去の入力画像に動体が写っていないので、判断部５５は出力画像を現在の入力画像で更新すると判断する（Ｓ１９－６）。これにより、商品棚８１に変化が合った場合には変化後の入力画像で出力画像を更新できる。 If it is determined Yes in step S19-5, the past input image does not include a moving object, so the determining unit 55 determines to update the output image with the current input image (S19-6). As a result, when the product shelf 81 is changed, the output image can be updated with the changed input image.

ステップＳ１９－５でＮｏと判断された場合、過去の入力画像に動体が写っている可能性があるため、現在の入力画像に人体の一部が写っているのか商品棚の変化なのかを判断できないとして、判断部５５は１周期前の出力画像をそのまま出力画像に決定する（Ｓ１９－７）。これにより、現在の入力画像に人体の一部が写っている場合には、出力画像を更新することを回避できる。 If it is determined No in step S19-5, there is a possibility that the past input image contains a moving object, so it is determined whether the current input image contains a part of the human body or a change in the product shelf. Assuming that it is not possible, the determination unit 55 determines the output image of one period before as the output image as it is (S19-7). This makes it possible to avoid updating the output image when a part of the human body is shown in the current input image.

なお、閾値ｙは数秒前の入力画像と一致することを検出したいため（商品棚８１が写っており変化がない）、最大の類似度である１に近い値となる。したがって、閾値ｙは閾値ｘよりも大きい。これにより、入力画像が変化しない場合には出力画像を入力画像で更新できる。 Note that the threshold value y is a value close to 1, which is the maximum degree of similarity, because it is desired to detect a match with the input image several seconds before (the product shelf 81 is shown and there is no change). Therefore, threshold y is greater than threshold x. This allows the output image to be updated with the input image when the input image does not change.

ステップＳ１９－８とＳ１９－９の処理は図１５のステップＳ１５－６とＳ１５－７と同じでよい。 The processing of steps S19-8 and S19-9 may be the same as steps S15-6 and S15-7 of FIG.

以上説明したように、本実施例の画像処理システム２００は、現在の入力画像と過去の入力画像を比較することで、動体の一部が分割後の画像に写っていてもこれを検出して出力画像を更新することを抑制できる。 As described above, the image processing system 200 according to the present embodiment compares the current input image and the past input image, thereby detecting a part of the moving object even if it appears in the divided image. It is possible to suppress updating the output image.

本実施例では、分割後の画像に動体が写っているか否かの判断の後、出力画像を更新しないと判断した旨を周りに拡張する拡張処理を施すことで動体が一部だけ写った入力画像で出力画像を更新することをより確実に抑制する画像処理システム２００について説明する。 In this embodiment, after determining whether or not a moving object appears in the image after division, expansion processing is performed to expand the fact that it is determined that the output image is not to be updated. An image processing system 200 that more reliably suppresses updating an output image with an image will be described.

図２０は、本実施例の画像処理システム２００が有する、撮像装置１、通信端末３、画像管理装置５、及び端末装置７の各機能ブロック図である。なお、図２０の説明において、図９と同一の符号を付した構成要素は同様の機能を果たすので、主に本実施例の主要な構成要素についてのみ説明する場合がある。 FIG. 20 is a functional block diagram of the imaging device 1, communication terminal 3, image management device 5, and terminal device 7 included in the image processing system 200 of this embodiment. In the description of FIG. 20, since the components denoted by the same reference numerals as in FIG. 9 perform the same functions, only the main components of this embodiment may be mainly described.

本実施例の画像管理装置５は拡張処理部５７を有している。拡張処理部５７は動体が検出された分割後の画像に隣接した分割後の画像でも動体が検出されたとみなす処理を行う。すなわち、動体が検出された分割後の画像の範囲を拡張する。 The image management device 5 of this embodiment has an extended processing section 57 . The extension processing unit 57 performs processing for determining that a moving object is also detected in a divided image adjacent to the divided image in which the moving object is detected. That is, the range of the divided image in which the moving object is detected is expanded.

＜拡張処理＞
図２１は拡張処理を説明する図の一例である。図２１においてマスは分割後の画像を現し、「１」の画像は動体が検出された分割後の画像を現し、「０」は動体が検出されていない分割後の画像を現す。まず、図２１（ａ）は実施例１，２により動体の検出処理が行われた９つの分割後の画像を示す。図２１（ａ）では中央の画像でのみ動体が検出されている。 <Extended processing>
FIG. 21 is an example of a diagram for explaining expansion processing. In FIG. 21, each square represents an image after division, an image of "1" represents an image after division in which a moving object is detected, and an image of "0" represents an image after division in which no moving object is detected. First, FIG. 21(a) shows nine divided images after the moving object detection processing has been performed in the first and second embodiments. In FIG. 21(a), a moving object is detected only in the central image.

図２１（ｂ）は動体が検出された分割後の画像の上下左右に隣接した画像への動体が検出された旨の拡張を説明する図である。図２１（ｂ）では中央の画像の上下左右に隣接した画像に「１」が設定されている。このような拡張処理を４近傍拡張処理という。 FIG. 21(b) is a diagram for explaining extension of detection of a moving object to images adjacent to each other vertically and horizontally after the divided image in which the moving object is detected. In FIG. 21B, "1" is set for the images adjacent to the upper, lower, left, and right sides of the central image. Such expansion processing is called 4-neighbor expansion processing.

図２１（ｃ）は動体が検出された分割後の画像の上下左右及び斜めに隣接した画像への動体が検出された旨の拡張を説明する図である。図２１（ｃ）では中央の画像の上下左右及び斜めに隣接した画像に「１」が設定されている。このような拡張処理を８近傍拡張処理という。 FIG. 21(c) is a diagram for explaining extension of detection of a moving object to an image adjacent to each other vertically, horizontally, and obliquely after the divided image in which the moving object is detected. In FIG. 21(c), "1" is set for the images adjacent to the center image in the upper, lower, left, right, and oblique directions. Such expansion processing is called 8-neighbor expansion processing.

図２２は８近傍拡張処理の好適例を説明する図の一例である。図２２（ａ）に示すように、例えば、人体の頭部が９つの分割後の画像の中央に写っているものとする。中央、左下、下中央及び右下の画像からは動体が検出されると考えられる。図２２（ｂ）に示すように、４近傍拡張処理では上下左右の画像に人体が写っていると見なすことができるが、斜め上の画像にも人体の一部が写っている。右上及び左上の画像にわずかにだけ人体が写っている場合、実施例２の判断方法でも出力画像が更新されるおそれがある。 FIG. 22 is an example of a diagram for explaining a preferred example of the 8-neighbor expansion process. As shown in FIG. 22A, for example, it is assumed that the head of a human body appears in the center of nine divided images. A moving object is considered to be detected from the center, lower left, lower center, and lower right images. As shown in FIG. 22(b), in the 4-neighbor expansion process, it can be considered that the human body is shown in the upper, lower, left, and right images, but a part of the human body is also shown in the diagonally upper image. If the upper right and upper left images only slightly show a human body, there is a possibility that the output image will be updated even with the determination method of the second embodiment.

これに対し、図２２（ｃ）に示すように、８近傍拡張処理では右上及び左上の画像に人体の一部が写っていたと見なすことができるので、動体が写っている入力画像で出力画像を更新することを抑制できる。 On the other hand, as shown in FIG. 22(c), in the 8-neighbour expansion process, it can be assumed that a part of the human body is shown in the upper right and upper left images. Updates can be suppressed.

したがって、８近傍拡張処理が有効であることが分かるが、動体の形状等によっては４近傍拡張処理を採用してもよい。 Therefore, it can be seen that the 8-neighbor expansion process is effective, but the 4-neighbor expansion process may be adopted depending on the shape of the moving object.

図２３は、拡張処理部５７が分割後の画像に対し拡張処理を行うフローチャート図の一例である。図２３の処理では図１９との相違を主に説明する。 FIG. 23 is an example of a flow chart showing how the extension processing unit 57 performs extension processing on the divided image. In the processing of FIG. 23, mainly the differences from FIG. 19 will be explained.

図２３の処理ではステップＳ２３－１～Ｓ２３－４が追加されている。ステップＳ２３－１では、拡張処理部５７が、分割後の画像を１つ決定する（ステップＳ２３－１）。分割後の画像の決定方法は任意でよい。 Steps S23-1 to S23-4 are added to the process of FIG. At step S23-1, the extension processing unit 57 determines one image after division (step S23-1). Any method may be used to determine images after division.

次に、着目している分割後の画像に動体が検出されているか否かを判断する（ステップＳ２３－２）。 Next, it is determined whether or not a moving object is detected in the divided image of interest (step S23-2).

ステップＳ２３－２の判断がＹｅｓの場合、拡張処理部５７は上下左右及び斜めの画像で動体が検出されたとみなす（ステップＳ２３－３）。ステップＳ２３－２の判断がＮｏの場合、拡張処理部５７は拡張処理を行わない。 If the determination in step S23-2 is Yes, the extension processing unit 57 considers that a moving object has been detected in the up/down/left/right and oblique images (step S23-3). If the determination in step S23-2 is No, the extension processing section 57 does not perform extension processing.

拡張処理部５７は全ての分割後の画像で終了したか否かを判断し（ステップＳ２３－４）、終了した場合は分割後の画像を合成する（Ｓ１９－９）。 The extension processing unit 57 determines whether or not all divided images are finished (step S23-4), and if finished, synthesizes the divided images (S19-9).

なお、ステップＳ２３－２において、動体が検出されたとみなされた分割後の画像はＹｅｓと判断されないことに注意されたい。これにより、動体が検出されたとみなされた分割後の画像と隣接する分割後の画像が連鎖的に、動体が検出されたとみなされることを回避できる。 It should be noted that, in step S23-2, it is not judged as Yes for the divided image in which the moving object is considered to be detected. As a result, it is possible to prevent a divided image in which a moving object has been detected and an adjacent divided image from being regarded as having detected a moving object in a chain reaction.

本実施例によれば、出力画像を更新しないと判断した分割後の画像を近傍に拡張することで、人物８２の一部が写っているが過去の入力画像との類似度が閾値ｙ以上となった場合でも出力画像が更新されないので、動体が一部だけ写った入力画像で出力画像を更新することを抑制することができる。 According to the present embodiment, by expanding the divided image for which it is determined that the output image is not to be updated to the vicinity, the person 82 is partly shown, but the similarity to the past input image is equal to or greater than the threshold value y. Even if this happens, the output image is not updated, so it is possible to suppress updating the output image with an input image in which only a part of the moving object is captured.

本実施例では人体が写っている画像を機械的に学習しておき学習モデル５８を構築し、学習モデル５８により入力画像に人物が写っているか否かを判断する画像処理システム２００について説明する。 In the present embodiment, an image processing system 200 that mechanically learns an image containing a human body, builds a learning model 58, and determines whether or not a person is shown in an input image based on the learning model 58 will be described.

図２４は、本実施例の画像処理システム２００が有する、撮像装置１、通信端末３、画像管理装置５、及び端末装置７の各機能ブロック図である。なお、図２４の説明において、図９と同一の符号を付した構成要素は同様の機能を果たすので、主に本実施例の主要な構成要素についてのみ説明する場合がある。 FIG. 24 is a functional block diagram of the imaging device 1, communication terminal 3, image management device 5, and terminal device 7 included in the image processing system 200 of this embodiment. 24. In the description of FIG. 24, since the components denoted by the same reference numerals as in FIG. 9 perform the same functions, only the main components of this embodiment may be mainly described.

本実施例の画像管理装置５は学習モデル５８を有している。学習モデル５８は、動体が写っている画像又は写っていない画像の学習結果を保持しており、入力された画像に対し動体の有無を出力する。学習モデル５８は入力画像に動体が写っているか否か（又は、確度、確率、又は確からしさ等）を出力する。入力画像から抽出した特徴量を使用してもよい。学習装置は予め、ディープラーニング、サポートベクターマシン、ニューラルネットワーク、ランダムフォレストなどのアルゴリズムでトレーニング用の入力画像に人物が写っているか否を学習することで学習モデル５８を構築する。学習モデル５８にはこの学習結果が保持されており、入力画像に動体が写っているか否か（被写体の画像かどうか）を判断する判断部として機能する。 The image management device 5 of this embodiment has a learning model 58 . The learning model 58 holds learning results of images with or without a moving object, and outputs the presence/absence of a moving object in an input image. The learning model 58 outputs whether or not a moving object appears in the input image (or accuracy, probability, certainty, etc.). You may use the feature-value extracted from the input image. The learning device builds a learning model 58 by learning in advance whether or not a person appears in an input image for training using an algorithm such as deep learning, support vector machine, neural network, or random forest. This learning result is held in the learning model 58, and functions as a judgment unit that judges whether or not the input image contains a moving object (whether or not the image is a subject).

＜学習モデルを用いた判断について＞
１．学習モデルのみで入力画像を破棄
図２５を用いて、学習モデル５８を用いた出力画像の決定方法について説明する。図２５（ａ）は入力画像を示し、図２５（ｂ）は出力画像を示す。判断部５５は、学習モデル５８を単独で使って動体が写っているか否かに基づいて出力画像を決定する。 <About judgment using learning model>
1. Discarding Input Image Using Learning Model Only A method of determining an output image using the learning model 58 will be described with reference to FIG. FIG. 25(a) shows an input image, and FIG. 25(b) shows an output image. The determination unit 55 uses the learning model 58 alone to determine an output image based on whether or not a moving object is captured.

時刻ｔ_０：入力画像８６_ｔ_０が撮像されると、学習モデル５８に入力される。学習モデル５８は人物が写っていないと判断する。時刻ｔ_０では出力画像がないため判断部５５は入力画像８６_ｔ_０を出力すると判断する。 Time t ₀ : When the input image 86 — t ₀ is captured, it is input to the learning model 58 . The learning model 58 determines that no person is shown. Since there is no output image at time _t0 , the determination unit 55 determines to output the input image _{86_t0} .

時刻ｔ_１：入力画像８６_ｔ_１が撮像されると、学習モデル５８に入力される。学習モデル５８は人物が写っていないと判断するので、時刻ｔ_１の入力画像８６_ｔ_１を出力画像８７_ｔ_１に決定する。 Time t ₁ : When the input image 86 — t ₁ is captured, it is input to the learning model 58 . Since the learning model 58 determines that _no person is captured, the input image _{86_t1} at time t1 is determined as the output image _{87_t1} .

時刻ｔ_２：入力画像８６_ｔ_２には人物８２が写ったため、学習モデル５８は人物が写っていると判断し、入力画像８６_ｔ_２を破棄し、時刻ｔ_１の出力画像８７_ｔ_１をそのまま出力画像に決定する。 Time t2: Since the person 82 is shown in the input image _{86_t2} , the learning model ₅₈ determines that the person is shown, discards the input image _{86_t2} , and uses the output image _{87_t1} at time _t1 as the output image. decide.

時刻ｔ_３～ｔ_５：引き続き人物８２が写っているため、学習モデル５８は人物が写っていると判断する。このため、１周期前の出力画像８７_ｔ_２～８７_ｔ_４をそのまま出力画像に決定する。 Time t ₃ to t ₅ : The learning model 58 determines that the person is still photographed because the person 82 is still photographed. For this reason, the output images 87 — t ₂ to 87 — t ₄ one period before are determined as the output images as they are.

時刻ｔ_６：人物８２が商品を取って商品棚８１の前から立ち去ったため、入力画像には商品棚８１が写るが商品が１つ少なくっている。学習モデル５８は人物が写っていないと判断するので、時刻ｔ_６の入力画像８６_ｔ_６を出力画像８７_ｔ_６に決定する。 Time t ₆ : Since the person 82 picked up the product and left the front of the product shelf 81, the product shelf 81 is shown in the input image, but the number of products is one less. Since the learning model 58 determines that no person is captured, the input image _{86_t6} at time _t6 is determined as the output image _{87_t6} .

このように、学習モデル５８を使用することで学習モデル５８の精度で人物が写っている入力画像を破棄できる。 In this way, by using the learning model 58, it is possible to discard an input image containing a person with the accuracy of the learning model 58. FIG.

図２６は、学習モデルが入力画像を破棄する手順を示すフローチャート図の一例である。図２６の処理は入力画像が撮像されるごとに繰り返し実行される。まず、入力画像がブロックに分割される（Ｓ１５－１）。 FIG. 26 is an example of a flowchart showing a procedure for discarding an input image by the learning model. The processing in FIG. 26 is repeatedly executed each time an input image is captured. First, an input image is divided into blocks (S15-1).

学習モデルは入力画像に動体が写っているか否かを判断する（Ｓ２６－１）。入力画像に動体が写っていない場合、学習モデルは出力画像を入力画像で更新する（Ｓ２６－２）。 The learning model determines whether or not the input image contains a moving object (S26-1). If the input image does not include a moving object, the learning model updates the output image with the input image (S26-2).

入力画像に動体が写っている場合、学習モデルは１周期前の出力画像を出力する（Ｓ２６－３）。 If the input image contains a moving object, the learning model outputs the output image of one period before (S26-3).

そして、学習モデル５８は全ての分割後の画像の処理が終わったか否かを判断する（Ｓ１５－６）。全ての分割後の画像の処理が終わっていなければステップＳ２６－１からの処理を繰り返し、全ての分割後の画像の処理が終わった場合は分割後の画像を合成する（Ｓ１５－７）。 Then, the learning model 58 determines whether or not the processing of all divided images has been completed (S15-6). If all divided images have not been processed, the processing from step S26-1 is repeated, and if all divided images have been processed, the divided images are synthesized (S15-7).

２．学習モデルで破棄されなかった入力画像と出力画像の類似度により入力画像を破棄
図２７を用いて、学習モデル５８を用いた出力画像の決定方法について説明する。図２７（ａ）は入力画像を示し、図２７（ｂ）は出力画像を示す。判断部５５は、学習モデル５８を使うと共に、出力画像と入力画像の類似度が閾値以上かどうかに基づいて出力画像を決定する。 2. Discarding Input Image Based on Similarity Between Input Image and Output Image Not Discarded by Learning Model A method of determining an output image using the learning model 58 will be described with reference to FIG. FIG. 27(a) shows an input image, and FIG. 27(b) shows an output image. The determination unit 55 uses the learning model 58 and determines the output image based on whether the similarity between the output image and the input image is equal to or greater than a threshold.

時刻ｔ_１：入力画像８６_ｔ_１が撮像されると、学習モデル５８に入力される。学習モデル５８は人物が写っていないと判断する。次に、類似度算出部５３は出力画像８７_ｔ_０と入力画像８６_ｔ_１の類似度を算出する。商品棚８１の商品に変化がないので、入力画像８６_ｔ_１と出力画像８７_ｔ_０の類似度は高く閾値以上となる。類似度が閾値以上の場合、判断部５５は時刻ｔ_１の入力画像８６_ｔ_１を出力画像８７_ｔ_１に決定する。 Time t ₁ : When the input image 86 — t ₁ is captured, it is input to the learning model 58 . The learning model 58 determines that no person is shown. Next, the similarity calculator 53 calculates the similarity between the output image _{87_t0} and the input image _{86_t1} . Since there is no change in the products on the product shelf 81, the similarity between the input image _{86_t1} and the output image _{87_t0} is high and exceeds the threshold. When the degree of similarity is equal to or greater _than the threshold, the determination unit 55 determines the input image _{86_t1} at time t1 as the output image _{87_t1} .

時刻ｔ_２：入力画像８６_ｔ_２には人物８２が写ったため、学習モデル５８は人物が写っていると判断する。判断部５５は入力画像８６_ｔ_２を破棄し、時刻ｔ_１の出力画像８７_ｔ_１をそのまま出力画像に決定する。 Time t ₂ : Since the person 82 appears in the input image 86_t ₂ , the learning model 58 determines that the person appears. _The determination unit 55 discards the input image _{86_t2} and directly determines the output image _{87_t1} at time t1 as the output image.

時刻ｔ_３～ｔ_５：引き続き人物８２が写っているため、学習モデル５８は人物が写っていると判断する。判断部５５は１周期前の出力画像８７_ｔ_２～８７_ｔ_４をそのまま出力画像に決定する。 Time t ₃ to t ₅ : The learning model 58 determines that the person is still photographed because the person 82 is still photographed. The determination unit 55 determines the output images 87 — t ₂ to 87 — t ₄ of one period before as the output images as they are.

時刻ｔ_６：人物８２が商品を取って商品棚８１の前から立ち去ったため、入力画像には商品棚８１が写るが商品が１つ少なくっている。まず、学習モデル５８は人物が写っていないと判断する。時刻ｔ_６の入力画像８６_ｔ_６と時刻ｔ_５の出力画像８７_ｔ_５の類似度は、中程度となる。類似度が中程度（閾値以上）の場合、判断部５５は時刻ｔ_６の入力画像８６_ｔ_６を出力画像８７_ｔ_６に決定する。 Time t ₆ : Since the person 82 picked up the product and left the front of the product shelf 81, the product shelf 81 is shown in the input image, but the number of products is one less. First, the learning model 58 determines that no person is in the image. The degree of similarity between the input image _{86_t6} at time _t6 and the output image _{87_t5} at time _t5 is moderate. When the degree of similarity is medium (greater than or equal to the threshold), the determination unit 55 determines the input image _{86_t6} at time _t6 as the output image _{87_t6} .

このように、学習モデル５８を使用することで学習モデル５８の精度で人物が写っている入力画像を破棄できる。人物が写っていないと学習モデル５８が判断した入力画像に対し、類似度算出部５３が類似度を算出するので、実際には人物が写っている入力画像を破棄することができる。 In this way, by using the learning model 58, it is possible to discard an input image containing a person with the accuracy of the learning model 58. FIG. Since the similarity calculation unit 53 calculates the degree of similarity with respect to the input image in which the learning model 58 determines that the person is not shown, the input image in which the person is actually shown can be discarded.

＜動作手順＞
図２８は、画像管理装置５が出力画像を決定する手順を示すフローチャート図の一例である。図２８の説明では主に図１５との相違を説明する。 <Operation procedure>
FIG. 28 is an example of a flowchart showing a procedure for the image manager 5 to determine an output image. In the explanation of FIG. 28, mainly the difference from FIG. 15 will be explained.

まず、画像分割部５２が全天球画像を分割する（ステップＳ１５－１）。 First, the image dividing unit 52 divides the omnidirectional image (step S15-1).

次に、学習モデル５８が分割された画像に人物が写っているか否かを判断する（Ｓ２６－１）。人物が写っていると判断された場合、処理はステップＳ１５－５に進み、判断部５５は１周期前の出力画像をそのまま出力画像として出力すると決定する（ステップＳ１５－５）。 Next, the learning model 58 determines whether or not a person is shown in the divided image (S26-1). If it is determined that a person is captured, the process proceeds to step S15-5, and the determination unit 55 determines to output the output image of one cycle before as it is (step S15-5).

人物が写っていないと判断された場合、処理はステップＳ２６－１に進み、学習モデル５８は分割後の画像１つずつについて動体が写っているか否かを判断する（ステップＳ２６－１）。以降の処理は図１５と同様でよい。 If it is determined that a person is not captured, the process proceeds to step S26-1, and the learning model 58 determines whether or not a moving object is captured in each divided image (step S26-1). Subsequent processing may be the same as in FIG.

＜まとめ＞
したがって、本実施例の画像処理システム２００によれば、学習モデル５８を利用するので学習モデル５８の精度が高ければ人物が写っている入力画像を高精度に検出できる。人物が写っていると学習モデル５８が判断した画像には商品棚が更新された画像は含まれないので、人物が写っていないと学習モデル５８が判断した画像は商品棚が更新された画像又は更新されていない画像になる。商品棚が更新されている場合は出力画像を早期に入力画像で更新したい場合には、閾値を小さくすることで人体が写っている可能性が低い状況で早期に出力画像を更新できる。 <Summary>
Therefore, according to the image processing system 200 of this embodiment, since the learning model 58 is used, if the accuracy of the learning model 58 is high, an input image including a person can be detected with high accuracy. The images in which the learning model 58 determines that a person is shown do not include the images in which the product shelf has been updated. The image is not updated. When it is desired to update the output image early with the input image when the product shelf has been updated, the output image can be updated early in a situation where the human body is less likely to appear in the image by reducing the threshold value.

また、学習モデル５８を用いた処理は実施例１だけでなく実施例２，３のいずれとも組み合わせて適用できる。いずれの実施例と組み合わせる場合でも、入力画像に対し人物が写っているか否かを最初に学習モデル５８が判断すればよい。実施例３において人物が写っていると学習モデル５８が判断した場合、人物が存在するブロックを中心に拡張処理部５７が拡張処理を行う。 Moreover, the processing using the learning model 58 can be applied not only in the first embodiment but also in combination with any of the second and third embodiments. In combination with any of the embodiments, the learning model 58 may first determine whether or not a person appears in the input image. In the third embodiment, when the learning model 58 determines that a person is shown, the expansion processing unit 57 performs expansion processing centering on the block in which the person exists.

また、本実施例では、入力画像と出力画像の類似度が算出される前に人物が写っているか否かを学習モデル５８が判断したが、入力画像と出力画像の類似度の算出の後に、人物が写っているか否かを学習モデル５８が判断してもよい。類似度が閾値よりも高いため、判断部５５が出力画像を入力画像で更新すると判断しても、人体の一部が写っている場合に入力画像を破棄できる。すなわち、商品棚の更新に対しては出力画像を入力画像で更新する必要があるため、閾値（実施例２では閾値ｘ）を大きくすることに限界があるが、本実施例では早期に出力画像を更新するために閾値が低くなり人体が写っている入力画像があっても、学習モデル５８がこれを検出して人物が写っている入力画像を破棄できる。 Further, in this embodiment, the learning model 58 determines whether or not a person is captured before the similarity between the input image and the output image is calculated. The learning model 58 may determine whether or not a person is shown. Since the degree of similarity is higher than the threshold, even if the determination unit 55 determines to update the output image with the input image, the input image can be discarded if a part of the human body is captured. That is, since it is necessary to update the output image with the input image for updating the product shelf, there is a limit to increasing the threshold value (threshold value x in the second embodiment). is updated, and even if there is an input image containing a human body, the learning model 58 can detect this and discard the input image containing a human body.

本実施例では画像処理システム２００の変形例について説明する。 In this embodiment, a modified example of the image processing system 200 will be described.

＜動体除去を行う装置について＞
実施例１～３では画像管理装置５が動体除去を行ったが、動体除去は撮像装置１、通信端末３、無線ルータ９ａ、又は、端末装置７のどの装置が行ってもよい。また、店舗側に存在するコンピュータ又はマイクロサーバのような情報処理装置が行ってもよい。また、図４には記載がないが通信ネットワーク９を介して接続された動体除去の専用の情報処理装置が動体を除去してもよい。 <About the device that removes the moving object>
In Embodiments 1 to 3, the image management device 5 removes the moving object, but the removal of the moving object may be performed by any of the imaging device 1, the communication terminal 3, the wireless router 9a, or the terminal device 7. FIG. Alternatively, an information processing device such as a computer or microserver existing on the store side may perform the processing. Further, although not shown in FIG. 4, an information processing apparatus dedicated to removing a moving object connected via the communication network 9 may remove the moving object.

また、図４では端末装置７が画像管理装置５から動体除去された全天球画像を取得しているが、端末装置７は撮像装置１から直接、全天球画像を取得してもよい。この場合も、撮像装置１、通信端末３、無線ルータ、又は端末装置７のどの装置が動体除去を行ってもよい。また、撮像装置１と端末装置７が直接、接続されていてもよい。また、端末装置７と撮像装置１が一体に構成されていてもよい。 Also, in FIG. 4, the terminal device 7 acquires the omnidirectional image with the moving object removed from the image management device 5 , but the terminal device 7 may directly acquire the omnidirectional image from the imaging device 1 . In this case also, any of the imaging device 1, the communication terminal 3, the wireless router, or the terminal device 7 may remove the moving object. Alternatively, the imaging device 1 and the terminal device 7 may be directly connected. Also, the terminal device 7 and the imaging device 1 may be configured integrally.

また、端末装置７と撮像装置１とが一体の場合、端末装置７が画像管理装置５に全天球画像を送信し、動体が除去された全天球画像を画像管理装置５から取得してもよい。 Further, when the terminal device 7 and the imaging device 1 are integrated, the terminal device 7 transmits the omnidirectional image to the image management device 5, and acquires the omnidirectional image from which the moving object is removed from the image management device 5. good too.

＜店舗以外の利用シーン＞
会議や講義等で使用されるホワイトボード又は電子黒板に表示された情報を、複数の端末装置７で共有するために、撮像装置１がホワイトボード又は電子黒板を撮像する場合がある。撮像時に筆記者が写る場合があるが、本実施形態で説明した動体除去により筆記者を除外すると共に、ホワイトボード又は電子黒板に表示された情報の変化を検出して出力画像を更新できる。 <Usage scenes other than stores>
In order to share information displayed on a whiteboard or an electronic blackboard used in meetings, lectures, etc. with a plurality of terminal devices 7, the imaging device 1 may capture an image of the whiteboard or the electronic blackboard. Although the writer may appear in the image, the moving object removal described in this embodiment can remove the writer, and the output image can be updated by detecting changes in the information displayed on the whiteboard or electronic blackboard.

また、ライブカメラについても好適に利用することができる。ライブカメラとは、遠隔地の映像をユーザのいる場所から監視するために遠隔地に設置され、映像を周期的に撮像する撮像装置１である。例えば、天候の観測地や観光地に設置され、全天球画像に環境客等が写っている場合に画像管理装置５が動体を除去できる。屋外だけでなく博物館などの屋内に撮像装置１が配置されてもよい。 In addition, a live camera can also be suitably used. A live camera is an imaging device 1 that is installed at a remote location to monitor video from a user's location and periodically captures video. For example, the image management device 5 can remove the moving object when it is installed at a weather observation site or a sightseeing spot and an environmental visitor or the like appears in the omnidirectional image. The imaging device 1 may be placed indoors such as in a museum as well as outdoors.

また、物流倉庫等に配置された撮像装置１の画像を解析して画像管理装置５が商品の補充を行う在庫管理にも、本実施形態の動体除去方法を適用できる。物流倉庫では商品をピックアップしたり補充したりする作業員が働いているため、全天球画像に写る場合があるが、本実施形態により全天球画像から作業員を除去し、更に、商品の減少による全天球画像の変化を出力画像として出力できる。 The moving object removal method of the present embodiment can also be applied to inventory management in which the image management device 5 replenishes products by analyzing the image of the imaging device 1 placed in a distribution warehouse or the like. Since workers who pick up and replenish products are working in distribution warehouses, they may appear in the omnidirectional image. A change in the omnidirectional image due to reduction can be output as an output image.

また、工場のラインを撮像装置１で撮像し、画像管理装置５がラインの異常を検知する異常監視にも本実施形態の動体除去方法を適用できる。工場のラインでは作業員が組み立て等を行っているが、本実施形態により全天球画像から作業員を除去し、更に、ライン上の製品の変化による全天球画像の変化を出力画像として出力できる。 The moving object removal method of the present embodiment can also be applied to abnormality monitoring in which an image of a factory line is captured by the imaging device 1 and an abnormality of the line is detected by the image management device 5 . In the factory line, workers perform assembly, etc., but in this embodiment, workers are removed from the omnidirectional image, and changes in the omnidirectional image due to changes in products on the line are output as output images. can.

＜その他の適用例＞
以上、本発明を実施するための最良の形態について実施例を用いて説明したが、本発明はこうした実施例に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 <Other application examples>
Although the best mode for carrying out the present invention has been described above using examples, the present invention is by no means limited to such examples, and various modifications can be made without departing from the scope of the present invention. and substitutions can be added.

例えば、本実施形態では同じ場所から周囲３６０度（同じ方向を撮像するのと同じ効果）を撮像する定点観測を例に説明したが、撮像装置１は固定されていなくてもよい。例えばユーザが把持する状態でも撮像は可能である。また、撮像装置１がゆっくりと移動する場合、背景の変化による類似度は中程度以上になるので、入力画像で出力画像を更新できる。 For example, in the present embodiment, an example of fixed-point observation in which 360-degree surroundings (the same effect as imaging in the same direction) is taken from the same place has been described, but the imaging device 1 does not have to be fixed. For example, imaging is possible even when the user is holding the camera. Also, when the imaging device 1 moves slowly, the degree of similarity due to changes in the background is intermediate or higher, so the output image can be updated with the input image.

また、端末装置７が表示した全天球画像が時系列に記憶されてもよい。これにより、商品棚の商品がどのように変化したのか商品棚だけの変化を再生できる。商品棚の他、天候の変化を記録する場合などでも有効である。 Also, the omnidirectional images displayed by the terminal device 7 may be stored in chronological order. As a result, it is possible to reproduce the change of only the product shelf as to how the products on the product shelf have changed. It is effective not only for product shelves but also for recording weather changes.

また、以上の実施例で示した図９、図２０などの構成例は、画像処理システム２００の処理の理解を容易にするために、主な機能に応じて分割したものである。しかし、各処理単位の分割の仕方や名称によって本願発明が制限されることはない。画像処理システム２００は、処理内容に応じて更に多くの処理単位に分割することもできる。また、１つの処理単位が更に多くの処理を含むように分割することもできる。 Further, the configuration examples shown in FIGS. 9 and 20 shown in the above embodiment are divided according to main functions in order to facilitate understanding of the processing of the image processing system 200 . However, the method of dividing each processing unit and the names thereof do not limit the present invention. The image processing system 200 can also be divided into more processing units according to the processing content. Also, one processing unit can be divided to include more processing.

また、画像管理装置５の機能が複数のサーバ装置に分散されていてもよいし、画像処理システム２００が複数の画像管理装置５を有していてもよい。 Also, the functions of the image management device 5 may be distributed to a plurality of server devices, and the image processing system 200 may have a plurality of image management devices 5 .

また、送受信部５１は画像取得手段の一例であり、類似度算出部５３は類似度算出手段の一例であり、判断部５５と学習モデル５８は判断手段の一例であり、判断部５５と学習モデル５８は被写体画像判断手段の一例であり、画像分割部５２は画像分割手段の一例として、拡張処理部５７は処理手段の一例であり、表示制御部７３と送受信部５１は出力手段の一例である。閾値ｘは第一の閾値の一例であり、閾値ｙは第二の閾値の一例であり、入力画像と出力画像の類似度は第一の類似度の一例であり、現在の入力画像と過去の入力画像の類似度は第二の類似度の一例である。 Further, the transmission/reception unit 51 is an example of image acquisition means, the similarity calculation unit 53 is an example of similarity calculation means, the determination unit 55 and learning model 58 are examples of determination means, and the determination unit 55 and learning model 58 are examples of determination means. Reference numeral 58 denotes an example of subject image determination means, image division section 52 is an example of image division means, extension processing section 57 is an example of processing means, and display control section 73 and transmission/reception section 51 are an example of output means. . The threshold x is an example of the first threshold, the threshold y is an example of the second threshold, the similarity between the input image and the output image is an example of the first similarity, and the current input image and the past The similarity of the input image is an example of the second similarity.

１撮像装置
３通信端末
５画像管理装置
７端末装置
８１商品棚
８２人物
８６入力画像
８７出力画像
２００画像処理システム 1 imaging device 3 communication terminal 5 image management device 7 terminal device 81 product shelf 82 person 86 input image 87 output image 200 image processing system

特許５４９３７０９号公報Japanese Patent No. 5493709

Claims

information processing equipment,
an image acquisition means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
Similarity calculating means for calculating a first similarity between the input image obtained by the image obtaining means and the output image output to the outside;
function as
When the first degree of similarity is equal to or greater than the first threshold, the subject image determination means determines that the input image is the image of the subject, and determines that the first degree of similarity is less than the first threshold. , determining that the input image is not an image of a subject;
When the first similarity is greater than or equal to the first threshold,
The similarity calculation means calculates a second similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image,
The determination means determines to output the current input image when the second similarity is equal to or greater than the second threshold, and outputs to the outside when the second similarity is less than the second threshold. A program for determining to output the output image.

information processing equipment,
an image acquisition means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
A learning model that pre-learns an image with or without a moving object and outputs the presence or absence of a moving object in the image;
function as
When the learning model does not determine that the input image acquired by the image acquiring means includes a moving object, the subject image determining means determines that the input image is an image of a subject, and the input image does not include a moving object. if the learning model determines that the input image is not an image of the subject, and
When the learning model does not determine that the input image acquired by the image acquisition means includes a moving object, the information processing device is configured to detect the input image acquired by the image acquisition means and the output image output to the outside. Functioning as similarity calculation means for calculating the first similarity,
When the first similarity is greater than or equal to the first threshold,
The similarity calculation means calculates a second similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image,
The determination means determines to output the current input image when the second similarity is equal to or greater than a second threshold, and outputs to the outside when the second similarity is less than the second threshold. A program for determining to output the output image.

When the first degree of similarity is less than the first threshold, or when the second degree of similarity is less than the second threshold,
2. determining that a moving object, which is captured in front of said subject when it exists between said imaging device and said imaging device, is captured in said current input image obtained by said image obtaining means. Or the program according to 2.

3. The similarity calculating means calculates the second similarity between the input image a predetermined time ago determined based on the moving speed of the moving body and the current input image acquired by the image acquiring means. 4. The program according to any one of 1 to 3.

5. The program according to any one of claims 1 to 4, wherein said second threshold is greater than said first threshold.

causing the information processing device to function as image dividing means for dividing the input image acquired by the image acquiring means into blocks;
The similarity calculating means calculates the first similarity for each block of the input image divided into blocks and the output image,
6. The program according to any one of claims 1 to 5, wherein the determination means determines whether to output the current input image or the output image output to the outside for each block.

The image dividing means refers to size information in which information about the size of the block is associated with coordinate information of the image captured by the imaging device, and
7. The program according to claim 6, wherein said input image obtained by said image obtaining means is divided into blocks.

3. The size of said block is larger than a subject imaged by said imaging device and smaller than a moving object that is captured in front of said subject when said subject is present between said imaging device and said imaging device. 8. The program according to 6 or 7.

6. The information processing apparatus further functions as processing means for judging that the output image is to be outputted to the outside for each block adjacent to the block judged by the judging means to be outputted to the outside. 9. The program according to any one of 8.

10. The program according to claim 9, wherein the processing means determines to output the output image to the outside for each of the blocks that are eight adjacent to the block determined by the determination means to be output to the outside.

an image acquisition means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
a similarity calculating means for calculating a first similarity between the input image obtained by the image obtaining means and the output image output to the outside;
When the first degree of similarity is equal to or greater than the first threshold, the subject image determination means determines that the input image is the image of the subject, and determines that the first degree of similarity is less than the first threshold. , determining that the input image is not an image of a subject;
When the first similarity is greater than or equal to the first threshold,
The similarity calculation means calculates a second similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image,
The determination means determines to output the current input image when the second similarity is equal to or greater than the second threshold, and outputs to the outside when the second similarity is less than the second threshold. and determining to output the output image.

an image acquisition means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
a learning model function for pre-learning an image with or without a moving object and outputting the presence or absence of a moving object in the image;
When the learning model function does not determine that the input image acquired by the image acquiring means includes a moving object, the subject image determining means determines that the input image is an image of a subject, and the input image does not include a moving object. when the learning model function determines that the input image is not an image of the subject, and
a first similarity between the input image acquired by the image acquisition means and an output image output to the outside when the learning model function does not determine that the input image acquired by the image acquisition means contains a moving object; Calculate the degree,
When the first similarity is greater than or equal to the first threshold,
calculating a second degree of similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image;
The determination means determines to output the current input image when the second similarity is equal to or greater than a second threshold, and outputs to the outside when the second similarity is less than the second threshold. An information processing apparatus that determines to output the output image.

An image processing system having an imaging device that periodically images the surroundings and one or more information processing devices,
an image acquiring means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
a similarity calculating means for calculating a first similarity between the input image obtained by the image obtaining means and the output image output to the outside;
When the first degree of similarity is equal to or greater than the first threshold, the subject image determination means determines that the input image is the image of the subject, and determines that the first degree of similarity is less than the first threshold. , determining that the input image is not an image of a subject;
When the first similarity is greater than or equal to the first threshold,
The similarity calculation means calculates a second similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image,
The determination means determines to output the current input image when the second similarity is equal to or greater than the second threshold, and outputs to the outside when the second similarity is less than the second threshold. and determining to output the output image.

An image processing system having an imaging device that periodically images the surroundings and one or more information processing devices,
an image acquiring means for acquiring an image captured by the imaging device as an input image;
subject image determination means for determining whether the input image acquired by the image acquisition means is an image of a subject;
determining means for determining to output the input image when determining that the input image is an image of a subject, and determining to output the externally output image when determining that the input image is not an image of a subject;
output means for outputting the input image or the output image according to the determination result of the determination means;
a learning model function for pre-learning an image with or without a moving object and outputting the presence or absence of a moving object in the image;
When the learning model function does not determine that the input image acquired by the image acquiring means includes a moving object, the subject image determining means determines that the input image is an image of a subject, and the input image does not include a moving object. when the learning model function determines that the input image is not an image of the subject, and
When the learning model function does not determine that the input image acquired by the image acquisition means includes a moving object, the information processing device is configured to perform the input image acquired by the image acquisition means and the output image output to the outside. function as similarity calculation means for calculating the first similarity of
When the first similarity is greater than or equal to the first threshold,
The similarity calculation means calculates a second similarity between the current input image acquired by the image acquisition means and an input image that is older than the current input image,
The determination means determines to output the current input image when the second similarity is equal to or greater than a second threshold, and outputs to the outside when the second similarity is less than the second threshold. An image processing system that determines to output the output image.