JP7393270B2

JP7393270B2 - Information processing device, information processing method, and information processing program

Info

Publication number: JP7393270B2
Application number: JP2020054674A
Authority: JP
Inventors: 崇泰八尋; 総中下; 友弥天利
Original assignee: Core Corp
Current assignee: Core Corp
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2023-12-06
Anticipated expiration: 2040-03-25
Also published as: JP2021157299A

Description

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

従来から、カメラで撮像した画像に記録される被写体を追跡する技術が存在する。特許文献１に記載された技術は、複数のカメラそれぞれで同一時刻に撮像されたフレームから被写体を検出し、フレーム内の物体の位置に基づいて実空間における物体の位置を算出する。特許文献１に記載された技術は、算出された被写体の位置に基づいて、次のフレームでの被写体の位置を予測する。 2. Description of the Related Art Conventionally, there has been a technique for tracking a subject recorded in an image captured by a camera. The technology described in Patent Document 1 detects a subject from frames captured at the same time by each of a plurality of cameras, and calculates the position of the object in real space based on the position of the object within the frame. The technique described in Patent Document 1 predicts the position of a subject in the next frame based on the calculated position of the subject.

特開２０１９－１０９７６５号公報JP 2019-109765 Publication

特許文献１に記載された技術は、フレームに記録される被写体を追跡するものであり、被写体の追跡を以外のその他の情報を取得できてはない。また、特許文献１に記載された技術は、複数のカメラが必要になるため、複数のカメラを設置するコスト及び手間が必要になる。さらに、設置場所の状況によっては、複数のカメラを設置できない場合もある。 The technique described in Patent Document 1 tracks a subject recorded in a frame, and cannot acquire any other information other than tracking the subject. Further, the technique described in Patent Document 1 requires a plurality of cameras, and thus requires the cost and effort of installing the plurality of cameras. Furthermore, depending on the circumstances of the installation location, it may not be possible to install multiple cameras.

本発明は、画像に記録される被写体に関する情報を取得することができる情報処理装置、情報処理方法及び情報処理プログラムを提供することを目的とする。 An object of the present invention is to provide an information processing device, an information processing method, and an information processing program that can acquire information regarding a subject recorded in an image.

一態様の情報処理装置は、時間的に連続して被写体を撮像して画像データを生成する定点撮像部と、定点撮像部によって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体をそれぞれ認識し、それぞれの時刻において被写体を囲う第１認識枠を生成する認識部と、複数の第１認識枠と、当該複数の第１認識枠の中心位置と、定点撮像部によって撮像される画像データに基づく画像の基準線とに基づいて、認識部によって第１認識枠を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体を囲う第２認識枠と当該第２認識枠の中心位置とを予測する予測部と、予測部によって生成される第２認識枠に一部が接するように、被写体の正面の外形に応じた正面枠と、被写体の背面の外形に応じた背面枠とを生成する生成部と、定点撮像部から被写体までの距離と、画像データに基づく画像のサイズと、生成部によって生成された正面枠及び背面枠の少なくとも一方のサイズとに基づいて、被写体のサイズを取得するサイズ取得部と、を備える。 An information processing device according to one embodiment includes a fixed-point imaging unit that temporally continuously images a subject to generate image data, and at least two temporally different images recorded in the image data captured by the fixed-point imaging unit. a recognition unit that recognizes each subject based on the subject and generates a first recognition frame surrounding the subject at each time, a plurality of first recognition frames, a center position of the plurality of first recognition frames, and a fixed-point imaging unit. and a reference line of the image based on the image data captured by the recognition unit, the second recognition encloses the subject at a predetermined time that is temporally earlier than the recording time of the image when the first recognition frame is generated by the recognition unit. a prediction unit that predicts the frame and the center position of the second recognition frame; a front frame that corresponds to the outer shape of the front of the subject, and a front frame that partially touches the second recognition frame generated by the prediction unit; A generation unit that generates a back frame according to the outer shape of the back surface, a distance from a fixed-point imaging unit to a subject, an image size based on image data, and at least one of the front frame and the back frame generated by the generation unit. and a size acquisition unit that acquires the size of the subject based on the size.

一態様の情報処理装置は、生成部によって生成された正面枠及び背面枠に基づいて、被写体の重心位置を取得する重心取得部と、重心取得部によって取得された重心位置を所定の座標系に変換することにより、被写体の位置を取得する位置取得部と、を備えることとしてもよい。 An information processing device according to one aspect includes a center of gravity acquisition unit that acquires a center of gravity position of a subject based on a front frame and a back frame generated by a generation unit, and a center of gravity position acquired by the center of gravity acquisition unit in a predetermined coordinate system. The camera may also include a position acquisition unit that acquires the position of the subject by converting the image.

一態様の情報処理装置は、重心取得部によって取得された時間毎の被写体の複数の重心位置の変化に基づいて被写体の速度を取得する速度取得部を備えることとしてもよい。 The information processing device in one embodiment may include a speed acquisition unit that acquires the speed of the subject based on changes in a plurality of positions of the center of gravity of the subject over time acquired by the center of gravity acquisition unit.

一態様の情報処理装置は、サイズ取得部によって取得された被写体のサイズ、位置取得部によって取得された被写体の位置、及び、速度取得部によって取得された被写体の速度のうち、少なくとも１つを出力する出力部を備えることとしてもよい。 In one embodiment, the information processing device outputs at least one of the size of the subject acquired by the size acquisition unit, the position of the subject acquired by the position acquisition unit, and the speed of the subject acquired by the speed acquisition unit. It is also possible to include an output section that does this.

一態様の情報処理装置では、認識部は、画像上において被写体を囲う矩形の枠を第１認識枠として生成することとしてもよい。 In one embodiment of the information processing device, the recognition unit may generate a rectangular frame surrounding the subject on the image as the first recognition frame.

一態様の情報処理装置では、予測部は、画像上において被写体を囲う矩形の枠を第２認識枠として生成することとしてもよい。 In one embodiment of the information processing device, the prediction unit may generate a rectangular frame surrounding the subject on the image as the second recognition frame.

一態様の情報処理装置では、予測部は、基準線として、画像データに基づく画像の水平方向に延びる線を設定し、基準線の所定位置を端点として設定し、当該端点から複数の第１認識枠の各中心位置を通る中心線上を中心位置として第２認識枠を生成することとしてもよい。 In one aspect of the information processing device, the prediction unit sets a line extending in the horizontal direction of an image based on image data as a reference line, sets a predetermined position of the reference line as an end point, and performs a plurality of first recognition processes from the end point. The second recognition frame may be generated with the center position being on the center line passing through each center position of the frame.

一態様の情報処理装置では、生成部は、所定位置を基準に、画像データに基づく画像の垂直方向に対して左側に被写体が位置する場合、画像に対して第２認識枠の左下にある頂点に接するように矩形の正面枠を生成すると共に、画像に対して第２認識枠の右上にある頂点に接するように矩形の背面枠を生成することとしてもよい。 In one embodiment of the information processing device, when the subject is located on the left side in the vertical direction of the image based on the image data with respect to the predetermined position as a reference, the generation unit generates a vertex at the lower left of the second recognition frame with respect to the image. A rectangular front frame may be generated so as to be in contact with the second recognition frame, and a rectangular back frame may be generated so as to be in contact with the top right vertex of the second recognition frame with respect to the image.

一態様の情報処理装置では、生成部は、所定位置を基準に、画像データに基づく画像の垂直方向に対して右側に被写体が位置する場合、画像に対して第２認識枠の右下にある頂点に接するように矩形の正面枠を生成すると共に、画像に対して第２認識枠の左上にある頂点に接するように矩形の背面枠を生成することとしてもよい。 In one embodiment of the information processing device, when the subject is located on the right side in the vertical direction of the image based on the image data with respect to the predetermined position as a reference, the generation unit may be configured to A rectangular front frame may be generated so as to be in contact with the vertex, and a rectangular back frame may be generated so as to be in contact with the apex at the upper left of the second recognition frame with respect to the image.

一態様の情報処理装置では、サイズ取得部は、定点撮像部において生成される画像データに基づく画像内の位置と、当該位置における定点撮像部からの距離とを予め対応付けた結果に基づいて、定点撮像部から被写体までの距離を取得することとしてもよい。 In one embodiment of the information processing device, the size acquisition unit may perform the following steps based on a result of associating in advance a position in the image based on image data generated by the fixed-point imaging unit and a distance from the fixed-point imaging unit at the position. The distance from the fixed-point imaging unit to the subject may also be acquired.

一態様の情報処理方法では、コンピュータが、時間的に連続して被写体を撮像して画像データを生成する定点撮像ステップと、定点撮像ステップによって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体をそれぞれ認識し、それぞれの時刻において被写体を囲う第１認識枠を生成する認識ステップと、複数の第１認識枠と、当該複数の第１認識枠の中心位置と、定点撮像ステップによって撮像される画像データに基づく画像の基準線とに基づいて、認識ステップによって第１認識枠を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体を囲う第２認識枠と当該第２認識枠の中心位置とを予測する予測ステップと、予測ステップによって生成される第２認識枠に一部が接するように、被写体の正面の外形に応じた正面枠と、被写体の背面の外形に応じた背面枠とを生成する生成ステップと、定点撮像ステップから被写体までの距離と、画像データに基づく画像のサイズと、生成ステップによって生成された正面枠及び背面枠の少なくとも一方のサイズとに基づいて、被写体のサイズを取得するサイズ取得ステップと、を実行する。 In one aspect of the information processing method, a computer sequentially images a subject to generate image data; and at least two temporal a recognition step of recognizing each subject based on different images and generating a first recognition frame surrounding the subject at each time; a plurality of first recognition frames; and a center position of the plurality of first recognition frames; enclosing the subject at a predetermined time that is temporally earlier than the recording time of the image when generating the first recognition frame in the recognition step, based on the reference line of the image based on the image data taken in the fixed-point imaging step; a prediction step of predicting a second recognition frame and a center position of the second recognition frame; and a prediction step of predicting a second recognition frame and a center position of the second recognition frame; , a generation step that generates a back frame according to the outer shape of the back of the subject, a distance from the fixed point imaging step to the subject, an image size based on image data, and a front frame and a back frame generated in the generation step. and a size obtaining step of obtaining the size of the subject based on at least one of the sizes.

一態様の情報処理プログラムは、コンピュータに、時間的に連続して被写体を撮像して画像データを生成する定点撮像機能と、定点撮像機能によって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体をそれぞれ認識し、それぞれの時刻において被写体を囲う第１認識枠を生成する認識機能と、複数の第１認識枠と、当該複数の第１認識枠の中心位置と、定点撮像機能によって撮像される画像データに基づく画像の基準線とに基づいて、認識機能によって第１認識枠を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体を囲う第２認識枠と当該第２認識枠の中心位置とを予測する予測機能と、予測機能によって生成される第２認識枠に一部が接するように、被写体の正面の外形に応じた正面枠と、被写体の背面の外形に応じた背面枠とを生成する生成機能と、定点撮像機能から被写体までの距離と、画像データに基づく画像のサイズと、生成機能によって生成された正面枠及び背面枠の少なくとも一方のサイズとに基づいて、被写体のサイズを取得するサイズ取得機能と、を実現させる。 An information processing program in one embodiment provides a computer with a fixed point imaging function that temporally continuously images a subject to generate image data, and at least two temporal a recognition function that recognizes each subject based on different images and generates a first recognition frame surrounding the subject at each time; a plurality of first recognition frames; and a center position of the plurality of first recognition frames; enclosing the subject at a predetermined time that is temporally earlier than the recording time of the image when the first recognition frame is generated by the recognition function, based on the reference line of the image based on the image data captured by the fixed-point imaging function; A prediction function that predicts a second recognition frame and the center position of the second recognition frame, and a front frame that corresponds to the front outline of the subject so that a part of the frame is in contact with the second recognition frame generated by the prediction function. , a generation function that generates a back frame according to the external shape of the back of the subject, the distance from the fixed point imaging function to the subject, the size of the image based on image data, and the front frame and back frame generated by the generation function. and a size acquisition function for acquiring the size of a subject based on at least one of the sizes.

一態様の情報処理装置は、少なくとも２つの時間的に異なるそれぞれの画像に基づいて被写体を囲う第１認識枠を生成する認識部と、複数の第１認識枠と、その複数の第１認識枠の中心位置と、定点撮像部によって撮像される画像データに基づく画像の基準線とに基づいて、認識部によって第１認識枠を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体を囲う第２認識枠と当該第２認識枠の中心位置とを予測する予測部と、第２認識枠に応じて被写体の正面の外形に応じた正面枠と背面枠とを生成する生成部と、定点撮像部から被写体までの距離と、画像データに基づく画像のサイズと、生成部によって生成された正面枠及び背面枠の少なくとも一方のサイズとに基づいて、被写体のサイズを取得するサイズ取得部と、を備えるので、画像に記録される被写体に関する情報を取得することができる。
また、一態様の情報処理方法及び情報処理プログラムは、一態様の情報処理装置と同様の効果を奏することができる。 An information processing device according to one aspect includes a recognition unit that generates a first recognition frame surrounding a subject based on at least two temporally different images, a plurality of first recognition frames, and a plurality of first recognition frames. , and the reference line of the image based on the image data captured by the fixed-point imaging unit, at a predetermined time that is temporally earlier than the recording time of the image when the recognition unit generates the first recognition frame. a prediction unit that predicts a second recognition frame surrounding the subject at a time and a center position of the second recognition frame, and generates a front frame and a back frame according to the front outline of the subject according to the second recognition frame. Obtaining the size of the subject based on the generation unit, the distance from the fixed-point imaging unit to the subject, the size of the image based on the image data, and the size of at least one of the front frame and the back frame generated by the generation unit. Since the size acquisition unit is provided, information regarding the subject recorded in the image can be acquired.
Further, the information processing method and the information processing program of one embodiment can achieve the same effects as the information processing apparatus of one embodiment.

一実施形態に係る情報処理装置について説明するためのブロック図である。FIG. 1 is a block diagram for explaining an information processing device according to an embodiment. 画像位置情報を取得する場合の説明に用いるための図である。FIG. 3 is a diagram for use in explaining the case of acquiring image position information. 定点撮像部で撮像される画像の一例について説明するための図である。FIG. 3 is a diagram for explaining an example of an image captured by a fixed-point imaging unit. Ｙ座標とピクセル数との関係を示すグラフである。It is a graph showing the relationship between the Y coordinate and the number of pixels. 画像位置情報の一例について説明するためのグラフである。It is a graph for explaining an example of image position information. Ｙ座標と画像の幅の比率との関係を示すグラフである。It is a graph showing the relationship between the Y coordinate and the width ratio of an image. 第１～９設定部について説明するためのブロック図である。FIG. 3 is a block diagram for explaining first to ninth setting units. 第１認識枠及び第２認識枠の一例について説明するための図である。It is a figure for explaining an example of a 1st recognition frame and a 2nd recognition frame. 第１認識枠及び第２認識枠それぞれの中心位置について説明するための図である。FIG. 6 is a diagram for explaining the center positions of each of the first recognition frame and the second recognition frame. 正面枠及び背面枠の一例について説明するための図である。It is a figure for explaining an example of a front frame and a back frame. 重心位置について説明するための図である。FIG. 3 is a diagram for explaining the center of gravity position. 被写体の速度を取得する際の重心位置の変化について説明するための図である。FIG. 6 is a diagram for explaining a change in the center of gravity position when acquiring the speed of a subject. 一実施形態に係る情報処理方法であって、画像位置情報を取得する方法ついて説明するためのフローチャートである。1 is a flowchart illustrating a method of acquiring image position information, which is an information processing method according to an embodiment. 一実施形態に係る情報処理方法であって、被写体に関する情報を取得する方法について説明するためのフローチャートである。1 is a flowchart illustrating an information processing method according to an embodiment, which is a method of acquiring information regarding a subject.

以下、本発明の一実施形態について説明する。
本明細書では、「情報」の文言を使用しているが、「情報」の文言は「データ」と言い換えることができ、「データ」の文言は「情報」と言い換えることができる。 An embodiment of the present invention will be described below.
Although the wording "information" is used in this specification, the wording "information" can be rephrased as "data" and the wording "data" can be rephrased as "information."

図１は、一実施形態に係る情報処理装置１について説明するためのブロック図である。
情報処理装置１は、定点撮像部２１によって動画又は時間的に連続した静止画を撮像することにより生成された画像データに基づいて被写体２０１（図８，１０参照）を撮像する。被写体２０１は、例えば、一定の動きをする物体である。一定の動きをする物体には、略一定の動きをする物体も含まれ、例えば、直線的（略直線的）に移動する車両、人物及び動物等である。具体的な一例として、被写体２０１は、直線道路等を移動する物体であってよい。なお、被写体２０１は、移動する物体に限られず、静止する物体（例えば、車両、人物及び動物等）であってもよい。 FIG. 1 is a block diagram for explaining an information processing device 1 according to an embodiment.
The information processing device 1 images a subject 201 (see FIGS. 8 and 10) based on image data generated by capturing moving images or temporally continuous still images using the fixed-point imaging unit 21. The subject 201 is, for example, an object that moves in a constant manner. Objects that move in a constant manner also include objects that move in a substantially constant manner, such as vehicles, people, animals, etc. that move linearly (substantially straight). As a specific example, the subject 201 may be an object moving on a straight road or the like. Note that the subject 201 is not limited to a moving object, but may be a stationary object (for example, a vehicle, a person, an animal, etc.).

情報処理装置１は、例えば、定点撮像部２１によって撮像された時間的に異なる同一の被写体２０１について物体認識を行い、その被写体２０１を囲う認識枠（第１認識枠２１０（図８参照））（ＲＯＩ：ＲｅａｇｉｏｎｏｆＩｎｔｅｒｅｓｔ）を時間毎に複数生成する。情報処理装置１は、複数の第１認識枠２１０を利用して、それらの第１認識枠２１０を生成する際の基となる画像（フレーム又は静止画）を撮像した時間（例えば、過去の時間）よりも先となる時間（例えば、現在時間又は将来の時間）における被写体２０１の位置を予測して、その予測の位置に認識枠（第２認識枠２２０（図８参照））を生成する。情報処理装置１は、第１認識枠２１０及び第２認識枠２２０を利用して、被写体２０１のサイズ（例えば、垂直方向、横方向及び奥行方向の長さ）、被写体２０１の位置（例えば、現在の位置及び過去の位置等）、定点撮像部２１から被写体２０１までの距離及び被写体２０１の速度を取得する。 For example, the information processing device 1 performs object recognition on the same subject 201 that is imaged by the fixed-point imaging unit 21 and that differs in time, and creates a recognition frame (first recognition frame 210 (see FIG. 8)) surrounding the subject 201 ( A plurality of ROIs (Regions of Interest) are generated for each time. The information processing device 1 uses a plurality of first recognition frames 210 to determine the time (for example, past time) at which an image (frame or still image) was captured as a basis for generating the first recognition frames 210. ) is predicted, and a recognition frame (second recognition frame 220 (see FIG. 8)) is generated at the predicted position. The information processing device 1 uses the first recognition frame 210 and the second recognition frame 220 to determine the size of the subject 201 (for example, the length in the vertical, horizontal and depth directions) and the position of the subject 201 (for example, the current , past position, etc.), the distance from the fixed-point imaging unit 21 to the subject 201, and the speed of the subject 201.

以下、情報処理装置１について詳細に説明する。
情報処理装置１は、定点撮像部２１、記憶部２２、設定部１２、認識部１３、予測部１４、生成部１５、サイズ取得部１６、重心取得部１７、位置取得部１８、速度取得部１９及び出力制御部２０を備える。設定部１２、認識部１３、予測部１４、生成部１５、サイズ取得部１６、重心取得部１７、位置取得部１８、速度取得部１９及び出力制御部２０は、情報処理装置１の制御部１１（一例として、演算処理装置）の一機能として実現されてもよい。上述した記憶部２２の他に、後述する通信部２３及び表示部２４は、本発明の「出力部」の一実施形態を構成してもよい。 The information processing device 1 will be described in detail below.
The information processing device 1 includes a fixed point imaging section 21, a storage section 22, a setting section 12, a recognition section 13, a prediction section 14, a generation section 15, a size acquisition section 16, a center of gravity acquisition section 17, a position acquisition section 18, and a speed acquisition section 19. and an output control section 20. The setting unit 12 , the recognition unit 13 , the prediction unit 14 , the generation unit 15 , the size acquisition unit 16 , the center of gravity acquisition unit 17 , the position acquisition unit 18 , the speed acquisition unit 19 and the output control unit 20 are the control unit 11 of the information processing device 1 (As an example, it may be realized as a function of an arithmetic processing device). In addition to the storage section 22 described above, a communication section 23 and a display section 24, which will be described later, may constitute an embodiment of the "output section" of the present invention.

定点撮像部２１は、時間的に連続して被写体を撮像して画像データを生成する。定点撮像部２１は、動画、又は、所定時間ごとに連続して静止画を撮像して動画データを生成する。定点撮像部２１は、所定の場所に配置されて、所定の方向を撮像する。
定点撮像部２１は、地面に固定されたポール等に設置され、又は、移動可能な三脚等に設置される。定点撮像部２１は、移動可能な三脚等に設置された場合、その三脚と共に移動させることが可能である。これにより、定点撮像部２１は、ユーザの所望の場所に容易に配置され、またユーザが必要とする期間のみ所望の場所に配置されることができる。 The fixed-point imaging unit 21 temporally continuously images a subject and generates image data. The fixed-point imaging unit 21 generates moving image data by capturing moving images or continuous images at predetermined time intervals. The fixed point imaging unit 21 is arranged at a predetermined location and captures an image in a predetermined direction.
The fixed-point imaging unit 21 is installed on a pole or the like fixed to the ground, or on a movable tripod or the like. When the fixed point imaging unit 21 is installed on a movable tripod or the like, it can be moved together with the tripod. Thereby, the fixed-point imaging unit 21 can be easily placed at a location desired by the user, and can be placed at a desired location only for a period required by the user.

定点撮像部２１は、例えば、発電パネル（図示せず）又は電源（図示せず）の少なくとも一方から給電される。
発電パネルは、光を受けて発電を行う。発電パネルは、エネルギハーベスティング技術を利用したデバイス、すなわち、周囲の環境のエネルギを収穫して電力に変換するデバイスであってもよい。具体的な一例として、発電パネルは、太陽電池パネルであってよい。発電パネルで発電された電力は、例えば、バッテリなどに蓄電して、そのバッテリから定点撮像部２１に給電してもよい。
電源は、例えば、一次電池、二次電池又は商用電源等である。電源として一次電池又は二次電池が用いられる場合、その一次電池又は二次電池が交換可能なように定点撮像部２１に配される。また、電源として二次電池が用いられる場合、その二次電池が充電可能なように定点撮像部２１に配されてもよい。
なお、発電パネル又は電源は、定点撮像部２１の他に、制御部等に給電してもよい。 The fixed-point imaging unit 21 is supplied with power, for example, from at least one of a power generation panel (not shown) or a power source (not shown).
Power generation panels generate electricity by receiving light. The power generation panel may be a device that utilizes energy harvesting technology, ie, a device that harvests energy from the surrounding environment and converts it into electricity. As a specific example, the power generation panel may be a solar panel. The power generated by the power generation panel may be stored in, for example, a battery, and the power may be supplied to the fixed-point imaging unit 21 from the battery.
The power source is, for example, a primary battery, a secondary battery, a commercial power source, or the like. When a primary battery or a secondary battery is used as a power source, the primary battery or secondary battery is arranged in the fixed-point imaging unit 21 so as to be replaceable. Further, when a secondary battery is used as a power source, the secondary battery may be arranged in the fixed-point imaging section 21 so as to be rechargeable.
Note that the power generation panel or the power source may supply power to the control unit and the like in addition to the fixed-point imaging unit 21.

記憶部２２は、種々の情報及びプログラム等を記憶する装置である。記憶部２２は、被写体と、その被写体の特徴とを対応付けた対応関係（学習モデル）を記憶する。対応関係は、認識部１３において被写体を認識するときに用いられる。対応関係は、一例として、深層学習等が行われることにより生成される。すなわち、対応関係は、深層学習等が行われることに基づいて被写体の特徴が学習され、被写体と、被写体の特徴との関係を対応付けることにより生成される。学習モデルは、定点撮像部２１によって生成された画像データに基づいて制御部１１が生成してもよく、後述する通信部２３を介して外部装置（例えば、サーバ等）（図示せず）から取得してもよい。
また、記憶部２２は、定点撮像部２１において生成される画像データに基づく画像（フレーム又は静止画）内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた関係（画像位置情報）を記憶する。画像位置情報の生成方法については、後述する。 The storage unit 22 is a device that stores various information, programs, and the like. The storage unit 22 stores a correspondence relationship (learning model) that associates a subject with a feature of the subject. The correspondence relationship is used when the recognition unit 13 recognizes the subject. The correspondence relationship is generated by performing deep learning or the like, for example. That is, the correspondence relationship is generated by learning the characteristics of the object based on deep learning or the like, and by associating the relationship between the object and the characteristics of the object. The learning model may be generated by the control unit 11 based on image data generated by the fixed-point imaging unit 21, and may be acquired from an external device (for example, a server, etc.) (not shown) via the communication unit 23, which will be described later. You may.
The storage unit 22 also stores a relationship (an image location information). A method for generating image position information will be described later.

通信部２３は、外部装置（例えば、サーバ等）（図示せず）との間で情報の送受信が可能な装置である。 The communication unit 23 is a device capable of transmitting and receiving information to and from an external device (for example, a server, etc.) (not shown).

設定部１２は、後述する各部（機能）において、被写体２０１のサイズ、被写体２０１の位置、定点撮像部２１から被写体２０１までの距離及び被写体２０１の速度を取得するために、前段階となる設定を行う。すなわち、定点撮像部２１において生成される画像データに基づく画像（フレーム又は静止画）内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた関係を設定する。 The setting unit 12 performs preliminary settings in order to obtain the size of the subject 201, the position of the subject 201, the distance from the fixed-point imaging unit 21 to the subject 201, and the speed of the subject 201 in each unit (function) described later. conduct. That is, a relationship is set in which a position in an image (frame or still image) based on image data generated by the fixed-point imaging section 21 is associated in advance with a distance from the fixed-point imaging section 21 at that position.

まず、画像位置情報を取得する構成について説明する。情報処理装置１は、画像位置情報を予め取得しておくことにより、後述するように被写体２０１に関する被写体情報、すなわち、被写体２０１のサイズ、被写体２０１の位置、定点撮像部２１から被写体２０１までの距離及び被写体２０１の速度を取得することが可能になる。 First, a configuration for acquiring image position information will be described. By acquiring image position information in advance, the information processing device 1 obtains subject information regarding the subject 201, such as the size of the subject 201, the position of the subject 201, and the distance from the fixed-point imaging unit 21 to the subject 201, as described later. It becomes possible to obtain the speed of the subject 201.

まず、定点撮像部２１の撮像範囲内において物体（一例として、人物等）を移動させる。以下では、設定部１２によって設定が行われる際の物体の一例として「人物」を例示して説明する。この際、設定部１２は、定点撮像部２１によって生成された画像データに記録される人物の高さと、定点撮像部２１の画角とに基づいて、画像データ内の画像位置情報を設定する。すなわち、設定部１２は、定点撮像部２１によって撮像される画像に基づいて、画像内の位置に関する情報（画像位置情報）を取得する。詳しくは、設定部１２は、定点撮像部２１によって撮像された画像データに基づいて、後述する認識部１３が人物を認識した場合、人物の高さと、定点撮像部２１の画角とに基づいて、定点撮像部２１で撮像される画像内の位置情報（画像位置情報（定点撮像部２１から人物までの距離に関る情報））を取得する。 First, an object (for example, a person, etc.) is moved within the imaging range of the fixed-point imaging unit 21. In the following, a "person" will be described as an example of an object for which settings are made by the setting unit 12. At this time, the setting unit 12 sets image position information in the image data based on the height of the person recorded in the image data generated by the fixed-point imaging unit 21 and the angle of view of the fixed-point imaging unit 21. That is, the setting unit 12 acquires information regarding a position within the image (image position information) based on the image captured by the fixed-point imaging unit 21. Specifically, when the recognition unit 13 (described later) recognizes a person based on the image data captured by the fixed-point imaging unit 21, the setting unit 12 determines the height of the person and the angle of view of the fixed-point imaging unit 21. , acquires positional information (image position information (information related to the distance from the fixed-point imaging unit 21 to the person)) in the image captured by the fixed-point imaging unit 21.

図２は、画像位置情報を取得する場合の説明に用いるための図である。
図３は、定点撮像部２１で撮像される画像の一例について説明するための図である。
図４は、Ｙ座標とピクセル数との関係を示すグラフである。図４の横軸はＹ座標を示し、縦軸は１ｍ当たりのピクセル数を示す。ここで、定点撮像部２１で撮像される画像内における奥行方向がＹ座標であり、その画像の幅方向がＸ座標である。
図５は、画像位置情報の一例について説明するためのグラフである。図５の横軸はＹ座標を示し、縦軸は、定点撮像部２１が配される地面の位置から任意のＹ座標の位置までの距離を示す。
図６は、Ｙ座標と画像の幅の比率との関係を示すグラフである。図６の横軸はＹ座標を示し、縦軸は画像の幅の比率を示す。
図７は、第１～９設定部１２１～１２９について説明するためのブロック図である。 FIG. 2 is a diagram for use in explaining the case of acquiring image position information.
FIG. 3 is a diagram for explaining an example of an image captured by the fixed-point imaging unit 21.
FIG. 4 is a graph showing the relationship between the Y coordinate and the number of pixels. The horizontal axis in FIG. 4 indicates the Y coordinate, and the vertical axis indicates the number of pixels per 1 m. Here, the depth direction in the image captured by the fixed-point imaging unit 21 is the Y coordinate, and the width direction of the image is the X coordinate.
FIG. 5 is a graph for explaining an example of image position information. The horizontal axis in FIG. 5 indicates the Y coordinate, and the vertical axis indicates the distance from the ground position where the fixed-point imaging unit 21 is arranged to an arbitrary Y coordinate position.
FIG. 6 is a graph showing the relationship between the Y coordinate and the width ratio of the image. The horizontal axis in FIG. 6 shows the Y coordinate, and the vertical axis shows the ratio of the width of the image.
FIG. 7 is a block diagram for explaining the first to ninth setting units 121 to 129.

ここで、図２では、Ｙ軸方向（Ｙ座標方向）が奥行方向になる。また、図２において、Ｙ軸方向（Ｙ座標方向）の所定位置にいる人物１１０を定点撮像部２１で撮像した場合、その所定位置における画像を符号１００ａで示す。画像１００ａの幅（Ｘ軸方向のサイズ）を符号ｗで示す。その画像１００ａの中心位置を符号Ｐｃとすると、定点撮像部２１から中心位置Ｐｃまでの距離を第１距離Ｌ１とし、中心位置Ｐｃから地面までの距離を第２距離Ｌ２とする。また、Ｙ軸方向に沿った、定点撮像部２１から画像１００ａの地面に対応する位置までの直線と、地面とのなす角を符号θとする。さらに、地面からの定点撮像部２１の高さをＨとし、定点撮像部２１が配される位置の地面位置をＰ０とし、Ｙ軸方向に沿った地面位置Ｐ０から画像１００ａまでの距離を第３距離Ｌ３とする。
設定部１２は、第３距離Ｌ３を取得して、定点撮像部２１で撮像される画像内の位置と第３距離Ｌ３とを対応付けることにより、画像位置情報を取得する。 Here, in FIG. 2, the Y-axis direction (Y-coordinate direction) is the depth direction. Further, in FIG. 2, when a person 110 at a predetermined position in the Y-axis direction (Y-coordinate direction) is imaged by the fixed-point imaging unit 21, the image at the predetermined position is indicated by reference numeral 100a. The width (size in the X-axis direction) of the image 100a is indicated by the symbol w. If the center position of the image 100a is designated by the symbol Pc, the distance from the fixed point imaging unit 21 to the center position Pc is a first distance L1, and the distance from the center position Pc to the ground is a second distance L2. Further, the angle between the ground and a straight line along the Y-axis direction from the fixed-point imaging unit 21 to a position corresponding to the ground in the image 100a is denoted by θ. Furthermore, the height of the fixed-point imaging unit 21 from the ground is defined as H, the ground position at which the fixed-point imaging unit 21 is arranged is defined as P0, and the distance from the ground position P0 along the Y-axis direction to the image 100a is a third Let the distance be L3.
The setting section 12 obtains image position information by obtaining the third distance L3 and associating the position in the image captured by the fixed point imaging section 21 with the third distance L3.

具体的には、図７に示すように、設定部１２は、第１設定部１２１、第２設定部１２２、第３設定部１２３、第４設定部１２４、第５設定部１２５、第６設定部１２６、第７設定部１２７、第８設定部１２８及び第９設定部１２９を備える。 Specifically, as shown in FIG. 7, the setting section 12 includes a first setting section 121, a second setting section 122, a third setting section 123, a fourth setting section 124, a fifth setting section 125, and a sixth setting section 122. section 126 , a seventh setting section 127 , an eighth setting section 128 , and a ninth setting section 129 .

第１設定部１２１は、被写体の高さ、定点撮像部２１の画角及び定点撮像部２１で撮像される画像幅を予め取得する。
画像位置情報を取得するために、まず、定点撮像部２１の撮像範囲内で人物１１０を移動させる。この場合、例えば、不図示の入力装置を用いて、人物の身長（高さ）が入力される。第１設定部１２１は、入力された人物の身長を取得する。
また、第１設定部１２１は、定点撮像部２１が撮影する画角θ２を取得する。画角θ２は、例えば、不図示の入力装置を用いて入力される。第１設定部１２１は、入力された画角θ２を取得する。画角θ２は、定点撮像部２１の撮像素子（図示せず）の平面サイズ、及び、定点撮像部２１の撮像レンズ（図示せず）の焦点距離等によって定まる。
また、第１設定部１２１は、定点撮像部２１で得られる画像の幅（一例として、ピクセル数）を取得する。画像の幅（画像幅）は、例えば、不図示の入力装置を用いて入力される。 The first setting unit 121 obtains in advance the height of the subject, the angle of view of the fixed-point imaging unit 21, and the width of the image captured by the fixed-point imaging unit 21.
In order to obtain image position information, first, the person 110 is moved within the imaging range of the fixed-point imaging unit 21. In this case, for example, the height of the person is input using an input device (not shown). The first setting unit 121 obtains the height of the inputted person.
The first setting unit 121 also obtains the angle of view θ2 taken by the fixed-point imaging unit 21. The angle of view θ2 is input using, for example, an input device (not shown). The first setting unit 121 obtains the input angle of view θ2. The angle of view θ2 is determined by the planar size of the image sensor (not shown) of the fixed-point imaging section 21, the focal length of the imaging lens (not shown) of the fixed-point imaging section 21, and the like.
The first setting unit 121 also acquires the width (for example, the number of pixels) of the image obtained by the fixed-point imaging unit 21. The width of the image (image width) is input using, for example, an input device (not shown).

第２設定部１２２は、第１設定部１２１で取得された人物１１０の高さに基づいて、１ｍ当たりの、画像データを構成するピクセル数を取得する。
第２設定部１２２は、撮像範囲内で人物１１０を歩かせることにより、画像内の任意に位置において、定点撮像部２１で撮像される画像内での１ｍ当たりのピクセル数を取得する。すなわち、第２設定部１２２は、第１設定部１２１で取得された人の身長に基づいて画像１００ａ内における１ｍの大きさ（長さ）を取得し、さらに取得した１ｍの大きさ当たりの画像のピクセル数を取得する。
図３に例示する場合、定点撮像部２１に用いられる撮像素子（定点撮像部２１で得られる画像）のサイズは、１２８０ピクセル×９６０ピクセルである。第２設定部１２２は、図３に例示する画像に人物１１０（図３には図示せず）が記録される場合、人物１１０の高さと、撮像素子（画像）のピクセル数とに基づいて、１ｍの大きさ当たりのピクセル数を取得する。
なお、人物１１０が定点撮像部２１から遠ざかる方向に移動する場合、画像内の所定の位置において人物がその位置よりも遠ざからない無限点Ｐ１（図３参照）が存在する。画像内における無限点Ｐ１を含む水平線が基準線ＬＳとなる。 The second setting unit 122 obtains the number of pixels forming the image data per meter based on the height of the person 110 obtained by the first setting unit 121.
The second setting unit 122 acquires the number of pixels per meter in the image captured by the fixed-point imaging unit 21 at an arbitrary position within the image by making the person 110 walk within the imaging range. That is, the second setting unit 122 obtains the size (length) of 1 m in the image 100a based on the height of the person obtained by the first setting unit 121, and further calculates the size of the image per 1 m obtained Get the number of pixels.
In the case illustrated in FIG. 3, the size of the image sensor used in the fixed-point imaging section 21 (the image obtained by the fixed-point imaging section 21) is 1280 pixels x 960 pixels. When the person 110 (not shown in FIG. 3) is recorded in the image illustrated in FIG. 3, the second setting unit 122 sets the second setting unit 122 to Obtain the number of pixels per meter.
Note that when the person 110 moves in a direction away from the fixed-point imaging unit 21, there is an infinite point P1 (see FIG. 3) at a predetermined position in the image at which the person does not move further away from that position. A horizontal line including the infinite point P1 in the image becomes the reference line LS.

第３設定部１２３は、画像データに基づく画像の奥行方向の座標となるＹ座標と、第２設定部１２２で取得される被写体の１ｍ当たりのピクセル数との関係を取得する。
図３に例示する場合、画像の縦方向が実空間の奥行方向に対応し、その画像の縦方向の座標がＹ座標になる。また、図３に例示する場合、画像の横方向が実空間の幅方向に対応し、その画像の横方向の座標がＸ座標になる。そして、図３の左上のピクセルを原点の座標（０，０）とすると、右上のピクセルの座標が（１２８０，０）になる。図３の中心位置のピクセルの座標が（６４０，４８０）になる。図３の左下のピクセルの座標が（０，９６０）になり、右下のピクセルの座標が（１２８０，９６０）になる。 The third setting unit 123 acquires the relationship between the Y coordinate, which is the coordinate in the depth direction of the image based on the image data, and the number of pixels per meter of the subject acquired by the second setting unit 122.
In the example shown in FIG. 3, the vertical direction of the image corresponds to the depth direction of the real space, and the vertical coordinate of the image is the Y coordinate. Further, in the example shown in FIG. 3, the horizontal direction of the image corresponds to the width direction of the real space, and the horizontal coordinate of the image is the X coordinate. If the upper left pixel in FIG. 3 is the coordinate (0,0) of the origin, then the coordinates of the upper right pixel are (1280,0). The coordinates of the pixel at the center position in FIG. 3 are (640, 480). The coordinates of the lower left pixel in FIG. 3 are (0,960), and the coordinates of the lower right pixel are (1280,960).

定点撮像部２１の撮像範囲内（座標（０，０）、座標（１２８０，０）、座標（１２８０，９６０）及び座標（０，９６０）で囲まれる範囲内）を、例えば、被写体（人物１１０）を縦横に歩かせることにより、第３設定部１２３は、複数のＹ座標の位置において１ｍ当たりのピクセル数を取得する。 Within the imaging range of the fixed-point imaging unit 21 (within the range surrounded by coordinates (0,0), coordinates (1280,0), coordinates (1280,960), and coordinates (0,960)), for example, a subject (person 110) ), the third setting unit 123 obtains the number of pixels per meter at a plurality of Y coordinate positions.

図４に一例を示すように、第３設定部１２３によって取得されるＹ座標と１ｍ当たりのピクセル数との関係は線形になる。ここで、図２に示すように、Ｙ座標の数値は、定点撮像部２１に近づくに従って大きくなる。すなわち、Ｙ座標が大きくなるに従って（被写体が定点撮像部２１に近づくに従って）、１ｍ当たりのピクセル数は多くなる。換言すると、Ｙ座標が小さくなるに従って（被写体が定点撮像部２１から遠ざかるに従って）、１ｍ当たりのピクセル数は少なくなる。 As an example is shown in FIG. 4, the relationship between the Y coordinate acquired by the third setting unit 123 and the number of pixels per 1 m is linear. Here, as shown in FIG. 2, the value of the Y coordinate increases as it approaches the fixed-point imaging section 21. That is, as the Y coordinate increases (as the subject approaches the fixed-point imaging unit 21), the number of pixels per meter increases. In other words, as the Y coordinate decreases (as the subject moves away from the fixed-point imaging unit 21), the number of pixels per meter decreases.

第４設定部１２４は、第１設定部１２１によって取得された画像の幅と、第２設定部１２２で取得された１ｍ当たりのピクセル数とに基づいて所定位置における人物１１０を検出した画像１００ａの実際の幅ｗを取得する。
図２に例示するように、人物１１０がいるＹ座標の位置（Ｙ座標の方向の所定位置）における画像１００ａを考える。第４設定部１２４は、定点撮像部２１で撮像される画像の幅方向のピクセル数（図３に例示する場合では１２８０ピクセル）は第１設定部１２１によって取得されるため、１ｍ当たりのピクセル数に基づいて、そのＹ座標の所定位置における画像１００ａの幅ｗの実際の長さを取得する。 The fourth setting section 124 configures the image 100a in which the person 110 is detected at a predetermined position based on the width of the image obtained by the first setting section 121 and the number of pixels per meter obtained by the second setting section 122. Get the actual width w.
As illustrated in FIG. 2, consider an image 100a at a Y-coordinate position (a predetermined position in the Y-coordinate direction) where a person 110 is located. The fourth setting unit 124 determines that the number of pixels in the width direction of the image captured by the fixed-point imaging unit 21 (1280 pixels in the case illustrated in FIG. 3) is acquired by the first setting unit 121, so the number of pixels per 1 m is Based on this, the actual length of the width w of the image 100a at the predetermined position of the Y coordinate is obtained.

第５設定部１２５は、第４設定部１２４で取得された実際の画像の幅ｗと、第１設定部１２１で取得された画角θ２とに基づいて、定点撮像部２１から画像１００ａ内の中心位置Ｐｃまでの距離である第１距離を取得すると共に、地面から中心位置Ｐｃまでの距離である第２距離Ｌ２を取得する。
第５設定部１２５は、上述したＹ座標の所定位置における画像１００ａの幅ｗの実際の長さ、及び、定点撮像部２１の画角θ２に基づいて、定点撮像部２１から画像１００ａの中心位置Ｐｃまでの実際の距離（第１距離Ｌ１）を取得する。また、第５設定部１２５は、画像のピクセル数、１ｍ当たりのピクセル数、Ｙ座標の所定位置における画像１００ａの幅ｗの実際の長さ、及び、定点撮像部２１の画角に基づいて、画像１００ａの中心位置Ｐｃから地面までの実際の距離（第２距離Ｌ２）を取得する。 The fifth setting unit 125 determines the width of the image 100a from the fixed-point imaging unit 21 based on the actual width w of the image acquired by the fourth setting unit 124 and the angle of view θ2 acquired by the first setting unit 121. A first distance, which is the distance to the center position Pc, is acquired, and a second distance L2, which is the distance from the ground to the center position Pc, is acquired.
The fifth setting unit 125 determines the center position of the image 100a from the fixed-point imaging unit 21 based on the actual length of the width w of the image 100a at the predetermined position of the Y coordinate described above and the angle of view θ2 of the fixed-point imaging unit 21. Obtain the actual distance (first distance L1) to Pc. Further, the fifth setting unit 125 determines, based on the number of pixels of the image, the number of pixels per meter, the actual length of the width w of the image 100a at a predetermined position of the Y coordinate, and the angle of view of the fixed-point imaging unit 21. The actual distance (second distance L2) from the center position Pc of the image 100a to the ground is acquired.

第６設定部１２６は、第５設定部１２５で取得された第１距離Ｌ１及び第２距離Ｌ２に基づいて、定点撮像部２１の撮像方向における中心線と地面とのなす角θを求める。すなわち、第６設定部１２６は、三角関数を利用することにより、第５設定部１２５で取得された第１距離Ｌ１及び第２距離Ｌ２に基づいて、地面と定点撮像部２１とのなす角θを求める。 The sixth setting unit 126 determines the angle θ between the center line of the fixed-point imaging unit 21 in the imaging direction and the ground based on the first distance L1 and the second distance L2 acquired by the fifth setting unit 125. That is, the sixth setting section 126 uses trigonometric functions to determine the angle θ between the ground and the fixed-point imaging section 21 based on the first distance L1 and the second distance L2 acquired by the fifth setting section 125. seek.

第７設定部１２７は、第５設定部１２５で取得された第１距離Ｌ１及び第２距離Ｌ２と、第６設定部１２６で取得されたなす角θとに基づいて、地面から定点撮像部２１までの高さＨを取得する。すなわち、第７設定部１２７は、三角関数を利用することにより、第１距離Ｌ１、第２距離Ｌ２及びなす角θに基づいて、地面から定点撮像部２１までの高さＨを取得する。 The seventh setting unit 127 determines whether the fixed-point imaging unit 21 is located from the ground based on the first distance L1 and second distance L2 acquired by the fifth setting unit 125 and the angle θ acquired by the sixth setting unit 126. Get the height H up to. That is, the seventh setting unit 127 uses trigonometric functions to obtain the height H from the ground to the fixed-point imaging unit 21 based on the first distance L1, the second distance L2, and the angle θ.

第８設定部１２８は、第７設定部１２７で取得された地面から定点撮像部２１までの高さＨと、第６設定部１２６で取得されたなす角θとに基づいて、定点撮像部２１が配される撮像位置から被写体までの位置を取得し、さらに被写体のその位置を複数取得することに基づいて画像位置情報を取得する。
すなわち、第８設定部１２８は、地面から定点撮像部２１までの高さＨと、定点撮像部２１と地面とのなす角θとに基づいて、定点撮像部２１が配される地面位置Ｐ０からＹ座標の所定位置（画像１００ａ）までの距離Ｌ３を取得する。第８設定部１２８は、その距離Ｌ３と、画像内の所定位置におけるＹ座標とを対応付けることにより、画像内の位置に関する情報（画像位置情報）を取得する。また、第８設定部１２８は、人物１１０を定点撮像部２１の撮像範囲内を移動させて、撮像範囲内の任意の位置における定点撮像部２１が配される地面位置Ｐ０から被写体（人）１１０までの距離を取得することに基づいて、画像位置情報を取得する。第８設定部１２８は、画像位置情報を記憶部２２に記憶する。 The eighth setting unit 128 sets the fixed-point imaging unit 21 based on the height H from the ground to the fixed-point imaging unit 21 acquired by the seventh setting unit 127 and the angle θ acquired by the sixth setting unit 126. The image position information is acquired based on acquiring the position from the imaging position where the camera is placed to the subject, and then acquiring a plurality of positions of the subject.
That is, the eighth setting unit 128 determines the position from the ground position P0 where the fixed-point imaging unit 21 is arranged based on the height H from the ground to the fixed-point imaging unit 21 and the angle θ between the fixed-point imaging unit 21 and the ground. A distance L3 to a predetermined Y coordinate position (image 100a) is obtained. The eighth setting unit 128 obtains information regarding the position within the image (image position information) by associating the distance L3 with the Y coordinate at a predetermined position within the image. Further, the eighth setting unit 128 moves the person 110 within the imaging range of the fixed-point imaging unit 21, and moves the subject (person) 110 from a ground position P0 where the fixed-point imaging unit 21 is arranged at an arbitrary position within the imaging range. Image position information is obtained based on obtaining the distance to. The eighth setting unit 128 stores the image position information in the storage unit 22.

画像位置情報は、定点撮像部２１で撮像される画像内の任意のピクセルの位置（座標）が、定点撮像部２１から何ｍの距離であるかを示す情報である。画像位置情報は、具体的な一例として、図３に示すピクセルの座標（６４０，４８０）は定点撮像部２１から１０ｍの距離がある、ピクセルの座標（６４０，７２０）は定点撮像部２１から１５ｍの距離がある、ピクセルの座標（６４０，８４０）は定点撮像部２１から２０ｍの距離があるという情報である。 The image position information is information indicating how many meters away from the fixed-point imaging unit 21 the position (coordinates) of an arbitrary pixel in the image captured by the fixed-point imaging unit 21 is. As a specific example of the image position information, the pixel coordinates (640, 480) shown in FIG. The information indicates that the pixel coordinates (640, 840) are 20 m away from the fixed-point imaging unit 21.

図５に示すように、画像位置情報は、Ｙ座標が大きくなるに従って、定点撮像部２１が配される地面位置Ｐ０からの距離が近くなることを示す情報である。換言すると、画像位置情報は、Ｙ座標が小さくなるに従って、定点撮像部２１が配される地面位置Ｐ０からの距離が遠くなることを示す情報である。 As shown in FIG. 5, the image position information is information indicating that as the Y coordinate increases, the distance from the ground position P0 where the fixed-point imaging unit 21 is arranged becomes shorter. In other words, the image position information is information indicating that as the Y coordinate becomes smaller, the distance from the ground position P0 where the fixed-point imaging unit 21 is arranged becomes longer.

ここで、第９設定部１２９は、第４設定部１２４で取得された複数の画像それぞれの実際の幅ｗのうち、Ｙ座標の方向における基準位置の幅と、Ｙ座標の方向の所定位置における幅とに基づいて、Ｙ座標に対する幅比率を取得する。
すなわち、第４設定部１２４では、実際の画像の幅ｗを取得する。人物１１０がいない位置（人が歩いていない位置に対応するＹ座標）においては、第４設定部１２４は、実際の画像の幅ｗを取得することができない。このため、第９設定部１２９は、第４設定部１２４でＹ座標の複数の位置で画像の幅ｗを取得した場合、その複数の位置での画像の幅ｗに基づいて、Ｙ座標の任意の位置における画像の幅を算出する。例えば、第９設定部１２９は、第４設定部１２４で取得された複数の画像の幅のうち１つをＹ座標の基準位置の画像の幅（基準幅）と設定し、複数の画像の幅のうち基準幅を除く他の画像の幅と、基準幅との比率（幅比率）を求める。第９設定部１２９は、第４設定部１２４で取得された複数の画像それぞれに基づいて、基準幅との比率（幅比率）を求める。この幅比率をグラフに示すと、図６のようになる。 Here, the ninth setting unit 129 determines the width at the reference position in the Y-coordinate direction and the width at a predetermined position in the Y-coordinate direction among the actual width w of each of the plurality of images acquired by the fourth setting unit 124. The width ratio to the Y coordinate is obtained based on the width.
That is, the fourth setting unit 124 obtains the actual width w of the image. At a position where the person 110 is not present (Y coordinate corresponding to a position where no person is walking), the fourth setting unit 124 cannot acquire the actual width w of the image. Therefore, when the fourth setting unit 124 obtains the width w of the image at a plurality of positions on the Y coordinate, the ninth setting unit 129 selects an arbitrary value on the Y coordinate based on the width w of the image at the plurality of positions. Calculate the width of the image at the position. For example, the ninth setting unit 129 sets one of the widths of the plurality of images acquired by the fourth setting unit 124 as the width of the image at the reference position of the Y coordinate (reference width), and The ratio (width ratio) between the widths of other images excluding the reference width and the reference width is calculated. The ninth setting unit 129 calculates the ratio (width ratio) to the reference width based on each of the plurality of images acquired by the fourth setting unit 124. A graph of this width ratio is shown in FIG. 6.

図６に例示すように、Ｙ座標が大きくなるに従って（定点撮像部２１からの距離が近くなるに従って）、画像の幅比率は小さくなる。換言すると、Ｙ座標が小さくなるに従って（定点撮像部２１からの距離が遠くなるに従って）画像の幅比率は大きくなる。 As illustrated in FIG. 6, as the Y coordinate increases (as the distance from the fixed-point imaging unit 21 decreases), the width ratio of the image decreases. In other words, the width ratio of the image increases as the Y coordinate decreases (as the distance from the fixed-point imaging unit 21 increases).

上述した第５設定部１２５は、第４設定部１２４で画像の幅ｗが取得されていない位置でも、第９設定部１２９で取得される画像の幅比率を利用して、任意のＹ座標における画像の幅を取得できるので、第１距離Ｌ１及び第２距離Ｌ２を取得することができる。
これにより、画像位置情報の取得を終了する。 The fifth setting section 125 described above uses the width ratio of the image obtained by the ninth setting section 129 even at a position where the width w of the image is not obtained by the fourth setting section 124, so that the width w of the image can be set at any Y coordinate. Since the width of the image can be acquired, the first distance L1 and the second distance L2 can be acquired.
This completes the acquisition of image position information.

次に、上述した画像位置情報を利用して、被写体２０１に関する被写体情報を取得する構成について説明する。
図８は、第１認識枠２１０及び第２認識枠２２０の一例について説明するための図である。
図９は、第１認識枠２１０及び第２認識枠２２０それぞれの中心位置２１０ａ，２２０ａについて説明するための図である。
図１０は、正面枠２２０ｂ及び背面枠２２０ｃの一例について説明するための図である。
図１１は、重心位置ＰＧについて説明するための図である。
図１２は、被写体２０１の速度を取得する際の重心位置ＰＧの変化について説明するための図である。 Next, a configuration for acquiring subject information regarding the subject 201 using the above-mentioned image position information will be described.
FIG. 8 is a diagram for explaining an example of the first recognition frame 210 and the second recognition frame 220.
FIG. 9 is a diagram for explaining the center positions 210a and 220a of the first recognition frame 210 and the second recognition frame 220, respectively.
FIG. 10 is a diagram for explaining an example of the front frame 220b and the back frame 220c.
FIG. 11 is a diagram for explaining the center of gravity position PG.
FIG. 12 is a diagram for explaining changes in the center of gravity position PG when acquiring the speed of the subject 201.

認識部１３は、定点撮像部２１で生成された画像データに被写体２０１が記録される場合、記憶部２２に記憶される対応関係（学習モデル）に基づいて被写体２０１を認識する。すなわち、認識部１３は、対応関係（学習モデル）に基づいて、定点撮像部２１で生成された画像データに基づく画像から被写体２０１の特徴を抽出し、抽出した特徴から被写体２０１を認識する。また、認識部１３は、学習モデルを利用して被写体２０１を認識するばかりでなく、例えば、パターンマッチング等を行うことにより、被写体２０１を認識することとしてもよい。 When the subject 201 is recorded in the image data generated by the fixed-point imaging unit 21, the recognition unit 13 recognizes the subject 201 based on the correspondence relationship (learning model) stored in the storage unit 22. That is, the recognition unit 13 extracts the features of the subject 201 from the image based on the image data generated by the fixed-point imaging unit 21 based on the correspondence relationship (learning model), and recognizes the subject 201 from the extracted features. Furthermore, the recognition unit 13 may not only recognize the subject 201 using the learning model, but may also recognize the subject 201 by, for example, performing pattern matching.

認識部１３は、定点撮像部２１によって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体２０１（図８，１０で示す場合には車両）をそれぞれ認識し、それぞれの時刻において被写体２０１を囲う複数の第１認識枠２１０（図８参照）を生成する。この場合、認識部１３は、画像上において被写体２０１を囲う矩形の枠を第１認識枠２１０として生成することとしてもよい。認識部１３は、定点撮像部２１によって動画が撮像される場合、時間的に異なる画像として、例えば、時間的に連続するフレーム、又は、所定時間毎のフレームに基づいて被写体２０１を認識する。認識部１３は、定点撮像部２１によって静止画が撮像される場合、時間的に異なる画像として、時間的に連続して撮像された静止画、又は、所定時間毎に撮像された静止画に基づいて被写体２０１を認識する。 The recognition unit 13 recognizes each subject 201 (vehicle in the case shown in FIGS. 8 and 10) based on at least two temporally different images recorded in the image data captured by the fixed-point imaging unit 21, and A plurality of first recognition frames 210 (see FIG. 8) surrounding the subject 201 are generated at the time . In this case, the recognition unit 13 may generate a rectangular frame surrounding the subject 201 on the image as the first recognition frame 210. When a moving image is captured by the fixed-point imaging unit 21, the recognition unit 13 recognizes the subject 201 based on temporally consecutive frames or frames at predetermined time intervals as temporally different images, for example. When a still image is captured by the fixed-point imaging unit 21, the recognition unit 13 determines whether the fixed-point imaging unit 21 captures a still image based on a still image captured continuously in time or a still image captured at predetermined time intervals as a temporally different image. The subject 201 is recognized.

第１認識枠２１０は、例えば、被写体２０１の周囲を囲う矩形の枠であってよく、また被写体２０１の外形に接する矩形の枠であってよい。認識部１３は、時間的に異なる画像（フレーム又は静止画）毎に第１認識枠２１０を生成する。認識部１３は、画像に複数の被写体２０１が記録される場合には、被写体２０１毎に第１認識枠２１０を生成する。 The first recognition frame 210 may be, for example, a rectangular frame surrounding the subject 201, or may be a rectangular frame that touches the outer shape of the subject 201. The recognition unit 13 generates a first recognition frame 210 for each temporally different image (frame or still image). The recognition unit 13 generates a first recognition frame 210 for each subject 201 when a plurality of subjects 201 are recorded in the image.

予測部１４は、複数の第１認識枠２１０と、その複数の第１認識枠２１０の中心位置２１０ａと、定点撮像部２１によって撮像される画像データに基づく画像の基準線ＬＳとに基づいて、認識部１３によって第１認識枠２１０を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体２０１を囲う第２認識枠２２０（図８参照）とその第２認識枠２２０の中心位置２２０ａ（図９参照）とを予測する。すなわち、図９に示すように、予測部１４は、上述した基準線ＬＳとして、画像データに基づく画像の水平方向に延びる線を設定し、基準線ＬＳの所定位置を端点（交点ＰＳ）として設定し、その端点（交点ＰＳ）から複数の第１認識枠２１０の各中心位置２１０ａを通る中心線上を中心位置２２０ａとして第２認識枠２２０を生成することとしてもよい。この場合、予測部１４は、画像上において被写体２０１を囲う矩形の枠を第２認識枠２２０として生成することとしてもよい。 The prediction unit 14 uses the plurality of first recognition frames 210, the center positions 210a of the plurality of first recognition frames 210, and the reference line LS of the image based on the image data captured by the fixed-point imaging unit 21, to A second recognition frame 220 (see FIG. 8) surrounding the subject 201 at a predetermined time that is temporally earlier than the recording time of the image when the first recognition frame 210 is generated by the recognition unit 13, and the second recognition frame 220 The center position 220a (see FIG. 9) is predicted. That is, as shown in FIG. 9, the prediction unit 14 sets a line extending in the horizontal direction of the image based on the image data as the above-mentioned reference line LS, and sets a predetermined position of the reference line LS as an end point (intersection point PS). However, the second recognition frame 220 may be generated with the center position 220a being on the center line passing through each center position 210a of the plurality of first recognition frames 210 from the end point (intersection point PS). In this case, the prediction unit 14 may generate a rectangular frame surrounding the subject 201 on the image as the second recognition frame 220.

予測部１４は、複数の第１認識枠２１０それぞれの中心位置２１０ａを通る線（第１延長線２３０）を生成して基準線ＬＳの方へ延長する。予測部１４は、第１延長線２３０と基準線ＬＳとの画像上の交点ＰＳを取得することとしてもよい。また、予測部１４は、基準線ＬＳ側とは反対方向に第１延長線２３０を伸ばし、その第１延長線２３０上に第２認識枠２２０の中心位置２２０ａが来るように、その第２認識枠２２０を生成する。一例として、予測部１４は、所定時間毎の４つの第１認識枠２１０に基づいて、時間的に所定時間分だけ先の第２認識枠２２０を生成する。なお、第２認識枠２２０は、現在時刻における被写体２０１の予想位置であってよい。これにより、被写体２０１が他の物体によって一時的に全部又は一部が隠れた場合でも、その被写体２０１を認識する（被写体２０１の位置を予測する）ことが可能になる。 The prediction unit 14 generates a line (first extension line 230) passing through the center position 210a of each of the plurality of first recognition frames 210, and extends it toward the reference line LS. The prediction unit 14 may obtain an intersection PS on the image between the first extension line 230 and the reference line LS. Further, the prediction unit 14 extends the first extension line 230 in the opposite direction to the reference line LS side, and extends the second recognition frame 220 so that the center position 220a of the second recognition frame 220 is on the first extension line 230. A frame 220 is generated. As an example, the prediction unit 14 generates a second recognition frame 220 temporally ahead by a predetermined time based on four first recognition frames 210 for each predetermined time. Note that the second recognition frame 220 may be the expected position of the subject 201 at the current time. This makes it possible to recognize the subject 201 (predict the position of the subject 201) even if the subject 201 is temporarily hidden in whole or in part by another object.

なお、予測部１４は、第１延長線２３０と基準線ＬＳとの交点ＰＳを基準に、複数の第１認識枠２１０それぞれの四隅のうち少なくとも１つの角部を通る第２延長線２４０上に、第２認識枠２２０の対応する角部を配置することとしてもよい。一例として、予測部１４は、第１延長線２３０と基準線ＬＳとの交点ＰＳに対して、画像上において左下に被写体２０１が移動する場合、複数の第１認識枠２１０の左上と右下それぞれの角部を通る２つの第２延長線２４０を生成し、一方の第２延長線２４０上に第２認識枠２２０の左上の角部を配置し、他方の第２延長線２４０上に第２認識枠２２０の右下の角部を配置することとしてもよい。 Note that the prediction unit 14 uses the intersection point PS between the first extension line 230 and the reference line LS as a reference, and extends the line on the second extension line 240 passing through at least one of the four corners of each of the plurality of first recognition frames 210. , the corresponding corners of the second recognition frame 220 may be arranged. As an example, when the subject 201 moves to the lower left on the image with respect to the intersection PS of the first extension line 230 and the reference line LS, the prediction unit 14 predicts the upper left and lower right of each of the plurality of first recognition frames 210. Generate two second extension lines 240 that pass through the corners of , place the upper left corner of the second recognition frame 220 on one second extension line 240, and place the second The lower right corner of the recognition frame 220 may be placed.

同様に、交点ＰＳから右下に被写体２０１が移動する場合には、一例として、複数の第１認識枠２１０の右上と左下の角部を通る２つの第２延長線２４０を生成し、一方の第２延長線２４０上に第２認識枠２２０の右上の角部を配置し、他方の第２延長線２４０上に第２認識枠２２０の左下の角部を配置することとしてもよい。 Similarly, when the subject 201 moves to the lower right from the intersection PS, as an example, two second extension lines 240 passing through the upper right and lower left corners of the plurality of first recognition frames 210 are generated, and one of the second extension lines 240 is generated. The upper right corner of the second recognition frame 220 may be placed on the second extension line 240, and the lower left corner of the second recognition frame 220 may be placed on the other second extension line 240.

同様に、交点ＰＳから下側に被写体２０１が移動する場合には、一例として、複数の第１認識枠２１０に右上と左上の角部を通る２つの第２延長線２４０を生成し、一方の第２延長線２４０上に第２認識枠２２０の左上の角部を配置し、他方の第２延長線２４０上に第２認識枠２２０の右上の角部を配置することとしてもよい。 Similarly, when the subject 201 moves downward from the intersection PS, for example, two second extension lines 240 passing through the upper right and upper left corners of the plurality of first recognition frames 210 are generated, and one The upper left corner of the second recognition frame 220 may be placed on the second extension line 240, and the upper right corner of the second recognition frame 220 may be placed on the other second extension line 240.

生成部１５は、図１０に示すように、予測部１４によって生成される第２認識枠２２０に一部が接するように、被写体２０１の正面の外形に応じた正面枠２２０ｂと、被写体２０１の背面の外形に応じた背面枠２２０ｃとを生成する。上述した認識部１３は、被写体２０１の認識によって被写体２０１の形状も認識できる。生成部１５は、認識部１３による被写体２０１の形状の認識を利用して、被写体２０１の幅方向及び高さ方向の外形を取得することができる。生成部１５は、被写体２０１の幅方向及び高さ方向の外形に応じた矩形の正面枠２２０ｂを生成する。また、生成部１５は、第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳと、正面枠２２０ｂの４つの角部とを接続する接続線２５０上に、第２認識枠２２０とが交差する点に背面枠２２０ｃの角部が一致するようにその背面枠２２０ｃを設定する。 As shown in FIG. 10, the generation unit 15 creates a front frame 220b corresponding to the front outline of the subject 201 and a rear face of the subject 201 so that a part thereof is in contact with the second recognition frame 220 generated by the prediction unit 14. A back frame 220c is generated according to the outer shape of the back frame 220c. The recognition unit 13 described above can also recognize the shape of the subject 201 by recognizing the subject 201. The generation unit 15 can use the recognition of the shape of the subject 201 by the recognition unit 13 to obtain the outer shape of the subject 201 in the width direction and height direction. The generation unit 15 generates a rectangular front frame 220b according to the outer shape of the subject 201 in the width direction and height direction. The generation unit 15 also generates a second recognition frame 220 on a connection line 250 connecting the intersection PS of the first and second extension lines 230, 240 and the reference line LS and the four corners of the front frame 220b. The rear frame 220c is set so that the corner of the rear frame 220c coincides with the point where the rear frame 220c intersects.

生成部１５は、所定位置としての交点（第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳ）を基準に、画像データに基づく画像の垂直方向に対して左側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の左下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の右上にある頂点に接するように矩形の背面枠２２０ｃを生成する。すなわち、生成部１５は、第２認識枠２２０の左下の角部と正面枠２２０ｂの左下の角部とを一致させ、第２認識枠２２０の左下の角部に接続する２つの辺と、正面枠２２０ｂの左下の角部に接続する２つの辺とを重ねる。この場合、生成部１５は、第２認識枠２２０の右上の角部と、背面枠２２０の右上の角部を一致させ、第２認識枠２２０の右上の角部に接続する２つの辺と、背面枠２２０ｃの右上の角に接続する２つの辺とを重ねる。 The generation unit 15 determines the position of the subject 201 on the left side with respect to the vertical direction of the image based on the image data, based on the intersection point (intersection point PS between the first and second extension lines 230 and 240 and the reference line LS) as a predetermined position. In this case, a rectangular front frame 220b is generated so as to touch the lower left vertex of the second recognition frame 220 with respect to the image, and a rectangular front frame 220b is generated so as to touch the upper right vertex of the second recognition frame 220 with respect to the image. A back frame 220c is generated. That is, the generation unit 15 matches the lower left corner of the second recognition frame 220 with the lower left corner of the front frame 220b, and aligns the two sides connected to the lower left corner of the second recognition frame 220 with the front The two sides connected to the lower left corner of the frame 220b are overlapped. In this case, the generation unit 15 matches the upper right corner of the second recognition frame 220 with the upper right corner of the back frame 220, and creates two sides connected to the upper right corner of the second recognition frame 220, The two sides connected to the upper right corner of the back frame 220c are overlapped.

生成部１５は、所定位置を基準に、画像データに基づく画像の垂直方向に対して右側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の右下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の左上にある頂点に接するように矩形の背面枠２２０ｃを生成する。すなわち、生成部１５は、第２認識枠２２０の右下の角部と正面枠２２０ｂの右下の角部とを一致させ、第２認識枠２２０の右下の角部に接続する２つの辺と、正面枠２２０ｂの左下の角部に接続する２つの辺とを重ねる。この場合、生成部１５は、第２認識枠２２０の左上の角部と、背面枠２２０ｃの左上の角部とを一致させ、第２認識枠２２０に接続する２つの辺と背面枠２２０ｃの左上に接続する２つの辺とを重ねる。 When the subject 201 is located on the right side in the vertical direction of the image based on the image data with respect to the predetermined position as a reference, the generation unit 15 generates an image so that the subject 201 is in contact with the vertex at the lower right of the second recognition frame 220 with respect to the image. A rectangular front frame 220b is generated, and a rectangular rear frame 220c is generated so as to be in contact with the upper left vertex of the second recognition frame 220 with respect to the image. That is, the generation unit 15 matches the lower right corner of the second recognition frame 220 with the lower right corner of the front frame 220b, and generates two sides connected to the lower right corner of the second recognition frame 220. and two sides connected to the lower left corner of the front frame 220b are overlapped. In this case, the generation unit 15 aligns the upper left corner of the second recognition frame 220 with the upper left corner of the back frame 220c, and aligns the two sides connected to the second recognition frame 220 with the upper left corner of the back frame 220c. Overlap the two sides that connect to.

サイズ取得部１６は、定点撮像部２１から被写体２０１までの距離と、画像データに基づく画像のサイズと、生成部１５によって生成された正面枠２２０ｂ及び背面枠２２０ｃの少なくとも一方のサイズとに基づいて、被写体２０１のサイズを取得する。サイズ取得部１６は、第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳの位置によって被写体２０１の幅方向と高さ方向のサイズが変わるため、適宜計算を行うことにより、被写体２０１のサイズを取得する。計算方法は如何なるものでもよいが、サイズ取得部１６は、例えば、以下のようにして被写体２０１のサイズを取得することとしてもよい。 The size acquisition unit 16 determines the size based on the distance from the fixed-point imaging unit 21 to the subject 201, the size of the image based on the image data, and the size of at least one of the front frame 220b and the back frame 220c generated by the generation unit 15. , obtain the size of the subject 201. Since the size of the subject 201 in the width direction and height direction changes depending on the position of the intersection point PS between the first and second extension lines 230 and 240 and the reference line LS, the size acquisition unit 16 calculates the size of the subject 201 as appropriate. Get the size of. Although any calculation method may be used, the size acquisition unit 16 may acquire the size of the subject 201 as follows, for example.

サイズ取得部１６は、被写体２０１の画像上での幅方向のサイズを求める場合、第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳの画像上での座標（Ｘ座標）と、第２認識枠２２０の中心位置２２０ａの画像上での座標（Ｘ座標）と、画像サイズ（例えば、画像の水平方向のピクセル数）とに基づいて、第１の値を取得する。サイズ取得部１６は、第１の値に対して所定の第１係数を加算すると共に、その値に対して第２係数を加算することにより、第２の値を取得する。サイズ取得部１６は、第２の値に対して第２認識枠２２０の画像上での幅方向のサイズを乗算することにより、被写体２０１の画像上での幅方向のサイズ（正面枠２２０ｂの幅方向のサイズ）を取得する。
サイズ取得部１６は、被写体２０１の画像上での高さを求める場合にも、上述した被写体２０１の画像上での幅方向のサイズを求める場合と同様にして取得することができる。 When determining the size of the subject 201 in the width direction on the image, the size acquisition unit 16 calculates the coordinate (X coordinate) on the image of the intersection point PS of the first and second extension lines 230 and 240 and the reference line LS, The first value is acquired based on the coordinate (X coordinate) of the center position 220a of the second recognition frame 220 on the image and the image size (for example, the number of pixels in the horizontal direction of the image). The size acquisition unit 16 acquires a second value by adding a predetermined first coefficient to the first value and adding a second coefficient to the value. The size acquisition unit 16 multiplies the second value by the size of the second recognition frame 220 in the width direction on the image, thereby obtaining the size of the subject 201 in the width direction on the image (width of the front frame 220b). Get the direction size).
The size acquisition unit 16 can obtain the height of the subject 201 on the image in the same manner as the above-described size of the subject 201 in the width direction on the image.

ここで、サイズ取得部１６は、定点撮像部２１において生成される画像データに基づく画像内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた結果（設定部１２によって得られた結果）に基づいて、定点撮像部２１から被写体２０１までの距離を取得する。サイズ取得部１６は、取得した距離と、上述した画像上での被写体２０１のサイズとに基づいて、被写体２０１の幅方向及び高さ方向のサイズを取得する。また、サイズ取得部１６は、被写体２０１の奥行方向のサイズを、設定部１２によって得られた結果に基づいて取得する。 Here, the size acquisition unit 16 obtains a result (obtained by the setting unit 12 The distance from the fixed-point imaging unit 21 to the subject 201 is acquired based on the results obtained (results obtained). The size acquisition unit 16 acquires the size of the subject 201 in the width direction and height direction based on the acquired distance and the size of the subject 201 on the image described above. Further, the size acquisition unit 16 acquires the size of the subject 201 in the depth direction based on the result obtained by the setting unit 12.

重心取得部１７は、生成部１５によって生成された正面枠２２０ｂ及び背面枠２２０ｃに基づいて、被写体２０１の重心位置ＰＧ（図１１参照）を取得する。重心取得部１７は、一例として、正面枠２２０ｂの４つの角部のうち地表面側の２つの角部（例えば、角部の画像上の座標等）と、背面枠２２０ｃの４つの角部のうち地表面側の２つの角部（例えば、角部の画像上の座標）とを利用して、被写体２０１の重心位置ＰＧ（画像上の座標）を取得する。この場合、重心取得部１７は、一例として、定点撮像部２１において生成される画像データに基づく画像内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた結果（設定部１２によって得られた結果（画像位置情報））に基づいて、定点撮像部２１から重心位置ＰＧまでの距離（一例として、画像上における水平方向（Ｘ方向）と垂直方法（Ｙ方向）の距離を取得することとしてもよい。なお、Ｘ方向の距離は、画像上の水平方向の中心線からの距離であってもよい。 The center of gravity acquisition unit 17 acquires the center of gravity position PG of the subject 201 (see FIG. 11) based on the front frame 220b and the back frame 220c generated by the generation unit 15. For example, the center of gravity acquisition unit 17 obtains two corners on the ground surface side (for example, the coordinates of the corners on the image) among the four corners of the front frame 220b, and four corners of the back frame 220c. The center of gravity position PG (coordinates on the image) of the subject 201 is obtained using the two corners on the ground surface side (for example, the coordinates of the corners on the image). In this case, the center of gravity acquisition unit 17 may obtain, for example, a result (setting unit 12 (image position information)), the distance from the fixed point imaging unit 21 to the center of gravity position PG (as an example, the distance in the horizontal direction (X direction) and vertical direction (Y direction) on the image Note that the distance in the X direction may be the distance from the center line in the horizontal direction on the image.

位置取得部１８は、重心取得部１７によって取得された重心位置ＰＧを所定の座標系に変換することにより、被写体２０１の位置を取得する。位置取得部１８は、重心取得部１７によって取得された重心位置ＰＧ（画像上の重心位置）を、例えば、グローバル座標系に変換することにより、被写体２０１の位置を取得する。位置取得部１８は、例えば、定点撮像部２１の撮像方向（方位、及び、地表面を向く角度（垂線と定点撮像部２１の光軸のなす角））、及び、定点撮像部２１のグローバル座標系等を利用して、三角関数を用いることにより、グローバル座標系における被写体２０１の位置を取得する。 The position acquisition unit 18 acquires the position of the subject 201 by converting the gravity center position PG acquired by the gravity center acquisition unit 17 into a predetermined coordinate system. The position acquisition unit 18 acquires the position of the subject 201 by converting the gravity center position PG (the gravity center position on the image) acquired by the gravity center acquisition unit 17 into a global coordinate system, for example. For example, the position acquisition unit 18 acquires the imaging direction (azimuth and angle toward the ground surface (angle between the perpendicular and the optical axis of the fixed-point imaging unit 21)) of the fixed-point imaging unit 21, and the global coordinates of the fixed-point imaging unit 21. The position of the subject 201 in the global coordinate system is obtained by using trigonometric functions.

速度取得部１９は、重心取得部１７によって取得された時間毎の被写体２０１の複数の重心位置ＰＧ（ＰＧ１，ＰＧ２）の変化に基づいて被写体２０１の速度を取得する。速度取得部１９は、図１２に例示するように、時間的に異なる少なくとも２つの重心位置ＰＧ１，ＰＧ２に基づき、所定時間ｄｔの間に重心位置ＰＧ（ＰＧ１，ＰＧ２）がどれくらい移動したのかを算出することにより、被写体２０１の速度を取得することが可能である。この場合、速度取得部１９は、所定時間ｄｔにおけるＸ方向の変位ｄｘとｙ方向の変位ｄｙとに基づいて、ｘ方向の速度ＶｘとＹ方向の速度Ｖｙとを取得することとしてもよい。また、速度取得部１９は、時間的に異なる３つ以上の被写体２０１の重心位置ＰＧに基づいて、被写体２０１の平均速度を取得することとしてもよい。 The speed acquisition unit 19 acquires the speed of the subject 201 based on the change in the plurality of gravity center positions PG (PG1, PG2) of the subject 201 at each time acquired by the gravity center acquisition unit 17. As illustrated in FIG. 12, the speed acquisition unit 19 calculates how far the center of gravity position PG (PG1, PG2) has moved during a predetermined time dt based on at least two temporally different center of gravity positions PG1, PG2. By doing so, it is possible to obtain the speed of the subject 201. In this case, the velocity acquisition unit 19 may acquire the velocity Vx in the x direction and the velocity Vy in the Y direction based on the displacement dx in the X direction and the displacement dy in the y direction during the predetermined time dt. Further, the speed acquisition unit 19 may acquire the average speed of the subject 201 based on the centroid positions PG of three or more temporally different subjects 201.

出力制御部２０は、出力部を制御する。上述した出力部は、出力制御部２０の制御に基づいて、サイズ取得部１６によって取得された被写体２０１のサイズ、位置取得部１８によって取得された被写体２０１の位置、及び、速度取得部１９によって取得された被写体２０１の速度のうち、少なくとも１つを出力する。出力部は、例えば、上述した記憶部２２の他に、通信部２３及び表示部２４であってもよい。出力制御部２０は、サイズ取得部１６によって取得された被写体２０１のサイズ、重心取得部１７によって取得された被写体２０１の重心位置ＰＧ、位置取得部１８によって取得された被写体２０１の位置、及び、速度取得部１９によって取得された被写体２０１の速度に関する被写体情報を記憶部２２に記憶する。また、出力制御部２０は、例えば、外部にある装置（例えば、サーバ等）に被写体情報を送信するよう通信部２３を制御することとしてもよい。出力制御部２０は、例えば、被写体情報を表示部２４に表示することとしてもよい。 The output control section 20 controls the output section. The above-mentioned output unit outputs the size of the subject 201 acquired by the size acquisition unit 16, the position of the subject 201 acquired by the position acquisition unit 18, and the velocity acquisition unit 19 based on the control of the output control unit 20. At least one of the speeds of the subject 201 thus determined is output. The output unit may be, for example, the communication unit 23 and the display unit 24 in addition to the storage unit 22 described above. The output control unit 20 controls the size of the subject 201 acquired by the size acquisition unit 16, the center of gravity position PG of the subject 201 acquired by the center of gravity acquisition unit 17, the position and speed of the subject 201 acquired by the position acquisition unit 18. Subject information regarding the speed of the subject 201 acquired by the acquisition unit 19 is stored in the storage unit 22 . Further, the output control section 20 may, for example, control the communication section 23 to transmit the subject information to an external device (for example, a server, etc.). The output control unit 20 may display subject information on the display unit 24, for example.

次に、一実施形態に係る情報処理方法について説明する。
まず、画像位置情報を取得する方法について説明する。
図１３は、一実施形態に係る情報処理方法であって、画像位置情報を取得する方法ついて説明するためのフローチャートである。 Next, an information processing method according to an embodiment will be described.
First, a method for acquiring image position information will be explained.
FIG. 13 is an information processing method according to an embodiment, and is a flowchart for explaining a method of acquiring image position information.

ステップＳＴ１１において、定点撮像部２１は、被写体を撮像する。 In step ST11, the fixed-point imaging unit 21 images the subject.

ステップＳＴ１２において、認識部１３は、予め学習した結果に基づいてステップＳＴ１１で撮像された被写体の特徴を抽出し、その抽出の結果に基づいて被写体を認識する。 In step ST12, the recognition unit 13 extracts the features of the subject imaged in step ST11 based on the results learned in advance, and recognizes the subject based on the extraction results.

ステップＳＴ１３において、設定部１２は、ステップＳＴ１１で撮像された被写体、すなわち、ステップＳＴ１２で認識された被写体の高さと、定点撮像部２１の画角とに基づいて、画像位置情報を取得する。
すなわち、設定部１２は、第１～９設定部１２１～１２９により、以下の処理を行う。
第１設定部１２１は、被写体の高さと、定点撮像部２１の画角と、定点撮像部２１で撮像される画像の幅を予め取得する。
第２設定部１２２は、被写体が移動する場合、第１設定部１２１で取得された被写体の高さに基づいて、１ｍ当たりのピクセル数を取得する。
第３設定部１２３は、画像の奥行方向の座標となるＹ座標と、第２設定部１２２で取得される被写体の１ｍ当たりのピクセル数との関係を取得する。
第４設定部１２４は、第１設定部１２１で取得される画像の幅と、第２設定部１２２で取得された１ｍ当たりのピクセル数とに基づいて所定位置における実際の画像の幅ｗを取得する。
第５設定部１２５は、第４設定部１２４で取得された実際の画像の幅ｗと、第１設定部１２１で取得された画角θ２とに基づいて、定点撮像部２１から画像内の中心位置Ｐｃまでの距離である第１距離Ｌ１と、地面から中心位置Ｐｃまでの距離である第２距離Ｌ２とを取得する。
第６設定部１２６は、第５設定部１２５で取得された第１距離Ｌ１及び第２距離Ｌ２に基づいて、地面と定点撮像部２１の撮像方向における中心線（Ｙ軸方向に沿う線）とのなす角θを求める。
第７設定部１２７は、第５設定部１２５で取得された第１距離Ｌ１及び第２距離Ｌ２と、第６設定部１２６で取得されたなす角θとに基づいて、地面から定点撮像部２１までの高さＨを取得する。
第８設定部１２８は、第７設定部１２７で取得された地面から定点撮像部２１までの高さＨと、第６設定部１２６で取得されたなす角θとに基づいて、定点撮像部２１が配される撮像位置から被写体までの位置を取得し、画像位置情報を取得する。
ここで、第９設定部１２９は、第４設定部１２４で取得された画像の幅のうち、Ｙ座標の方向における基準位置の幅と、Ｙ座標の方向の所定位置における幅とに基づいて、Ｙ座標に対する幅比率を取得することとしてもよい。 In step ST13, the setting unit 12 acquires image position information based on the height of the subject imaged in step ST11, that is, the subject recognized in step ST12, and the angle of view of the fixed-point imaging unit 21.
That is, the setting section 12 performs the following processing using the first to ninth setting sections 121 to 129.
The first setting unit 121 obtains in advance the height of the subject, the angle of view of the fixed-point imaging unit 21, and the width of the image captured by the fixed-point imaging unit 21.
When the subject moves, the second setting unit 122 obtains the number of pixels per meter based on the height of the subject obtained by the first setting unit 121.
The third setting unit 123 acquires the relationship between the Y coordinate, which is the coordinate in the depth direction of the image, and the number of pixels per meter of the subject acquired by the second setting unit 122.
The fourth setting unit 124 obtains the actual width w of the image at a predetermined position based on the width of the image obtained by the first setting unit 121 and the number of pixels per meter obtained by the second setting unit 122. do.
The fifth setting unit 125 determines the center of the image from the fixed-point imaging unit 21 based on the width w of the actual image acquired by the fourth setting unit 124 and the angle of view θ2 acquired by the first setting unit 121. A first distance L1, which is the distance to the position Pc, and a second distance L2, which is the distance from the ground to the center position Pc, are acquired.
Based on the first distance L1 and the second distance L2 acquired by the fifth setting unit 125, the sixth setting unit 126 determines the center line (line along the Y-axis direction) between the ground and the fixed-point imaging unit 21 in the imaging direction. Find the angle θ formed by
The seventh setting unit 127 determines whether the fixed-point imaging unit 21 is located from the ground based on the first distance L1 and second distance L2 acquired by the fifth setting unit 125 and the angle θ acquired by the sixth setting unit 126. Get the height H up to.
The eighth setting unit 128 sets the fixed-point imaging unit 21 based on the height H from the ground to the fixed-point imaging unit 21 acquired by the seventh setting unit 127 and the angle θ acquired by the sixth setting unit 126. The position from the imaging position where the camera is placed to the subject is acquired, and image position information is acquired.
Here, the ninth setting unit 129 calculates, based on the width at the reference position in the Y-coordinate direction and the width at a predetermined position in the Y-coordinate direction, among the widths of the image acquired by the fourth setting unit 124. It is also possible to obtain the width ratio to the Y coordinate.

ステップＳＴ１４において、設定部１２は、ステップＳＴ１３で取得された画像位置情報を記憶部２２に記憶する。 In step ST14, the setting unit 12 stores the image position information acquired in step ST13 in the storage unit 22.

次に、被写体２０１に関する情報を取得する方法について説明する。
図１４は、一実施形態に係る情報処理方法であって、被写体２０１に関する情報を取得する方法について説明するためのフローチャートである。 Next, a method for acquiring information regarding the subject 201 will be described.
FIG. 14 is an information processing method according to an embodiment, and is a flowchart for explaining a method of acquiring information regarding the subject 201.

ステップＳＴ２１において、定点撮像部２１は、時間的に連続して被写体２０１を撮像して画像データを生成する。定点撮像部２１は、動画、又は、所定時間ごとに連続して静止画を撮像して動画データを生成する。被写体２０１は、例えば、一定（略一定）の動きをする物体である。一定の動きをする物体の一は、直線的（略直線的）に移動する物体である。なお、物体は静止していてもよい。物体の具体的な一例は、車両、人物及び動物等であってもよい。 In step ST21, the fixed-point imaging unit 21 temporally continuously images the subject 201 and generates image data. The fixed-point imaging unit 21 generates moving image data by capturing moving images or continuous images at predetermined time intervals. The subject 201 is, for example, an object that moves in a constant (substantially constant) manner. One type of object that moves in a constant motion is an object that moves linearly (substantially linearly). Note that the object may be stationary. Specific examples of objects may be vehicles, people, animals, etc.

ステップＳＴ２２において、認識部１３は、ステップＳＴ２１で生成された画像データに基づく少なくとも２つの時間的に異なる画像（フレーム又は静止画）から被写体２０１を認識し、それぞれの時刻において被写体２０１を囲う複数の第１認識枠２１０を生成する。この場合、認識部１３は、画像上において被写体２０１を囲う（例えば、被写体２０１の外形に接する）矩形の枠を第１認識枠２１０として生成することとしてもよい。 In step ST22, the recognition unit 13 recognizes the subject 201 from at least two temporally different images (frames or still images) based on the image data generated in step ST21, and identifies a plurality of images surrounding the subject 201 at each time. A first recognition frame 210 is generated. In this case, the recognition unit 13 may generate a rectangular frame surrounding the subject 201 (for example, touching the outline of the subject 201) on the image as the first recognition frame 210.

ステップＳＴ２３において、予測部１４は、ステップＳＴ２２で生成された複数の第１認識枠２１０に基づいて、第１認識枠２１０を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体２０１を囲う第２認識枠２２０を予測する。具体的には、予測部１４は、複数の第１認識枠２１０と、複数の第１認識枠２１０それぞれの中心位置２１０ａと、ステップＳＴ２１で生成される画像データに基づく画像の基準線ＬＳとに基づいて、第２認識枠２２０と、その第２認識枠２２０の中心位置２２０ａとを予測する。この場合、予測部１４は、画像上において被写体２０１を囲う矩形の枠を第２認識枠２２０として生成することとしてもよい。
上述した所定時刻は、現在時刻であってもよい。すなわち、第２認識枠２２０は、現在時刻における被写体２０１の予想位置であってよい。これにより、被写体２０１が他の物体によって一時的に全部又は一部が隠れた場合でも、その被写体２０１を認識する（被写体２０１の位置を予測する）ことが可能になる。 In step ST23, the prediction unit 14 determines, based on the plurality of first recognition frames 210 generated in step ST22, a predetermined time that is temporally earlier than the recording time of the image when generating the first recognition frame 210. A second recognition frame 220 surrounding the subject 201 is predicted. Specifically, the prediction unit 14 determines the plurality of first recognition frames 210, the center position 210a of each of the plurality of first recognition frames 210, and the reference line LS of the image based on the image data generated in step ST21. Based on this, the second recognition frame 220 and the center position 220a of the second recognition frame 220 are predicted. In this case, the prediction unit 14 may generate a rectangular frame surrounding the subject 201 on the image as the second recognition frame 220.
The predetermined time mentioned above may be the current time. That is, the second recognition frame 220 may be the expected position of the subject 201 at the current time. This makes it possible to recognize the subject 201 (predict the position of the subject 201) even if the subject 201 is temporarily hidden in whole or in part by another object.

ステップＳＴ２４において、生成部１５は、ステップＳＴ２３によって生成される第２認識枠２２０に一部が接するように、被写体２０１の正面の外形に応じた正面枠２２０ｂと、被写体２０１の背面の外形に応じた背面枠２２０ｃとを生成する。
生成部１５は、第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳを基準にした画像の垂直方向に対して左側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の左下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の右上にある頂点に接するように矩形の背面枠２２０ｃを生成する。
又は、生成部１５は、第１，２延長線２３０，２４０と基準線ＬＳとの交点ＰＳを基準にした画像の垂直方向に対して右側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の右下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の左上にある頂点に接するように矩形の背面枠２２０ｃを生成する。 In step ST24, the generation unit 15 generates a front frame 220b corresponding to the front outline of the subject 201 and a front frame 220b corresponding to the rear outline of the subject 201 so that a part thereof touches the second recognition frame 220 generated in step ST23. A rear frame 220c is generated.
When the subject 201 is located on the left side in the vertical direction of the image based on the intersection point PS of the first and second extension lines 230 and 240 and the reference line LS, the generation unit 15 generates a second recognition frame for the image. A rectangular front frame 220b is generated so as to touch the lower left vertex of the second recognition frame 220, and a rectangular rear frame 220c is generated so as to touch the upper right vertex of the second recognition frame 220 with respect to the image.
Alternatively, when the subject 201 is located on the right side with respect to the vertical direction of the image based on the intersection point PS of the first and second extension lines 230 and 240 and the reference line LS, the generation unit 15 generates a second A rectangular front frame 220b is generated so as to touch the lower right vertex of the recognition frame 220, and a rectangular rear frame 220c is generated so as to touch the upper left vertex of the second recognition frame 220 with respect to the image.

ステップＳＴ２５において、サイズ取得部１６は、定点撮像部２１から被写体２０１までの距離と、画像データに基づく画像のサイズ（例えば、ピクセル数）と、生成部１５によって生成された正面枠２２０ｂ及び背面枠２２０ｃの少なくとも一方のサイズとに基づいて、被写体２０１のサイズを取得する。 In step ST25, the size acquisition unit 16 acquires the distance from the fixed-point imaging unit 21 to the subject 201, the size of the image based on the image data (for example, the number of pixels), and the front frame 220b and back frame generated by the generation unit 15. The size of the subject 201 is acquired based on the size of at least one of the images 220c and 220c.

ステップＳＴ２６において、重心取得部１７は、ステップＳＴ２４で生成された正面枠２２０ｂ及び背面枠２２０ｃに基づいて、被写体２０１の重心位置ＰＧを取得する。
この場合、重心取得部１７は、一例として、定点撮像部２１において生成される画像データに基づく画像内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた結果（設定部１２によって得られた結果（画像位置情報））に基づいて、定点撮像部２１から重心位置ＰＧまでの距離（一例として、画像上における水平方向（Ｘ方向）と垂直方法（Ｙ方向）の距離）を取得することとしてもよい。なお、Ｘ方向の距離は、画像上の水平方向の中心線からの距離であってもよい。 In step ST26, the center of gravity acquisition unit 17 obtains the center of gravity position PG of the subject 201 based on the front frame 220b and back frame 220c generated in step ST24.
In this case, the center of gravity acquisition unit 17 may obtain, for example, a result (setting unit 12 (image position information)), the distance from the fixed point imaging unit 21 to the center of gravity position PG (as an example, the distance in the horizontal direction (X direction) and vertical direction (Y direction) on the image) It is also possible to obtain Note that the distance in the X direction may be a distance from the center line in the horizontal direction on the image.

ステップＳＴ２７において、位置取得部１８は、ステップＳＴ２６で取得された重心位置ＰＧを所定の座標系に変換することにより、被写体２０１の位置を取得する。所定の座標系は、例えば、グローバル座標系である。 In step ST27, the position acquisition unit 18 acquires the position of the subject 201 by converting the center of gravity position PG acquired in step ST26 into a predetermined coordinate system. The predetermined coordinate system is, for example, a global coordinate system.

ステップＳＴ２８において、速度取得部１９は、ステップＳＴ２６又はステップＳＴ２７で取得された被写体２０１の複数の重心位置ＰＧの変化に基づいて被写体２０１の速度を取得する。 In step ST28, the velocity acquisition unit 19 acquires the velocity of the subject 201 based on the changes in the plurality of gravity center positions PG of the subject 201 acquired in step ST26 or step ST27.

なお、上述した実施形態では、第１認識枠２１０及び第２認識枠２２０に基づいて被写体２０１に関する情報を取得する例について説明した。しかし、本発明は、現在時刻における被写体２０１を認識できた場合には、第２認識枠２２０を第１認識枠２１０として被写体２０１に関する情報を取得することとしてもよい。 Note that in the embodiment described above, an example was described in which information regarding the subject 201 is acquired based on the first recognition frame 210 and the second recognition frame 220. However, in the present invention, if the subject 201 at the current time can be recognized, the second recognition frame 220 may be used as the first recognition frame 210 to acquire information regarding the subject 201.

次に、本実施形態の効果について説明する。
情報処理装置１は、時間的に連続して被写体２０１を撮像して画像データを生成する定点撮像部２１と、定点撮像部２１によって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体２０１をそれぞれ認識し、それぞれの時刻において被写体２０１を囲う複数の第１認識枠２１０を生成する認識部１３と、複数の第１認識枠２１０と、その複数の第１認識枠２１０の中心位置２１０ａと、定点撮像部２１によって撮像される画像データに基づく画像の基準線ＬＳとに基づいて、認識部１３によって第１認識枠２１０を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体２０１を囲う第２認識枠２２０とその第２認識枠２２０の中心位置２２０ａとを予測する予測部１４と、予測部１４によって生成される第２認識枠２２０に一部が接するように、被写体２０１の正面の外形に応じた正面枠２２０ｂと、被写体２０１の背面の外形に応じた背面枠２２０ｃとを生成する生成部１５と、定点撮像部２１から被写体２０１までの距離と、画像データに基づく画像のサイズと、生成部１５によって生成された正面枠２２０ｂ及び背面枠２２０ｃの少なくとも一方のサイズとに基づいて、被写体２０１のサイズを取得するサイズ取得部１６と、を備える。
これにより、情報処理装置１は、画像に記録される被写体２０１に関する情報を取得することができる。 Next, the effects of this embodiment will be explained.
The information processing device 1 includes a fixed-point imaging unit 21 that temporally continuously images a subject 201 to generate image data, and at least two temporally different images recorded in the image data captured by the fixed-point imaging unit 21. A recognition unit 13 that each recognizes a subject 201 based on an image and generates a plurality of first recognition frames 210 surrounding the subject 201 at each time, a plurality of first recognition frames 210, and the plurality of first recognition frames. Based on the center position 210a of 210 and the reference line LS of the image based on the image data captured by the fixed point imaging unit 21, the time is longer than the recording time of the image when the recognition unit 13 generates the first recognition frame 210. A prediction unit 14 that predicts a second recognition frame 220 surrounding the subject 201 and a center position 220a of the second recognition frame 220 at a predetermined time that is the next predetermined time, and a second recognition frame 220 generated by the prediction unit 14. A generating unit 15 that generates a front frame 220b corresponding to the front outline of the subject 201 and a back frame 220c corresponding to the rear outline of the subject 201 so that the parts from the fixed-point imaging unit 21 to the subject 201 are in contact with each other. a size acquisition unit 16 that acquires the size of the subject 201 based on the distance, the size of the image based on the image data, and the size of at least one of the front frame 220b and the back frame 220c generated by the generation unit 15; Equipped with
Thereby, the information processing apparatus 1 can acquire information regarding the subject 201 recorded in the image.

情報処理装置１は、生成部１５によって生成された正面枠２２０ｂ及び背面枠２２０ｃに基づいて、被写体２０１の重心位置ＰＧを取得する重心取得部１７と、重心取得部１７によって取得された重心位置ＰＧを所定の座標系に変換することにより、被写体２０１の位置を取得する位置取得部１８と、を備えることとしてもよい。
所定の座標系は、一例として、グローバル座標系であってもよい。これにより、情報処理装置１は、被写体２０１の位置（座標）を取得することができる。 The information processing device 1 includes a center of gravity acquisition unit 17 that acquires the center of gravity position PG of the subject 201 based on the front frame 220b and the back frame 220c generated by the generation unit 15, and a center of gravity position PG acquired by the center of gravity acquisition unit 17. The camera may also include a position acquisition unit 18 that acquires the position of the subject 201 by converting the image into a predetermined coordinate system.
The predetermined coordinate system may be, for example, a global coordinate system. Thereby, the information processing device 1 can acquire the position (coordinates) of the subject 201.

情報処理装置１は、重心取得部１７によって取得された時間毎の被写体２０１の複数の重心位置ＰＧの変化に基づいて被写体２０１の速度を取得する速度取得部１９を備えることとしてもよい。
これにより、情報処理装置１は、被写体２０１が移動する際に、その被写体２０１の重心位置ＰＧに基づいて速度を取得することができる。 The information processing device 1 may include a speed acquisition unit 19 that acquires the speed of the subject 201 based on changes in a plurality of barycenter positions PG of the subject 201 over time acquired by the centroid acquisition unit 17.
Thereby, the information processing device 1 can acquire the speed based on the center of gravity position PG of the subject 201 when the subject 201 moves.

情報処理装置１は、サイズ取得部１６によって取得された被写体２０１のサイズ、位置取得部１８によって取得された被写体２０１の位置、及び、速度取得部１９によって取得された被写体２０１の速度のうち、少なくとも１つを出力する出力部を備えることとしてもよい。
これにより、情報処理装置１は、取得した被写体２０１に関する情報を、通信部２３によって外部に送信し、表示部２４に表示し、及び、記憶部２２に記憶することができる。 The information processing device 1 determines at least the size of the subject 201 acquired by the size acquisition unit 16, the position of the subject 201 acquired by the position acquisition unit 18, and the speed of the subject 201 acquired by the speed acquisition unit 19. It is also possible to include an output section that outputs one.
Thereby, the information processing device 1 can transmit the acquired information regarding the subject 201 to the outside through the communication unit 23, display it on the display unit 24, and store it in the storage unit 22.

情報処理装置１では、認識部１３は、画像上において被写体２０１を囲う矩形の枠を第１認識枠２１０として生成することとしてもよい。
これよりに、情報処理装置１は、例えば、学習モデル又はパターンマッチング等に基づいて被写体２０１を認識することができ、認識した被写体２０１に基づいた第１認識枠２１０（ＲＯＩ）を生成することができる。 In the information processing device 1, the recognition unit 13 may generate a rectangular frame surrounding the subject 201 on the image as the first recognition frame 210.
Thereby, the information processing device 1 can recognize the subject 201 based on a learning model, pattern matching, etc., and can generate the first recognition frame 210 (ROI) based on the recognized subject 201. can.

情報処理装置１では、予測部１４は、画像上において被写体２０１を囲う矩形の枠を第２認識枠２２０として生成することとしてもよい。
これにより、情報処理装置１は、複数の第１認識枠２１０に基づいて、被写体２０１の第２認識枠２２０（ＲＯＩ）を生成することができる。 In the information processing device 1, the prediction unit 14 may generate a rectangular frame surrounding the subject 201 on the image as the second recognition frame 220.
Thereby, the information processing apparatus 1 can generate the second recognition frame 220 (ROI) of the subject 201 based on the plurality of first recognition frames 210.

情報処理装置１では、予測部１４は、基準線ＬＳとして、画像データに基づく画像の水平方向に延びる線を設定し、基準線ＬＳの所定位置を端点として設定し、その端点から複数の第１認識枠２１０の各中心位置２１０ａを通る中心線上を中心位置２２０ａとして第２認識枠２２０を生成することとしてもよい。
これにより、情報処理装置１は、複数の第１認識枠２１０に基づいて、被写体２０１の移動方向に応じた第２認識枠２２０を生成することができる。 In the information processing device 1, the prediction unit 14 sets a line extending in the horizontal direction of the image based on the image data as the reference line LS, sets a predetermined position of the reference line LS as an end point, and calculates a plurality of first The second recognition frame 220 may be generated by setting the center position 220a on the center line passing through each center position 210a of the recognition frame 210.
Thereby, the information processing device 1 can generate the second recognition frame 220 according to the moving direction of the subject 201 based on the plurality of first recognition frames 210.

情報処理装置１では、生成部１５は、所定位置を基準に、画像データに基づく画像の垂直方向に対して左側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の左下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の右上にある頂点に接するように矩形の背面枠２２０ｃを生成することとしてもよい。
これにより、情報処理装置１は、被写体２０１の移動方向に、画像に対して左方向の成分がある場合に、正面枠２２０ｂ及び背面枠２２０ｃを生成することができる。 In the information processing device 1, when the subject 201 is located on the left side with respect to the vertical direction of the image based on the image data with respect to the predetermined position as a reference, the generation unit 15 generates an image that is located at the lower left of the second recognition frame 220 with respect to the image. The rectangular front frame 220b may be generated so as to be in contact with the apex, and the rectangular back frame 220c may be generated so as to be in contact with the apex at the upper right of the second recognition frame 220 with respect to the image.
Thereby, the information processing device 1 can generate the front frame 220b and the back frame 220c when there is a leftward component with respect to the image in the moving direction of the subject 201.

情報処理装置１では、生成部１５は、所定位置を基準に、画像データに基づく画像の垂直方向に対して右側に被写体２０１が位置する場合、画像に対して第２認識枠２２０の右下にある頂点に接するように矩形の正面枠２２０ｂを生成すると共に、画像に対して第２認識枠２２０の左上にある頂点に接するように矩形の背面枠２２０ｃを生成することとしてもよい。
これにより、情報処理装置１は、被写体２０１の移動方向に、画像に対して右方向の成分がある場合に、正面枠２２０ｂ及び背面枠２２０ｃを生成することができる。 In the information processing device 1, when the subject 201 is located on the right side with respect to the vertical direction of the image based on the image data with respect to the predetermined position as a reference, the generation unit 15 generates an image at the bottom right of the second recognition frame 220 with respect to the image. The rectangular front frame 220b may be generated so as to touch a certain vertex, and the rectangular back frame 220c may be generated so as to touch the upper left vertex of the second recognition frame 220 with respect to the image.
Thereby, the information processing apparatus 1 can generate the front frame 220b and the back frame 220c when there is a rightward component with respect to the image in the movement direction of the subject 201.

情報処理装置１では、サイズ取得部１６は、定点撮像部２１において生成される画像データに基づく画像内の位置と、その位置における定点撮像部２１からの距離とを予め対応付けた結果に基づいて、定点撮像部２１から被写体２０１までの距離を取得することとしてもよい。
これにより、情報処理装置１は、その距離を、被写体２０１に関する情報を取得する際の基礎にすることができる。 In the information processing device 1, the size acquisition unit 16 uses the result of associating in advance the position in the image based on the image data generated by the fixed-point imaging unit 21 and the distance from the fixed-point imaging unit 21 at that position. , the distance from the fixed-point imaging unit 21 to the subject 201 may be acquired.
Thereby, the information processing device 1 can use the distance as a basis for acquiring information regarding the subject 201.

情報処理方法では、コンピュータが、時間的に連続して被写体２０１を撮像して画像データを生成する定点撮像ステップと、定点撮像ステップによって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体２０１をそれぞれ認識し、それぞれの時刻において被写体２０１を囲う複数の第１認識枠２１０を生成する認識ステップと、複数の第１認識枠２１０と、その複数の第１認識枠２１０の中心位置２１０ａと、定点撮像ステップによって撮像される画像データに基づく画像の基準線ＬＳとに基づいて、認識ステップによって第１認識枠２１０を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体２０１を囲う第２認識枠２２０とその第２認識枠２２０の中心位置２２０ａとを予測する予測ステップと、予測ステップによって生成される第２認識枠２２０に一部が接するように、被写体２０１の正面の外形に応じた正面枠２２０ｂと、被写体２０１の背面の外形に応じた背面枠２２０ｃとを生成する生成ステップと、定点撮像ステップから被写体２０１までの距離と、画像データに基づく画像のサイズと、生成ステップによって生成された正面枠２２０ｂ及び背面枠２２０ｃの少なくとも一方のサイズとに基づいて、被写体２０１のサイズを取得するサイズ取得ステップと、を実行する。
これにより、情報処理方法は、画像に記録される被写体２０１に関する情報を取得することができる。 In the information processing method, a fixed point imaging step in which a computer sequentially images the subject 201 to generate image data, and at least two temporally different images recorded in the image data captured in the fixed point imaging step. a recognition step of respectively recognizing the subject 201 based on the image and generating a plurality of first recognition frames 210 surrounding the subject 201 at each time; a plurality of first recognition frames 210; Based on the center position 210a of 210a and the reference line LS of the image based on the image data captured in the fixed point imaging step, the image is temporally earlier than the recording time of the image when the first recognition frame 210 is generated in the recognition step. A prediction step of predicting a second recognition frame 220 surrounding the subject 201 and a center position 220a of the second recognition frame 220 at a predetermined time to be , a generation step of generating a front frame 220b according to the front outline of the subject 201 and a back frame 220c according to the rear outline of the subject 201, the distance from the fixed point imaging step to the subject 201, and the image data. A size obtaining step of obtaining the size of the subject 201 is performed based on the size of the based image and the size of at least one of the front frame 220b and the rear frame 220c generated in the generation step.
Thereby, the information processing method can acquire information regarding the subject 201 recorded in the image.

情報処理プログラムは、コンピュータに、時間的に連続して被写体２０１を撮像して画像データを生成する定点撮像機能と、定点撮像機能によって撮像された画像データに記録される少なくとも２つの時間的に異なる画像に基づいて被写体２０１をそれぞれ認識し、それぞれの時刻において被写体２０１を囲う複数の第１認識枠２１０を生成する認識機能と、複数の第１認識枠２１０と、その複数の第１認識枠２１０の中心位置２１０ａと、定点撮像機能によって撮像される画像データに基づく画像の基準線ＬＳとに基づいて、認識機能によって第１認識枠２１０を生成する際の画像の記録時刻よりも時間的に先となる所定時刻における被写体２０１を囲う第２認識枠２２０とその第２認識枠２２０の中心位置２２０ａとを予測する予測機能と、予測機能によって生成される第２認識枠２２０に一部が接するように、被写体２０１の正面の外形に応じた正面枠２２０ｂと、被写体２０１の背面の外形に応じた背面枠２２０ｃとを生成する生成機能と、定点撮像機能から被写体２０１までの距離と、画像データに基づく画像のサイズと、生成機能によって生成された正面枠２２０ｂ及び背面枠２２０ｃの少なくとも一方のサイズとに基づいて、被写体２０１のサイズを取得するサイズ取得機能と、を実現させる。
これにより、情報処理プログラムは、画像に記録される被写体２０１に関する情報を取得することができる。 The information processing program provides the computer with at least two temporally different functions: a fixed-point imaging function that sequentially images the subject 201 to generate image data, and a fixed-point imaging function that generates image data by sequentially imaging the subject 201; A recognition function that recognizes each subject 201 based on an image and generates a plurality of first recognition frames 210 surrounding the subject 201 at each time, a plurality of first recognition frames 210, and the plurality of first recognition frames 210. based on the center position 210a of the image and the reference line LS of the image based on the image data captured by the fixed-point imaging function, the image is temporally earlier than the recording time of the image when the first recognition frame 210 is generated by the recognition function. A prediction function that predicts a second recognition frame 220 surrounding the subject 201 at a predetermined time and a center position 220a of the second recognition frame 220, and a prediction function that partially touches the second recognition frame 220 generated by the prediction function. In addition, a generation function that generates a front frame 220b according to the front outline of the subject 201 and a back frame 220c according to the rear outline of the subject 201, the distance from the fixed point imaging function to the subject 201, and the image data are provided. A size acquisition function that acquires the size of the subject 201 based on the size of the image based on the image size and the size of at least one of the front frame 220b and the rear frame 220c generated by the generation function is realized.
Thereby, the information processing program can acquire information regarding the subject 201 recorded in the image.

上述した情報処理装置１の各部は、コンピュータの演算処理装置等の機能として実現されてもよい。すなわち、情報処理装置１の設定部１２、認識部１３、予測部１４、生成部１５、サイズ取得部１６、重心取得部１７、位置取得部１８、速度取得部１９及び出力制御部２０は、コンピュータの演算処理装置等による設定機能、認識機能、予測機能、生成機能、サイズ取得機能、重心取得機能、位置取得機能、速度取得機能及び出力制御機能としてそれぞれ実現されてもよい。
情報処理プログラムは、上述した各機能をコンピュータに実現させることができる。情報処理プログラムは、外部メモリ又は光ディスク等の、コンピュータで読み取り可能な非一時的な記録媒体に記録されていてもよい。
また、上述したように、情報処理装置１の各部は、コンピュータの演算処理装置等で実現されてもよい。その演算処理装置等は、例えば、集積回路等によって構成される。このため、情報処理装置１の各部は、演算処理装置等を構成する回路として実現されてもよい。すなわち、情報処理装置１の設定部１２、認識部１３、予測部１４、生成部１５、サイズ取得部１６、重心取得部１７、位置取得部１８、速度取得部１９及び出力制御部２０は、コンピュータの演算処理装置等を構成する設定回路、認識回路、予測回路、生成回路、サイズ取得回路、重心取得回路、位置取得回路、速度取得回路及び出力制御回路として実現されてもよい。
また、情報処理装置１の定点撮像部２１及び出力部（記憶部２２、通信部２３及び表示部２４）は、例えば、演算処理装置等の機能を含む定点撮像機能及び出力機能（記憶機能、通信機能及び表示機能）として実現されもよい。また、情報処理装置１の定点撮像部２１及び出力部（記憶部２２、通信部２３及び表示部２４）は、例えば、集積回路等によって構成されることにより定点撮像回路及び出力回路（記憶回路、通信回路及び表示回路）として実現されてもよい。また、情報処理装置１の定点撮像部２１及び出力部（記憶部２２、通信部２３及び表示部２４）は、例えば、複数のデバイスによって構成されることにより定点撮像装置及び出力装置（記憶装置、通信装置及び表示装置）として構成されてもよい。 Each part of the information processing device 1 described above may be realized as a function of a computer processing device or the like. That is, the setting section 12, the recognition section 13, the prediction section 14, the generation section 15, the size acquisition section 16, the center of gravity acquisition section 17, the position acquisition section 18, the speed acquisition section 19, and the output control section 20 of the information processing device 1 are configured by a computer. Each of the functions may be realized as a setting function, a recognition function, a prediction function, a generation function, a size acquisition function, a center of gravity acquisition function, a position acquisition function, a speed acquisition function, and an output control function by the arithmetic processing device or the like.
The information processing program can cause a computer to realize each of the functions described above. The information processing program may be recorded on a computer-readable non-transitory recording medium such as an external memory or an optical disc.
Further, as described above, each part of the information processing device 1 may be realized by a calculation processing device of a computer or the like. The arithmetic processing device and the like are constituted by, for example, an integrated circuit or the like. Therefore, each part of the information processing device 1 may be realized as a circuit that constitutes an arithmetic processing device or the like. That is, the setting section 12, the recognition section 13, the prediction section 14, the generation section 15, the size acquisition section 16, the center of gravity acquisition section 17, the position acquisition section 18, the speed acquisition section 19, and the output control section 20 of the information processing device 1 are configured by a computer. It may be realized as a setting circuit, a recognition circuit, a prediction circuit, a generation circuit, a size acquisition circuit, a center of gravity acquisition circuit, a position acquisition circuit, a speed acquisition circuit, and an output control circuit that constitute an arithmetic processing device or the like.
Further, the fixed-point imaging unit 21 and the output unit (storage unit 22, communication unit 23, and display unit 24) of the information processing device 1 have a fixed-point imaging function including functions such as an arithmetic processing unit, and an output function (storage function, communication function and display function). Further, the fixed point imaging section 21 and the output section (the storage section 22, the communication section 23, and the display section 24) of the information processing device 1 are configured by, for example, an integrated circuit or the like. (communication circuit and display circuit). Further, the fixed-point imaging unit 21 and the output unit (storage unit 22, communication unit 23, and display unit 24) of the information processing device 1 are configured by a plurality of devices, so that the fixed-point imaging unit and the output unit (storage unit, (communication device and display device).

１情報処理装置
１１制御部
１２設定部
１３認識部
１４予測部
１５生成部
１６サイズ取得部
１７重心取得部
１８位置取得部
１９速度取得部
２０出力制御部
２１定点撮像部
２２記憶部
２３通信部
２４表示部 1 Information processing device 11 Control section 12 Setting section 13 Recognition section 14 Prediction section 15 Generation section 16 Size acquisition section 17 Center of gravity acquisition section 18 Position acquisition section 19 Speed acquisition section 20 Output control section 21 Fixed point imaging section 22 Storage section 23 Communication section 24 Display section

Claims

a fixed-point imaging unit that sequentially images a subject and generates image data;
a recognition unit that recognizes the subject based on at least two temporally different images recorded in image data captured by the fixed-point imaging unit, and generates a first recognition frame surrounding the subject at each time;
A recognition unit generates a first recognition frame based on a plurality of first recognition frames, a center position of the plurality of first recognition frames, and a reference line of an image based on image data captured by the fixed-point imaging unit. a prediction unit that predicts a second recognition frame surrounding the subject and a center position of the second recognition frame at a predetermined time that is temporally earlier than the recording time of the image when the image is recorded;
a generation unit that generates a front frame according to the front outline of the subject and a back frame according to the rear outline of the subject so that a part thereof touches the second recognition frame generated by the prediction unit;
Size acquisition for obtaining the size of the subject based on the distance from the fixed point imaging unit to the subject, the size of the image based on image data, and the size of at least one of the front frame and the back frame generated by the generation unit. Department and
An information processing device comprising:

a center of gravity acquisition unit that acquires the position of the center of gravity of the subject based on the front frame and the back frame generated by the generation unit;
a position acquisition unit that acquires the position of the subject by converting the gravity center position acquired by the gravity center acquisition unit into a predetermined coordinate system;
The information processing device according to claim 1, comprising:

The information processing apparatus according to claim 2, further comprising a speed acquisition section that acquires the speed of the subject based on changes in a plurality of positions of the center of gravity of the subject over time acquired by the center of gravity acquisition section.

an output unit that outputs at least one of the size of the subject acquired by the size acquisition unit, the position of the subject acquired by the position acquisition unit, and the speed of the subject acquired by the speed acquisition unit. The information processing device according to claim 3.

The information processing device according to claim 1, wherein the recognition unit generates a rectangular frame surrounding the subject on the image as the first recognition frame.

The information processing device according to claim 1, wherein the prediction unit generates a rectangular frame surrounding the subject on the image as the second recognition frame.

The prediction unit is
Set a line extending in the horizontal direction of the image based on the image data as the reference line,
According to any one of claims 1 to 6, wherein a predetermined position of the reference line is set as an end point, and the second recognition frame is generated with the center position being on a center line passing from the end point to the center position of each of the plurality of first recognition frames. The information processing device described.

The generation unit is
When the subject is located on the left side in the vertical direction of the image based on the image data based on the predetermined position, a rectangular front frame is generated so as to be in contact with the lower left vertex of the second recognition frame with respect to the image. 8. The information processing apparatus according to claim 7, wherein a rectangular back frame is generated with respect to the image so as to be in contact with a top right corner of the second recognition frame.

The generation unit is
When the subject is located on the right side in the vertical direction of the image based on the image data based on the predetermined position, a rectangular front frame is generated for the image so as to be in contact with the lower right vertex of the second recognition frame. The information processing apparatus according to claim 7, wherein the information processing apparatus also generates a rectangular back frame so as to be in contact with the top left vertex of the second recognition frame with respect to the image.

The size acquisition unit selects a subject from the fixed-point imaging unit based on a result of associating in advance a position in the image based on image data generated in the fixed-point imaging unit and a distance from the fixed-point imaging unit at the position. The information processing device according to any one of claims 1 to 9, wherein the information processing device acquires a distance to.

The computer is
a fixed point imaging step of generating image data by imaging a subject continuously over time;
a recognition step of respectively recognizing the subject based on at least two temporally different images recorded in the image data captured in the fixed-point imaging step, and generating a first recognition frame surrounding the subject at each time;
A first recognition frame is generated by a recognition step based on a plurality of first recognition frames, a center position of the plurality of first recognition frames, and a reference line of an image based on image data imaged by the fixed-point imaging step. a prediction step of predicting a second recognition frame surrounding the subject and the center position of the second recognition frame at a predetermined time that is temporally earlier than the recording time of the image when the image is recorded;
a generation step of generating a front frame according to the front outline of the subject and a back frame according to the rear outline of the subject so that a part thereof touches the second recognition frame generated in the prediction step;
Size acquisition for obtaining the size of the subject based on the distance from the fixed point imaging step to the subject, the size of the image based on image data, and the size of at least one of the front frame and the back frame generated in the generation step. step and
An information processing method that performs.

to the computer,
A fixed-point imaging function that captures images of a subject continuously over time and generates image data;
a recognition function that recognizes a subject based on at least two temporally different images recorded in image data captured by the fixed-point imaging function, and generates a first recognition frame surrounding the subject at each time;
A first recognition frame is generated by a recognition function based on a plurality of first recognition frames, a center position of the plurality of first recognition frames, and a reference line of an image based on image data captured by the fixed point imaging function. a prediction function that predicts a second recognition frame surrounding the subject and the center position of the second recognition frame at a predetermined time that is temporally earlier than the recording time of the image when the image is recorded;
a generation function that generates a front frame according to the front outline of the subject and a back frame according to the rear outline of the subject so that a part thereof touches the second recognition frame generated by the prediction function;
Size acquisition for obtaining the size of the subject based on the distance from the fixed point imaging function to the subject, the size of the image based on image data, and the size of at least one of the front frame and the back frame generated by the generation function. function and
An information processing program that realizes.