JP7507524B1

JP7507524B1 - Attention-grabbing call-to-action system

Info

Publication number: JP7507524B1
Application number: JP2023167134A
Authority: JP
Inventors: 大介木村
Original assignee: Asilla Inc
Current assignee: Asilla Inc
Filing date: 2023-09-28
Publication date: 2024-06-28
Anticipated expiration: 2043-09-28

Abstract

[Problem] To provide a system for calling out attention to an action that can make an appropriate call according to the attributes of an action that occurs when attention to an action, such as abnormal behavior, of an action that occurs in a video is detected.
[Solution] In the attention behavior calling system 1, feature points of a target behavior entity Z captured in a time-series image Y are detected, and when it is determined that an attention behavior has been performed based on the displacement of the stored feature points and the displacement of the detected feature points, a different call is made by voice according to the type of attention behavior. However, even when it is determined that an attention behavior has been performed, if the attribute or size of the target behavior entity Z satisfies a first predetermined condition, no call is made, or the content or manner of the call is changed.
[Selected figure] Figure 3

Description

本発明は、映像に映った行動体の異常行動等の注目行動が検出された際に、行動体の属性等に応じて適切な呼びかけを行うことが可能な注目行動呼びかけシステムに関する。 The present invention relates to a system for calling attention to an action that can make an appropriate call according to the attributes of an action that is captured on video when abnormal or other attention-grabbing action of the action is detected.

従来より、時系列画像に映った行動体の行動が、予め決定された“通常の行動”と異なる場合、“通常の行動”と異なる行動を行った行動体を時系列画像の中から抽出する技術が知られている（例えば、特許文献１参照）。 There is a known technique for extracting from time-series images an entity that has behaved differently from a predetermined "normal behavior" when the entity's behavior in the time-series images differs from the "normal behavior" (see, for example, Patent Document 1).

特許第６５２５１７９号Patent No. 6525179

上記技術を用いることで、異常行動を行った行動体に対して、撮影手段の周辺に設けられた報知部から呼びかけを行うことが考えられる。 By using the above technology, it is possible to call out to any abnormal behavior from a notification unit installed near the imaging means.

しかしながら、例えば、子供は、異常行動と判定されるような行動を行うことが多いが、その中には、行動体が大人であれば異常行動であるが、子供であれば異常行動とは言えないようなものも存在する。このような状況で、大人でも子供でも一律に呼びかけを行ってしまうと、頻繁な呼びかけが騒音となる上に、呼びかけに対する信頼性も損ねてしまうことが考えられる。 However, for example, children often behave in ways that are judged to be abnormal, and some of these behaviors would be considered abnormal if the behavior was an adult, but would not be considered abnormal if the behavior was a child. In such a situation, if calls were made to both adults and children in the same way, the frequent calls would not only become a distraction, but would also undermine the credibility of the calls.

そこで、本発明は、映像に映った行動体の異常行動等の注目行動が検出された際に、行動体の属性等に応じて適切な呼びかけを行うことが可能な注目行動呼びかけシステムを提供することを目的としている。 Therefore, the present invention aims to provide a system for calling attention to an action that can make an appropriate call according to the attributes of an action that occurs when abnormal or other attention-grabbing action of an action that occurs in a video is detected.

本発明は、撮影手段により撮影された時系列画像を取得する取得部と、前記時系列画像に映った対象行動体の特徴点を検出する検出部と、行動体が注目行動を行った場合の前記行動体の特徴点の変位を記憶した記憶部と、前記記憶された特徴点の変位と、各時系列画像から検出された特徴点の変位と、に基づき、前記注目行動が行われたか否かを判定する判定部と、前記検出された特徴点又は前記対象行動体の外観に基づき、前記対象行動体の属性又は大きさを推定する推定部と、前記撮影手段の撮影範囲の周辺に設けられ、前記注目行動が行われたと判定された場合に、前記注目行動の種類に応じた異なる呼びかけを音声により行う呼びかけ部と、を備え、前記呼びかけ部は、前記注目行動が行われたと判定された場合であっても、前記属性又は大きさが第１の所定の条件を満たす場合には、前記呼びかけを行わない、又は、前記呼びかけの内容又は態様を変更することを特徴とする注目行動呼びかけシステムを提供している。 The present invention provides an attention behavior calling system that includes an acquisition unit that acquires time-series images captured by a photographing means, a detection unit that detects feature points of a target behavior object captured in the time-series images, a storage unit that stores the displacement of the feature points of the behavior object when the behavior object performs an attention behavior, a determination unit that determines whether the attention behavior has been performed based on the displacement of the stored feature points and the displacement of the feature points detected from each time-series image, an estimation unit that estimates the attribute or size of the target behavior object based on the detected feature points or the appearance of the target behavior object, and a calling unit that is provided around the shooting range of the photographing means and that, when it is determined that the attention behavior has been performed, issues a different call by voice according to the type of the attention behavior, and the calling unit does not issue the call or changes the content or manner of the call even when it is determined that the attention behavior has been performed, if the attribute or size satisfies a first predetermined condition.

このような構成によれば、時系列画像において対象行動体の異常行動等の注目行動が検出された際に、対象行動体の属性等に応じて適切な呼びかけを行うことが可能となる。例えば、属性が“子供”の場合には呼びかけを停止したり、属性が“老人”の場合には呼びかけの音量を上げたりすることが可能となる。 With this configuration, when abnormal or other noteworthy behavior of a target behavior entity is detected in a time-series image, it becomes possible to make an appropriate call according to the attributes of the target behavior entity. For example, it is possible to stop the call if the attribute is "child," and to increase the volume of the call if the attribute is "elderly."

また、本発明の別の観点では、上記注目行動呼びかけシステムに対応する注目行動呼びかけプログラム及び注目行動呼びかけ方法を提供している。 In another aspect of the present invention, there is provided a program and method for calling attention to an action that correspond to the above-mentioned system for calling attention to an action.

本発明の注目行動呼びかけシステムによれば、映像に映った行動体の異常行動等の注目行動が検出された際に、行動体の属性等に応じて適切な呼びかけを行うことが可能となる。 The attention behavior calling system of the present invention makes it possible to make an appropriate call according to the attributes of an action subject when attention behavior such as abnormal behavior of an action subject captured in a video is detected.

本発明の第１の実施の形態による時系列画像の説明図FIG. 1 is an explanatory diagram of a time-series image according to a first embodiment of the present invention; 本発明の第１の実施の形態による注目行動呼びかけシステムのブロック図1 is a block diagram of a system for calling attention to actions according to a first embodiment of the present invention; 本発明の第１の実施の形態による注目行動呼びかけシステムのフローチャート1 is a flowchart of a system for calling attention to behavior according to a first embodiment of the present invention; 本発明の第２の実施の形態による注目行動呼びかけシステムのブロック図Block diagram of an attention behavior calling system according to a second embodiment of the present invention. 本発明の変形例による注目行動呼びかけシステムのブロック図1 is a block diagram of a system for calling attention to behavior according to a modified example of the present invention;

以下、本発明の第１の実施の形態による注目行動呼びかけシステム１について、図１－図３を参照して説明する。 The attention behavior calling system 1 according to the first embodiment of the present invention will be described below with reference to Figures 1 to 3.

注目行動呼びかけシステム１は、図１に示すように、撮影手段Ｘによって撮影された時系列画像Ｙ（図１では、映像を構成するフレーム）に映った対象行動体Ｚの行動の種類に応じた呼びかけを行うためのものである。本実施の形態では、対象行動体Ｚとして人間を採用し、理解容易のため、対象行動体Ｚを骨格だけで簡易的に表示する。 As shown in FIG. 1, the attention behavior calling system 1 is for calling out to a target behavior entity Z captured in a time-series image Y (in FIG. 1, frames constituting an image) captured by a capture means X according to the type of behavior of the target behavior entity Z. In this embodiment, a human being is used as the target behavior entity Z, and for ease of understanding, the target behavior entity Z is simply displayed using only a skeleton.

注目行動呼びかけシステム１は、図２に示すように、取得部２と、検出部３と、記憶部４と、判定部５と、推定部６と、呼びかけ部７と、を備えている。また、本実施の形態では、注目行動呼びかけシステム１は、撮影手段Ｘと一体に設けられているものとする。 As shown in FIG. 2, the attention behavior call system 1 includes an acquisition unit 2, a detection unit 3, a storage unit 4, a determination unit 5, an estimation unit 6, and a call unit 7. In this embodiment, the attention behavior call system 1 is provided integrally with the image capture means X.

取得部２は、撮影手段Ｘにより撮影された時系列画像Ｙを取得する。 The acquisition unit 2 acquires time-series images Y captured by the imaging means X.

検出部３は、時系列画像Ｙに映った対象行動体Ｚの特徴点を検出する。 The detection unit 3 detects feature points of the target action object Z captured in the time-series image Y.

特徴点としては、様々なものが考えられるが、本実施の形態では、特徴点として関節を検出する例を用いて説明を行う。 There are various possible feature points, but in this embodiment, we will explain using an example of detecting joints as feature points.

特徴点として関節を検出する場合には、例えば、以下のような方法が考えられる。 When detecting joints as feature points, the following methods can be considered, for example:

まず、記憶部（記憶部４であっても、他の記憶部であっても良い）に、“関節識別基準”と、“行動体識別基準”と、を記憶しておく。 First, the "joint identification criteria" and "acting object identification criteria" are stored in a memory unit (which may be memory unit 4 or another memory unit).

“関節識別基準”は、人間の複数の関節を識別するためのものであり、関節ごとに、それぞれを識別するための形状、方向、サイズ等を示したものである。 The "joint identification standard" is used to identify multiple human joints, and indicates the shape, orientation, size, etc. for each joint to identify them.

“行動体識別基準”は、人間の様々なバリエーション（“歩行”、“直立”等）の “基本姿勢“、”各関節の可動域“、一の人間における”各関節間の距離“等を示したものである。 The "Behavior Identification Standards" indicate the "basic postures" of various human variations ("walking", "standing upright", etc.), the "range of motion of each joint", and the "distance between each joint" in a single human.

上記“関節識別基準”に該当する複数の関節を検出した上で、“行動体識別基準”を参照して、一の対象行動体Ｚに含まれる複数の関節を特定することで、対象行動体Ｚそれぞれに含まれる関節を特定することが可能となる。 After detecting multiple joints that meet the above "joint identification criteria," it is possible to identify the joints contained in each of the target action objects Z by referring to the "action object identification criteria."

なお、特徴点は、時系列画像Ｙごとに個別に検出しても良いし、時系列の順番が特定できれば複数の時系列画像Ｙからまとめて検出しても良い。 Note that feature points may be detected individually for each time-series image Y, or may be detected collectively from multiple time-series images Y if the chronological order can be identified.

記憶部４は、行動体が注目行動を行った場合の行動体の特徴点の変位を記憶している。特徴点として関節を検出する場合には、複数の関節の変位を記憶しておくこととなる。 The memory unit 4 stores the displacement of the feature points of the action object when the action object performs a target action. When joints are detected as feature points, the displacements of multiple joints are stored.

注目行動としては、例えば、転倒、殴る、蹴る等の行動が考えられ、１つに限らず複数の行動を記憶しても良い。また、注目行動としては、万引きの予備動作等の“予兆行動”を記憶しても良い。なお、記憶部４には、注目行動以外の行動（歩行、立ち止まる等）を推定するために、それらの行動が生じた場合の特徴点情報の変位を記憶しておいても良い。 Examples of behaviors of interest include falling, punching, kicking, etc., and multiple behaviors may be stored without being limited to one. Furthermore, "premonitory behaviors" such as preparatory movements for shoplifting may also be stored as behaviors of interest. Note that in order to estimate behaviors other than the behaviors of interest (walking, stopping, etc.), the memory unit 4 may also store the displacement of feature point information when these behaviors occur.

判定部５は、記憶された特徴点の変位と、各時系列画像Ｙから検出された特徴点の変位と、に基づき、注目行動が行われたか否かを判定する。 The determination unit 5 determines whether or not attention behavior has occurred based on the displacement of the stored feature points and the displacement of the feature points detected from each time-series image Y.

例えば、検出された特徴点の変位が、注目行動の特徴点の変位と所定以上一致している場合に、「当該注目行動が行われた」と判定することが考えられる。 For example, if the displacement of a detected feature point matches the displacement of a feature point of a focused behavior by a specified amount or more, it may be determined that the focused behavior has been performed.

推定部６は、検出された特徴点又は対象行動体Ｚの外観に基づき、対象行動体Ｚの属性又は大きさを推定する。 The estimation unit 6 estimates the attributes or size of the target action object Z based on the detected feature points or the appearance of the target action object Z.

対象行動体Ｚの属性又は大きさは、公知の方法により推定すれば良い。 The attributes or size of the target action Z can be estimated using known methods.

例えば、上記したように関節により対象行動体Ｚを特定する場合には、頭部の関節点と、足首の関節点と、の距離を判定することで、対象行動体Ｚの大きさを推定することが考えられる。 For example, when identifying the target action object Z by joints as described above, it is possible to estimate the size of the target action object Z by determining the distance between the joint point of the head and the joint point of the ankle.

また、属性は、上記した大きさに加えて、姿勢、骨格等の特徴点や、服装、髪の毛の長さ、髪の毛の色等の外観から推定することが考えられる。 In addition to the size mentioned above, attributes can also be estimated from features such as posture and bone structure, as well as external appearance such as clothing, hair length, and hair color.

呼びかけ部７は、撮影手段Ｘの撮影範囲の周辺に設けられ、注目行動が行われたと判定された場合に、注目行動の種類に応じた異なる呼びかけを音声により行う。 The calling unit 7 is provided around the shooting range of the image capturing means X, and when it is determined that an attention behavior has been performed, it issues a different call by voice according to the type of attention behavior.

注目行動の種類に応じた異なる呼びかけとしては、注目行動が犯罪行為（万引き、暴力行為等）の場合には、「万引き（暴力行為）は犯罪です」、注目行動がアクシデント（転倒等）である場合には、「大丈夫ですか？」、「起き上がれますか？」、「お怪我はありませんか？」等が考えられる。 Different appeals depending on the type of behavior in question could be, if the behavior in question is a criminal act (shoplifting, violent acts, etc.), then "Shoplifting (violent acts) is a crime," and if the behavior in question is an accident (falling, etc.), then "Are you OK?", "Can you get up?", "Are you hurt?", etc.

上記呼びかけにより、犯罪行為の場合、その犯罪行為の停止が期待される。また、アクシデントの場合、呼びかけに対する対象行動体Ｚのアクションやジェスチャーで、救護が必要かどうかを判断することが可能となる。 The above call is expected to stop any criminal activity in the event of such an act. In addition, in the event of an accident, it will be possible to determine whether or not rescue is required based on the actions and gestures of the target actor Z in response to the call.

ところで、例えば、子供は、注目行動（異常行動）と判定されるような行動を行うことが多いが、その中には、対象行動体Ｚが大人であれば異常行動であるが、子供であれば異常行動とは言えないようなものも存在する。このような状況で、大人でも子供でも一律に呼びかけを行ってしまうと、頻繁な呼びかけが騒音となる上に、呼びかけに対する信頼性も損ねてしまうことが考えられる。 For example, children often behave in ways that are judged to be attention-grabbing behaviors (abnormal behaviors), and among these behaviors, there are some that would be abnormal if the target behavior entity Z were an adult, but would not be considered abnormal if it were a child. In such a situation, if calls were made to both adults and children in the same way, the frequent calls would become a distraction and would likely undermine the credibility of the calls.

そこで、本実施の形態では、呼びかけ部７は、注目行動が行われたと判定された場合であっても、属性又は大きさが第１の所定の条件を満たす場合には、呼びかけを行わない、又は、呼びかけの内容又は態様を変更する。 Therefore, in this embodiment, even if it is determined that attention behavior has been performed, if the attribute or size satisfies a first predetermined condition, the call unit 7 does not make a call or changes the content or manner of the call.

例えば、属性が“子供”の場合には、呼びかけを行わないことが考えられる。また、属性が“老人”の場合には、耳が遠いことを考慮して、音量を上げる（呼びかけの態様を変更）、周りの人に対して救護を求める呼びかけに変更（呼びかけの内容を変更）する等が考えられる。 For example, if the attribute is "child," it may be possible to not make a call. If the attribute is "elderly," it may be possible to turn up the volume of the call (changing the manner of the call) or change the call to a request for help from people nearby (changing the content of the call), etc.

また、単純に、対象行動体Ｚの大きさが第１の所定の条件を満たす場合（例えば、「対象行動体Ｚの大きさ（背の高さ）が所定値よりも小さい」場合等）に、同様の制御を行っても良い。 Also, similar control may be performed simply when the size of the target action object Z satisfies a first predetermined condition (for example, when "the size (height) of the target action object Z is smaller than a predetermined value").

このようにして、本実施の形態による注目行動呼びかけシステム１では、時系列画像Ｙにおいて対象行動体Ｚの異常行動等の注目行動が検出された際に、対象行動体Ｚの属性等に応じて適切な呼びかけを行うことが可能となる。 In this way, in the present embodiment, the attention behavior calling system 1 can make an appropriate call according to the attributes of the target behavior entity Z when an attention behavior such as an abnormal behavior of the target behavior entity Z is detected in the time-series image Y.

続いて、図３のフローチャートを用いて、本実施の形態による呼びかけの流れについて説明する。 Next, we will explain the flow of calls in this embodiment using the flowchart in Figure 3.

まず、取得部２により時系列画像Ｙが取得されると（Ｓ１）、検出部３により時系列画像Ｙに映った対象行動体Ｚの特徴点が検出される（Ｓ２）。 First, when a time-series image Y is acquired by the acquisition unit 2 (S1), the detection unit 3 detects feature points of a target action object Z captured in the time-series image Y (S2).

また、Ｓ２で検出された特徴点又は対象行動体Ｚの外観に基づき、対象行動体Ｚの属性又は大きさが推定される（Ｓ３）。 In addition, the attributes or size of the target action object Z are estimated based on the feature points detected in S2 or the appearance of the target action object Z (S3).

続いて、記憶部４に記憶された特徴点の変位と、Ｓ２で検出された特徴点の変位と、に基づき、注目行動が行われたか否かが判定される（Ｓ４）。Ｓ３とＳ４は、逆の順序で行われても良い。 Next, based on the displacement of the feature points stored in the memory unit 4 and the displacement of the feature points detected in S2, it is determined whether or not attention behavior has occurred (S4). S3 and S4 may be performed in the reverse order.

注目行動が行われたと判定された場合には（Ｓ４：ＹＥＳ）、Ｓ３で推定された属性又は大きさが第１の所定の条件を満たすか否かが判定される（Ｓ５）。 If it is determined that attention behavior has occurred (S4: YES), it is determined whether the attribute or size estimated in S3 satisfies a first predetermined condition (S5).

第１の所定の条件を満たさない場合には（Ｓ５：ＮＯ）、注目行動の種類に応じた異なる呼びかけが音声により行われる（Ｓ６）。 If the first predetermined condition is not met (S5: NO), a different voice call is made depending on the type of attention behavior (S6).

一方、第１の所定の条件を満たす場合には（Ｓ５：ＹＥＳ）、呼びかけを行わない、又は、呼びかけの内容又は態様を変更する（Ｓ７）。 On the other hand, if the first predetermined condition is met (S5: YES), the call is not made or the content or manner of the call is changed (S7).

以上説明したように、本実施の形態による注目行動呼びかけシステム１では、注目行動が行われたと判定された場合であっても、属性又は大きさが第１の所定の条件を満たす場合には、呼びかけを行わない、又は、呼びかけの内容又は態様を変更する。 As described above, in the attention behavior call system 1 according to this embodiment, even if it is determined that attention behavior has been performed, if the attribute or size satisfies a first predetermined condition, the call is not made, or the content or manner of the call is changed.

このような構成によれば、時系列画像Ｙにおいて対象行動体Ｚの異常行動等の注目行動が検出された際に、対象行動体Ｚの属性等に応じて適切な呼びかけを行うことが可能となる。例えば、属性が“子供”の場合には呼びかけを停止したり、属性が“老人”の場合には呼びかけの音量を上げたりすることが可能となる。 With this configuration, when abnormal or other noteworthy behavior of the target behavior entity Z is detected in the time-series image Y, it becomes possible to make an appropriate call according to the attributes of the target behavior entity Z. For example, it is possible to stop the call if the attribute is "child," and to increase the volume of the call if the attribute is "elderly."

続いて、図４を用いて、本発明の第２の実施の形態による注目行動呼びかけシステム１００について説明する。なお、第１の実施の形態と同一の部材については同一の符号を付し、その説明を省略する。 Next, an attention behavior calling system 100 according to a second embodiment of the present invention will be described with reference to FIG. 4. Note that the same components as those in the first embodiment are given the same reference numerals and their description will be omitted.

本実施の形態では、記憶部４に記憶される注目行動を学習により決定する。 In this embodiment, the attention behaviors stored in the memory unit 4 are determined through learning.

詳細には、注目行動呼びかけシステム１００は、第１の実施の形態の構成に加えて、学習側取得部１１と、学習側検出部１２と、決定部１３と、を備えている。 In detail, the attention behavior calling system 100 includes a learning side acquisition unit 11, a learning side detection unit 12, and a determination unit 13 in addition to the configuration of the first embodiment.

学習側取得部１１は、所定範囲を撮影するように設置された撮影手段Ｘにより撮影されたサンプル映像（複数のサンプル時系列画像）を取得する。 The learning side acquisition unit 11 acquires sample images (multiple sample time-series images) captured by an imaging means X installed to capture a specified range.

学習側検出部１２は、サンプル映像に映ったサンプル行動体の行動を検出する。 The learning side detection unit 12 detects the behavior of the sample behavioral entity captured in the sample video.

サンプル行動体の行動の検出は、サンプル行動体が所定の行動を行った場合の特徴点の変位（各関節の動き等）を記憶部（記憶部４であっても、他の記憶部であっても良い）に記憶しておき、検出部３と同様に特徴点を検出した上で、検出された特徴点情報の変位が、記憶された特徴点の変位と所定以上一致している場合に、「当該所定の行動が行われた」と検出することが考えられる。 The behavior of the sample behavior body can be detected by storing the displacement of feature points (such as the movement of each joint) when the sample behavior body performs a specified behavior in a memory unit (which can be memory unit 4 or another memory unit), detecting the feature points in the same manner as detection unit 3, and detecting that "the specified behavior has been performed" if the displacement of the detected feature point information matches the displacement of the stored feature points by a specified amount or more.

決定部１３は、学習側検出部１２によって検出された多数の行動に基づき、所定範囲における一又は複数の“通常の行動”を決定する。本実施の形態では、決定された“通常の行動”は、記憶部４に注目行動として記憶される。 The determination unit 13 determines one or more "normal behaviors" within a predetermined range based on the multiple behaviors detected by the learning side detection unit 12. In this embodiment, the determined "normal behaviors" are stored in the memory unit 4 as attention behaviors.

“通常の行動”は、様々な基準で決定することが可能であるが、例えば、検出された全行動の中で所定（閾値）以上の割合を有する行動を“通常の行動”として決定することが考えられる。 "Normal behavior" can be determined based on various criteria, but for example, behavior that has a predetermined (threshold) or higher ratio among all detected behaviors could be determined to be "normal behavior."

そして、判定部５は、各時系列画像Ｙから検出された特徴点の変位が“通常の行動”に相当しない場合に、注目行動が行われたと判定する。 Then, the determination unit 5 determines that attention behavior has been performed when the displacement of the feature points detected from each time-series image Y does not correspond to "normal behavior."

詳細には、各時系列画像Ｙから検出された特徴点の変位が“通常の行動”に相当しない場合、すなわち、注目行動が生じたと判定された場合（図３のＳ３：ＹＥＳ）、第１の実施の形態と同様に、図３のＳ４－Ｓ７の動作が行われることとなる。 In detail, if the displacement of the feature points detected from each time-series image Y does not correspond to "normal behavior," i.e., if it is determined that attention behavior has occurred (S3 in FIG. 3: YES), the operations of S4-S7 in FIG. 3 are performed, as in the first embodiment.

以上説明したように、本実施の形態による注目行動呼びかけシステム１００では、各時系列画像Ｙから検出された特徴点の変位が、学習により決定された“通常の行動”に相当しない場合に、注目行動が行われたと判定する。 As described above, in the present embodiment, the attention behavior calling system 100 determines that attention behavior has been performed when the displacement of feature points detected from each time-series image Y does not correspond to the "normal behavior" determined by learning.

このような構成によれば、暴力行動や転倒のような明らかな異常行動でない行動であっても、その場（所定範囲）にふさわしくない行動に対して呼びかけを行うことが可能となる。但し、このような構成の場合、例えば、子供の行動は、注目行動（異常行動）と判定される可能性が更に高くなるが、このような場合であっても、属性又は大きさが第１の所定の条件を満たす場合には、呼びかけを行わない、又は、呼びかけの内容又は態様を変更することで、“子供”等の注目行動（異常行動）により頻繁な呼びかけが生じることが抑制される。 With this configuration, it is possible to call out to behavior that is inappropriate for the situation (predetermined range), even if the behavior is not clearly abnormal, such as violent behavior or falling. However, with this configuration, for example, the behavior of a child is more likely to be determined as attention-grabbing behavior (abnormal behavior). However, even in such a case, if the attribute or size meets the first predetermined condition, the call is not made, or the content or manner of the call is changed, thereby preventing frequent calls due to attention-grabbing behavior (abnormal behavior) of a "child" or the like.

尚、本発明の注目行動呼びかけシステムは、上述した実施の形態に限定されず、特許請求の範囲に記載した範囲で種々の変形や改良が可能である。 The attention behavior calling system of the present invention is not limited to the above-described embodiment, and various modifications and improvements are possible within the scope of the claims.

例えば、大人であれば「走る」ことが適切でない場所であっても、子供であれば悪意なく走ってしまうことも考えられるが、悪意がなく危険性も低いにも関わらず「走る」という行動に対して一律に呼びかけを行うことは適切ではない。そこで、行われた注目行動の悪意及び危険度に応じて、呼びかけを行わない、又は、呼びかけの内容又は態様を変更しても良い。 For example, even if a place is not appropriate for an adult to "run," a child may run without any malicious intent. However, it is not appropriate to uniformly issue a call to the child when the action of "running" is unintentional and low risk. Therefore, depending on the malicious intent and the degree of danger of the attentional behavior, a call may not be issued, or the content or manner of the call may be changed.

この場合、図５に示すように、注目行動呼びかけシステム１に、行われたと判定された注目行動が映った複数の時系列画像Ｙを参照して、当該行われたと判定された注目行動の危険度を個別に判定する危険度判定部８を更に備えることが考えられる。 In this case, as shown in FIG. 5, the attention behavior calling system 1 may further include a risk determination unit 8 that refers to a plurality of time-series images Y showing the attention behavior determined to have been performed and determines the risk level of the attention behavior determined to have been performed individually.

本変形例では、実施の形態でも行った「属性又は大きさが第１の所定の条件を満たしているか否か」を、悪意の判定に兼用する。例えば、属性が“子供”や“老人”の場合には「悪意が低い」ものとみなすことが考えられる。また、単純に、大きさが所定以下の場合に「悪意が低い」ものとみなすことも可能である。 In this modified example, the "whether the attribute or size satisfies a first predetermined condition" that was also used in the embodiment is also used to determine malice. For example, if the attribute is "child" or "elderly", it can be considered that the malice is "low". It is also possible to simply consider that the malice is "low" if the size is below a predetermined level.

危険度としては、例えば、注目行動が「走る」の場合、対象行動体Ｚの速度が所定値以上であれば「危険度が高い」、所定値未満であれば「危険度が低い」と判定することが考えられる。 For example, when the behavior of interest is "running," it is possible to determine the degree of danger as "high danger" if the speed of the target behavior object Z is equal to or greater than a predetermined value, and "low danger" if it is less than the predetermined value.

また、例えば、注目行動が「寝転がる」の場合、そもそも危険度は低いので、速度等に関係なく「危険度が低い」と判定することが考えられる。但し、注目行動が「転倒」の場合には、転倒速度等も考慮して危険度を判定することが好ましい。 For example, if the behavior of interest is "lying down," the risk is low to begin with, so it may be possible to determine the risk as "low risk" regardless of speed, etc. However, if the behavior of interest is "falling," it is preferable to determine the risk taking into account the speed of falling, etc.

そして、呼びかけ部７は、注目行動が生じたと判定された場合であっても、属性又は大きさが第１の所定の条件を満たし、かつ、危険度が属性又は大きさごとに設定された第２の所定の条件を満たす場合には、呼びかけを行わない、又は、呼びかけの内容又は態様を変更する。 Then, even if it is determined that attention behavior has occurred, if the attribute or size satisfies a first predetermined condition and the degree of danger satisfies a second predetermined condition set for each attribute or size, the call unit 7 does not make a call or changes the content or manner of the call.

例えば、属性が“子供”の場合には、「悪意が低い」と推定することができ、第２の所定条件“低速で走っている”場合には、危険度も低いため、呼びかけが行われないこととなる。 For example, if the attribute is "child," it can be estimated that the vehicle is "low malicious intent," and if the second predetermined condition is "driving slowly," the danger is also low, so no call will be made.

一方、属性が“大人”の場合には、「悪意が高い」と推定することができ、危険度に関係なく呼びかけを行うことが考えられる。 On the other hand, if the attribute is "adult," it can be inferred that the person has "highly malicious intent," and the call may be made regardless of the level of danger.

このような構成により、時系列画像Ｙにおいて対象行動体Ｚの異常行動等の注目行動が検出された際に、注目行動の悪意及び危険度に応じて更に適切な呼びかけを行うことが可能となる。 With this configuration, when abnormal or other noteworthy behavior of the target behavior entity Z is detected in the time-series image Y, it becomes possible to issue a more appropriate appeal depending on the maliciousness and danger level of the noteworthy behavior.

また、場所に応じて、呼びかけを更に制御しても良い。 The calls can also be further controlled depending on the location.

この場合、図５に示すように、注目行動呼びかけシステム１に、時系列画像Ｙに映った背景を検出する背景検出部９を更に備えることが考えられる。 In this case, as shown in FIG. 5, the attention behavior calling system 1 may further include a background detection unit 9 that detects the background shown in the time-series image Y.

そして、呼びかけ部７は、検出された背景に応じて、呼びかけを行わない、又は、呼びかけの内容又は態様を変更する。 Then, the call unit 7 will either not make a call or will change the content or manner of the call depending on the detected background.

例えば、背景が階段の場合、対象行動体Ｚが呼びかけに驚いて落下の危険性があるため「呼びかけを行わない」、背景がエレベーターの場合、呼びかけ部７と対象行動体Ｚの距離は近いので「音量を下げる」等が考えられる。 For example, if the background is stairs, the target action Z may be startled by the call and may risk falling, so "no call will be made." If the background is an elevator, the distance between the call unit 7 and the target action Z is close, so "lower the volume" may be considered.

このような構成により、時系列画像Ｙにおいて対象行動体Ｚの異常行動等の注目行動が検出された際に、背景に応じて更に適切な呼びかけを行うことが可能となる。 With this configuration, when abnormal or other noteworthy behavior of the target behavior entity Z is detected in the time-series image Y, it becomes possible to make a more appropriate appeal depending on the background.

なお、この場合、上記した危険度も考慮すると、更に効果的である。 In this case, it will be even more effective if you also take into account the risk mentioned above.

例えば、“子供”かつ“低速で走っている”場合であっても、背景が階段の場合には、危険度が高くなるので、呼びかけを行うことが考えられる。 For example, even if a child is running slowly, if there are stairs in the background, the danger is high, so a call may be made.

また、ユーザによっては、「子供であっても全ての異常行動に呼びかけを行いたい」等の要望も考えられる。 In addition, some users may want to be notified of any abnormal behavior, even that of children.

そこで、図５に示すように、注目行動呼びかけシステム１に、対象行動体Ｚの年齢又は大きさが所定の条件を満たす場合に、呼びかけを行わない、又は、呼びかけの内容又は態様を変更するか否かをユーザが設定可能な設定部１０を更に備えても良い。 As shown in FIG. 5, the attention behavior call system 1 may further include a setting unit 10 that allows the user to set whether or not to make a call or to change the content or manner of the call when the age or size of the target behavior entity Z meets a predetermined condition.

また、上記実施の形態では、撮影手段Ｘは、呼びかけ部７と一体に設けられていたが、対象行動体Ｚに対して呼びかけができるよう、撮影手段Ｘの撮影範囲の周辺に設けられていれば良い。 In addition, in the above embodiment, the imaging means X is provided integrally with the interrogation unit 7, but it is sufficient if it is provided near the periphery of the imaging range of the imaging means X so that an interrogation can be made to the target action object Z.

また、上記実施の形態では、全ての部材が一体に設けられていたが、例えば、一部の部材（記憶部４、判定部５、推定部６等）が異なる場所（クラウド等）に設けられていることを除外するものではない。 In addition, in the above embodiment, all components are provided integrally, but this does not exclude, for example, some components (memory unit 4, determination unit 5, estimation unit 6, etc.) being provided in different locations (cloud, etc.).

また、本発明は、コントローラとしての各部材が行う処理に相当するプログラム及び方法や、当該プログラムを記憶した記録媒体にも応用可能である。記録媒体の場合、コンピュータ等に当該プログラムがインストールされることとなる。ここで、当該プログラムを記憶した記録媒体は、非一過性の記録媒体であってもよい。非一過性の記録媒体としては、ＣＤ－ＲＯＭ等が考えられるが、それに限定されるものではない。 The present invention can also be applied to a program and method corresponding to the processing performed by each component as a controller, and to a recording medium storing the program. In the case of a recording medium, the program is installed in a computer or the like. Here, the recording medium storing the program may be a non-transient recording medium. A non-transient recording medium may be, but is not limited to, a CD-ROM or the like.

１、１００注目行動呼びかけシステム
２取得部
３検出部
４記憶部
５判定部
６推定部
７呼びかけ部
８危険度判定部
９背景検出部
１０設定部
１１学習側取得部
１２学習側検出部
１３決定部
Ｘ撮影手段
Ｙ時系列画像
Ｚ行動体 Reference Signs List 1, 100 Attention behavior calling system 2 Acquisition unit 3 Detection unit 4 Storage unit 5 Determination unit 6 Estimation unit 7 Call unit 8 Danger level determination unit 9 Background detection unit 10 Setting unit 11 Learning side acquisition unit 12 Learning side detection unit 13 Decision unit X Shooting means Y Time series image Z Behavior body

Claims

an acquisition unit for acquiring time-series images captured by an imaging means;
A detection unit that detects feature points of a target action object captured in the time-series images;
A storage unit that stores displacement of a feature point of a behavior object when the behavior object performs an abnormal behavior;
a determination unit that determines whether or not the abnormal behavior has occurred based on the displacement of the stored feature points and the displacement of the feature points detected from each time-series image;
an estimation unit that estimates an attribute or a size of the target action object based on the detected feature points or an appearance of the target action object;
a calling unit that is provided around the photographing range of the photographing means and that, when it is determined that the abnormal behavior has been performed, issues a different call by voice according to the type of the abnormal behavior;
Equipped with
The attention behavior calling system is characterized in that the call unit does not make the call or changes the content or manner of the call when the attribute or size satisfies a first predetermined condition, even if it is determined that the abnormal behavior has been performed.

A learning side acquisition unit that acquires a sample video captured by the imaging means installed to capture a predetermined range;
A learning side detection unit that detects the behavior of a sample behavior object captured in the sample video;
A determination unit that determines one or more normal behaviors in the predetermined range based on the multiple behaviors detected by the learning side detection unit;
Further comprising:
The system for calling for attention to behavior as described in claim 1, characterized in that the determination unit determines that the abnormal behavior has been performed when the displacement of feature points detected from each time-series image does not correspond to the normal behavior.

A risk determination unit that determines a risk of each of the abnormal behaviors determined to have been performed by referring to a plurality of time-series images showing the abnormal behaviors determined to have been performed,
The attention behavior calling system of claim 1, characterized in that the call unit does not make the call or changes the content or manner of the call when the attribute or size satisfies the first specified condition and the degree of danger satisfies a second specified condition set for each attribute or size, even if it is determined that the abnormal behavior has been performed.

A background detection unit detects a background in the time-series images,
The attention behavior calling system according to claim 1 , wherein the calling unit does not make the call or changes the content or manner of the call depending on the detected background.

The call system according to claim 1, further comprising a setting unit that allows a user to set whether or not to not make the call or to change the content or manner of the call when the age or size of the behavioral entity satisfies a predetermined condition.

A program executed by a computer that stores a displacement of a feature point of a behavior object when the behavior object performs an abnormal behavior,
A step of acquiring time-series images captured by an imaging means;
detecting feature points of a target action object captured in the time-series images;
determining whether or not the abnormal behavior has occurred based on the displacement of the stored feature points and the displacement of the feature points detected from each time-series image;
estimating an attribute or a size of the target action based on the detected feature points or an appearance of the target action;
a step of providing a voice message corresponding to a type of abnormal behavior when the voice message is provided around the photographing range of the photographing means and when the abnormal behavior is determined to have occurred,
Equipped with
A program for calling out attention to behavior, characterized in that in the step of making a call, even if it is determined that abnormal behavior has been performed, if the attribute or size satisfies a first predetermined condition, the call is not made or the content or manner of the call is changed.

A method executed by a computer that stores displacements of feature points of an action object when the action object performs an abnormal action, comprising:
A step of acquiring time-series images captured by an imaging means;
detecting feature points of a target action object captured in the time-series images;
determining whether or not the abnormal behavior has occurred based on the displacement of the stored feature points and the displacement of the feature points detected from each time-series image;
estimating an attribute or a size of the target action based on the detected feature points or an appearance of the target action;
a step of providing a voice message corresponding to a type of abnormal behavior when the voice message is provided around the photographing range of the photographing means and when the abnormal behavior is determined to have occurred,
Equipped with
A method for calling out attention to behavior, characterized in that in the step of making a call, even if it is determined that abnormal behavior has been performed, if the attribute or size satisfies a first predetermined condition, the call is not made or the content or manner of the call is changed.