JP2022050268A

JP2022050268A - System, electronic apparatus, method for controlling electronic apparatus, and program

Info

Publication number: JP2022050268A
Application number: JP2020156779A
Authority: JP
Inventors: 健宏馬渕; Takehiro Mabuchi; 智子浅野; Tomoko Asano; 裕香島; Yu Kashima; 賢也小林; Kenya Kobayashi; 永勲高; Younghun Ko
Original assignee: Kyocera Corp
Current assignee: Kyocera Corp
Priority date: 2020-09-17
Filing date: 2020-09-17
Publication date: 2022-03-30
Anticipated expiration: 2040-09-17
Also published as: JP7465773B2

Abstract

To provide a system, an electronic apparatus, a method for controlling an electronic apparatus, and a program that can contribute to the safety of the surroundings of a subject to be monitored.SOLUTION: A system comprises: an imaging unit; an extraction unit; and a controller. The imaging unit picks up images of a subject to be monitored and the other person or an object related to the other person. The extraction unit extracts the coordinates of predetermined parts of the subject to be monitored and the coordinates of predetermined parts of the other person or the coordinates of the object related to the other person from the images picked up by the imaging unit. The controller, based on timing information indicating a starting point and an ending point of the behavior of the subject to be monitored of using violence on the other person in a sequential image picked up by the imaging unit, performs machine learning of the relationship between the coordinates of the predetermined parts of the subject to be monitored and the coordinates of the predetermined parts of the other person or the coordinates of the object related to the other person between the starting point and the ending point of the behavior of using violence and the start of the behavior of the subject to be monitored of using violence on the other person.SELECTED DRAWING: Figure 1

Description

本開示は、システム、電子機器、電子機器の制御方法、及びプログラムに関する。 The present disclosure relates to systems, electronic devices, control methods for electronic devices, and programs.

例えば介護施設のような現場において、要看護者又は要介護者などのような被監視者の行動を監視する装置が提案されている。例えば、特許文献１は、撮像装置で得られた画像に基づいて、被監視者における所定の行動を検知する被監視者システムを開示している。特許文献２は、対象者の足に検出装置を装着することにより、対象者が歩行中に転倒するのを予防する転倒予防システムを開示している。また、特許文献３は、温度分布を検出することにより、人体の体位を判定する見守り支援装置を開示している。また、引用文献４は、在宅、又は老人ホーム若しくは介護施設における老年精神病患者を監視するための医療システムを開示している。 For example, in a field such as a long-term care facility, a device for monitoring the behavior of a monitored person such as a nurse or a care recipient has been proposed. For example, Patent Document 1 discloses a monitored person system that detects a predetermined behavior in a monitored person based on an image obtained by an image pickup device. Patent Document 2 discloses a fall prevention system that prevents a subject from falling while walking by attaching a detection device to the foot of the subject. Further, Patent Document 3 discloses a monitoring support device for determining the body position of a human body by detecting a temperature distribution. Cited Document 4 also discloses a medical system for monitoring elderly psychotic patients at home or in elderly housing with care or long-term care facilities.

特開２０１７－９１５５２号公報Japanese Unexamined Patent Publication No. 2017-91552 特開２０１７－２２１５０２号公報Japanese Unexamined Patent Publication No. 2017-221502 特開２０１４－１０６６３６号公報Japanese Unexamined Patent Publication No. 2014-106636 特開２００３－９１７９０号公報Japanese Patent Application Laid-Open No. 2003-91790

被監視対象を監視することにより、被監視対象の周囲の安全に供することができれば、有益である。 It would be beneficial if the monitored object could be monitored to ensure the safety of the surroundings of the monitored object.

本開示の目的は、被監視対象の周囲の安全に供し得るシステム、電子機器、電子機器の制御方法、及びプログラムを提供することにある。 An object of the present disclosure is to provide a system, an electronic device, a control method for the electronic device, and a program that can be safely provided around the monitored object.

一実施形態に係るシステムは、撮像部と、抽出部と、コントローラとを備える。前記撮像部は、被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する。前記抽出部は、前記撮像部によって撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する。前記コントローラは、前記撮像部によって撮像された経時的な画像において前記被監視対象が前記他の人物に対して暴力を振るう動作の開始時点及び終了時点を示すタイミング情報に基づいて、前記開始時点と前記終了時点との間における前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標と、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始との関係を機械学習する。 The system according to one embodiment includes an imaging unit, an extraction unit, and a controller. The imaging unit captures a monitored object and another person or an object related to the other person. The extraction unit extracts the coordinates of the predetermined portion of the monitored object, the coordinates of the predetermined portion of the other person, or the coordinates of the object related to the other person from the image captured by the imaging unit. The controller has the start time and the start time based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the image pickup unit. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person between the end point and the time point, and the monitored object with respect to the other person. Machine learning the relationship with the start of violent movements.

また、一実施形態に係るシステムは、
被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する撮像部と、
前記撮像部によって撮像された画像から前記被監視対象の所定部位の座標及び他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出部と、
人間が他の人間に対して暴力を振るう動作の開始時点と終了時点との間における前記人間の所定部位の座標及び前記他の人間の所定部位の座標又は前記他の人間に関連する物体の座標と、前記人間が前記他の人間に対して暴力を振るう動作の開始との関係が機械学習された機械学習データに基づいて、前記抽出部によって抽出された前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標から、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始を推定するコントローラと、
を備える。 In addition, the system according to one embodiment is
An image pickup unit that captures an image of a monitored object and another person or an object related to the other person.
An extraction unit that extracts the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of another person, or the coordinates of an object related to the other person from the image captured by the imaging unit.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined portion of the monitored object extracted by the extraction unit based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. A controller that estimates the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
To prepare for.

一実施形態に係る電子機器は、
被監視対象及び他の人物又は前記他の人物に関連する物体を含んで撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出部と、
前記被監視対象を含んで撮像された経時的な画像において前記被監視対象が前記他の人物に対して暴力を振るう動作の開始時点及び終了時点を示すタイミング情報に基づいて、前記開始時点と前記終了時点との間における前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標と、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始との関係を機械学習するコントローラと、
を備える。 The electronic device according to the embodiment is
From the image captured including the monitored object and another person or an object related to the other person, the coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or related to the other person. An extraction unit that extracts the coordinates of the object to be used,
The start time and the above are based on the timing information indicating the start time and the end time of the action in which the monitored object violently acts against the other person in the time-dependent image captured including the monitored object. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person between the end time point and the time point of the end, and the monitored object with respect to the other person. A controller that machine-learns the relationship with the initiation of violent movements,
To prepare for.

また、一実施形態に係る電子機器は、
被監視対象及び他の人物又は前記他の人物に関連する物体を含んで撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出部と、
人間が他の人間に対して暴力を振るう動作の開始時点と終了時点との間における前記人間の所定部位の座標及び前記他の人間の所定部位の座標又は前記他の人間に関連する物体の座標と、前記人間が前記他の人間に対して暴力を振るう動作の開始との関係が機械学習された機械学習データに基づいて、前記抽出部によって抽出された前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標から、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始を推定するコントローラと、
を備える。 Further, the electronic device according to the embodiment is
From the image captured including the monitored object and another person or an object related to the other person, the coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or related to the other person. An extraction unit that extracts the coordinates of the object to be used,
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined portion of the monitored object extracted by the extraction unit based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. A controller that estimates the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
To prepare for.

一実施形態に係る電子機器の制御方法は、
被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する撮像ステップと、
前記撮像ステップによって撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出ステップと、
前記撮像ステップによって撮像された経時的な画像において前記被監視対象が前記他の人物に対して暴力を振るう動作の開始時点及び終了時点を示すタイミング情報に基づいて、前記開始時点と前記終了時点との間における前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標と、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始との関係を機械学習する機械学習ステップと、
を含む。 The method for controlling an electronic device according to an embodiment is as follows.
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The start time and the end time are based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the imaging step. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person, and the monitored object violently attack the other person. Machine learning steps to machine learn the relationship with the start of motion,
including.

また、一実施形態に係る電子機器の制御方法は、
被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する撮像ステップと、
前記撮像ステップによって撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出ステップと、
人間が他の人間に対して暴力を振るう動作の開始時点と終了時点との間における前記人間の所定部位の座標及び前記他の人間の所定部位の座標又は前記他の人間に関連する物体の座標と、前記人間が前記他の人間に対して暴力を振るう動作の開始との関係が機械学習された機械学習データに基づいて、前記抽出ステップによって抽出された前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標から、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始を推定する推定ステップと、
を含む。 Further, the control method of the electronic device according to the embodiment is as follows.
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined part of the monitored object extracted by the extraction step based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. An estimation step of estimating the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
including.

一実施形態に係るプログラムは、
コンピュータに、
被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する撮像ステップと、
前記撮像ステップによって撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出ステップと、
前記撮像ステップによって撮像された経時的な画像において前記被監視対象が前記他の人物に対して暴力を振るう動作の開始時点及び終了時点を示すタイミング情報に基づいて、前記開始時点と前記終了時点との間における前記所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標と、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始との関係を機械学習する機械学習ステップと、
を実行させる。 The program according to one embodiment is
On the computer
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The start time and the end time are based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the imaging step. The coordinates of the predetermined part and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person, and the start of the action of the monitored object to violently attack the other person. Machine learning steps to machine learn the relationship between
To execute.

また、一実施形態に係るプログラムは、
コンピュータに、
被監視対象及び他の人物又は前記他の人物に関連する物体を撮像する撮像ステップと、
前記撮像ステップによって撮像された画像から前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標を抽出する抽出ステップと、
人間が他の人間に対して暴力を振るう動作の開始時点と終了時点との間における前記人間の所定部位の座標及び前記他の人間の所定部位の座標又は前記他の人間に関連する物体の座標と、前記人間が前記他の人間に対して暴力を振るう動作の開始との関係が機械学習された機械学習データに基づいて、前記抽出ステップによって抽出された前記被監視対象の所定部位の座標及び前記他の人物の所定部位の座標又は前記他の人物に関連する物体の座標から、前記被監視対象が前記他の人物に対して暴力を振るう動作の開始を推定する推定ステップと、
を実行させる。 In addition, the program according to one embodiment is
On the computer
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined part of the monitored object extracted by the extraction step based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. An estimation step of estimating the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
To execute.

一実施形態によれば、被監視対象の周囲の安全に供し得るシステム、電子機器、電子機器の制御方法、及びプログラムを提供することができる。 According to one embodiment, it is possible to provide a system, an electronic device, a control method for the electronic device, and a program that can be safely provided around the monitored object.

一実施形態に係るシステムの概略構成を示す機能ブロック図である。It is a functional block diagram which shows the schematic structure of the system which concerns on one Embodiment. 一実施形態に係るシステムによる学習フェーズの動作を説明するフローチャートである。It is a flowchart explaining the operation of the learning phase by the system which concerns on one Embodiment. 一実施形態に係るシステムによって撮像される画像の例を示すフローチャートである。It is a flowchart which shows the example of the image imaged by the system which concerns on one Embodiment. 一実施形態に係るシステムによって撮像される画像の例を示すフローチャートである。It is a flowchart which shows the example of the image imaged by the system which concerns on one Embodiment. 一実施形態に係るシステムによって抽出される所定の部位の例を示す図である。It is a figure which shows the example of the predetermined part extracted by the system which concerns on one Embodiment. 一実施形態に係るシステムによって撮像される画像から抽出される所定の部位の例を示す図である。It is a figure which shows the example of the predetermined part extracted from the image imaged by the system which concerns on one Embodiment. 一実施形態に係るシステムによって抽出される所定の部位の座標について説明する図である。It is a figure explaining the coordinates of the predetermined part extracted by the system which concerns on one Embodiment. 一実施形態に係るシステムによって撮像される画像から抽出される所定の部位の例を示す図である。It is a figure which shows the example of the predetermined part extracted from the image imaged by the system which concerns on one Embodiment. 一実施形態に係るシステムによって撮像される画像から抽出される所定の部位の例を示す図である。It is a figure which shows the example of the predetermined part extracted from the image imaged by the system which concerns on one Embodiment. 一実施形態に係るシステムによって取得されるタイミング情報について説明する図である。It is a figure explaining the timing information acquired by the system which concerns on one Embodiment. 一実施形態に係るシステムによる推定フェーズの動作を説明するフローチャートである。It is a flowchart explaining the operation of the estimation phase by the system which concerns on one Embodiment. 一実施形態に係るシステムによる推定について説明する図である。It is a figure explaining the estimation by the system which concerns on one Embodiment.

本開示において、「電子機器」とは、電力により駆動する機器としてよい。また、「システム」とは、電力により駆動する機器を含むものとしてよい。また、「ユーザ」とは、一実施形態に係るシステム及び／又は電子機器を使用する者（典型的には人間）としてよい。ユーザは、一実施形態に係るシステム及び／又は電子機器を用いることで、被監視対象の監視を行う者を含んでもよい。また、「被監視対象」とは、一実施形態に係るシステム及び／又は電子機器によって監視される対象となる者（例えば人間又は動物）としてよい。さらに、ユーザは、被監視対象を含んでもよい。 In the present disclosure, the "electronic device" may be a device driven by electric power. Further, the "system" may include a device driven by electric power. Further, the "user" may be a person (typically a human being) who uses the system and / or the electronic device according to the embodiment. The user may include a person who monitors the monitored object by using the system and / or the electronic device according to the embodiment. Further, the “monitored target” may be a person (for example, a human or an animal) to be monitored by the system and / or the electronic device according to the embodiment. Further, the user may include a monitored object.

一実施形態に係るシステムが利用される場面として想定されるのは、例えば、会社、病院、老人ホーム、学校、スポーツジム、及び介護施設などのような、社会活動を行う者が使用する特定の施設などとしてよい。例えば、会社であれば従業員などの健康状態の把握及び／又は管理は、極めて重要である。同様に、病院であれば患者及び医療従事者など、また老人ホームであれば入居者及びスタッフなどの健康状態の把握及び／又は管理は、極めて重要である。一実施形態に係るシステムが利用される場面は、上述の、会社、病院、及び老人ホームなどの施設に限定されず、被監視対象の健康状態の把握及び／又は管理などが望まれる任意の施設としてよい。任意の施設は、例えば、ユーザの自宅などの非商業施設も含んでもよい。また、一実施形態に係るシステムが利用される場面は、例えば、電車、バス、及び飛行機などの移動体内、並びに、駅及び乗り場などとしてもよい。 It is assumed that the system according to one embodiment is used in a specific situation used by a person engaged in social activities such as a company, a hospital, an elderly home, a school, a sports gym, and a long-term care facility. It may be a facility. For example, in the case of a company, it is extremely important to understand and / or manage the health status of employees. Similarly, it is extremely important to understand and / or manage the health status of patients and medical staff in hospitals, and residents and staff in elderly housing with care. The scene in which the system according to the embodiment is used is not limited to the above-mentioned facilities such as companies, hospitals, and elderly housings, but any facility where it is desired to grasp and / or manage the health condition of the monitored object. May be. Any facility may also include non-commercial facilities, such as the user's home. Further, the scene in which the system according to one embodiment is used may be, for example, a moving body such as a train, a bus, or an airplane, a station, a platform, or the like.

一実施形態に係るシステムは、例えば、介護施設などにおいて、要看護者又は要介護者などのような被監視対象の行動を監視する用途で用いられてよい。一実施形態に係るシステムは、例えば要看護者又は要介護者などのような被監視対象が被監視対象を介護したり看護したりするスタッフなどの他の人物に対して暴力を振るう動作を監視することができる。暴力を振るう動作は、他の人物を叩いたり殴ったり押したり蹴ったり体当たりしたり噛みついたりする動作を含んでよい。暴力を振るう動作は、他の人物が座っている椅子又は車椅子を押したり揺らしたりする動作を含んでよい。暴力を振るう動作は、これらに限られず、他の種々の動作を含んでよい。 The system according to one embodiment may be used for monitoring the behavior of a monitored object such as a nurse-requiring person or a nursing-requiring person in a nursing care facility or the like. The system according to one embodiment monitors the behavior of a monitored object, such as a person requiring nursing care or a person requiring long-term care, to violently attack another person such as a staff member who cares for or cares for the monitored object. can do. The act of wielding violence may include the act of hitting, hitting, pushing, kicking, ramming, or biting another person. The act of wielding violence may include the act of pushing or rocking a chair or wheelchair on which another person is sitting. The violent movement is not limited to these, and may include various other movements.

特に、一実施形態に係るシステムは、例えば要看護者又は要介護者などのような被監視対象が他の人物に近づいている場合に、被監視対象が前記他の人物に対して暴力を振るう動作が開始する前又は終了する前に、所定の警告を発することができる。したがって、一実施形態に係るシステムによれば、例えば介護施設などのスタッフは、例えば要看護者又は要介護者などのような被監視対象が他の人物に対して暴力を振るったりさらに暴力を振るい続けたりする前に、被監視対象が他の人物に対して暴力を振るったり暴力を振るい続けたりしようとしていることを認識し得る。 In particular, in the system according to one embodiment, when a monitored target such as a nurse or a care recipient is approaching another person, the monitored target violently acts against the other person. A predetermined warning can be issued before the operation starts or ends. Therefore, according to the system according to one embodiment, the staff of a nursing care facility, for example, a monitored object such as a nurse or a person requiring long-term care violently or further violently acts against another person. Before continuing, it is possible to recognize that the monitored subject is or is trying to violence against another person.

以下、一実施形態に係るシステムについて、図面を参照して詳細に説明する。 Hereinafter, the system according to the embodiment will be described in detail with reference to the drawings.

図１は、一実施形態に係るシステムの概略構成を示す図である。図１に示すように、一実施形態に係るシステム１は、電子機器１０及び撮像部２０を含んで構成されてよい。電子機器１０と撮像部２０とは、有線若しくは無線、又は有線及び無線の組合せにより接続されてよい。一実施形態に係るシステム１は、図１に示す機能部の一部を含まなくてもよいし、図１に示す以外の機能部を含んでもよい。例えば、一実施形態に係るシステム１は、警告部１７及び通信部１９の少なくとも一方を備えなくてもよい。また、例えば、一実施形態に係るシステム１は、画像を表示可能なディスプレイ及び／又はメモリカードなどのストレージを挿入可能なスロットなどを備えてもよい。 FIG. 1 is a diagram showing a schematic configuration of a system according to an embodiment. As shown in FIG. 1, the system 1 according to an embodiment may be configured to include an electronic device 10 and an image pickup unit 20. The electronic device 10 and the image pickup unit 20 may be connected by wire or wireless, or a combination of wire and wireless. The system 1 according to one embodiment may not include a part of the functional unit shown in FIG. 1, or may include a functional unit other than that shown in FIG. For example, the system 1 according to the embodiment may not include at least one of the warning unit 17 and the communication unit 19. Further, for example, the system 1 according to the embodiment may include a display capable of displaying an image and / or a slot into which a storage such as a memory card can be inserted.

図１に示す撮像部２０は、例えばデジタルカメラのような、電子的に画像を撮像するイメージセンサを含んで構成されてよい。撮像部２０は、ＣＣＤ（Charge Coupled Device Image Sensor）又はＣＭＯＳ（Complementary Metal Oxide Semiconductor）センサ等のように、光電変換を行う撮像素子を含んで構成されてよい。撮像部２０は、例えば図１に示すように、被監視対象Ｔを撮像してよい。ここで、被監視対象Ｔは、例えば人間としてよい。撮像部２０は、撮像した画像を信号に変換して、電子機器１０に送信してよい。例えば、撮像部２０は、撮像した画像に基づく信号を、電子機器１０の抽出部１１、記憶部１３、及び／又は、コントローラ１５などに送信してよい。撮像部２０は、被監視対象Ｔを撮像するものであれば、デジタルカメラのような撮像デバイスに限定されず、任意のデバイスとしてよい。撮像部２０は、有線、無線、若しくはこれらの任意の組み合わせにより電子機器１０と接続されている。 The image pickup unit 20 shown in FIG. 1 may include an image sensor that electronically captures an image, such as a digital camera. The image pickup unit 20 may include an image pickup element that performs photoelectric conversion, such as a CCD (Charge Coupled Device Image Sensor) or a CMOS (Complementary Metal Oxide Semiconductor) sensor. The image pickup unit 20 may take an image of the monitored object T, for example, as shown in FIG. Here, the monitored target T may be, for example, a human being. The image pickup unit 20 may convert the captured image into a signal and transmit it to the electronic device 10. For example, the image pickup unit 20 may transmit a signal based on the captured image to the extraction unit 11, the storage unit 13, and / or the controller 15 of the electronic device 10. The image pickup unit 20 is not limited to an image pickup device such as a digital camera as long as it captures the monitored target T, and may be any device. The image pickup unit 20 is connected to the electronic device 10 by wire, wirelessly, or any combination thereof.

一実施形態において、撮像部２０は、例えば被監視対象Ｔを所定時間ごと（例えば秒間１５フレーム）の静止画として撮像してもよい。また、一実施形態において、撮像部２０は、例えば被監視対象Ｔを連続した動画として撮像してもよい。 In one embodiment, the image pickup unit 20 may take an image of the monitored target T, for example, as a still image at predetermined time intervals (for example, 15 frames per second). Further, in one embodiment, the imaging unit 20 may image, for example, the monitored target T as a continuous moving image.

図１に示すように、一実施形態に係る電子機器１０は、抽出部１１、記憶部１３、コントローラ１５、警告部１７、及び通信部１９を備えてよい。一実施形態に係る電子機器１０は、図１に示す機能部の一部を備えなくてもよいし、図１に示す以外の機能部を備えてもよい。例えば、一実施形態に係る電子機器１０は、記憶部１３に記憶される後述の機械学習データ１３２を備えてもよい。例えば、一実施形態に係る電子機器１０は、後述の機械学習データ１３２の少なくとも一部が、外部サーバなどの外部機器に記憶されているとしてもよい。 As shown in FIG. 1, the electronic device 10 according to the embodiment may include an extraction unit 11, a storage unit 13, a controller 15, a warning unit 17, and a communication unit 19. The electronic device 10 according to the embodiment may not include a part of the functional unit shown in FIG. 1, or may include a functional unit other than that shown in FIG. For example, the electronic device 10 according to the embodiment may include machine learning data 132, which will be described later, stored in the storage unit 13. For example, in the electronic device 10 according to the embodiment, at least a part of the machine learning data 132 described later may be stored in an external device such as an external server.

抽出部１１は、撮像部２０によって撮像された画像から、所定の特徴点を抽出する機能を有してよい。例えば、抽出部１１は、撮像部２０によって撮像された被監視対象Ｔの画像から、当該被監視対象Ｔの身体における所定部位のような特徴点の座標を抽出してもよい。ここで、特徴点については、さらに後述する。一実施形態において、抽出部１１は、撮像部２０によって撮像された被監視対象Ｔの画像から、当該被監視対象Ｔの頭部、体幹、四肢、及び／又は各関節などの各部の座標を抽出してもよい。抽出部１１は、専用のハードウェアとして構成されてもよいし、少なくとも一部にソフトウェアを含めて構成されてもよいし、全てソフトウェアで構成されているとしてもよい。このように、抽出部１１は、撮像部２０によって撮像された画像から、被監視対象Ｔの所定部位の座標を抽出してよい。抽出部１１は、他の人物Ｗ（図３など参照）の画像から、他の人物Ｗの身体における所定部位のような特徴点の座標を抽出してもよい。抽出部１１は、撮像部２０によって撮像された画像から、他の人物Ｗの所定部位の座標を抽出してよい。抽出部１１は、撮像部２０によって撮像された画像から、他の人物Ｗが座っている椅子又は車椅子などを認識してもよい。他の人物Ｗが座っている椅子又は車椅子などは、他の人物Ｗに関連する物体とも称される。 The extraction unit 11 may have a function of extracting a predetermined feature point from the image captured by the image pickup unit 20. For example, the extraction unit 11 may extract the coordinates of a feature point such as a predetermined portion of the body of the monitored object T from the image of the monitored object T captured by the imaging unit 20. Here, the feature points will be further described later. In one embodiment, the extraction unit 11 obtains the coordinates of each part such as the head, trunk, limbs, and / or joints of the monitored object T from the image of the monitored object T captured by the imaging unit 20. It may be extracted. The extraction unit 11 may be configured as dedicated hardware, may be configured to include software at least in part, or may be configured entirely by software. In this way, the extraction unit 11 may extract the coordinates of the predetermined portion of the monitored target T from the image captured by the image pickup unit 20. The extraction unit 11 may extract the coordinates of a feature point such as a predetermined part in the body of the other person W from the image of the other person W (see FIG. 3 or the like). The extraction unit 11 may extract the coordinates of a predetermined portion of another person W from the image captured by the image pickup unit 20. The extraction unit 11 may recognize a chair or a wheelchair on which another person W is sitting from the image captured by the image pickup unit 20. A chair or wheelchair on which another person W sits is also referred to as an object related to the other person W.

記憶部１３は、各種の情報を記憶するメモリとしての機能を有してよい。記憶部１３は、例えばコントローラ１５において実行されるプログラム、及び、コントローラ１５において実行された処理の結果などを記憶してよい。また、記憶部１３は、コントローラ１５のワークメモリとして機能してよい。記憶部１３は、例えば半導体メモリ等により構成することができるが、これに限定されず、任意の記憶装置とすることができる。例えば、記憶部１３は、一実施形態に係る電子機器１０に挿入されたメモリカードのような記憶媒体としてもよい。また、記憶部１３は、後述のコントローラ１５として用いられるＣＰＵの内部メモリであってもよいし、コントローラ１５に別体として接続されるものとしてもよい。 The storage unit 13 may have a function as a memory for storing various types of information. The storage unit 13 may store, for example, a program executed by the controller 15, the result of processing executed by the controller 15, and the like. Further, the storage unit 13 may function as a work memory of the controller 15. The storage unit 13 can be configured by, for example, a semiconductor memory or the like, but is not limited to this, and can be any storage device. For example, the storage unit 13 may be a storage medium such as a memory card inserted in the electronic device 10 according to the embodiment. Further, the storage unit 13 may be the internal memory of the CPU used as the controller 15 described later, or may be connected to the controller 15 as a separate body.

図１に示すように、記憶部１３は、例えば機械学習データ１３２を記憶してもよい。ここで、機械学習データ１３２は、機械学習によって生成されるデータとしてよい。また、機械学習とは、特定のタスクをトレーニングによって実行可能になるＡＩ（Artificial Intelligence）の技術に基づくものとしてよい。より具体的には、機械学習とは、コンピュータのような情報処理装置が多くのデータを学習し、分類及び／又は予測などのタスクを遂行するアルゴリズム又はモデルを自動的に構築する技術としてよい。本明細書において、ＡＩ（Artificial Intelligence）の一部には、機械学習が含まれるとしてもよい。本明細書において、機械学習には、正解データをもとに入力データの特徴又はルールを学習する教師あり学習が含まれるものとしてよい。また、機械学習には、正解データがない状態で入力データの特徴又はルールを学習する教師なし学習が含まれるものとしてもよい。さらに、機械学習には、報酬又は罰などを与えて入力データの特徴又はルールを学習する強化学習などが含まれるものとしてもよい。また、本明細書において、機械学習は、教師あり学習、教師なし学習、及び強化学習を任意に組み合わせたものとしてもよい。本実施形態の機械学習データ１３２の概念は、入力データに対して学習されたアルゴリズムを用いて所定の推論（推定）結果を出力するアルゴリズムを含むとしてもよい。本実施形態は、このアルゴリズムとして、例えば、従属変数と独立変数との関係を予測する線形回帰、人の脳神経系ニューロンを数理モデル化したニューラルネットワーク（ＮＮ）、誤差を二乗して算出する最小二乗法、問題解決を木構造にする決定木、及びデータを所定の方法で変形する正則化などその他適宜なアルゴリズムを用いることができる。本実施形態は、ニューラルネットワークの一種であるディープラーニングを利用するとしてよい。ディープラーニングは、ニューラルネットワークの一種であり、ネットワークの階層が深いニューラルネットワークがディープラーニングと呼ばれている。 As shown in FIG. 1, the storage unit 13 may store, for example, machine learning data 132. Here, the machine learning data 132 may be data generated by machine learning. Further, machine learning may be based on AI (Artificial Intelligence) technology that enables a specific task to be executed by training. More specifically, machine learning may be a technique in which an information processing device such as a computer learns a large amount of data and automatically constructs an algorithm or model for performing tasks such as classification and / or prediction. In the present specification, a part of AI (Artificial Intelligence) may include machine learning. In the present specification, machine learning may include supervised learning that learns the features or rules of input data based on correct answer data. Further, the machine learning may include unsupervised learning in which the features or rules of the input data are learned in the absence of correct answer data. Further, the machine learning may include reinforcement learning for learning the features or rules of the input data by giving a reward or a punishment. Further, in the present specification, machine learning may be any combination of supervised learning, unsupervised learning, and reinforcement learning. The concept of machine learning data 132 of the present embodiment may include an algorithm that outputs a predetermined inference (estimation) result by using an algorithm learned for input data. In this embodiment, as this algorithm, for example, linear regression that predicts the relationship between the dependent variable and the independent variable, a neural network (NN) that mathematically models a human brain nervous system neuron, and a least squares that squares an error are calculated. Other appropriate algorithms such as multiplication, decision trees that make problem solving a tree structure, and regularization that transforms data in a predetermined way can be used. In this embodiment, deep learning, which is a kind of neural network, may be used. Deep learning is a type of neural network, and a neural network with a deep network hierarchy is called deep learning.

本開示の技術において、被監視対象Ｔの身体の動作ａと、この動作ａから発生する被監視対象Ｔの動作結果Ａとの間には、一般的に一定の関係が存在するものとしてよい。なお、ここでの動作結果には、被監視対象Ｔの動作、被監視対象Ｔの動作開始時点、被監視対象Ｔの動作から発生する事故及び事件その他の出来事などを含むとしてよい。例えば、被監視対象Ｔの身体の動作ａが行われ、この動作ａから被監視対象Ｔの動作結果Ａが発生したとする。また、被監視対象Ｔの身体の動作ｂが行われ、この動作ｂから被監視対象Ｔの動作結果Ｂが発生したとする。本開示の技術は、上記動作ａと動作結果Ａ、動作ｂと動作結果Ｂその他の動作と動作結果の関係を、機械学習データとして蓄積する。そして、本開示の技術は、動作ｘが抽出された場合に、上記機械学習データを用いて、動作ｘに関係する動作結果Ｘを推定するとしてよい。 In the technique of the present disclosure, it may be assumed that there is generally a certain relationship between the movement a of the body of the monitored target T and the movement result A of the monitored target T generated from this movement a. The operation result here may include the operation of the monitored target T, the operation start time of the monitored target T, an accident, an incident or other event generated from the operation of the monitored target T. For example, it is assumed that the body movement a of the monitored target T is performed, and the movement result A of the monitored target T is generated from this movement a. Further, it is assumed that the body movement b of the monitored target T is performed, and the movement result B of the monitored target T is generated from this movement b. The technique of the present disclosure accumulates the relationship between the operation a and the operation result A, the operation b and the operation result B, and other operations and the operation result as machine learning data. Then, in the technique of the present disclosure, when the motion x is extracted, the motion result X related to the motion x may be estimated by using the machine learning data.

特に、一実施形態において、機械学習データ１３２は、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作をする際の特徴点の動きを機械学習したデータとしてよい。以下、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作を、「暴力動作」と記すことがある。また、機械学習データ１３２は、被監視対象Ｔとして特定の人物（例えば特定の要介護者など）が他の人物Ｗに対して暴力を振るう動作をする際の特徴点の動きを機械学習したデータとしてもよい。一実施形態に係る機械学習データ１３２については、さらに後述する。 In particular, in one embodiment, the machine learning data 132 may be machine-learned data on the movement of feature points when the monitored target T behaves violently against another person W. Hereinafter, the action in which the monitored target T violently acts against another person W may be referred to as "violent action". Further, the machine learning data 132 is data obtained by machine learning the movement of a feature point when a specific person (for example, a specific person requiring nursing care) acts violently against another person W as a monitored target T. May be. The machine learning data 132 according to the embodiment will be further described later.

コントローラ１５は、電子機器１０を構成する各機能部をはじめとして、電子機器１０の全体を制御及び／又は管理する。コントローラ１５は、種々の機能を実行するための制御及び処理能力を提供するために、例えばＣＰＵ（Central Processing Unit）のような、少なくとも１つのプロセッサを含んでよい。コントローラ１５は、まとめて１つのプロセッサで実現してもよいし、いくつかのプロセッサで実現してもよいし、それぞれ個別のプロセッサで実現してもよい。プロセッサは、単一の集積回路として実現されてよい。集積回路は、ＩＣ（Integrated Circuit）ともいう。プロセッサは、複数の通信可能に接続された集積回路及びディスクリート回路として実現されてよい。プロセッサは、他の種々の既知の技術に基づいて実現されてよい。 The controller 15 controls and / or manages the entire electronic device 10 including each functional unit constituting the electronic device 10. The controller 15 may include at least one processor, such as a CPU (Central Processing Unit), to provide control and processing power to perform various functions. The controller 15 may be realized collectively by one processor, by several processors, or by individual processors. The processor may be realized as a single integrated circuit. The integrated circuit is also referred to as an IC (Integrated Circuit). The processor may be realized as a plurality of communicably connected integrated circuits and discrete circuits. The processor may be implemented on the basis of various other known techniques.

一実施形態において、コントローラ１５は、例えばＣＰＵ及び当該ＣＰＵで実行されるプログラムとして構成されてよい。コントローラ１５において実行されるプログラム、及び、コントローラ１５において実行された処理の結果などは、例えば記憶部１３に記憶されてよい。コントローラ１５は、コントローラ１５の動作に必要なメモリを適宜含んでもよい。一実施形態に係る電子機器１０のコントローラ１５の動作については、さらに後述する。 In one embodiment, the controller 15 may be configured as, for example, a CPU and a program executed by the CPU. The program executed by the controller 15, the result of the process executed by the controller 15, and the like may be stored in, for example, the storage unit 13. The controller 15 may appropriately include a memory necessary for the operation of the controller 15. The operation of the controller 15 of the electronic device 10 according to the embodiment will be further described later.

警告部１７は、コントローラ１５から出力される所定の警告信号に基づいて、システム１又は電子機器１０のユーザなどに注意を促すための所定の警告を発してよい。警告部１７は、所定の警告として、例えば音、音声、光、文字、映像、及び振動など、ユーザの聴覚、視覚、触覚の少なくともいずれかを刺激する任意の機能部としてよい。具体的には、警告部１７は、例えばブザー又はスピーカのような音声出力部、ＬＥＤのような発光部、ＬＣＤのような表示部、及びバイブレータのような触感呈示部などの少なくともいずれかとしてよい。このように、警告部１７は、コントローラ１５から出力される所定の警告信号に基づいて、所定の警告を発してよい。一実施形態において、警告部１７は、所定の警報を、聴覚、視覚、及び触覚の少なくともいずれかに作用する情報として発してもよい。 The warning unit 17 may issue a predetermined warning for calling attention to the user of the system 1 or the electronic device 10 based on the predetermined warning signal output from the controller 15. As a predetermined warning, the warning unit 17 may be any functional unit that stimulates at least one of the user's auditory sense, visual sense, and tactile sense, such as sound, voice, light, characters, video, and vibration. Specifically, the warning unit 17 may be at least one of, for example, an audio output unit such as a buzzer or a speaker, a light emitting unit such as an LED, a display unit such as an LCD, and a tactile presentation unit such as a vibrator. .. In this way, the warning unit 17 may issue a predetermined warning based on a predetermined warning signal output from the controller 15. In one embodiment, the warning unit 17 may issue a predetermined alarm as information acting on at least one of auditory, visual, and tactile sensations.

一実施形態において、警告部１７は、例えば被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作を開始する前に、被監視対象Ｔが他の人物Ｗに対して暴力を振るい始めるリスクがある旨の警告を発してよい。また、一実施形態において、警告部１７は、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作を終了する前に、被監視対象Ｔが他の人物Ｗに対して暴力を振るい続けるリスクがある旨の警告を発してもよい。例えば、一実施形態において、視覚情報を出力する警告部１７は、被監視対象Ｔが他の人物Ｗに対して暴力を振るい始めたり暴力を振るい続けたりするリスクがある旨が検出されると、その旨を発光又は所定の表示などによってユーザに警告してよい。また、一実施形態において、聴覚情報を出力する警告部１７は、被監視対象Ｔが他の人物Ｗに対して暴力を振るい始めたり暴力を振るい続けたりするリスクがある旨が検出されると、その旨を所定の音又は音声などによってユーザに警告してよい。本実施形態では、上記警告は、発光又は所定の表示、及び所定の音又は音声を組み合わせてもよい。 In one embodiment, the warning unit 17 has a risk that the monitored target T starts to violently attack the other person W, for example, before the monitored target T starts to violently attack the other person W. You may issue a warning that there is. Further, in one embodiment, the warning unit 17 continues to violence against the other person W before the monitored object T finishes the action of violently acting against the other person W. You may issue a warning that there is a risk. For example, in one embodiment, when the warning unit 17 that outputs visual information detects that the monitored target T has a risk of starting or continuing to violence against another person W, The user may be warned by light emission or a predetermined display to that effect. Further, in one embodiment, when the warning unit 17 that outputs auditory information detects that the monitored target T has a risk of starting or continuing to violence against another person W, The user may be warned to that effect by a predetermined sound or voice. In the present embodiment, the warning may be a combination of light emission or a predetermined display, and a predetermined sound or voice.

図１に示す電子機器１０は、警告部１７を内蔵している。しかしながら、一実施形態にシステム１において、警告部１７は、電子機器１０の外部に設けられてもよい。この場合、警告部１７と電子機器１０とは、有線若しくは無線、又は有線及び無線の組合せにより接続されてよい。 The electronic device 10 shown in FIG. 1 has a built-in warning unit 17. However, in one embodiment, in the system 1, the warning unit 17 may be provided outside the electronic device 10. In this case, the warning unit 17 and the electronic device 10 may be connected by wire or wireless, or a combination of wire and wireless.

通信部１９は、有線又は無線により通信するためのインタフェースの機能を有する。一実施形態の通信部１９によって行われる通信方式は無線通信規格としてよい。例えば、無線通信規格は２Ｇ、３Ｇ、４Ｇ、及び５Ｇ等のセルラーフォンの通信規格を含む。例えばセルラーフォンの通信規格は、ＬＴＥ（Long Term Evolution）、Ｗ－ＣＤＭＡ（Wideband Code Division Multiple Access）、ＣＤＭＡ２０００、ＰＤＣ（Personal Digital Cellular）、ＧＳＭ（登録商標）（Global System for Mobile communications）、及びＰＨＳ（Personal Handy-phone System）等を含む。例えば、無線通信規格は、ＷｉＭＡＸ（Worldwide Interoperability for Microwave Access）、ＩＥＥＥ８０２．１１、ＷｉＦｉ、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＩｒＤＡ（Infrared Data Association）、及びＮＦＣ（Near Field Communication）等を含む。通信部１９は、上記の通信規格の１つ又は複数をサポートすることができる。通信部１９は、例えば電波を送受信するアンテナ及び適当なＲＦ部などを含めて構成してよい。また、通信部１９は、外部に有線接続するためのコネクタなどのようなインタフェースとして構成してもよい。通信部１９は、無線通信を行うための既知の技術により構成することができるため、より詳細なハードウェアなどの説明は省略する。 The communication unit 19 has an interface function for communicating by wire or wirelessly. The communication method performed by the communication unit 19 of one embodiment may be a wireless communication standard. For example, wireless communication standards include communication standards for cellular phones such as 2G, 3G, 4G, and 5G. For example, the communication standards for cellular phones are LTE (Long Term Evolution), W-CDMA (Wideband Code Division Multiple Access), CDMA2000, PDC (Personal Digital Cellular), GSM (Registered Trademark) (Global System for Mobile communications), and PHS. (Personal Handy-phone System) etc. are included. For example, wireless communication standards include WiMAX (Worldwide Interoperability for Microwave Access), 802.11, WiFi, Bluetooth®, IrDA (Infrared Data Association), NFC (Near Field Communication) and the like. The communication unit 19 can support one or more of the above communication standards. The communication unit 19 may be configured to include, for example, an antenna for transmitting and receiving radio waves, an appropriate RF unit, and the like. Further, the communication unit 19 may be configured as an interface such as a connector for making a wired connection to the outside. Since the communication unit 19 can be configured by a known technique for performing wireless communication, more detailed description of hardware and the like will be omitted.

通信部１９が受信する各種の情報は、例えば記憶部１３及び／又はコントローラ１５に供給されてよい。通信部１９が受信する各種の情報は、例えば記憶部１３及び／又はコントローラ１５に内蔵されたメモリに記憶してもよい。また、通信部１９は、例えばコントローラ１５による処理結果、抽出部１１による抽出結果、及び／又は、記憶部１３に記憶された情報などを外部に送信してもよい。 Various types of information received by the communication unit 19 may be supplied to, for example, the storage unit 13 and / or the controller 15. Various information received by the communication unit 19 may be stored in, for example, a memory built in the storage unit 13 and / or the controller 15. Further, the communication unit 19 may transmit, for example, the processing result by the controller 15, the extraction result by the extraction unit 11, and / or the information stored in the storage unit 13 to the outside.

図１に示すような、一実施形態に係る電子機器１０を構成する各機能部の少なくとも一部は、ソフトウェアとハードウェア資源とが協働した具体的手段によって構成されてもよい。 As shown in FIG. 1, at least a part of each functional unit constituting the electronic device 10 according to the embodiment may be configured by a specific means in which software and hardware resources cooperate.

次に、一実施形態に係るシステム１の動作について説明する。 Next, the operation of the system 1 according to the embodiment will be described.

一実施形態に係るシステム１の動作は、典型的には、「学習フェーズ」と「推定フェーズ」とに分けることができる。学習フェーズにおいては、例えば被監視対象Ｔのような人間が他の人物に対して暴力を振るう動作における身体の各部の位置（座標）、及び、他の人物Ｗの身体の各部の位置（座標）又は他の人物Ｗに関連する物体の位置（座標）と、暴力動作のタイミングとの関係を機械学習する動作を行ってよい。また、推定フェーズにおいては、学習フェーズにおいて機械学習した結果に基づいて、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作における被監視対象Ｔの身体の各部の位置（座標）、及び、他の人物Ｗの身体の各部の位置（座標）又は他の人物Ｗに関連する物体の位置（座標）から、暴力動作の開始を推定する動作を行ってよい。以下、上述の学習フェーズ及び推定フェーズのそれぞれにおける動作について、より詳細に説明する。まず、学習フェーズにおける動作について、説明する。なお、本開示では、例えば図1に示されるシステム1が複数あり、「学習フェーズ」と「推定フェーズ」を実施するシステムが異なるシステムであってもよい。 The operation of the system 1 according to one embodiment can be typically divided into a "learning phase" and an "estimation phase". In the learning phase, for example, the position (coordinates) of each part of the body in the action of a human being violently acting against another person, such as the monitored target T, and the position (coordinates) of each part of the body of another person W. Alternatively, the motion of machine learning the relationship between the position (coordinates) of the object related to the other person W and the timing of the violent motion may be performed. Further, in the estimation phase, based on the result of machine learning in the learning phase, the position (coordinates) of each part of the body of the monitored target T in the movement in which the monitored target T violently acts against another person W, and , The movement of estimating the start of the violent movement may be performed from the position (coordinates) of each part of the body of the other person W or the position (coordinates) of the object related to the other person W. Hereinafter, the operation in each of the above-mentioned learning phase and estimation phase will be described in more detail. First, the operation in the learning phase will be described. In the present disclosure, for example, there may be a plurality of systems 1 shown in FIG. 1, and the systems that carry out the "learning phase" and the "estimation phase" may be different.

図２は、一実施形態に係るシステム１の学習フェーズにおける動作の一例を示すフローチャートである。図２は、一実施形態に係るシステム１に含まれる電子機器１０の学習フェーズにおける動作に焦点を当てたフローチャートとしてもよい。 FIG. 2 is a flowchart showing an example of the operation in the learning phase of the system 1 according to the embodiment. FIG. 2 may be a flowchart focusing on the operation of the electronic device 10 included in the system 1 according to the embodiment in the learning phase.

例えば認知症の発症が疑われる者など（例えば要看護者又は要介護者など）は、看護者又は介護者などの他の人物Ｗに対して暴力を振るってしまうことがある。このような場合、他の人物Ｗがケガをすることがある。また、他の人物Ｗが不快に感じたり危険を感じたりする。つまり、他の人物Ｗの安全が脅かされるリスクがある。したがって、他の人物Ｗに対して暴力が振るわれようとしていること、又は、他の人物Ｗに対して暴力が振るわれたことを他者に警告することは、被監視対象Ｔの周囲の他の人物Ｗの安全に供し得るのみならず、被監視対象Ｔの周囲環境の安全にも供し得る。一実施形態に係るシステム１の学習フェーズにおいては、上述のようにして被監視対象Ｔのような人間が他の人物Ｗに対して暴力を振るう動作における被監視対象Ｔの身体の各部の位置（座標）、及び、他の人物Ｗの身体の各部の位置（座標）又は他の人物Ｗに関連する物体の位置（座標）と、暴力動作のタイミングとの関係を機械学習してよい。以下、このような動作について、より詳細に説明する。 For example, a person suspected of developing dementia (for example, a nurse or a care recipient) may violence against another person W such as a nurse or a caregiver. In such a case, another person W may be injured. In addition, another person W feels uncomfortable or dangerous. That is, there is a risk that the safety of another person W will be threatened. Therefore, to warn another person that violence is about to be violent against another person W or that violence has been violent against another person W is to warn others around the monitored target T. Not only can it be used for the safety of the person W, but it can also be used for the safety of the surrounding environment of the monitored target T. In the learning phase of the system 1 according to the embodiment, as described above, the position of each part of the body of the monitored object T in the action of a human being such as the monitored object T violently acting against another person W ( Coordinates) and the position (coordinates) of each part of the body of the other person W or the position (coordinates) of the object related to the other person W may be machine-learned about the relationship between the timing of the violent movement. Hereinafter, such an operation will be described in more detail.

図２に示す動作が開始する時点において、システム１の撮像部２０は、例えば被監視対象Ｔのような人間の撮像を開始していてよい。図２に示す動作が開始する時点は、撮像部２０が例えば被監視対象Ｔのような人間の撮像を開始した時点としてもよい。また、図２に示す動作が開始する時点は、撮像部２０が撮像を開始してから、例えば被監視対象Ｔのような人間が撮像部２０の撮像範囲に入った時点としてもよい。 At the time when the operation shown in FIG. 2 starts, the imaging unit 20 of the system 1 may start imaging a human such as the monitored target T. The time point at which the operation shown in FIG. 2 starts may be the time when the image pickup unit 20 starts image pickup of a human such as the monitored target T. Further, the time point at which the operation shown in FIG. 2 starts may be the time when a person such as the monitored target T enters the image pickup range of the image pickup unit 20 after the image pickup unit 20 starts image pickup.

図２に示す動作が開始すると、電子機器１０のコントローラ１５は、撮像部２０によって撮像された画像を取得する（ステップＳ１１）。 When the operation shown in FIG. 2 starts, the controller 15 of the electronic device 10 acquires the image captured by the image pickup unit 20 (step S11).

図３は、図２に示したステップＳ１１においてコントローラ１５が取得した画像、すなわち撮像部２０によって撮像された画像の例を示す図である。 FIG. 3 is a diagram showing an example of an image acquired by the controller 15 in step S11 shown in FIG. 2, that is, an image captured by the imaging unit 20.

図３に示すように、撮像部２０は、例えば被監視対象Ｔのような人間が他の人物Ｗに近づいている状態を撮像してよい。図３は、被監視対象Ｔのような人間が他の人物Ｗに近づいている様子を模式的に示している。この場合、ステップＳ１１において、コントローラ１５は、被監視対象Ｔのような人間が他の人物Ｗに近づいている画像を取得する。後述のように、撮像部２０は、被監視対象Ｔのような人間が他の人物Ｗに近づいている状態以外の状態の画像を撮像してもよい。この場合、ステップＳ１１において、コントローラ１５は、被監視対象Ｔのような人間が他の人物Ｗに近づいている状態以外の状態の画像を取得してもよい。 As shown in FIG. 3, the image pickup unit 20 may take an image of a state in which a person such as a monitored target T is approaching another person W. FIG. 3 schematically shows how a person such as the monitored target T is approaching another person W. In this case, in step S11, the controller 15 acquires an image in which a person such as the monitored target T is approaching another person W. As will be described later, the image pickup unit 20 may capture an image in a state other than the state in which a person such as the monitored target T is approaching another person W. In this case, in step S11, the controller 15 may acquire an image of a state other than the state in which a person such as the monitored target T is approaching another person W.

図４は、図２に示したステップＳ１１においてコントローラ１５が取得した画像、すなわち撮像部２０によって撮像された画像の他の例を示す図である。図４に示すように、撮像部２０は、例えば被監視対象Ｔのような人間が他の人物Ｗを叩いている状態を撮像してよい。図４は、被監視対象Ｔが他の人物Ｗを叩いている様子を模式的に示している。図４において、被監視対象Ｔ及び／又は他の人物Ｗは、移動していてもよいし、静止していてもよい。この場合、ステップＳ１１において、コントローラ１５は、被監視対象Ｔのような人間及び他の人物Ｗを含む画像を取得する。後述のように、撮像部２０は、被監視対象Ｔのような人間が他の人物Ｗを叩いている状態以外の状態の画像を撮像してもよい。この場合、ステップＳ１１において、コントローラ１５は、被監視対象Ｔのような人間が他の人物Ｗを叩いている状態以外の状態の画像を取得してもよい。 FIG. 4 is a diagram showing another example of the image acquired by the controller 15 in step S11 shown in FIG. 2, that is, the image captured by the imaging unit 20. As shown in FIG. 4, the image pickup unit 20 may take an image of a state in which a person such as a monitored target T is hitting another person W. FIG. 4 schematically shows how the monitored target T is hitting another person W. In FIG. 4, the monitored target T and / or another person W may be moving or stationary. In this case, in step S11, the controller 15 acquires an image including a human being such as the monitored target T and another person W. As will be described later, the image pickup unit 20 may capture an image in a state other than the state in which a person such as the monitored target T is hitting another person W. In this case, in step S11, the controller 15 may acquire an image of a state other than the state in which a person such as the monitored target T is hitting another person W.

撮像部２０は、秒間所定数のフレームの各画像を撮像するものとしてよい。ここで、撮像部２０が撮像する画像は、連続するフレームの静止画としてもよいし、動画としてもよい。例えば、撮像部２０は、秒間１５フレームの画像を撮像するものとしてよい。ステップＳ１１において、コントローラ１５は、撮像部２０によって撮像された秒間所定数のフレームの画像を取得してよい。 The image pickup unit 20 may capture each image of a predetermined number of frames per second. Here, the image captured by the image pickup unit 20 may be a still image of continuous frames or a moving image. For example, the image pickup unit 20 may capture an image of 15 frames per second. In step S11, the controller 15 may acquire images of a predetermined number of frames per second captured by the image pickup unit 20.

図２に示すように、ステップＳ１１において撮像された画像を取得すると、抽出部１１は、被監視対象Ｔの身体における所定部位の座標を抽出する（ステップＳ１２）。ステップＳ１２における動作は、抽出部１１ではなく、コントローラ１５が行ってもよい。抽出部１１は、被監視対象Ｔの身体における所定部位の座標だけでなく、他の人物Ｗの身体における所定部位の座標をさらに抽出する。抽出部１１は、他の人物Ｗに関連する物体の座標をさらに抽出してもよい。 As shown in FIG. 2, when the image captured in step S11 is acquired, the extraction unit 11 extracts the coordinates of a predetermined portion of the body of the monitored target T (step S12). The operation in step S12 may be performed by the controller 15 instead of the extraction unit 11. The extraction unit 11 further extracts not only the coordinates of the predetermined portion of the body of the monitored target T but also the coordinates of the predetermined portion of the body of the other person W. The extraction unit 11 may further extract the coordinates of the object related to the other person W.

図５は、ステップＳ１２において抽出される被監視対象Ｔの身体における所定部位の例を示す図である。 FIG. 5 is a diagram showing an example of a predetermined portion of the body of the monitored target T extracted in step S12.

ステップＳ１２において、抽出部１１は、例えば図５に示すような被監視対象Ｔの身体における所定部位の座標を抽出してよい。図５に示すように、ステップＳ１２において座標を抽出する所定部位は、例えば、被監視対象Ｔの身体における首、左肩、左肘、左手首、右肩、右肘、及び右手首を含んでよい。また、図５に示すように、ステップＳ１２において座標を抽出する所定部位は、例えば、被監視対象Ｔの身体における左尻、左膝、左足首、右尻、右膝、及び右足首をさらに含んでよい。このように、ステップＳ１２において抽出される所定部位の座標は、被監視対象Ｔの身体における所定の関節点の座標などとしてよい。 In step S12, the extraction unit 11 may extract the coordinates of a predetermined portion of the body of the monitored target T as shown in FIG. 5, for example. As shown in FIG. 5, the predetermined portion from which the coordinates are extracted in step S12 may include, for example, the neck, left shoulder, left elbow, left wrist, right shoulder, right elbow, and right wrist in the body of the monitored object T. .. Further, as shown in FIG. 5, the predetermined portion for which the coordinates are extracted in step S12 further includes, for example, the left hip, left knee, left ankle, right hip, right knee, and right ankle in the body of the monitored target T. It's fine. As described above, the coordinates of the predetermined portion extracted in step S12 may be the coordinates of the predetermined joint point in the body of the monitored target T or the like.

図６は、図４に示した画像において、被監視対象Ｔの身体における所定部位として抽出される座標の例を示す図である。図６に示す被監視対象Ｔの画像は、図４に示した被監視対象Ｔの画像と同じものを示している。図６は、図４に示した被監視対象Ｔの画像において、図５に示した被監視対象Ｔの身体における所定部位として抽出される座標を示している。 FIG. 6 is a diagram showing an example of coordinates extracted as a predetermined part of the body of the monitored target T in the image shown in FIG. The image of the monitored object T shown in FIG. 6 is the same as the image of the monitored object T shown in FIG. FIG. 6 shows the coordinates extracted as a predetermined part of the body of the monitored object T shown in FIG. 5 in the image of the monitored object T shown in FIG.

ステップＳ１２において、抽出部１１は、図６に示す複数のドットの座標を、図５に示す被監視対象Ｔの身体における所定部位として抽出する。例えば、抽出部１１は、図６に示す座標軸に従って、図６に示す複数のドットの座標を、２次元的に抽出してよい。すなわち、抽出部１１は、撮像部２０によって撮像される画像の撮像範囲の左下端部は、図６に示す座標軸の原点を示すものとしてよい。例えば、抽出部１１は、図６に示す被監視対象Ｔの首の位置の座標を、図６に示す座標軸に従って取得する。 In step S12, the extraction unit 11 extracts the coordinates of the plurality of dots shown in FIG. 6 as predetermined parts of the body of the monitored target T shown in FIG. For example, the extraction unit 11 may two-dimensionally extract the coordinates of the plurality of dots shown in FIG. 6 according to the coordinate axes shown in FIG. That is, in the extraction unit 11, the lower left end portion of the image pickup range of the image captured by the image pickup unit 20 may indicate the origin of the coordinate axis shown in FIG. For example, the extraction unit 11 acquires the coordinates of the position of the neck of the monitored target T shown in FIG. 6 according to the coordinate axes shown in FIG.

ここで、撮像部２０が秒間所定数のフレームの各画像を撮像する場合、抽出部１１は、秒間所定数のフレームにおいて被監視対象Ｔの身体における所定部位として抽出してよい。また、コントローラ１５が秒間所定数のフレームの画像を取得する場合も、抽出部１１は、秒間所定数のフレームにおいて被監視対象Ｔの身体における所定部位として抽出してよい。一例として、抽出部１１は、被監視対象Ｔの身体における所定部位を、秒間１５フレームにおいて抽出してよい。 Here, when the imaging unit 20 captures each image of a predetermined number of frames per second, the extraction unit 11 may extract the image as a predetermined portion of the body of the monitored target T in the predetermined number of frames per second. Further, even when the controller 15 acquires images of a predetermined number of frames per second, the extraction unit 11 may extract the images of the monitored target T as a predetermined part in the body in the predetermined number of frames per second. As an example, the extraction unit 11 may extract a predetermined portion of the body of the monitored target T at 15 frames per second.

図７は、例えば１秒間の１５フレームにおいて、被監視対象Ｔの身体において抽出された所定部位の座標をまとめて示す図である。図７に示すように、ステップＳ１２において、コントローラ１５（又は抽出部１１）は、被監視対象Ｔの身体において抽出された所定部位の座標を、フレームごとに並べて配置してもよい。図７に示すように、抽出部１１は、フレームごとに、被監視対象Ｔの身体において２次元的に（Ｘ，Ｙ座標として）所定部位の座標を抽出してよい。図７に示す表において、各行は、各フレームにおいて、被監視対象Ｔの身体の所定部位が、Ｘ，Ｙ座標として抽出された様子を模式的に示してある。また、図７に示す表において、各フレームを示す行は、時間の経過に従って上から下に示してある。図７に示す１５フレームの座標は、例えば図６に示すような画像（又は動画）における１秒間の座標をトラッキングしたものとしてよい。また、図７に示す１５フレームの後も、順次、被監視対象Ｔの身体において所定部位の座標が抽出されるものとしてよい。 FIG. 7 is a diagram showing the coordinates of predetermined parts extracted in the body of the monitored target T collectively, for example, in 15 frames per second. As shown in FIG. 7, in step S12, the controller 15 (or the extraction unit 11) may arrange the coordinates of the predetermined parts extracted in the body of the monitored target T side by side for each frame. As shown in FIG. 7, the extraction unit 11 may extract the coordinates of a predetermined portion two-dimensionally (as X and Y coordinates) in the body of the monitored target T for each frame. In the table shown in FIG. 7, each row schematically shows how a predetermined part of the body of the monitored target T is extracted as X and Y coordinates in each frame. Also, in the table shown in FIG. 7, the rows showing each frame are shown from top to bottom over time. The coordinates of the 15 frames shown in FIG. 7 may be, for example, the coordinates of one second in the image (or moving image) as shown in FIG. 6 tracked. Further, even after the 15 frames shown in FIG. 7, the coordinates of a predetermined portion in the body of the monitored target T may be sequentially extracted.

このように、一実施形態において、抽出部１１は、撮像部２０によって撮像された秒間所定数の各フレームの画像から、被監視対象Ｔの身体における所定数の関節点の座標を２次元的に抽出してもよい。 As described above, in one embodiment, the extraction unit 11 two-dimensionally obtains the coordinates of a predetermined number of joint points in the body of the monitored target T from the images of the predetermined number of frames per second captured by the image pickup unit 20. It may be extracted.

他の人物Ｗの身体における所定部位の座標は、被監視対象Ｔの身体における所定部位の座標と同様に抽出されてもよい。他の人物Ｗに関連する物体の座標がさらに抽出されてもよい。また、図６に示されるように、被監視対象Ｔが他の人物Ｗを叩いている態様の暴力が振るわれている場合の座標だけでなく、他の態様の暴力が振るわれている場合の座標が抽出されてもよい。 The coordinates of the predetermined portion in the body of the other person W may be extracted in the same manner as the coordinates of the predetermined portion in the body of the monitored target T. The coordinates of the object related to the other person W may be further extracted. Further, as shown in FIG. 6, not only the coordinates when the violence in the mode in which the monitored target T is hitting the other person W is being performed, but also the violence in the case where the violence in another mode is being performed. Coordinates may be extracted.

図８は、ステップＳ１２において認識される被監視対象Ｔ及び他の人物Ｗの他の例を示す図である。図８は、被監視対象Ｔが他の人物Ｗに噛みついている様子を模式的に示した図と、抽出部１１が被監視対象Ｔ及び他の人物Ｗを認識した結果として得られた特徴点を重畳して表示した図である。 FIG. 8 is a diagram showing another example of the monitored target T and another person W recognized in step S12. FIG. 8 is a diagram schematically showing how the monitored target T is biting the other person W, and the feature points obtained as a result of the extraction unit 11 recognizing the monitored target T and the other person W. It is the figure which superposed and displayed.

図９は、ステップＳ１２において認識される被監視対象Ｔ及び他の人物Ｗ、並びに、他の人物Ｗが座っている車椅子Ｃの他の例を示す図である。図９は、被監視対象Ｔが他の人物Ｗが座っている車椅子Ｃを強く押している様子を模式的に示した図と、抽出部１１が被監視対象Ｔ及び他の人物Ｗ、並びに、車椅子Ｃを認識した結果として得られた特徴点を重畳して表示した図である。 FIG. 9 is a diagram showing another example of the wheelchair C in which the monitored target T and another person W recognized in step S12, and the other person W are sitting. FIG. 9 is a diagram schematically showing how the monitored target T strongly pushes the wheelchair C on which the other person W is sitting, and the extraction unit 11 shows the monitored target T, the other person W, and the wheelchair. It is a figure which superposed and displayed the feature points obtained as a result of recognizing C.

ステップＳ１２において、抽出部１１は、図６だけでなく、図８及び図９に示すような被監視対象Ｔ及び他の人物Ｗにおける所定の特徴点を認識してもよい。また、抽出部１１は、図８及び図９に示す座標軸に従って、図８及び図９に示す被監視対象Ｔ及び他の人物Ｗを、２次元的に抽出してよい。また、抽出部１１は、被監視対象Ｔ及び他の人物Ｗを認識する際に、画像における被監視対象Ｔ及び他の人物Ｗの位置（例えば座標）を抽出してもよい。また、抽出部１１は、他の人物Ｗに関連する物体として車椅子Ｃを抽出し、車椅子Ｃの座標又は車椅子Ｃの特徴点の座標を抽出してもよい。 In step S12, the extraction unit 11 may recognize not only FIG. 6 but also predetermined feature points in the monitored target T and the other person W as shown in FIGS. 8 and 9. Further, the extraction unit 11 may two-dimensionally extract the monitored target T and another person W shown in FIGS. 8 and 9 according to the coordinate axes shown in FIGS. 8 and 9. Further, the extraction unit 11 may extract the positions (for example, coordinates) of the monitored target T and the other person W in the image when recognizing the monitored target T and the other person W. Further, the extraction unit 11 may extract the wheelchair C as an object related to the other person W, and extract the coordinates of the wheelchair C or the coordinates of the feature point of the wheelchair C.

ステップＳ１２において所定部位の座標が抽出されたら、抽出部１１は、抽出された所定数のフレーム（例えば１秒間の１５フレーム）における座標（Ｘ，Ｙ）それぞれの最大値及び最小値に従って、座標を正規化する（ステップＳ１３）。 After the coordinates of the predetermined portion are extracted in step S12, the extraction unit 11 extracts the coordinates according to the maximum and minimum values of the coordinates (X, Y) in the extracted predetermined number of frames (for example, 15 frames per second). Normalize (step S13).

ステップＳ１２において抽出される所定部位の座標は、例えば被監視対象Ｔの身体のサイズなどに起因してばらつくことが想定される。また、ステップＳ１２において抽出される所定部位の座標は、例えば撮像部２０と被監視対象Ｔとの距離、及び、撮像部２０から被監視対象Ｔに向く方向などにも起因してばらつくことが想定される。したがって、一実施形態において、ステップＳ１２において抽出された座標のＸ方向成分及びＹ方向成分をそれぞれ正規化することにより、抽出された座標を汎用的に機械学習に用いることができるようにする。 It is assumed that the coordinates of the predetermined portion extracted in step S12 vary due to, for example, the size of the body of the monitored target T. Further, it is assumed that the coordinates of the predetermined portion extracted in step S12 vary due to, for example, the distance between the image pickup unit 20 and the monitored target T, the direction from the image pickup unit 20 toward the monitored target T, and the like. Will be done. Therefore, in one embodiment, by normalizing the X-direction component and the Y-direction component of the coordinates extracted in step S12, the extracted coordinates can be used for machine learning in general.

この場合、例えば１秒間の１５フレームにおいて抽出されたＸ，Ｙ座標のそれぞれの最大値及び最小値に基づいて、抽出されるＸ，Ｙ座標を正規化してもよい。ここで、ステップＳ１２において抽出されたＸ座標の最大値をＸｍａｘとし、ステップＳ１２において抽出されたＸ座標の最小値をＸｍｉｎとする。また、正規化後のＸ座標の最大値をＸ’ｍａｘとする。この場合、以下の式（１）を用いて、正規化前のＸ座標（Ｘ）を、正規化後のＸ座標（Ｘ’）に変換することができる。 In this case, for example, the extracted X and Y coordinates may be normalized based on the maximum and minimum values of the extracted X and Y coordinates in 15 frames per second. Here, the maximum value of the X coordinate extracted in step S12 is Xmax, and the minimum value of the X coordinate extracted in step S12 is Xmin. Further, the maximum value of the X coordinate after normalization is set to X'max. In this case, the X coordinate (X) before normalization can be converted into the X coordinate (X') after normalization by using the following equation (1).

Ｘ’＝（（Ｘ－Ｘｍｉｎ）／（Ｘｍａｘ－Ｘｍｉｎ））・Ｘ’ｍａｘ（１） X'= ((X-Xmin) / (Xmax-Xmin)) · X'max (1)

同様に、ステップＳ１２において抽出されたＹ座標の最大値をＹｍａｘとし、ステップＳ１２において抽出されたＹ座標の最小値をＹｍｉｎとする。また、正規化後のＹ座標の最大値をＹ’ｍａｘとする。この場合、以下の式（２）を用いて、正規化前のＹ座標（Ｙ）を、正規化後のＹ座標（Ｙ’）に変換することができる。 Similarly, the maximum value of the Y coordinate extracted in step S12 is Ymax, and the minimum value of the Y coordinate extracted in step S12 is Ymin. Further, the maximum value of the Y coordinate after normalization is set to Y'max. In this case, the Y coordinate (Y) before normalization can be converted into the Y coordinate (Y') after normalization by using the following equation (2).

Ｙ’＝（（Ｙ－Ｙｍｉｎ）／（Ｙｍａｘ－Ｙｍｉｎ））・Ｙ’ｍａｘ（２） Y'= ((Y-Ymin) / (Ymax-Ymin)) · Y'max (2)

上記の式（１）及び式（２）に従って、抽出された座標のＸ方向成分及びＹ方向成分を正規化することにより、被監視対象Ｔの個体差、及び撮像部２０が被監視対象Ｔを撮像した環境などが機械学習に与える影響を低減することが期待できる。 By normalizing the X-direction component and the Y-direction component of the extracted coordinates according to the above equations (1) and (2), the individual difference of the monitored target T and the imaging unit 20 determine the monitored target T. It can be expected to reduce the influence of the imaged environment on machine learning.

このように、一実施形態において、抽出部１１は、２次元的に抽出された被監視対象Ｔの身体における所定数の関節点の座標の各方向成分を、当該各方向成分の最大値及び最小値に基づいて正規化してもよい。また、このような動作は、抽出部１１ではなくコントローラ１５が行ってもよい。 As described above, in one embodiment, the extraction unit 11 sets each direction component of the coordinates of a predetermined number of joint points in the body of the monitored target T two-dimensionally extracted to the maximum value and the minimum value of each direction component. It may be normalized based on the value. Further, such an operation may be performed by the controller 15 instead of the extraction unit 11.

ステップＳ１３に示す座標の正規化により、図７に示す各座標（Ｘ，Ｙ）は、それぞれ座標（Ｘ’，Ｙ’）に正規化される。 By normalizing the coordinates shown in step S13, each of the coordinates (X, Y) shown in FIG. 7 is normalized to the coordinates (X', Y'), respectively.

他の人物Ｗの身体における所定部位の座標は、被監視対象Ｔの身体における所定部位の座標と同様に正規化されてもよい。他の人物Ｗに関連する物体の座標がさらに正規化されてもよい。 The coordinates of the predetermined part in the body of the other person W may be normalized in the same manner as the coordinates of the predetermined part in the body of the monitored object T. The coordinates of the object associated with the other person W may be further normalized.

ステップＳ１３において座標が正規化されたら、コントローラ１５は、タイミング情報を取得する（ステップＳ１４）。ステップＳ１４においてコントローラ１５がタイミング情報を取得するために、ステップＳ１１において取得された画像、又は、ステップＳ１２において抽出された座標において、予めタイミング情報が付与（設定）されている必要がある。また、ステップＳ１２において抽出された座標ではなく、ステップＳ１３において正規化された座標において、予めタイミング情報が付与（設定）されていてもよい。以下、このようなタイミング情報について、さらに説明する。 When the coordinates are normalized in step S13, the controller 15 acquires the timing information (step S14). In order for the controller 15 to acquire the timing information in step S14, the timing information needs to be added (set) in advance to the image acquired in step S11 or the coordinates extracted in step S12. Further, timing information may be given (set) in advance at the coordinates normalized in step S13 instead of the coordinates extracted in step S12. Hereinafter, such timing information will be further described.

一実施形態において、電子機器１０が機械学習するためのデータとして、例えば被監視対象Ｔのような人間が他の人物Ｗに対して暴力を振るう動作の開始時点及び終了時点を示す情報を用意する必要がある。被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作の開始時点及び終了時点を示す情報（タイミング情報）があれば、電子機器１０は、このタイミング情報を例えば教師データとして機械学習を行うことができる。以上のように、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作の開始時点及び終了時点を示す情報を、「タイミング情報」とも記す。すなわち、「タイミング情報」とは、撮像部２０によって撮像された経時的な画像において暴力動作の開始時点及び終了時点を示す情報としてよい。 In one embodiment, as data for the electronic device 10 to perform machine learning, information indicating the start time point and the end time point of an action in which a human being, such as a monitored target T, violently acts against another person W is prepared. There is a need. If there is information (timing information) indicating the start time and end time of the operation in which the monitored target T violently acts against another person W, the electronic device 10 performs machine learning using this timing information as, for example, teacher data. be able to. As described above, the information indicating the start time point and the end time point of the action in which the monitored target T violently acts against the other person W is also referred to as "timing information". That is, the "timing information" may be information indicating the start time point and the end time point of the violent movement in the time-dependent image captured by the image pickup unit 20.

このようなタイミング情報は、例えばスタッフなどの人員によって付与（設定）されてよい。すなわち、例えば行動学の専門家又は介護施設の職員などが、撮像部２０によって撮像された被監視対象Ｔの画像を観察（視認）しながら、暴力動作の開始時点及び終了時点を示すタイミング情報を付与してよい。また、このようなタイミング情報を付与するための所定の基準を予め設けることにより、行動学の専門家又は介護施設の職員などではない一般的な人員であっても、タイミング情報を付与することができる。上述のように、タイミング情報は、ステップＳ１１において取得された画像データにおいて付与（設定）されてよい。また、タイミング情報は、ステップＳ１２において抽出された座標データにおいて付与（設定）されてよい。また、タイミング情報は、ステップＳ１３において正規化された座標データにおいて付与（設定）されてよい。 Such timing information may be given (set) by personnel such as staff. That is, for example, an ethology expert or a staff member of a long-term care facility observes (visually recognizes) the image of the monitored target T captured by the imaging unit 20, and obtains timing information indicating the start time and end time of the violent movement. May be given. In addition, by setting a predetermined standard for giving such timing information in advance, it is possible to give timing information even to general personnel who are not ethology specialists or staff of long-term care facilities. can. As described above, the timing information may be added (set) in the image data acquired in step S11. Further, the timing information may be added (set) in the coordinate data extracted in step S12. Further, the timing information may be given (set) in the coordinate data normalized in step S13.

図１０は、タイミング情報の設定について説明する図である。図１０においては、画像データが各フレームごとに時間の経過とともに連続している様子を概念的に示している。すなわち、図１０において、画像データは、時間の経過に従って、フレーム１、フレーム２、フレーム３、…のように連続していることを示している。また、図１０においては、画像データにタイミング情報を付与（設定）する例を示している。しかしながら、上述のように、図１０に示す画像データは、座標データに代えてもよいし、正規化された座標データに代えてもよい。 FIG. 10 is a diagram illustrating the setting of timing information. FIG. 10 conceptually shows how the image data is continuous with the passage of time for each frame. That is, in FIG. 10, it is shown that the image data is continuous like frame 1, frame 2, frame 3, ... With the passage of time. Further, FIG. 10 shows an example of adding (setting) timing information to the image data. However, as described above, the image data shown in FIG. 10 may be replaced with the coordinate data or may be replaced with the normalized coordinate data.

図１０に示す「入室」の時点において、例えば撮像部２０が設置された部屋に被監視対象Ｔが入室した様子が、撮像部２０によって撮像された画像のデータ（画像データ）に示されていたとする。図１０に示す「暴力動作の開始」の時点において、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作を開始した様子が、画像データに示されていたとする。図１０に示す「暴力動作の終了」の時点において、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作を終了した様子が、画像データに示されていたとする。図１０に示す「退室」の時点において、被監視対象Ｔが、例えば撮像部２０が設置された部屋から退室した様子が、画像データに示されていたとする。 At the time of "entry" shown in FIG. 10, for example, the state in which the monitored target T entered the room in which the image pickup unit 20 was installed was shown in the image data (image data) captured by the image pickup unit 20. do. It is assumed that the image data shows that the monitored target T has started the action of violently acting against the other person W at the time of "start of the violent action" shown in FIG. It is assumed that the image data shows that the monitored target T has finished the action of violently acting against the other person W at the time of "the end of the violent action" shown in FIG. It is assumed that the image data shows that the monitored target T has left the room in which the image pickup unit 20 is installed, for example, at the time of “leaving the room” shown in FIG.

以上のように撮像された画像データにおいて、例えば行動学の専門家又は介護施設の職員その他の一般的な人員などによって、少なくとも「暴力動作の開始」の時点及び「暴力動作の終了」の時点を示すタイミング情報が付与（設定）されてよい。 In the image data captured as described above, at least the time of "beginning of violent movement" and the time of "end of violent movement" are determined by, for example, an ethology expert, a staff member of a long-term care facility, or other general personnel. The indicated timing information may be given (set).

ここで、「暴力動作の開始」の時点とは、撮像部２０が設置された部屋に入室してきた被監視対象Ｔが、例えば図３に示すような他の人物Ｗに近づく動きを開始した時点としてもよい。 Here, the time point of "start of violent movement" is the time point when the monitored target T who has entered the room in which the image pickup unit 20 is installed starts to move closer to another person W as shown in FIG. 3, for example. May be.

一般的に、図６に示すように被監視対象Ｔが他の人物Ｗを叩く態様で暴力動作を開始する場合、被監視対象Ｔの手首を表す特徴点Ｔａが他の人物Ｗの少なくとも一部の特徴点に対して近づき始める。したがって、一実施形態において、スタッフなどの人員は、被監視対象Ｔの手首を表す特徴点Ｔａが他の人物Ｗの少なくとも一部の特徴点に対して近づき始めた時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。例えば、スタッフなどの人員は、図６において、被監視対象Ｔの手首の特徴点Ｔａと他の人物Ｗの肘を表す特徴点Ｗａとの距離Ｄ０が所定の閾値未満に変化する時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。また、例えば、スタッフなどの人員は、図６において、被監視対象Ｔが広がる範囲と、他の人物Ｗが広がる範囲とが重ならない状態から重なる状態に変化する時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。また、被監視対象Ｔが他の人物Ｗを叩く態様で暴力動作を開始する場合、被監視対象Ｔが腕を振り上げることもある。スタッフなどの人員は、被監視対象Ｔの手首を表す特徴点Ｔａが肘を表す特徴点よりも高い位置に上がった時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してもよい。 Generally, when the monitored target T starts a violent action in a manner of hitting another person W as shown in FIG. 6, the feature point Ta representing the wrist of the monitored target T is at least a part of the other person W. Begin to approach the feature points of. Therefore, in one embodiment, a person such as a staff member "starts a violent action" when the feature point Ta representing the wrist of the monitored target T begins to approach at least a part of the feature points of the other person W. Timing information may be added (set) as the time point of. For example, in FIG. 6, a person such as a staff member "violence" when the distance D0 between the feature point Ta of the wrist of the monitored target T and the feature point Wa representing the elbow of another person W changes below a predetermined threshold value. Timing information may be added (set) as the time point of "start of operation". Further, for example, in FIG. 6, a person such as a staff member has a “start of violent action” at a time when the range in which the monitored target T spreads and the range in which another person W spreads change from a state in which they do not overlap to a state in which they overlap. Timing information may be added (set) as a time point. Further, when the monitored target T starts a violent action in a manner of hitting another person W, the monitored target T may swing his arm up. Personnel such as staff give (set) timing information as the time when the feature point Ta representing the wrist of the monitored target T rises to a position higher than the feature point representing the elbow as the time point of "start of violent movement". You may.

また、被監視対象Ｔが図８に示すように他の人物Ｗに噛みつく態様で暴力動作を開始する場合、被監視対象Ｔの首の上部又は頭部を表す特徴点Ｔａが他の人物Ｗの少なくとも一部の特徴点に対して近づき始める。したがって、一実施形態において、スタッフなどの人員は、被監視対象Ｔの首の上部又は頭部を表す特徴点Ｔａが他の人物Ｗの少なくとも一部の特徴点に対して近づき始めた時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。例えば、スタッフなどの人員は、図８において、被監視対象Ｔの首又は頭部を表す特徴点Ｔａと他の人物Ｗの手首を表す特徴点Ｗａとの距離Ｄ０が所定の閾値未満に変化する時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。また、例えば、スタッフなどの人員は、図８において、被監視対象Ｔが広がる範囲と、他の人物Ｗが広がる範囲とが重ならない状態から重なる状態に変化する時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。また、被監視対象Ｔが他の人物Ｗに噛みつく態様で暴力動作を開始する場合、被監視対象Ｔが上半身を不自然に屈めることもある。スタッフなどの人員は、被監視対象Ｔの上半身を表す特徴点が前屈を表す位置関係に移動し始めた時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してもよい。 Further, when the monitored target T starts a violent motion in a manner of biting the other person W as shown in FIG. 8, the feature point Ta representing the upper part of the neck or the head of the monitored target T is the other person W. Start approaching at least some feature points. Therefore, in one embodiment, a staff member or the like refers to the time when the feature point Ta representing the upper part of the neck or the head of the monitored target T begins to approach at least a part of the feature points of the other person W. Timing information may be added (set) at the time of "start of violent movement". For example, in FIG. 8, in FIG. 8, the distance D0 between the feature point Ta representing the neck or head of the monitored target T and the feature point Wa representing the wrist of another person W changes to less than a predetermined threshold value. Timing information may be added (set) with the time point as the time point of "start of violent action". Further, for example, in FIG. 8, the time point at which the range in which the monitored target T spreads and the range in which the other person W spreads changes from the non-overlapping state to the overlapping state is defined as the “start of violent movement”. Timing information may be added (set) as a time point. Further, when the monitored target T starts a violent action in a manner of biting another person W, the monitored target T may bend the upper body unnaturally. Even if personnel such as staff members give (set) timing information, the time when the feature point representing the upper body of the monitored target T begins to move to the positional relationship representing forward bending is set as the time point of "start of violent movement". good.

また、被監視対象Ｔが図９に示すように他の人物Ｗが座っている車椅子Ｃを押したり揺らしたりして衝撃を与える態様で暴力動作を開始する場合、被監視対象Ｔの手首を表す特徴点Ｔａが他の人物Ｗが座っている車椅子Ｃの少なくとも一部に対して近づき始める。したがって、一実施形態において、スタッフなどの人員は、被監視対象Ｔの手首を表す特徴点Ｔａが他の人物Ｗが座っている車椅子Ｃの少なくとも一部に接触した時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。例えば、スタッフなどの人員は、図９において、被監視対象Ｔの手首を表す特徴点Ｔａと他の人物Ｗが座っている車椅子ＣのハンドルＣａとが接触した時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。また、例えば、スタッフなどの人員は、図９において、被監視対象Ｔが広がる範囲と、他の人物Ｗが座っている車椅子Ｃが広がる範囲とが重ならない状態から重なる状態に変化する時点を「暴力動作の開始」の時点として、タイミング情報を付与（設定）してよい。 Further, as shown in FIG. 9, when the monitored target T starts a violent movement by pushing or shaking the wheelchair C on which another person W is sitting to give an impact, it represents the wrist of the monitored target T. Feature point Ta begins to approach at least part of wheelchair C on which another person W is sitting. Therefore, in one embodiment, a person such as a staff member "starts a violent movement" when the feature point Ta representing the wrist of the monitored target T comes into contact with at least a part of the wheelchair C in which another person W is sitting. Timing information may be added (set) as the time point of. For example, in FIG. 9, a person such as a staff member “starts violent movement” when the feature point Ta representing the wrist of the monitored target T and the handle Ca of the wheelchair C on which another person W sits come into contact with each other. Timing information may be added (set) as a time point. Further, for example, in FIG. 9, a person such as a staff member changes from a state in which the range in which the monitored target T expands to a state in which the range in which the wheelchair C on which another person W sits expands does not overlap with each other. Timing information may be added (set) at the time of "start of violent movement".

このように、一実施形態において、タイミング情報における暴力動作の開始時点は、撮像部２０によって撮像された経時的な画像において被監視対象Ｔの各特徴点が他の人物Ｗの少なくとも一部の特徴点、又は、他の人物Ｗが座っている椅子若しくは車椅子の少なくとも一部に対して所定距離以内に近づいたタイミングを示してもよい。また、タイミング情報における暴力動作の開始時点は、撮像部２０によって撮像された経時的な画像において被監視対象Ｔの各特徴点の位置が通常の動作から外れたタイミングを示してもよい。 As described above, in one embodiment, at the start time of the violent motion in the timing information, each feature point of the monitored target T is a feature of at least a part of the other person W in the temporal image captured by the image pickup unit 20. It may indicate the timing at which the point or at least a part of the chair or wheelchair on which the other person W is sitting approaches within a predetermined distance. Further, the start time of the violent movement in the timing information may indicate the timing when the position of each feature point of the monitored target T deviates from the normal movement in the time-dependent image captured by the image pickup unit 20.

また、「暴力動作の終了」の時点とは、撮像部２０によって撮像された経時的な画像において被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子からある程度（例えば１ｍなど）離れたと判断し得るタイミングとしてよい。したがって、一実施形態において、スタッフなどの人員は、被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子から例えば１ｍなどの所定の距離だけ離れた時点を「暴力動作の終了」の時点として、タイミング情報を付与（設定）してよい。 Further, the time point of "end of violent movement" means that the monitored target T is a chair or a wheelchair on which another person W or another person W is sitting in the time-dependent image captured by the image pickup unit 20 (for example,). It may be a timing that can be judged to be separated (1 m, etc.). Therefore, in one embodiment, a person such as a staff member “violently operates” when the monitored object T is separated from another person W or a chair or wheelchair on which another person W is sitting by a predetermined distance such as 1 m. Timing information may be added (set) as the time point of "end of".

このように、一実施形態において、タイミング情報における暴力動作の終了時点は、撮像部２０によって撮像された経時的な画像において被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子から所定の距離だけ離れたタイミングを示してもよい。 As described above, in one embodiment, the end point of the violent motion in the timing information is the chair in which the monitored target T is another person W or another person W in the time-dependent image captured by the image pickup unit 20. Alternatively, the timing may be indicated by a predetermined distance from the wheelchair.

また、「暴力動作の終了」の時点とは、被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子からある程度（例えば１ｍなど）離れた状態で所定時間（例えば１分など）が経過したと判断し得るタイミングとしてよい。したがって、一実施形態において、スタッフなどの人員は、被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子から例えば１ｍなど所定の距離だけ離れた状態で所定時間経過した時点を「暴力動作の終了」の時点として、タイミング情報を付与（設定）してよい。 Further, the time point of "end of violent movement" is a predetermined time (for example, 1 m) in a state where the monitored target T is separated from the chair or wheelchair on which the other person W or the other person W is sitting to some extent (for example, 1 m). It may be a timing when it can be judged that the minute) has passed. Therefore, in one embodiment, the staff or the like has elapsed a predetermined time with the monitored object T separated from the other person W or the chair or wheelchair on which the other person W sits by a predetermined distance, for example, 1 m. Timing information may be added (set) with the time point as the time point of "end of violent action".

このように、一実施形態において、タイミング情報における暴力動作の終了時点は、撮像部２０によって撮像された経時的な画像において被監視対象Ｔが他の人物Ｗ又は他の人物Ｗが座っている椅子若しくは車椅子から所定の距離だけ離れた状態で所定時間が経過したタイミングを示してもよい。 As described above, in one embodiment, the end point of the violent motion in the timing information is the chair in which the monitored target T is another person W or another person W in the time-dependent image captured by the image pickup unit 20. Alternatively, it may indicate the timing at which a predetermined time has elapsed while being away from the wheelchair by a predetermined distance.

タイミング情報において、上述した「暴力動作の開始」の時点及び「暴力動作の終了」の時点は、必ずしもこの順序で付与（設定）しなくてもよい。すなわち、タイミング情報において、最初に「暴力動作の終了」の時点を設定してから、「暴力動作の開始」の時点を設定してもよい。 In the timing information, the time points of the above-mentioned "start of violent movement" and the time point of "end of violent movement" do not necessarily have to be given (set) in this order. That is, in the timing information, the time point of "end of violent action" may be set first, and then the time point of "start of violent action" may be set.

例えば、スタッフなどの人員は、撮像部２０によって撮像された被監視対象Ｔの画像を再生して観察（視認）している際、「暴力動作の開始」の時点を見極めるのが必ずしも容易でないことも想定される。このような場合、まず、スタッフなどの人員は、撮像部２０によって撮像された被監視対象Ｔの画像を再生して観察（視認）している際に、「暴力動作の終了」の時点をタイミング情報として設定してよい。次に、スタッフなどの人員は、「暴力動作の終了」の時点から、撮像部２０によって撮像された被監視対象Ｔの画像を時間的に逆に再生して観察（視認）してもよい。ここで、撮像部２０によって撮像された被監視対象Ｔの画像を時間的に逆に再生する際には、当該逆再生の速度をある程度低下させてもよい。このようにすれば、スタッフなどの人員は、タイミング情報において、「暴力動作の終了」の時点よりも前の時点である「暴力動作の開始」の時点を容易に設定することができる。例えば、スタッフなどの人員は、時間的に逆に再生させた画像データを視認しながら、被監視対象Ｔの左尻又は右尻が椅子の座面において被監視対象Ｔの前方に最初に移動する時点を見出して、当該時点をタイミング情報として設定してよい。 For example, it is not always easy for personnel such as staff to determine the time point of "beginning of violent movement" when reproducing (visually recognizing) the image of the monitored target T captured by the imaging unit 20. Is also assumed. In such a case, first, when the staff or the like reproduces (visually recognizes) the image of the monitored target T imaged by the image pickup unit 20, the timing of the "end of violent movement" is determined. It may be set as information. Next, a staff member or the like may reproduce (visually recognize) the image of the monitored object T captured by the image pickup unit 20 in the reverse time from the time of "the end of the violent movement". Here, when the image of the monitored target T captured by the image pickup unit 20 is reproduced in reverse in time, the speed of the reverse reproduction may be reduced to some extent. In this way, personnel such as staff can easily set the time point of "beginning of violent movement", which is the time point before the time of "end of violent movement", in the timing information. For example, a staff member or the like first moves the left or right buttock of the monitored object T to the front of the monitored object T on the seat surface of the chair while visually recognizing the image data reproduced in the reverse time. A time point may be found and the time point may be set as timing information.

上述のようにして画像データ（又は座標データ若しくは正規化された座標データ）においてタイミング情報が付与（設定）されたら、コントローラ１５は、ステップＳ１４において当該タイミング情報を取得することができる。電子機器１０のコントローラ１５は、ステップＳ１４において、上述のようなタイミング情報を、例えば通信部１９を介して外部のネットワークなどから取得してもよい。また、コントローラ１５は、ステップＳ１４において、上述のようなタイミング情報を、例えばメモリカードなどのストレージを挿入可能な電子機器１０のスロットなどから取得してもよい。また、システム１が画像を表示可能なディスプレイを含む場合、当該ディスプレイに画像データなどを表示してもよい。この場合、スタッフなどの人員がディスプレイを視認しながら設定するタイミング情報は、例えば電子機器１０の操作部などを介して入力されてよい。 When the timing information is added (set) in the image data (or the coordinate data or the normalized coordinate data) as described above, the controller 15 can acquire the timing information in step S14. In step S14, the controller 15 of the electronic device 10 may acquire the timing information as described above from an external network or the like via, for example, the communication unit 19. Further, in step S14, the controller 15 may acquire the timing information as described above from a slot of an electronic device 10 into which a storage such as a memory card can be inserted. Further, when the system 1 includes a display capable of displaying an image, image data or the like may be displayed on the display. In this case, the timing information set by a staff member or the like while visually recognizing the display may be input via, for example, an operation unit of the electronic device 10.

ステップＳ１４においてタイミング情報を取得したら、コントローラ１５は、画像データから抽出された座標と、暴力動作の開始との関係を、タイミング情報に基づいて機械学習する（ステップＳ１５）。ステップＳ１５において、コントローラ１５は、暴力動作の開始時点と終了時点との間における被監視対象Ｔの所定部位（関節点）の座標、及び、他の人物Ｗの所定部位（関節点）の座標又は他の人物Ｗに関連する物体の座標と、暴力動作の開始との関係を、タイミング情報に基づいて学習するものとしてよい。ここで、コントローラ１５は、画像データ（又は座標データ、若しくは正規化された座標データ）、及びこれに設定されたタイミング情報に基づいて、機械学習を行ってもよい。以下、ステップＳ１５における機械学習の結果として生成されるデータを、「機械学習データ」と記すことがある。 After acquiring the timing information in step S14, the controller 15 machine-learns the relationship between the coordinates extracted from the image data and the start of the violent movement based on the timing information (step S15). In step S15, the controller 15 determines the coordinates of the predetermined portion (joint point) of the monitored target T between the start time and the end time of the violent movement, and the coordinates of the predetermined portion (joint point) of the other person W. The relationship between the coordinates of the object related to the other person W and the start of the violent movement may be learned based on the timing information. Here, the controller 15 may perform machine learning based on the image data (or coordinate data or normalized coordinate data) and the timing information set in the image data. Hereinafter, the data generated as a result of machine learning in step S15 may be referred to as "machine learning data".

上述のような機械学習を行うことにより、電子機器１０は、暴力動作が開始するタイミングと、暴力動作の開始時点から終了時点までの間における被監視対象Ｔの所定部位（関節点）の座標の動きとの関連を把握することができる。このため、電子機器１０によれば、後述の推定フェーズにおいて、被監視対象Ｔの所定部位（関節点）の座標の動きに基づいて、暴力動作の開始時点を推定し得る。 By performing the machine learning as described above, the electronic device 10 has the coordinates of the timing at which the violent movement starts and the coordinates of the predetermined portion (joint point) of the monitored target T between the start time and the end time of the violent movement. It is possible to grasp the relationship with movement. Therefore, according to the electronic device 10, in the estimation phase described later, the start time of the violent movement can be estimated based on the movement of the coordinates of the predetermined portion (joint point) of the monitored target T.

このように、一実施形態において、コントローラ１５は、暴力動作の開始時点と終了時点との間における被監視対象Ｔの所定部位の座標、及び、他の人物Ｗの所定部位（関節点）の座標又は他の人物Ｗに関連する物体の座標と、暴力動作の開始との関係を、タイミング情報に基づいて機械学習する。ここで、タイミング情報とは、上述のように、撮像部２０によって撮像された経時的な画像において暴力動作の開始時点及び終了時点を示す情報としてよい。なお、本開示において、コントローラ１５が機械学習する、暴力行為の開始時点と終了時点との間における被監視対象Ｔの所定部位の座標と、暴力行為の開始との関係の数は、例えば1以上、10以上、100以上、1000以上など、少なくとも1以上の数であればよい。 As described above, in one embodiment, the controller 15 has the coordinates of the predetermined portion of the monitored target T between the start time and the end time of the violent movement, and the coordinates of the predetermined portion (joint point) of the other person W. Alternatively, the relationship between the coordinates of the object related to the other person W and the start of the violent movement is machine-learned based on the timing information. Here, the timing information may be information indicating the start time point and the end time point of the violent movement in the time-dependent image captured by the image pickup unit 20 as described above. In the present disclosure, the number of relationships between the coordinates of the predetermined portion of the monitored target T between the start time and the end time of the violent act and the start of the violent act, which the controller 15 machine-learns, is, for example, 1 or more. , 10 or more, 100 or more, 1000 or more, etc., as long as it is at least 1 or more.

後述の推定フェーズにおいて電子機器１０が暴力動作の開始時点を推定する精度を向上するために、比較的多数のサンプル（例えば被監視対象Ｔのような人間）のデータについて、機械学習を行ってもよい。機械学習を行う際のサンプルのデータを多くすることにより、後述の推定フェーズにおいて電子機器１０が暴力動作の開始時点を推定する際の精度を高めることが期待できる。したがって、機械学習を行う際のサンプルのデータを多くすることにより、電子機器１０が暴力動作の開始を推定して警告を発する際に、誤報を発したり、失報したりするといったことを低減し得る。なお、本開示において、コントローラ１５が機械学習する、サンプルの数は、例えば1以上、10以上、100以上、1000以上など、少なくとも1以上の数であればよい。 In order to improve the accuracy with which the electronic device 10 estimates the start time of violent movement in the estimation phase described later, even if machine learning is performed on the data of a relatively large number of samples (for example, a human such as a monitored target T). good. By increasing the sample data when performing machine learning, it can be expected that the accuracy when the electronic device 10 estimates the start time of the violent movement in the estimation phase described later will be improved. Therefore, by increasing the sample data when performing machine learning, it is possible to reduce the occurrence of false alarms or false alarms when the electronic device 10 estimates the start of violent movement and issues a warning. obtain. In the present disclosure, the number of samples machine-learned by the controller 15 may be at least 1 or more, for example, 1 or more, 10 or more, 100 or more, 1000 or more, and the like.

上述の機械学習において、タイミング情報は、撮像部２０によって撮像された経時的な画像において暴力動作の開始時点及び終了時点を示す情報とした。一実施形態において、タイミング情報は、撮像部２０によって撮像された経時的な画像において暴力動作の開始時点及び終了時点以外の時点を示す情報を含むものとしてもよい。 In the above-mentioned machine learning, the timing information is information indicating the start time point and the end time point of the violent movement in the time-dependent image captured by the image pickup unit 20. In one embodiment, the timing information may include information indicating a time point other than the start time point and the end time point of the violent movement in the time-dependent image captured by the image pickup unit 20.

例えば、図１０に示す学習データ（４）の区間の画像データは、暴力動作の開始時点と終了時点との間に存在する。このため、学習データ（４）は、危険な（すなわち暴力を振るう動作の可能性が高い）クラスに分類されるデータとして、コントローラ１５に機械学習させてよい。図１０に示す学習データ（４）の区間の画像データは、暴力動作の開始時点とほぼ同じ時点において開始している。一方、図１０に示す学習データ（４）の区間の画像データは、暴力動作の終了時点とは異なる時点において終了している。 For example, the image data of the section of the learning data (4) shown in FIG. 10 exists between the start time point and the end time point of the violent movement. Therefore, the learning data (4) may be machine-learned by the controller 15 as data classified into a dangerous class (that is, there is a high possibility of violent behavior). The image data of the section of the learning data (4) shown in FIG. 10 starts at substantially the same time as the start time of the violent movement. On the other hand, the image data in the section of the learning data (4) shown in FIG. 10 ends at a time different from the end time of the violent movement.

一方、図１０に示す学習データ（１）乃至（３）、及び学習データ（５）の区間の画像データは、暴力動作の開始時点と終了時点との間に存在しない。このため、これらの学習データは、正常な（すなわち危険が少ない（被監視対象Ｔが他の人物Ｗに対して暴力を振るうリスクが低い））クラスに分類されるデータとして、コントローラ１５に機械学習させてよい。これらの学習データの区間の画像データは、暴力動作の開始時点とは異なる時点において開始し、暴力動作の終了時点とは異なる時点において終了している。 On the other hand, the image data of the sections of the learning data (1) to (3) and the learning data (5) shown in FIG. 10 do not exist between the start time and the end time of the violent movement. Therefore, these learning data are machine-learned by the controller 15 as data classified into a normal class (that is, there is little risk (the risk that the monitored target T violently acts against another person W)). You may let me. The image data of the section of these learning data starts at a time different from the start time of the violent movement and ends at a time different from the end time of the violent movement.

このように、一実施形態において、タイミング情報は、撮像部２０によって撮像された経時的な画像において暴力動作の開始時点及び終了時点以外の時点を示す情報を含んでもよい。このようなタイミング情報に基づいて、コントローラ１５は、暴力動作の開始時点と終了時点との間における被監視対象Ｔ、及び、他の人物Ｗの所定部位（関節点）の座標又は他の人物Ｗに関連する物体の座標の所定部位の座標と、暴力動作の開始との関係を機械学習してもよい。 As described above, in one embodiment, the timing information may include information indicating a time point other than the start time point and the end time point of the violent movement in the time-dependent image captured by the image pickup unit 20. Based on such timing information, the controller 15 uses the coordinates of the monitored target T between the start time and the end time of the violent movement, the coordinates of the predetermined portion (joint point) of the other person W, or the other person W. Machine learning may be performed on the relationship between the coordinates of a predetermined part of the coordinates of the object related to the above and the start of the violent movement.

コントローラ１５は、図１０に示すように、危険なクラスに分類される学習データ（４）のみならず、正常なクラスに分類される学習データ（１）乃至（３）、及び学習データ（５）のような学習データにも基づいて、機械学習を行ってよい。コントローラ１５は、危険なクラスに分類される学習データ及び正常なクラスに分類される学習データに基づいて機械学習を行うことにより、暴力動作の開始を推定する精度を高めることができる。 As shown in FIG. 10, the controller 15 has not only the learning data (4) classified into the dangerous class, but also the learning data (1) to (3) and the learning data (5) classified into the normal class. Machine learning may be performed based on the learning data such as. The controller 15 can improve the accuracy of estimating the start of violent movement by performing machine learning based on the learning data classified into the dangerous class and the learning data classified into the normal class.

次に、推定フェーズにおける動作について、説明する。 Next, the operation in the estimation phase will be described.

図１１は、一実施形態に係るシステム１の推定フェーズにおける動作の一例を示すフローチャートである。図１１は、一実施形態に係るシステム１に含まれる電子機器１０の推定フェーズにおける動作に焦点を当てたフローチャートとしてもよい。 FIG. 11 is a flowchart showing an example of the operation in the estimation phase of the system 1 according to the embodiment. FIG. 11 may be a flowchart focusing on the operation in the estimation phase of the electronic device 10 included in the system 1 according to the embodiment.

上述のように、例えば認知症の発症が疑われる者など（例えば要看護者又は要介護者など）は、看護者又は介護者などの他の人物Ｗに対して暴力を振るってしまうことがある。このような場合、他の人物Ｗがケガをすることがある。また、他の人物Ｗが不快に感じたり危険を感じたりする。つまり、他の人物Ｗの安全が脅かされるリスクがある。一実施形態に係るシステム１の推定フェーズにおいては、学習フェーズにおいて得られた機械学習データを利用して、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作における、被監視対象Ｔの身体における所定部位の位置（座標）及び他の人物Ｗの身体における所定部位の位置（座標）又は他の人物Ｗに関連する物体の位置（座標）から、暴力動作の開始を推定してよい。以下、このような動作について、より詳細に説明する。システム１によって被監視対象Ｔを監視することで、例えば介護施設又は病院などのスタッフは、暴力動作が開始することを、被監視対象Ｔが実際に暴力動作を終了する前に認識することができる。 As described above, for example, a person suspected of developing dementia (for example, a nurse or a care recipient) may violence against another person W such as a nurse or a caregiver. .. In such a case, another person W may be injured. In addition, another person W feels uncomfortable or dangerous. That is, there is a risk that the safety of another person W will be threatened. In the estimation phase of the system 1 according to one embodiment, the monitored object T in an operation in which the monitored object T violently acts against another person W by using the machine learning data obtained in the learning phase. The start of the violent movement may be estimated from the position (coordinates) of the predetermined part on the body and the position (coordinates) of the predetermined part on the body of the other person W or the position (coordinates) of the object related to the other person W. Hereinafter, such an operation will be described in more detail. By monitoring the monitored target T by the system 1, for example, a staff member of a nursing care facility or a hospital can recognize that the violent movement starts before the monitored target T actually ends the violent movement. ..

図１１に示す動作が開始すると、電子機器１０のコントローラ１５は、機械学習データを取得する（ステップＳ２１）。ステップＳ２１において取得する機械学習データは、図２に示したステップＳ１５における機械学習の結果として生成されるデータとしてよい。すなわち、機械学習データとは、被監視対象Ｔのような人間が他の人物Ｗのような他の人間に対して暴力を振るう動作の開始時点と終了時点との間における、その人間の身体における所定部位の座標及び他の人間の身体における所定部位の座標又は他の人物Ｗのような他の人間に関連する物体の座標と、その人間が他の人間に対して暴力を振るう動作の開始との関係が機械学習されたデータとしてよい。 When the operation shown in FIG. 11 starts, the controller 15 of the electronic device 10 acquires machine learning data (step S21). The machine learning data acquired in step S21 may be data generated as a result of machine learning in step S15 shown in FIG. That is, the machine learning data is the human body between the start time and the end time of the action in which a human such as the monitored target T violently acts against another human such as another person W. The coordinates of a predetermined part, the coordinates of a predetermined part in another human body, or the coordinates of an object related to another human such as another person W, and the start of an action in which the person violently acts against another person. The relationship may be machine-learned data.

図１１に示す動作が開始する時点で既に機械学習データを取得している場合には、コントローラ１５は、ステップＳ２１において再び機械学習データを取得しなくてもよい。また、ステップＳ２１において取得する機械学習データは、後述のステップＳ２５において用いられる。このため、一実施形態において、機械学習データの取得は、必ずしもステップＳ２１において行う必要はなく、ステップ２５までの任意のタイミングにおいて行ってもよい。 If the machine learning data has already been acquired at the time when the operation shown in FIG. 11 starts, the controller 15 does not have to acquire the machine learning data again in step S21. Further, the machine learning data acquired in step S21 is used in step S25 described later. Therefore, in one embodiment, the acquisition of machine learning data does not necessarily have to be performed in step S21, and may be performed at any timing up to step 25.

ステップＳ２１において機械学習データが取得されたら、コントローラ１５は、撮像部２０によって撮像された画像を取得する（ステップＳ２２）。ステップＳ２２における動作は、図２に示したステップＳ１１の動作と同様に行ってよい。 When the machine learning data is acquired in step S21, the controller 15 acquires the image captured by the imaging unit 20 (step S22). The operation in step S22 may be performed in the same manner as the operation in step S11 shown in FIG.

ステップ２２において撮像された画像を取得すると、抽出部１１は、被監視対象Ｔの身体における所定部位の座標を抽出する（ステップＳ２３）。抽出部１１は、被監視対象Ｔの身体における所定部位の座標だけでなく、他の人物Ｗの身体における所定部位の座標をさらに抽出する。抽出部１１は、他の人物Ｗに関連する物体の座標をさらに抽出してもよい。ステップＳ２３における動作は、図２に示したステップＳ１２の動作と同様に行ってよい。すなわち、例えば、抽出部１１は、撮像部２０によって撮像された秒間所定数の各フレームの画像から、被監視対象Ｔの身体における所定数の関節点の座標を２次元的に抽出してもよい。 When the image captured in step 22 is acquired, the extraction unit 11 extracts the coordinates of a predetermined portion of the body of the monitored target T (step S23). The extraction unit 11 further extracts not only the coordinates of the predetermined portion of the body of the monitored target T but also the coordinates of the predetermined portion of the body of the other person W. The extraction unit 11 may further extract the coordinates of the object related to the other person W. The operation in step S23 may be performed in the same manner as the operation in step S12 shown in FIG. That is, for example, the extraction unit 11 may two-dimensionally extract the coordinates of a predetermined number of joint points in the body of the monitored target T from the images of a predetermined number of frames per second captured by the image pickup unit 20. ..

ステップＳ２３において座標が抽出されたら、抽出部１１は、抽出された所定数のフレーム（例えば１秒間の１５フレーム）における座標（Ｘ，Ｙ）それぞれの最大値及び最小値に従って、座標を正規化する（ステップＳ２４）。ステップＳ２４における動作は、図２に示したステップＳ１３の動作と同様に行ってよい。すなわち、例えば、抽出部１１は、２次元的に抽出された被監視対象Ｔの身体における所定数の関節点の座標の各方向成分を、当該各方向成分の最大値及び最小値に基づいて正規化してもよい。また、抽出部１１は、２次元的に抽出された他の人物Ｗの身体における所定数の関節点の座標の各方向成分を、当該各方向成分の最大値及び最小値に基づいて正規化してもよい。 After the coordinates are extracted in step S23, the extraction unit 11 normalizes the coordinates according to the maximum and minimum values of the coordinates (X, Y) in the extracted predetermined number of frames (for example, 15 frames per second). (Step S24). The operation in step S24 may be performed in the same manner as the operation in step S13 shown in FIG. That is, for example, the extraction unit 11 normalizes each direction component of the coordinates of a predetermined number of joint points in the body of the monitored target T extracted two-dimensionally based on the maximum value and the minimum value of each direction component. It may be transformed into. Further, the extraction unit 11 normalizes each direction component of the coordinates of a predetermined number of joint points in the body of another person W extracted two-dimensionally based on the maximum value and the minimum value of each direction component. May be good.

コントローラ１５は、ステップＳ２４の手順を実行する前に、被監視対象Ｔの所定部位の座標及び他の人物Ｗの所定部位の座標に基づいて、被監視対象Ｔが他の人物Ｗの所定範囲内に近づいているか判定してもよい。コントローラ１５は、被監視対象Ｔの所定部位の座標及び他の人物Ｗに関連する物体の座標に基づいて、被監視対象Ｔが他の人物Ｗに関連する物体の所定範囲内に近づいているか判定してもよい。コントローラ１５は、被監視対象Ｔが他の人物Ｗ又は他の人物Ｗに関連する物体の所定範囲内に近づいている場合にステップＳ２４に示す座標を正規化する動作を行ってもよい。 Before executing the procedure of step S24, the controller 15 keeps the monitored target T within the predetermined range of the other person W based on the coordinates of the predetermined portion of the monitored target T and the coordinates of the predetermined portion of the other person W. You may determine if you are approaching. The controller 15 determines whether the monitored object T is approaching within a predetermined range of the object related to the other person W based on the coordinates of the predetermined part of the monitored object T and the coordinates of the object related to the other person W. You may. The controller 15 may perform an operation to normalize the coordinates shown in step S24 when the monitored object T is approaching within a predetermined range of another person W or an object related to the other person W.

コントローラ１５は、被監視対象Ｔの少なくとも一部の所定部位の座標と、他の人物Ｗの少なくとも一部の所定部位の座標との少なくとも一部の組み合わせについて座標間の距離を算出してよい。コントローラ１５は、被監視対象Ｔの全ての所定部位の座標と、他の人物Ｗの全ての所定部位の座標との全ての組み合わせについて座標間の距離を算出してよい。コントローラ１５は、座標間の距離の算出結果に基づいて、被監視対象Ｔが他の人物Ｗに近づいたかを判定してもよい。コントローラ１５は、被監視対象Ｔの少なくとも一部の所定部位の座標と、他の人物Ｗに関連する物体の座標との組み合わせについて座標間の距離を算出してよい。コントローラ１５は、座標間の距離の算出結果に基づいて、被監視対象Ｔが他の人物Ｗに関連する物体に近づいたかを判定してもよい。 The controller 15 may calculate the distance between the coordinates for at least a part combination of the coordinates of at least a part of the predetermined part of the monitored object T and the coordinates of at least a part of the predetermined part of the other person W. The controller 15 may calculate the distance between the coordinates for all combinations of the coordinates of all the predetermined parts of the monitored target T and the coordinates of all the predetermined parts of the other person W. The controller 15 may determine whether the monitored target T has approached another person W based on the calculation result of the distance between the coordinates. The controller 15 may calculate the distance between the coordinates for the combination of the coordinates of at least a part of the predetermined portion of the monitored target T and the coordinates of the object related to the other person W. The controller 15 may determine whether the monitored object T has approached an object related to another person W based on the calculation result of the distance between the coordinates.

ステップＳ２４において座標が正規化されたら、コントローラ１５は、ステップＳ２１において取得された機械学習データに基づいて、ステップＳ２４において正規化された座標から、暴力動作の開始が推定されるか否かを判定する（ステップＳ２５）。 After the coordinates are normalized in step S24, the controller 15 determines whether or not the start of the violent motion is estimated from the coordinates normalized in step S24 based on the machine learning data acquired in step S21. (Step S25).

ステップＳ２５において暴力動作の開始が推定される場合、すなわち、暴力動作がこれから開始するリスクが高まった場合、コントローラ１５は、所定の警告信号を出力する（ステップＳ２６）。ステップＳ２６において、コントローラ１５は、所定の警告信号を警告部１７に出力してよい。これにより、警告部１７は、所定の警告を発することができる。 When the start of the violent action is estimated in step S25, that is, when the risk of the start of the violent action is increased, the controller 15 outputs a predetermined warning signal (step S26). In step S26, the controller 15 may output a predetermined warning signal to the warning unit 17. As a result, the warning unit 17 can issue a predetermined warning.

一方、ステップＳ２５において暴力動作の開始が推定されない場合、すなわち、暴力動作がこれから開始するリスクが高まっていない場合、コントローラ１５は、ステップＳ２６の動作をスキップして、図１１に示す動作を終了してよい。図１１に示す動作が終了すると、コントローラ１５は、再び図１１に示す動作を開始してよい。例えば、コントローラ１５は、画像データから座標が抽出されるごとに、図１１に示す動作を繰り返してもよい。すなわち、例えば抽出部１１が秒間１５フレームの画像データから座標（Ｘ，Ｙ）を抽出する場合、コントローラ１５は、ステップＳ２５における暴力動作の開始の推定を秒間１５回行ってもよい。 On the other hand, if the start of the violent motion is not estimated in step S25, that is, if the risk of the violent motion starting from now on is not increased, the controller 15 skips the motion of step S26 and ends the motion shown in FIG. It's okay. When the operation shown in FIG. 11 is completed, the controller 15 may start the operation shown in FIG. 11 again. For example, the controller 15 may repeat the operation shown in FIG. 11 every time the coordinates are extracted from the image data. That is, for example, when the extraction unit 11 extracts the coordinates (X, Y) from the image data of 15 frames per second, the controller 15 may estimate the start of the violent motion in step S25 15 times per second.

このように、一実施形態において、コントローラ１５は、機械学習データに基づいて、抽出部１１によって抽出された被監視対象Ｔの所定部位の座標、及び、他の人物Ｗの所定部位の座標又は他の人物Ｗに関連する物体の座標から、暴力動作の開始を推定してよい。また、抽出部１１は、撮像部２０によって撮像された単位時間当たり所定数のフレームの画像から被監視対象Ｔの所定部位の座標、及び、他の人物Ｗの所定部位の座標又は他の人物Ｗに関連する物体の座標を抽出してもよい。この場合、コントローラ１５は、抽出部１１によって抽出された被監視対象Ｔの所定部位の座標、及び、他の人物Ｗの所定部位の座標又は他の人物Ｗに関連する物体の座標から、暴力動作の開始を推定してもよい。 As described above, in one embodiment, the controller 15 has the coordinates of the predetermined portion of the monitored target T extracted by the extraction unit 11 based on the machine learning data, the coordinates of the predetermined portion of the other person W, or the like. The start of the violent movement may be estimated from the coordinates of the object related to the person W. Further, the extraction unit 11 has the coordinates of the predetermined portion of the monitored target T, the coordinates of the predetermined portion of the other person W, or the other person W from the images of a predetermined number of frames per unit time captured by the imaging unit 20. You may extract the coordinates of the object related to. In this case, the controller 15 violently operates from the coordinates of the predetermined portion of the monitored target T extracted by the extraction unit 11, the coordinates of the predetermined portion of the other person W, or the coordinates of the object related to the other person W. The start of may be estimated.

図１１に示すように、コントローラ１５は、ステップＳ２５において暴力動作の開始を推定したら直ちに、ステップＳ２６において所定の警告信号を出力してよい。このため、コントローラ１５は、実際の暴力動作が終了する前に、所定の警告信号を出力してよい。このように、コントローラ１５は、暴力動作の開始を推定したら、暴力動作の終了前に、所定の警告信号を出力してもよい。また、コントローラ１５は、可能な場合には、実際の暴力動作が開始する前に、所定の警告信号を出力してもよい。このように、コントローラ１５は、暴力動作の開始を推定したら、暴力動作の開始前に、所定の警告信号を出力してもよい。 As shown in FIG. 11, the controller 15 may output a predetermined warning signal in step S26 immediately after estimating the start of the violent action in step S25. Therefore, the controller 15 may output a predetermined warning signal before the actual violent operation ends. As described above, the controller 15 may output a predetermined warning signal after estimating the start of the violent movement and before the end of the violent movement. Further, if possible, the controller 15 may output a predetermined warning signal before the actual violent operation starts. As described above, after estimating the start of the violent movement, the controller 15 may output a predetermined warning signal before the start of the violent movement.

一実施形態に係るシステム１によれば、例えば被監視対象Ｔのような人間が他の人物Ｗに対して暴力を振るう動作における被監視対象Ｔの身体の関節点の座標及び他の人物Ｗの身体の関節点の座標又は他の人物Ｗに関連する物体の座標と、当該暴力を振るう動作のタイミングとの関係を機械学習することができる。また、一実施形態に係るシステム１によれば、機械学習した結果に基づいて、被監視対象Ｔが他の人物Ｗに対して暴力を振るう動作における被監視対象Ｔの身体の関節点の座標及び他の人物Ｗの身体の関節点の座標又は他の人物Ｗに関連する物体の座標から、暴力動作の開始を推定することができる。したがって、一実施形態に係るシステム１によれば、被監視対象Ｔが他の人物Ｗに対して暴力を振るい始める前、又は、暴力を振るい続けている間であって暴力を振るい終わるまでに、所定の警告を発することができる。したがって、一実施形態に係るシステム１によれば、例えば介護施設などのスタッフは、例えば要看護者又は要介護者などのような被監視対象が他の人物Ｗに対して暴力を振るい始める前、又は、暴力を振るい続けている途中で、被監視対象Ｔが他の人物Ｗに対して暴力を振るおうとしていたり暴力を振るい続けようとしていたりすることを認識し得る。このため、一実施形態に係るシステム１によれば、被監視対象Ｔの周囲の安全に供し得る。 According to the system 1 according to the embodiment, the coordinates of the joint points of the body of the monitored object T and the other person W in the action of a human being violently acting against the other person W, for example, the monitored object T. Machine learning can be performed on the relationship between the coordinates of the joint points of the body or the coordinates of an object related to another person W and the timing of the action of wielding the violence. Further, according to the system 1 according to the embodiment, the coordinates of the joint points of the body of the monitored object T in the action of the monitored object T violently acting against another person W based on the result of machine learning and the coordinates of the body of the monitored object T. The start of violent movement can be estimated from the coordinates of the joint points of the body of the other person W or the coordinates of the object related to the other person W. Therefore, according to the system 1 according to the embodiment, before the monitored target T begins to violence against another person W, or while the violence continues to be violent, before the violence is finished. A predetermined warning can be issued. Therefore, according to the system 1 according to the embodiment, the staff of a nursing care facility, for example, before the monitored object such as a nurse or a person requiring nursing care begins to violence against another person W. Alternatively, it can be recognized that the monitored target T is trying to violence against another person W or is trying to continue violence while continuing to wield violence. Therefore, according to the system 1 according to the embodiment, it is possible to provide safety around the monitored target T.

図１２は、図１１のステップＳ２５において説明した推定処理をさらに説明する図である。図１２は、左側の列において、例えば図１１のステップＳ２２において取得された画像データに撮像された被監視対象Ｔの状態を示している。図１２に示すように、撮像された画像データにおいて、被監視対象Ｔは、撮像部２０が設置された部屋に入室した後、他の人物Ｗに近づいて他の人物Ｗに手を伸ばして暴力動作を開始したとする。暴力動作が開始した瞬間から、コントローラ１５は、図１１に示したステップＳ２２以降の動作を、秒間１５フレームの処理として行うものとする。すなわち、システム１において、撮像部２０は、秒間１５フレームの画像を撮像するものとしてよい。また、システム１において、コントローラ１５は、秒間１５フレームの画像を取得するものとしてよい。また、システム１において、抽出部１１は、秒間１５フレームの画像から、被監視対象Ｔの身体の関節点の座標及び他の人物Ｗの身体の関節点の座標又は他の人物Ｗに関連する物体の座標を抽出するものとしてよい。また、システム１において、コントローラ１５（又は抽出部１１）は、秒間１５フレームの画像から抽出された座標を正規化するものとしてよい。さらに、システム１において、コントローラ１５は、機械学習データに基づいて、秒間１５フレームの正規化された座標から、暴力動作の開始を推定してよい。 FIG. 12 is a diagram further explaining the estimation process described in step S25 of FIG. FIG. 12 shows the state of the monitored target T captured in the image data acquired in step S22 of FIG. 11, for example, in the left column. As shown in FIG. 12, in the captured image data, the monitored target T approaches the other person W after entering the room in which the image pickup unit 20 is installed, reaches for the other person W, and violence. Suppose that the operation is started. From the moment when the violent operation starts, the controller 15 shall perform the operation after step S22 shown in FIG. 11 as a process of 15 frames per second. That is, in the system 1, the image pickup unit 20 may capture an image of 15 frames per second. Further, in the system 1, the controller 15 may acquire an image of 15 frames per second. Further, in the system 1, the extraction unit 11 extracts the coordinates of the body joint point of the monitored target T, the coordinates of the body joint point of the other person W, or an object related to the other person W from the image of 15 frames per second. It may be used to extract the coordinates of. Further, in the system 1, the controller 15 (or the extraction unit 11) may normalize the coordinates extracted from the image of 15 frames per second. Further, in the system 1, the controller 15 may estimate the start of the violent movement from the normalized coordinates of 15 frames per second based on the machine learning data.

図１２の中央の列において、コントローラ１５が画像データのフレームを連続して取得する様子を概念的に示してある。ここで、各フレームの画像データは、画像データから抽出された座標としてもよいし、正規化された座標としてもよい。また、図１２の中央の列において、ハッチングを付した画像データは、暴力動作が開始してから１秒間のフレームを表している。 In the central column of FIG. 12, the controller 15 conceptually shows how to continuously acquire frames of image data. Here, the image data of each frame may be the coordinates extracted from the image data or may be the normalized coordinates. Further, in the central column of FIG. 12, the hatched image data represents a frame for one second after the violent movement starts.

このような状況において、コントローラ１５は、暴力動作が開始してから１秒間の１５フレーム（フレーム１からフレーム１５まで）に基づいて、その時点における暴力動作の開始を推定してよい（図１２に示す推定１）。次に、コントローラ１５は、フレーム２からフレーム１６までの１秒間の１５フレームに基づいて、その時点における暴力動作の開始を推定してよい（図１２に示す推定２）。また、コントローラ１５は、フレーム（Ｎ－１４）からフレームＮまでの１秒間の１５フレームに基づいて、その時点における投げつけ動作の開始を推定してよい（図１２に示す推定３）。以上のような動作を繰り返すことにより、コントローラ１５は、暴力動作の開始の推定を、秒間１５回行うことになる。したがって、一実施形態に係るシステム１によれば、例えば図１２に示す推定１、推定２及び推定３において何らかの原因により本来推定されるべき暴力動作の開始が推定されなかったとしても、秒間１５回の推定によって失報のリスクを低減することができる。 In such a situation, the controller 15 may estimate the start of the violent movement at that time based on 15 frames per second (frames 1 to 15) after the start of the violent movement (FIG. 12). Estimate shown 1). Next, the controller 15 may estimate the start of the violent movement at that time based on 15 frames per second from frame 2 to frame 16 (estimation 2 shown in FIG. 12). Further, the controller 15 may estimate the start of the throwing operation at that time based on 15 frames per second from the frame (N-14) to the frame N (estimation 3 shown in FIG. 12). By repeating the above operation, the controller 15 estimates the start of the violent operation 15 times per second. Therefore, according to the system 1 according to the embodiment, for example, even if the start of the violent movement that should be originally estimated is not estimated in the estimation 1, the estimation 2 and the estimation 3 shown in FIG. 12, 15 times per second. The risk of misreporting can be reduced by the estimation of.

上述した実施形態において、例えば図５に示したように、抽出部１１は、被監視対象Ｔの関節点として、１３か所の部位の座標を抽出する例について説明した。しかしながら、一実施形態において、抽出部１１は、１３より多くの箇所の部位の座標を抽出してもよいし、１３より少ない箇所の部位の座標を抽出してもよい。また、上述した実施形態において、システム１は、秒間１５フレームを処理する例について説明した。しかしながら、一実施形態において、システム１又はシステム１を構成する各機能部は、秒間１５よりも多くのフレームを処理してもよいし、秒間１５よりも少ないフレームを処理してもよい。一実施形態において、システム１が扱う関節点の数及び／又は処理するフレームの数は、暴力動作の開始の推定が妥当な結果になるように調整してもよい。 In the above-described embodiment, for example, as shown in FIG. 5, the extraction unit 11 has described an example in which the extraction unit 11 extracts the coordinates of 13 sites as the joint points of the monitored target T. However, in one embodiment, the extraction unit 11 may extract the coordinates of the parts having more than 13 parts, or may extract the coordinates of the parts having less than 13. Further, in the above-described embodiment, the system 1 has described an example of processing 15 frames per second. However, in one embodiment, the system 1 or each functional unit constituting the system 1 may process more than 15 frames per second or may process less than 15 frames per second. In one embodiment, the number of joint points handled by the system 1 and / or the number of frames processed may be adjusted so that the estimation of the onset of violent movement results in a reasonable result.

このように、一実施形態において、コントローラ１５は、暴力動作の開始の推定の妥当性が所定以上になるように、前記フレームの数及び前記関節点の数の少なくとも一方を決定してもよい。 Thus, in one embodiment, the controller 15 may determine at least one of the number of the frames and the number of the joint points so that the validity of the estimation of the start of the violent movement becomes more than a predetermined value.

上記実施形態においては、撮像された画像データを用いているため、可視光を検出対象として用いて監視を行った。しかしながら、本開示は、このような場合に限定されず、任意の電磁波、音波、温度、振動など、他の検出対象を任意に用いてもよい。 In the above embodiment, since the captured image data is used, visible light is used as a detection target for monitoring. However, the present disclosure is not limited to such cases, and other detection targets such as arbitrary electromagnetic waves, sound waves, temperatures, and vibrations may be arbitrarily used.

本開示に係る実施形態について、諸図面及び実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形又は改変を行うことが可能であることに注意されたい。従って、これらの変形又は改変は本開示の範囲に含まれることに留意されたい。例えば、各構成部又は各ステップなどに含まれる機能などは論理的に矛盾しないように再配置可能であり、複数の構成部又はステップなどを１つに組み合わせたり、或いは分割したりすることが可能である。本開示に係る実施形態について装置を中心に説明してきたが、本開示に係る実施形態は装置の各構成部が実行するステップを含む方法としても実現し得るものである。本開示に係る実施形態は装置が備えるプロセッサにより実行される方法、プログラム、又はプログラムを記録した記憶媒体としても実現し得るものである。本開示の範囲にはこれらも包含されるものと理解されたい。 Although the embodiments according to the present disclosure have been described based on the drawings and examples, it should be noted that those skilled in the art can make various modifications or modifications based on the present disclosure. It should be noted, therefore, that these modifications or modifications are within the scope of this disclosure. For example, the functions included in each component or each step can be rearranged so as not to be logically inconsistent, and a plurality of components or steps can be combined or divided into one. Is. Although the embodiment according to the present disclosure has been mainly described with respect to the apparatus, the embodiment according to the present disclosure can also be realized as a method including steps executed by each component of the apparatus. The embodiments according to the present disclosure can also be realized as a method, a program, or a storage medium on which a program is recorded, which is executed by a processor included in the apparatus. It should be understood that these are also included in the scope of this disclosure.

上述した実施形態は、システム１としての実施のみに限定されるものではない。例えば、上述した実施形態は、システム１に含まれる電子機器１０として実施してもよい。また、上述した実施形態は、例えば、電子機器１０のような機器による監視方法として実施してもよい。さらに、上述した実施形態は、例えば、電子機器１０のような機器又は情報処理装置（例えばコンピュータ）が実行するプログラムとして実施してもよい。また、本開示の技術では、図1に示される電子機器10の各構成要素のすべてが1つの筐体やサーバに存在しなくてもよい。例えば、電子機器10の構成要素のコントローラや記憶部などの各部が、互いに有線、無線若しくはこれらの組み合わせからなるネットワークにより接続され、異なる筐体、サーバ、装置、部屋、ビル、地域、国などに任意に配置されているとしてもよい。 The above-described embodiment is not limited to the implementation as the system 1. For example, the above-described embodiment may be implemented as the electronic device 10 included in the system 1. Further, the above-described embodiment may be implemented as a monitoring method using a device such as the electronic device 10. Further, the above-described embodiment may be implemented as a program executed by, for example, a device such as an electronic device 10 or an information processing device (for example, a computer). Further, in the technique of the present disclosure, it is not necessary that all the components of the electronic device 10 shown in FIG. 1 exist in one housing or server. For example, each part such as a controller and a storage part of the components of the electronic device 10 are connected to each other by a network consisting of wired, wireless or a combination thereof, and can be connected to different housings, servers, devices, rooms, buildings, regions, countries, etc. It may be arranged arbitrarily.

１システム
１０電子機器
１１抽出部
１３記憶部
１３２機械学習データ
１５コントローラ
１７警告部
１９通信部
２０撮像部 1 System 10 Electronic equipment 11 Extraction unit 13 Storage unit 132 Machine learning data 15 Controller 17 Warning unit 19 Communication unit 20 Imaging unit

Claims

An image pickup unit that captures an image of a monitored object and another person or an object related to the other person.
An extraction unit that extracts the coordinates of a predetermined portion of the monitored object, the coordinates of a predetermined portion of the other person, or the coordinates of an object related to the other person from the image captured by the imaging unit.
The start time and the end time are based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the image pickup unit. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person, and the monitored object violently attack the other person. A controller that machine-learns the relationship with the start of operation,
A system equipped with.

The controller is based on timing information including a time point other than the start time point and the end time point of the action in which the monitored object violently acts against the other person in the time-dependent image captured by the image pickup unit. The coordinates of the predetermined part of the monitored object between the start time point and the end time point, the coordinates of the predetermined part of the other person, or the coordinates of the object related to the other person, and the monitored object is the other person. The system according to claim 1, wherein the system learns the relationship with the start of a violent movement against a person by machine learning.

The extraction unit obtains coordinates of a predetermined number of joint points in the body to be monitored and a predetermined number of joint points in the body of the other person from images of a predetermined number of frames per second captured by the imaging unit. The system according to claim 1 or 2, wherein the coordinates or the coordinates of an object related to the other person are extracted two-dimensionally.

The extraction unit is a two-dimensionally extracted coordinate of a predetermined number of joint points in the body of the monitored object, coordinates of a predetermined number of joint points in the body of the other person, or an object related to the other person. The system according to any one of claims 1 to 3, wherein each direction component of the coordinates is normalized based on the maximum value and the minimum value of each direction component.

In the timing information, the time point at which the monitored object violently acts against the other person is a predetermined distance from the other person in the time-dependent image captured by the imaging unit. The system according to any one of claims 1 to 4, which indicates the timings separated from each other.

In the timing information, the time when the monitored object starts to violently attack the other person is a predetermined distance from the other person in the time-dependent image captured by the imaging unit. The system according to any one of claims 1 to 5, indicating the timing of approaching within.

In the timing information, the wrist of the monitored object is the other person or the wrist of the monitored object in the time-dependent image captured by the imaging unit at the start time of the operation in which the monitored object violently acts against the other person. The system according to any one of claims 1 to 6, indicating the timing at which an object related to the other person begins to approach.

An image pickup unit that captures an image of a monitored object and another person or an object related to the other person.
An extraction unit that extracts the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of another person, or the coordinates of an object related to the other person from the image captured by the imaging unit.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined portion of the monitored object extracted by the extraction unit based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. A controller that estimates the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
A system equipped with.

The extraction unit is an object related to the coordinates of the predetermined portion of the monitored object, the coordinates of the predetermined portion of the other person, or the object related to the other person from the images of a predetermined number of frames per unit time captured by the imaging unit. Extract the coordinates of
In the controller, the monitored object is the other from the coordinates of the predetermined part of the monitored object extracted by the extraction unit, the coordinates of the predetermined part of the other person, or the coordinates of the object related to the other person. 8. The system according to claim 8, which estimates the start of a violent action against a person.

When the controller estimates the start of the action of the monitored object to behave violently against the other person, the controller gives a predetermined warning before the start of the action of the monitored object to behave violently against the other person. The system of claim 8 or 9, which outputs a signal.

When the controller estimates the start of the action of the monitored object to behave against the other person, a predetermined warning is given before the end of the action of the monitored object to behave against the other person. The system of claim 8 or 9, which outputs a signal.

The extraction unit obtains coordinates of a predetermined number of joint points in the body to be monitored and a predetermined number of joint points in the body of the other person from images of a predetermined number of frames per second captured by the imaging unit. The system according to any one of claims 8 to 11, wherein the coordinates or the coordinates of an object related to the other person are extracted two-dimensionally.

The extraction unit is a two-dimensionally extracted coordinate of a predetermined number of joint points in the body of the monitored object, coordinates of a predetermined number of joint points in the body of the other person, or an object related to the other person. 12. The system of claim 12, wherein each directional component of the coordinates is normalized based on the maximum and minimum values of the directional component.

The controller determines at least one of the number of frames and the number of joint points so that the validity of the estimation of the start of the action of the monitored object to violently attack the other person is equal to or higher than a predetermined value. The system according to claim 12 or 13.

From the image captured including the monitored object and another person or an object related to the other person, the coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or related to the other person. An extraction unit that extracts the coordinates of the object to be used,
The start time and the above are based on the timing information indicating the start time and the end time of the action in which the monitored object violently acts against the other person in the time-dependent image captured including the monitored object. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person between the end time point and the time point of the end, and the monitored object with respect to the other person. A controller that machine-learns the relationship with the initiation of violent movements,
Electronic equipment equipped with.

From the image captured including the monitored object and another person or an object related to the other person, the coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or related to the other person. An extraction unit that extracts the coordinates of the object to be used,
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined portion of the monitored object extracted by the extraction unit based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. A controller that estimates the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
Electronic equipment equipped with.

An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The start time and the end time are based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the imaging step. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person, and the monitored object violently attack the other person. Machine learning steps to machine learn the relationship with the start of motion,
How to control electronic devices, including.

An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined part of the monitored object extracted by the extraction step based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. An estimation step of estimating the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
How to control electronic devices, including.

On the computer
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The start time and the end time are based on the timing information indicating the start time and the end time of the operation in which the monitored object violently acts against the other person in the time-dependent image captured by the imaging step. The coordinates of the predetermined part of the monitored object and the coordinates of the predetermined part of the other person or the coordinates of the object related to the other person, and the monitored object violently attack the other person. Machine learning steps to machine learn the relationship with the start of motion,
A program that runs.

On the computer
An imaging step of imaging a monitored object and another person or an object related to the other person.
An extraction step of extracting the coordinates of a predetermined part of the monitored object, the coordinates of a predetermined part of the other person, or the coordinates of an object related to the other person from the image captured by the imaging step.
The coordinates of the predetermined part of the human and the coordinates of the predetermined part of the other human or the coordinates of the object related to the other human between the start time and the end time of the action of the human being violent against another human. And the coordinates of the predetermined part of the monitored object extracted by the extraction step based on the machine learning data in which the relationship between the human being and the start of the action of violently acting against the other human being is machine-learned. An estimation step of estimating the start of an action in which the monitored object acts violently against the other person from the coordinates of a predetermined part of the other person or the coordinates of an object related to the other person.
A program that runs.