JP6829368B2

JP6829368B2 - Information processing equipment, information processing methods, and programs

Info

Publication number: JP6829368B2
Application number: JP2016235922A
Authority: JP
Inventors: 友昭中村; 長井　隆行; 隆行長井; 佳那池田
Original assignee: THE UNIVERSITY OF ELECTRO-COMUNICATINS
Current assignee: THE UNIVERSITY OF ELECTRO-COMUNICATINS
Priority date: 2016-12-05
Filing date: 2016-12-05
Publication date: 2021-02-10
Anticipated expiration: 2036-12-05
Also published as: JP2018092421A

Description

本開示は、情報処理装置、情報処理方法、およびプログラムに関し、特に、大量のデータから所望の分類結果を効果的に抽出することができるようにした情報処理装置、情報処理方法、およびプログラムに関する。 The present disclosure relates to information processing devices, information processing methods, and programs, and more particularly to information processing devices, information processing methods, and programs that enable effective extraction of desired classification results from a large amount of data.

近年、カラーカメラおよび赤外線カメラを用いてカラー（RGB）画像および奥行き（Depth）画像を取得するRGB-Dセンサや、赤外線レーザーの反射強度を測定するレーザーレンジファインダなどを利用して、人物の位置を計測することが可能となっている。 In recent years, the position of a person has been used by using an RGB-D sensor that acquires color (RGB) and depth (Depth) images using a color camera and an infrared camera, and a laser range finder that measures the reflection intensity of an infrared laser. It is possible to measure.

例えば、本発明者らは、２台のRGB-Dセンサを利用して、複数の子供たちを識別するとともに、それぞれの子供の位置を追跡して動き軌跡を抽出する手法を提案している（例えば、非特許文献１参照）。 For example, the present inventors have proposed a method of identifying a plurality of children by using two RGB-D sensors and tracking the position of each child to extract a movement trajectory (). For example, see Non-Patent Document 1).

張斌，中村友昭，阿部香澄，アッタミミムハンマド，長井隆行，大森隆司，岡夏樹，金子正秀，"複数のKinectを用いた子どもの行動追跡及び個人認証"，人工知能学会全国大会，4K4-1，2016Zhang Yi, Tomoaki Nakamura, Kasumi Abe, Attami Mimhammad, Takayuki Nagai, Takashi Omori, Natsuki Oka, Masahide Kaneko, "Children's Behavior Tracking and Personal Authentication Using Multiple Kinects", Japanese Society for Artificial Intelligence National Convention, 4K4-1 ， 2016

ところで、上述したような計測を継続的に行うことによって取得される大量のデータ（位置や速度など）を有効活用するためには、それらの大量のデータを分類して所望の分類結果を抽出する必要がある。しかしながら、大量のデータから所望の分類結果を抽出するためには、例えば、どのような分類結果が必要であるかを予め設定することが手間となってしまい、所望の分類結果を効果的に抽出することは非常に困難であった。 By the way, in order to effectively utilize a large amount of data (position, velocity, etc.) acquired by continuously performing the above-mentioned measurement, the large amount of data is classified and a desired classification result is extracted. There is a need. However, in order to extract a desired classification result from a large amount of data, for example, it becomes troublesome to preset what kind of classification result is required, and the desired classification result is effectively extracted. It was very difficult to do.

本開示は、このような状況に鑑みてなされたものであり、大量のデータから所望の分類結果を効果的に抽出することができるようにするものである。 The present disclosure has been made in view of such a situation, and makes it possible to effectively extract a desired classification result from a large amount of data.

本開示の一側面の情報処理装置は、複数の観察対象を観察して得られる時系列データを教師なしクラスタリングすることにより、前記観察対象ごとの前記時系列データを、第１の分類項目へ分類する第１の分類部と、複数の前記観察対象の全体についての前記第１の分類項目の集合の特徴を表す特徴量を教師なしクラスタリングすることにより、複数の前記観察対象の全体についての前記第１の分類項目の集合を、第２の分類項目へ分類する第２の分類部とを備える。 The information processing apparatus of one aspect of the present disclosure classifies the time-series data for each observation object into a first classification item by unsupervised clustering of time-series data obtained by observing a plurality of observation objects. By unsupervised clustering of the first classification unit to be used and the feature quantities representing the characteristics of the set of the first classification items for the entire observation target, the first classification unit for the entire observation target is performed. It is provided with a second classification unit that classifies a set of one classification item into a second classification item.

本開示の一側面の情報処理方法またはプログラムは、複数の観察対象を観察して得られる時系列データを教師なしクラスタリングすることにより、前記観察対象ごとの前記時系列データを、第１の分類項目へ分類し、複数の前記観察対象の全体についての前記第１の分類項目の集合の特徴を表す特徴量を教師なしクラスタリングすることにより、複数の前記観察対象の全体についての前記第１の分類項目の集合を、第２の分類項目へ分類するステップを含む。 The information processing method or program of one aspect of the present disclosure unsupervised clusters the time-series data obtained by observing a plurality of observation objects, thereby classifying the time-series data for each observation object into a first classification item. By unsupervised clustering the feature quantities representing the characteristics of the set of the first classification items for the entire observation target, the first classification item for the entire observation target is obtained. Includes a step of classifying the set of items into a second category.

本開示の一側面においては、複数の観察対象を観察して得られる時系列データを教師なしクラスタリングすることにより、観察対象ごとの時系列データが、第１の分類項目へ分類され、複数の観察対象の全体についての第１の分類項目の集合の特徴を表す特徴量を教師なしクラスタリングすることにより、複数の観察対象の全体についての第１の分類項目の集合が、第２の分類項目へ分類される。 In one aspect of the present disclosure, by unsupervised clustering of time-series data obtained by observing a plurality of observation objects, the time-series data for each observation object is classified into a first classification item, and a plurality of observations are performed. By unsupervised clustering of features representing the characteristics of the set of first classification items for the entire object, the set of first classification items for the entire observation target is classified into the second classification item. Will be done.

本開示の一側面によれば、大量のデータから所望の分類結果を効果的に抽出することができる。 According to one aspect of the present disclosure, a desired classification result can be effectively extracted from a large amount of data.

本技術を適用した活動分類処理の概要を説明する図である。It is a figure explaining the outline of the activity classification process to which this technique is applied. 本技術を適用した情報処理システムの一実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of one Embodiment of the information processing system to which this technology is applied. 行動の分類について説明する図である。It is a figure explaining the classification of behavior. 活動特徴量の生成について説明する図である。It is a figure explaining the generation of activity feature quantity. 活動の分類について説明する図である。It is a figure explaining the classification of activity. 活動分類処理を説明するフローチャートである。It is a flowchart explaining the activity classification process. 本技術を適用したコンピュータの一実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of one Embodiment of the computer to which this technique is applied.

以下、本技術を適用した具体的な実施の形態について、図面を参照しながら詳細に説明する。 Hereinafter, specific embodiments to which the present technology is applied will be described in detail with reference to the drawings.

＜活動分類処理の概要＞ <Outline of activity classification process>

まず、図１を参照して、活動分類処理の概要について説明する。 First, the outline of the activity classification process will be described with reference to FIG.

本実施の形態では、例えば、図１に示される観測シーンのように、子供たちが自由に遊んでいる様子を観測し、その観測結果から、どのような活動（遊び）が行われているのかを分類する活動分類処理について説明する。 In the present embodiment, for example, as in the observation scene shown in FIG. 1, children are observed to be playing freely, and what kind of activity (play) is being performed from the observation results. The activity classification process for classifying is described.

まず、観測が行われる場所に居る複数の子供を観測対象として識別し、観測対象ごとの位置を計測して、各時刻において観測対象が移動する位置を追跡した動き軌跡を抽出する。そして、HDP-HMM（Hierarchical Drichlet Process-Hidden Markov Model）によって、各観測対象の位置および速度を教師なしクラスタリングする。これにより、観測対象ごとの位置および速度から、局所的な行動へ分類（離散化）することができる。 First, a plurality of children in the place where the observation is performed are identified as observation targets, the position of each observation target is measured, and the movement locus that tracks the position where the observation target moves at each time is extracted. Then, the position and velocity of each observation target are unsupervised clustered by HDP-HMM (Hierarchical Drichlet Process-Hidden Markov Model). As a result, the position and velocity of each observation target can be classified (discretized) into local behaviors.

さらに、任意の時間帯ごとに、各行動が行われた頻度を計数することで行動頻度ヒストグラムを算出する。例えば、観測対象の全体についての行動の集合を活動として捉えると、行動頻度ヒストグラムは、それぞれの活動の特徴を表す活動特徴量として用いることができる。そして、活動特徴量である行動頻度ヒストグラムを、LDA（Latent Dirichlet Allocation）やHDP-LDA（Hierarchical Drichlet Process - LDA）などによる教師なしクラスタリングすることで、活動ごとへ分類することができる。 Further, the action frequency histogram is calculated by counting the frequency at which each action is performed for each arbitrary time zone. For example, if a set of actions for the entire observation target is regarded as an activity, the action frequency histogram can be used as an activity feature quantity representing the characteristics of each activity. Then, the behavior frequency histogram, which is an activity feature amount, can be classified by activity by unsupervised clustering by LDA (Latent Dirichlet Allocation) or HDP-LDA (Hierarchical Drichlet Process --LDA).

このように、観測シーンにおける子供たちの観測結果を用いて、二段階の教師なしクラスタリングを行うことによって、複数の子供の行動の集合を、特定の活動として分類することができる。 In this way, by performing two-step unsupervised clustering using the observation results of children in the observation scene, a set of behaviors of a plurality of children can be classified as a specific activity.

＜情報処理装置の構成例＞ <Configuration example of information processing device>

図２は、本技術を適用した情報処理システムの一実施の形態の構成例を示すブロック図である。 FIG. 2 is a block diagram showing a configuration example of an embodiment of an information processing system to which the present technology is applied.

図２に示すように、情報処理システム１１は、観測装置１２、入力装置１３、記憶装置１４、および活動分類処理装置１５を備えて構成される。また、図示するように、情報処理システム１１は、観測装置１２、入力装置１３、および記憶装置１４が、それぞれ活動分類処理装置１５に接続される接続構成となっている。 As shown in FIG. 2, the information processing system 11 includes an observation device 12, an input device 13, a storage device 14, and an activity classification processing device 15. Further, as shown in the figure, the information processing system 11 has a connection configuration in which the observation device 12, the input device 13, and the storage device 14 are connected to the activity classification processing device 15, respectively.

観測装置１２は、例えば、複数のRGB-Dセンサを備えて構成され、複数の方向から観測シーンを撮像して得られるカラー画像および奥行き画像を、活動分類処理装置１５に供給する。なお、観測装置１２は、後述するような動き軌跡を抽出することができる構成であればよく、観測装置１２に用いられるセンサの種類や個数などは、特に、本実施の形態における説明に限定されることはない。 The observation device 12 is configured to include, for example, a plurality of RGB-D sensors, and supplies a color image and a depth image obtained by capturing an observation scene from a plurality of directions to the activity classification processing device 15. The observation device 12 may have a configuration capable of extracting a movement locus as described later, and the type and number of sensors used in the observation device 12 are particularly limited to the description in the present embodiment. There is nothing.

入力装置１３は、例えば、キーボードやマウスなどにより構成され、ユーザによる操作に従った各種の入力値（例えば、後述するウィンドウ幅など）を、活動分類処理装置１５に入力する。 The input device 13 is composed of, for example, a keyboard, a mouse, or the like, and inputs various input values (for example, a window width described later) according to an operation by the user to the activity classification processing device 15.

記憶装置１４は、例えば、ハードディスクドライブやメモリなどにより構成され、活動分類処理装置１５が活動分類処理を行う際に一時的に記憶させる各種のデータや、活動分類処理装置１５が活動分類処理を行った結果として得られる分類結果などを記憶する。 The storage device 14 is composed of, for example, a hard disk drive or a memory, and various data temporarily stored when the activity classification processing device 15 performs the activity classification processing and the activity classification processing device 15 perform the activity classification processing. The classification result obtained as a result is stored.

活動分類処理装置１５は、動き軌跡抽出部２１、行動分類部２２、活動特徴量生成部２３、および活動分類部２４を備えて構成される。また、図示するように、活動分類処理装置１５は、動き軌跡抽出部２１が行動分類部２２に接続され、行動分類部２２が活動特徴量生成部２３に接続され、活動特徴量生成部２３が活動分類部２４に接続される接続構成となっている。 The activity classification processing device 15 includes a movement locus extraction unit 21, a behavior classification unit 22, an activity feature amount generation unit 23, and an activity classification unit 24. Further, as shown in the figure, in the activity classification processing device 15, the movement locus extraction unit 21 is connected to the behavior classification unit 22, the behavior classification unit 22 is connected to the activity feature amount generation unit 23, and the activity feature amount generation unit 23 It has a connection configuration connected to the activity classification unit 24.

動き軌跡抽出部２１は、観測装置１２から供給されるカラー画像および奥行き画像に基づいて、それぞれの観測対象を識別し、観測対象ごとの位置を特定する。そして、動き軌跡抽出部２１は、それぞれの観測対象が移動することによる位置の変化を追跡することにより、観測対象ごとの動き軌跡を抽出して、行動分類部２２に供給する。 The motion locus extraction unit 21 identifies each observation target based on the color image and the depth image supplied from the observation device 12, and identifies the position of each observation target. Then, the movement locus extraction unit 21 extracts the movement locus for each observation target by tracking the change in the position due to the movement of each observation target, and supplies the movement locus to the behavior classification unit 22.

例えば、動き軌跡抽出部２１は、観測装置１２から供給されるカラー画像に対して物体認識処理を施し、カラー画像に映されている子供の顔や服色などに基づいて、それぞれの子供を識別することができる。また、動き軌跡抽出部２１は、観測装置１２から供給される奥行き画像から求められる子供までの距離に従って、カラー画像で識別された子供ごとの位置（ｘｙ座標）を特定することができる。ここで、子供ごとの位置は、例えば、カラー画像上の座標位置を用いて特定する他、実空間上において子供ごとの位置を特定することができる場合には、実空間上の座標位置を用いて特定してもよい。また、子供ごとの位置を特定する方法は、特に、これらに限定されることはない。 For example, the movement locus extraction unit 21 performs object recognition processing on the color image supplied from the observation device 12, and identifies each child based on the child's face, clothes color, etc. reflected in the color image. can do. Further, the movement locus extraction unit 21 can specify the position (xy coordinates) for each child identified in the color image according to the distance from the depth image supplied from the observation device 12 to the child. Here, the position for each child is specified by using, for example, the coordinate position on the color image, and when the position for each child can be specified in the real space, the coordinate position in the real space is used. May be specified. Further, the method of specifying the position of each child is not particularly limited to these.

行動分類部２２は、動き軌跡抽出部２１から供給される動き軌跡から求められる観察対象ごとの位置および速度を、HDP-HMMによって教師なしクラスタリングする。これにより、行動分類部２２は、観察対象ごとに、それぞれの位置および速度で構成される行動（分類項目）へ分類して、観察対象ごとの行動を活動特徴量生成部２３に供給する。 The behavior classification unit 22 unsupervised clusters the positions and velocities for each observation target obtained from the movement locus supplied from the movement locus extraction unit 21 by HDP-HMM. As a result, the behavior classification unit 22 classifies each observation target into actions (classification items) composed of their respective positions and speeds, and supplies the behavior for each observation target to the activity feature amount generation unit 23.

ここで、HDP-HMMは、隠れ状態とその状態間の確率的遷移で表現されるモデルの一つである階層ディリクレ過程隠れマルコフモデルであり、例えば、状態数をあらかじめ決めることなく、学習データの複雑さに応じて最適な状態数を推定することができる。 Here, HDP-HMM is a hierarchical Dirichlet process hidden Markov model, which is one of the models expressed by the hidden state and the stochastic transition between the hidden states. For example, the training data can be obtained without determining the number of states in advance. The optimum number of states can be estimated according to the complexity.

例えば、図３に示すように、動き軌跡抽出部２１は、第１の観察対象および第２の観察対象の動き軌跡の時刻ｔごとに、位置および速度をクラスタリングすることで、第１の観察対象および第２の観察対象が行った行動を識別する行動ＩＤ（Identification）を動的に割り当てる。これにより、例えば、第１の観察対象および第２の観察対象の動き軌跡において類似した位置および速度（図３に示す破線の範囲）に対して、それぞれ同一の行動ＩＤが割り当てられることになる。このように、行動分類部２２は、各観察対象の動き軌跡について類似した位置および速度を、それぞれ対応する行動へ分類することができる。 For example, as shown in FIG. 3, the movement locus extraction unit 21 clusters the position and the velocity for each time t of the movement locus of the first observation target and the second observation target, so that the first observation target And the action ID (Identification) that identifies the action performed by the second observation target is dynamically assigned. As a result, for example, the same action ID is assigned to similar positions and velocities (range of the broken line shown in FIG. 3) in the movement trajectories of the first observation target and the second observation target. In this way, the behavior classification unit 22 can classify similar positions and velocities for the movement loci of each observation target into corresponding behaviors.

活動特徴量生成部２３は、行動分類部２２から供給される複数の観察対象の行動に基づいて、複数の観察対象の全体における活動の特徴を表す活動特徴量を生成して、活動分類部２４に供給する。 The activity feature amount generation unit 23 generates an activity feature amount representing the activity characteristics of the plurality of observation objects as a whole based on the behaviors of the plurality of observation objects supplied from the behavior classification unit 22, and the activity classification unit 24 Supply to.

例えば、図４に示すように、活動特徴量生成部２３は、第１の観察対象および第２の観察対象の行動を、ユーザが入力装置１３を操作して入力する任意のウィンドウ幅（時間帯）に分割する。そして、活動特徴量生成部２３は、それぞれのウィンドウ幅において割り当てられている行動ＩＤが出現した回数を計数し、これにより求められる行動頻度ヒストグラム（固定長のベクトル）を活動特徴量として生成する。なお、このウィンドウ幅は、図４に示すように連続的に設定してもよいし、例えば、所定の幅でオーバラップするように設定してもよい。 For example, as shown in FIG. 4, the activity feature amount generation unit 23 has an arbitrary window width (time zone) in which the user operates the input device 13 to input the actions of the first observation target and the second observation target. ). Then, the activity feature amount generation unit 23 counts the number of times that the action ID assigned in each window width appears, and generates an action frequency histogram (fixed length vector) obtained by this as the activity feature amount. The window width may be set continuously as shown in FIG. 4, or may be set so as to overlap with a predetermined width, for example.

活動分類部２４は、活動特徴量生成部２３から供給される活動特徴量を、LDAやHDP-LDAなどによって教師なしクラスタリングすることにより、複数の観察対象の全体についての行動の集合を活動（分類項目）へ分類する。そして、活動分類部２４は、その分類による結果として得られる分類結果を、記憶装置１４に記憶させる。 The activity classification unit 24 activates (classifies) a set of actions for the entire plurality of observation objects by unsupervised clustering the activity features supplied from the activity feature generation unit 23 by LDA, HDP-LDA, or the like. Item). Then, the activity classification unit 24 stores the classification result obtained as a result of the classification in the storage device 14.

ここで、LDAは、文書および単語に対する潜在状態（トピック）を推定する潜在的ディリクレ配分法であり、例えば、潜在状態の数が予め与えられ、活動特徴量を「文書」とし、かつ、行動を「単語」として活動（潜在状態）を推定することができる。なお、HDP-LDAでは、データの複雑さに合わせて必要な数の潜在状態数が自動的に決められる。 Here, LDA is a latent Dirichlet allocation method that estimates latent states (topics) for documents and words. For example, the number of latent states is given in advance, the activity feature is set as "document", and the action is performed. The activity (latent state) can be estimated as a "word". In HDP-LDA, the required number of latent states is automatically determined according to the complexity of the data.

例えば、図５に示すように、活動分類部２４は、活動特徴量が生成された時間帯ごとに、それぞれの活動特徴量の類似性（図示するようなヒストグラムの形状的な類似性）に基づいて、類似した活動特徴量に対して同一の活動ＩＤを動的に割り当てることができる。なお、活動分類部２４が活動を分類する分類数（活動ＩＤの数）は、ユーザが入力装置１３を操作して入力してもよいし、活動分類部２４が、全ての活動特徴量から適切な分類数を推定してもよい。 For example, as shown in FIG. 5, the activity classification unit 24 is based on the similarity of each activity feature amount (the shape similarity of the histogram as shown) for each time zone in which the activity feature amount is generated. Therefore, the same activity ID can be dynamically assigned to similar activity features. The number of classifications (number of activity IDs) for which the activity classification unit 24 classifies the activities may be input by the user by operating the input device 13, or the activity classification unit 24 is appropriate from all the activity feature quantities. The number of classifications may be estimated.

このように情報処理システム１１は構成されており、行動分類部２２による教師なしクラスタリングと、活動分類部２４による教師なしクラスタリングとを二段階で行うことによって、複数の観察対象の全体による様々な活動ごとの分類結果を効果的に抽出することができる。これにより、情報処理システム１１は、複数の観測対象による活動のアノテーションの半自動化（即ち、活動ＩＤの割り当ては自動的に行われるが、それぞれの活動の意味づけまでは自動的に行われない。活動の意味づけは、状況に応じて上述の活動ＩＤに基づいて観察者または他のシステム等により行われることが想定される。）を実現することができる。 In this way, the information processing system 11 is configured, and by performing unsupervised clustering by the behavior classification unit 22 and unsupervised clustering by the activity classification unit 24 in two stages, various activities by the entire plurality of observation objects are performed. The classification result for each can be effectively extracted. As a result, the information processing system 11 semi-automates the annotation of activities by a plurality of observation targets (that is, the assignment of activity IDs is automatically performed, but the meaning of each activity is not automatically performed. It is assumed that the meaning of the activity is performed by an observer or another system based on the above-mentioned activity ID depending on the situation).

例えば、従来、複数の観測対象による活動を分類する処理を行う場合には、分類したい行動を教師データとして予め設定する必要があり、どのように教師データを設定するのかなど様々な手間が生じていため、容易に処理を行うことができなかった。 For example, in the past, when performing a process of classifying activities by a plurality of observation targets, it was necessary to preset the behavior to be classified as teacher data, and various troubles such as how to set the teacher data have occurred. Therefore, the processing could not be easily performed.

これに対し、情報処理システム１１は、教師データを予め設定しなくても、複数の観測対象の動き軌跡から活動ＩＤを動的に割り当てることができ、分類結果を効果的に抽出することができる。なお、それぞれの活動ＩＤに対して、どのような活動であるのかは、活動を分類した後に、画像などを見た人物が意味づけすることができる。 On the other hand, the information processing system 11 can dynamically assign the activity ID from the movement trajectories of a plurality of observation targets without setting the teacher data in advance, and can effectively extract the classification result. .. For each activity ID, what kind of activity it is can be defined by a person who sees an image or the like after classifying the activity.

具体的には、情報処理システム１１は、例えば、保育園にいる子供たちが、どのような遊びをしているのかを自動的に分類することができ、その後、保育士が遊びの内容（例えば、かくれんぼや鬼ごっこなど）を意味づけすることができる。さらに、情報処理システム１１により子供たちの活動を長期間に亘って分類することで、それらの子供の集団としての成長を観測することができる。 Specifically, the information processing system 11 can automatically classify, for example, what kind of play the children in the nursery school are playing, and then the nursery teacher can classify the content of the play (for example,). It can mean hide-and-seek, tag, etc.). Furthermore, by classifying the activities of children over a long period of time by the information processing system 11, it is possible to observe the growth of those children as a group.

さらに、情報処理システム１１は、例えば、記憶装置１４に蓄積されている分類結果を参照することで、これまでに観測された類似の活動（観測シーン）を検索することができる。 Further, the information processing system 11 can search for similar activities (observation scenes) observed so far by referring to the classification results stored in the storage device 14, for example.

特に、情報処理システム１１は、複数の観察対象の全体としての活動特徴量を用いることで、例えば、観察対象を個別に識別する識別精度が低かったり、観察対象の人数を完全に把握していなかったりしても、複数の観察対象の全体としての活動を正確に分類することができる。 In particular, the information processing system 11 uses the activity features of a plurality of observation targets as a whole, so that, for example, the identification accuracy for individually identifying the observation targets is low, or the number of observation targets is not completely grasped. Even so, the activity of multiple observation objects as a whole can be accurately classified.

＜活動分類処理のフローチャート＞ <Flowchart of activity classification process>

次に、図６に示すフローチャートを参照して、活動分類処理装置１５において実行される活動分類処理について説明する。 Next, the activity classification process executed in the activity classification processing device 15 will be described with reference to the flowchart shown in FIG.

例えば、ある程度の時間のカラー画像および奥行き画像が観測装置１２から供給されると処理が開始され、ステップＳ１１において、動き軌跡抽出部２１は、観測装置１２から供給されるカラー画像および奥行き画像に基づいて、観測対象ごとの動き軌跡を抽出する。なお、動き軌跡抽出部２１が観測対象ごとの動き軌跡を抽出する処理については、上述した非特許文献１において詳細に説明されている。 For example, processing is started when a color image and a depth image for a certain period of time are supplied from the observation device 12, and in step S11, the motion locus extraction unit 21 is based on the color image and the depth image supplied from the observation device 12. Then, the movement trajectory for each observation target is extracted. The process of extracting the movement locus for each observation target by the movement locus extraction unit 21 is described in detail in Non-Patent Document 1 described above.

ステップＳ１２において、行動分類部２２は、ステップＳ１１で動き軌跡抽出部２１により抽出された動き軌跡を用いて、HDP-HMMによる教師なしクラスタリングすることで、それぞれの観測対象の位置および速度を行動ごとに分類する。 In step S12, the action classification unit 22 uses the movement locus extracted by the movement locus extraction unit 21 in step S11 to perform unsupervised clustering by HDP-HMM to set the position and speed of each observation target for each action. Classify into.

ステップＳ１３において、活動特徴量生成部２３は、ステップＳ１２で行動分類部２２により分類された各行動が行われた頻度を表すヒストグラムを、観測対象の全体による活動の特徴を表す活動特徴量として生成する。 In step S13, the activity feature amount generation unit 23 generates a histogram showing the frequency of each action classified by the action classification unit 22 in step S12 as an activity feature amount showing the activity feature of the entire observation target. To do.

ステップＳ１４において、活動分類部２４は、ステップＳ１３で活動特徴量生成部２３により生成された活動特徴量を、LDAやHDP-LDAなどによる教師なしクラスタリングすることで活動ごとに分類し、その結果得られる活動ＩＤを分類結果として出力する。 In step S14, the activity classification unit 24 classifies the activity feature amount generated by the activity feature amount generation unit 23 in step S13 for each activity by unsupervised clustering by LDA, HDP-LDA, or the like, and obtains the result. The activity ID to be performed is output as a classification result.

以上のように、活動分類処理装置１５は、複数の観測対象の動き軌跡（大量のデータ）から、二段階の教師なしクラスタリングを行うことによって、複数の観測対象の全体による活動へ分類した分類結果を効果的に抽出することがでる。また、活動分類処理装置１５は、このようなクラスタリングを、例えば、撮像中のカラー画像および奥行き画像が連続的に供給されるタイミングに準じてリアルタイムで行うことができる。もちろん、活動分類処理装置１５は、既に記録されているカラー画像および奥行き画像を用いて処理を行ってもよい。 As described above, the activity classification processing device 15 classifies the movement trajectories (a large amount of data) of a plurality of observation targets into activities by the entire observation target by performing unsupervised clustering in two stages. Can be effectively extracted. Further, the activity classification processing device 15 can perform such clustering in real time according to, for example, the timing at which the color image and the depth image during imaging are continuously supplied. Of course, the activity classification processing device 15 may perform processing using the already recorded color image and depth image.

なお、本技術は、上述したような子供の遊びを分類する他、例えば、体育館において行われているスポーツを分類して、体育館の運用を管理するのに利用することができる。また、本技術は、例えば、特定のエリアの人の動きを分類して、異常な行動をする人を抽出することで、防犯に役立てることができる。 In addition to classifying children's play as described above, this technology can be used to classify sports performed in gymnasiums and manage the operation of gymnasiums. In addition, this technology can be useful for crime prevention, for example, by classifying the movements of people in a specific area and extracting people who behave abnormally.

また、本実施の形態では、観察対象ごとの位置および速度（動き軌跡に対して位置の時間的な変化を求める処理を行って得られる処理結果）をクラスタリングの対象として説明を行ったが、例えば、観察対象ごと動き軌跡そのものをクラスタリングの対象としてもよい。その他、観察対象ごとの動き軌跡に対して速度の時間的な変化を求める処理を行って得られる加速度や、複数の観察対象の動き軌跡に対して互いの関係を求める処理を行って得られる関係性（例えば、相関係数）などの処理結果を、クラスタリングの対象とすることができる。 Further, in the present embodiment, the position and velocity of each observation target (processing result obtained by performing the process of obtaining the temporal change of the position with respect to the movement locus) have been described as the target of clustering. , The movement locus itself may be the target of clustering for each observation target. In addition, the acceleration obtained by performing the process of obtaining the temporal change of the velocity with respect to the movement locus of each observation object, and the relationship obtained by performing the process of obtaining the mutual relationship between the movement trajectories of a plurality of observation objects. Processing results such as sex (for example, correlation coefficient) can be targeted for clustering.

さらに、情報処理システム１１は、観察対象の動き軌跡以外の時系列データを用いてもよく、その時系列データのデータ値および処理結果を教師なしクラスタリングすることで、例えば、行動や活動以外の所望の分類項目へ分類した分類結果を抽出することができる。また、情報処理システム１１において用いられる活動特徴量は、複数の観察対象の全体における活動の特徴を表していれば、上述したような行動頻度ヒストグラムに限定されることはない。 Further, the information processing system 11 may use time-series data other than the movement locus of the observation target, and by clustering the data values and processing results of the time-series data without supervised learning, for example, desired other than actions and activities. It is possible to extract the classification results classified into the classification items. Further, the activity feature amount used in the information processing system 11 is not limited to the behavior frequency histogram as described above as long as it represents the activity feature in the whole of the plurality of observation objects.

なお、上述のフローチャートを参照して説明した各処理は、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいは個別に実行される処理（例えば、並列処理あるいはオブジェクトによる処理）も含むものである。また、プログラムは、単一のCPUにより処理されるものであっても良いし、複数のCPUによって分散処理されるものであっても良い。 It should be noted that each process described with reference to the above flowchart does not necessarily have to be processed in chronological order in the order described as the flowchart, and is a process executed in parallel or individually (for example, a parallel process or an object). Processing by) is also included. Further, the program may be processed by a single CPU or may be distributed by a plurality of CPUs.

また、上述した一連の処理（情報処理方法）は、ハードウエアにより実行することもできるし、ソフトウエアにより実行することもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、専用のハードウエアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、プログラムが記録されたプログラム記録媒体からインストールされる。 Further, the series of processes (information processing method) described above can be executed by hardware or by software. When a series of processes are executed by software, the programs that make up the software execute various functions by installing a computer embedded in dedicated hardware or various programs. It is installed from a program recording medium on which a program is recorded, for example, on a general-purpose personal computer.

図７は、上述した一連の処理をプログラムにより実行するコンピュータのハードウエアの構成例を示すブロック図である。 FIG. 7 is a block diagram showing an example of hardware configuration of a computer that executes the above-mentioned series of processes programmatically.

コンピュータにおいて、CPU（Central Processing Unit）１０１，ROM（Read Only Memory）１０２，RAM（Random Access Memory）１０３は、バス１０４により相互に接続されている。 In a computer, a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, and a RAM (Random Access Memory) 103 are connected to each other by a bus 104.

バス１０４には、さらに、入出力インタフェース１０５が接続されている。入出力インタフェース１０５には、キーボード、マウス、マイクロホンなどよりなる入力部１０６、ディスプレイ、スピーカなどよりなる出力部１０７、ハードディスクや不揮発性のメモリなどよりなる記憶部１０８、ネットワークインタフェースなどよりなる通信部１０９、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブルメディア１１１を駆動するドライブ１１０が接続されている。 An input / output interface 105 is further connected to the bus 104. The input / output interface 105 includes an input unit 106 composed of a keyboard, a mouse, a microphone, etc., an output unit 107 composed of a display, a speaker, etc., a storage unit 108 composed of a hard disk, a non-volatile memory, etc., and a communication unit 109 composed of a network interface, etc. , A drive 110 for driving a removable medium 111 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is connected.

以上のように構成されるコンピュータでは、CPU１０１が、例えば、記憶部１０８に記憶されているプログラムを、入出力インタフェース１０５及びバス１０４を介して、RAM１０３にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 101 loads the program stored in the storage unit 108 into the RAM 103 via the input / output interface 105 and the bus 104 and executes the above-described series. Is processed.

コンピュータ（CPU１０１）が実行するプログラムは、例えば、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD-ROM(Compact Disc-Read Only Memory),DVD(Digital Versatile Disc)等）、光磁気ディスク、もしくは半導体メモリなどよりなるパッケージメディアであるリムーバブルメディア１１１に記録して、あるいは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供される。 The program executed by the computer (CPU101) is, for example, a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.), a magneto-optical disk, or a semiconductor. It is recorded on removable media 111, which is a package media including memory, or provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.

そして、プログラムは、リムーバブルメディア１１１をドライブ１１０に装着することにより、入出力インタフェース１０５を介して、記憶部１０８にインストールすることができる。また、プログラムは、有線または無線の伝送媒体を介して、通信部１０９で受信し、記憶部１０８にインストールすることができる。その他、プログラムは、ROM１０２や記憶部１０８に、あらかじめインストールしておくことができる。 Then, the program can be installed in the storage unit 108 via the input / output interface 105 by mounting the removable media 111 in the drive 110. Further, the program can be received by the communication unit 109 and installed in the storage unit 108 via a wired or wireless transmission medium. In addition, the program can be pre-installed in the ROM 102 or the storage unit 108.

なお、本実施の形態は、上述した実施の形態に限定されるものではなく、本開示の要旨を逸脱しない範囲において種々の変更が可能である。 The present embodiment is not limited to the above-described embodiment, and various modifications can be made without departing from the gist of the present disclosure.

１１情報処理システム，１２観測装置，１３入力装置，１４記憶装置，１５活動分類処理装置，２１動き軌跡抽出部，２２行動分類部，２３活動特徴量生成部，２４活動分類部 11 Information processing system, 12 Observation device, 13 Input device, 14 Storage device, 15 Activity classification processing device, 21 Movement locus extraction unit, 22 Behavior classification unit, 23 Activity feature amount generation unit, 24 Activity classification unit

Claims

A first classification unit that classifies the time-series data for each observation target into a first classification item by unsupervised clustering of time-series data obtained by observing a plurality of observation objects.
By unsupervised clustering the feature quantities representing the characteristics of the set of the first classification items for the whole of the plurality of observation objects, the set of the first classification items for the whole of the plurality of observation objects can be obtained. An information processing device including a second classification unit for classifying into a second classification item.

The information processing apparatus according to claim 1, wherein the first classification item is composed of a data value of the time-series data for each observation target and a processing result obtained by processing the time-series data.

Based on the first classification item for each observation object classified by the first classification unit, the feature amount generation unit that generates the feature amount for the entire plurality of observation objects at arbitrary time zones. The information processing apparatus according to claim 1 or 2, further comprising.

The information according to claim 3, wherein the feature amount generation unit generates a histogram as the feature amount, which counts the number of times the first classification item appears for each of the plurality of observation objects in an arbitrary time zone. Processing equipment.

The information processing apparatus according to any one of claims 1 to 4, further comprising an extraction unit that extracts as time-series data a movement locus that tracks a position where a plurality of observation objects move.

The first classification unit classifies the behavior of each observation object as the first classification item based on the movement locus of each observation object extracted by the extraction unit.
The information processing device according to claim 5, wherein the second classification unit classifies the activities of the plurality of observation objects as a whole as the second classification item based on the set of actions.

By unsupervised clustering the time-series data obtained by observing a plurality of observation objects, the time-series data for each observation object is classified into the first classification item.
By unsupervised clustering the feature quantities representing the characteristics of the set of the first classification items for the whole of the plurality of observation objects, the set of the first classification items for the whole of the plurality of observation objects can be obtained. An information processing method that includes a step of classifying into a second classification item.

By unsupervised clustering the time-series data obtained by observing a plurality of observation objects, the time-series data for each observation object is classified into the first classification item.
By unsupervised clustering the feature quantities representing the characteristics of the set of the first classification items for the whole of the plurality of observation objects, the set of the first classification items for the whole of the plurality of observation objects can be obtained. A program that causes a computer to execute information processing including a step of classifying into a second classification item.