JP2019191946A

JP2019191946A - Information processing device

Info

Publication number: JP2019191946A
Application number: JP2018084268A
Authority: JP
Inventors: 允禎梁; Yunjeong Yang; 真武下平; Masatake Shimodaira; 昌嗣左近; Masatsugu Sakon; 夏川里紗; Risa Natsukawa; 里紗夏川; 吉洋安原; Yoshihiro Yasuhara
Original assignee: Pioneer Electronic Corp
Current assignee: Pioneer Corp
Priority date: 2018-04-25
Filing date: 2018-04-25
Publication date: 2019-10-31

Abstract

To accurately and easily designate a position with respect to predetermined processing.SOLUTION: In a control unit 4, a position recognition unit 41 acquires point group information and recognizes a position in a passenger compartment indicated by a person, for example, using a fingertip, on the basis of the point group information. Then, a sound image control unit 42 localizes a sound image at the position recognized by the position recognition unit 41.SELECTED DRAWING: Figure 1

Description

本発明は、所定の処理を行う情報処理装置に関する。 The present invention relates to an information processing apparatus that performs predetermined processing.

例えば車載のオーディオシステムにおいて、複数のスピーカ各々の出力に遅延時間差を与えることにより、音像の定位位置を制御することは既に知られている。 For example, in a vehicle-mounted audio system, it is already known to control the localization position of a sound image by giving a delay time difference to the outputs of a plurality of speakers.

特許文献１には、定位位置受付ウインドウ３００のダッシュボード画像３５０上の位置入力に応じて定位位置の指定を受け付け、受け付けた定位位置に音像が定位するように各スピーカ３に出力するオーディオ信号の遅延時間を設定することが記載されている。また、定位位置の指定は車両水平方向のレイアウト画像である車内レイアウト画像８５０を表示して定位位置の指定を受け付けることも記載されている。 Japanese Patent Laid-Open No. 2004-228688 accepts designation of a localization position in response to a position input on the dashboard image 350 of the localization position reception window 300, and outputs an audio signal output to each speaker 3 so that a sound image is localized at the received localization position. Setting the delay time is described. In addition, it is described that the localization position is specified by displaying an in-vehicle layout image 850 which is a layout image in the horizontal direction of the vehicle and accepting the localization position.

特開２００６−１９６９４１号公報JP 2006-196941 A

特許文献１に記載の発明の場合、二次元的な画像で定位位置を指定しているが、実際に音像が定位するのは三次元空間内の位置であるので、二次元的な画像による指定では厳密な指定が困難である。 In the case of the invention described in Patent Document 1, the localization position is specified by a two-dimensional image. However, since a sound image is actually localized at a position in a three-dimensional space, specification by a two-dimensional image is performed. Therefore, it is difficult to specify exactly.

本発明が解決しようとする課題としては、所定の処理についての位置の指定を正確かつ容易にすることが一例として挙げられる。 An example of a problem to be solved by the present invention is to make it easy to specify the position of a predetermined process accurately and easily.

上記課題を解決するために、請求項１に記載の発明は、所定空間に存在する物体に関する情報を取得する第１取得部と、前記物体に関する情報に基づいて、前記所定空間に存在する人物が身体の一部によって示した、前記所定空間における位置を認識する第１認識部と、前記第１認識部が認識した位置に基づいた処理を行う処理部と、を備えることを特徴としている。 In order to solve the above-mentioned problem, the invention according to claim 1 is characterized in that a first acquisition unit that acquires information related to an object existing in the predetermined space and a person existing in the predetermined space based on the information related to the object. A first recognizing unit that recognizes a position in the predetermined space indicated by a part of the body, and a processing unit that performs processing based on the position recognized by the first recognizing unit are provided.

請求項５に記載の発明は、所定の処理を行う情報処理装置で実行される情報処理方法であって、所定空間に存在する物体に関する情報を取得する第１取得工程と、前記物体に関する情報に基づいて、前記所定空間に存在する人物が身体の一部によって示した、前記所定空間における位置を認識する第１認識工程と、前記第１認識工程で認識した位置に基づいた処理を行う処理工程と、を含むことを特徴としている。 The invention according to claim 5 is an information processing method executed by an information processing apparatus that performs a predetermined process, and includes a first acquisition step of acquiring information related to an object existing in a predetermined space, and information related to the object. A first recognition step for recognizing a position in the predetermined space indicated by a part of the body of a person existing in the predetermined space, and a processing step for performing processing based on the position recognized in the first recognition step It is characterized by including.

請求項６に記載の発明は、請求項５に記載の情報処理方法を、コンピュータにより実行させることを特徴としている。 The invention described in claim 6 is characterized in that the information processing method described in claim 5 is executed by a computer.

本発明の第１の実施例にかかる情報処理装置を有する音響システムの機能構成図である。It is a functional lineblock diagram of an acoustic system which has an information processor concerning the 1st example of the present invention. ユーザ等の人物が人差し指で位置を指示している図である。It is a figure in which persons, such as a user, are indicating the position with an index finger. 図１に示された情報処理装置の動作のフローチャートである。2 is a flowchart of the operation of the information processing apparatus shown in FIG. 位置を指示する他の方法の図である。It is a figure of the other method of indicating a position. 本発明の第２の実施例にかかる情報処理装置を有する入力装置の機能構成図である。It is a functional block diagram of the input device which has an information processing apparatus concerning the 2nd Example of this invention. 表示面に表示させた操作画像に対する入力操作の検出の説明図である。It is explanatory drawing of the detection of input operation with respect to the operation image displayed on the display surface. ハイライト表示したアイコンにかかる機能を実行させる操作の説明図である。It is explanatory drawing of operation which performs the function concerning the highlighted icon. 図５に示された情報処理装置の動作のフローチャートである。It is a flowchart of operation | movement of the information processing apparatus shown by FIG. ハイライト表示したアイコンにかかる機能を実行させる他の操作の説明図である。It is explanatory drawing of other operation which performs the function concerning the highlighted icon.

以下、本発明の一実施形態にかかる情報処理装置を説明する。本発明の一実施形態にかかる情報処理装置は、第１取得部が所定空間に存在する物体に関する情報を取得し、第１認識部が物体に関する情報に基づいて、所定空間に存在する人物が身体の一部によって示した、所定空間における位置を認識する。そして、処理部が第１認識部で認識した位置に基づいた処理を行う。このようにすることにより、第１認識部が認識した所定空間に存在する人物が身体の一部によって示した位置に基づいて処理をすることができる。そのため、三次元空間において直接位置の指定をすることができるので、所定の処理についての位置の指定を正確かつ容易にすることができる。 Hereinafter, an information processing apparatus according to an embodiment of the present invention will be described. In an information processing apparatus according to an embodiment of the present invention, a first acquisition unit acquires information related to an object existing in a predetermined space, and a first recognition unit detects a person existing in the predetermined space based on the information related to the object. The position in the predetermined space indicated by a part of is recognized. Then, the processing unit performs processing based on the position recognized by the first recognition unit. By doing in this way, it can process based on the position which the person who exists in the predetermined space recognized by the 1st recognition part showed by a part of body. Therefore, since the position can be directly specified in the three-dimensional space, the position can be specified accurately and easily for a predetermined process.

また、第１取得部は、所定空間内に電磁波を出射して当該所定空間内の物体によって反射された電磁波を受信することで物体までの距離を測定可能なセンサから所定空間に存在する物体に関する情報を取得してもよい。このようにすることにより、所定空間内に電磁波を出射して当該所定空間内の物体によって反射された電磁波を受信することで物体までの距離を測定可能なセンサを利用して容易に所定空間内に存在する物体に関する情報を取得することができる。また、物体までの距離を測定することができるので、正確な位置を特定することが容易となる。 In addition, the first acquisition unit relates to an object existing in the predetermined space from a sensor capable of measuring the distance to the object by emitting the electromagnetic wave in the predetermined space and receiving the electromagnetic wave reflected by the object in the predetermined space. Information may be acquired. This makes it easy to use a sensor that can measure the distance to an object by emitting the electromagnetic wave into the predetermined space and receiving the electromagnetic wave reflected by the object in the predetermined space. It is possible to acquire information related to an object existing in Further, since the distance to the object can be measured, it is easy to specify an accurate position.

また、発話音声を取得する第２取得部と、発話音声の内容を認識する第２認識部と、を備え、処理部は、第１認識部で認識した位置と、第２認識部で認識した内容と、に基づいた処理を行ってもよい。このようにすることにより、発話音声の内容も考慮して処理を実行することができるので、より精度良く位置を特定することができる。 In addition, a second acquisition unit that acquires uttered speech and a second recognition unit that recognizes the content of the uttered speech are provided, and the processing unit recognizes the position recognized by the first recognition unit and the second recognition unit. Processing based on the content may be performed. By doing so, the processing can be executed in consideration of the content of the uttered voice, so that the position can be specified with higher accuracy.

また、処理部は、複数スピーカから発せられる音により形成される所定空間内の音像を、第１認識部が認識した位置に定位させる処理を行ってもよい。このようにすることにより、音像を定位させる位置を室内等の三次元空間において直接指定をすることができるようになる。 The processing unit may perform a process of localizing a sound image in a predetermined space formed by sounds emitted from a plurality of speakers at a position recognized by the first recognition unit. In this way, the position where the sound image is localized can be directly designated in a three-dimensional space such as a room.

また、本発明の一実施形態にかかる情報処理方法は、第１取得工程で所定空間に存在する物体に関する情報を取得し、第１認識工程で物体に関する情報に基づいて、所定空間に存在する人物が身体の一部によって示した、所定空間における位置を認識する。そして、処理部工程で第１認識工程において認識した位置に基づいた処理を行う。このようにすることにより、第１認識工程で認識した所定空間に存在する人物が身体の一部によって示した位置に基づいて処理をすることができる。そのため、三次元空間において直接位置の指定をすることができるので、所定の処理についての位置の指定を正確かつ容易にすることができる。 In addition, the information processing method according to an embodiment of the present invention acquires information related to an object existing in a predetermined space in the first acquisition step, and a person existing in the predetermined space based on the information related to the object in a first recognition step. Recognizes a position in a predetermined space indicated by a part of the body. And the process based on the position recognized in the 1st recognition process at a process part process is performed. By doing in this way, it can process based on the position which the person who exists in the predetermined space recognized at the 1st recognition process showed by a part of body. Therefore, since the position can be directly specified in the three-dimensional space, the position can be specified accurately and easily for a predetermined process.

また、上述した情報処理方法を、コンピュータにより実行させてもよい。このようにすることにより、コンピュータを用いて、第１認識工程で認識した所定空間に存在する人物が身体の一部によって示した位置に基づいて処理をすることができる。そのため、三次元空間において直接位置の指定をすることができるので、所定の処理についての位置の指定を正確かつ容易にすることができる。 The information processing method described above may be executed by a computer. By doing in this way, it can process using the computer based on the position which the person who exists in the predetermined space recognized at the 1st recognition process showed by a part of body. Therefore, since the position can be directly specified in the three-dimensional space, the position can be specified accurately and easily for a predetermined process.

本発明の第１の実施例にかかる情報処理装置を有する音響システム図１〜図４を参照して説明する。音響システムは、例えば自動車等の車両に搭載されている。情報処理装置を有する音響システムの機能的構成を図１に示す。音響システムは、オーディオ機器１と、を遅延調整部２（２Ｒ、２Ｌ）と、スピーカ３（３Ｒ、３Ｌ）と、制御部４と、マイク５と、ライダ７と、備えている。また、音響システムは、本実施例では、車両に搭載されて、所定空間としての車室内の音響を制御する。 An acoustic system having an information processing apparatus according to a first embodiment of the present invention will be described with reference to FIGS. The acoustic system is mounted on a vehicle such as an automobile. A functional configuration of an acoustic system having an information processing apparatus is shown in FIG. The acoustic system includes an audio device 1, a delay adjustment unit 2 (2 R, 2 L), a speaker 3 (3 R, 3 L), a control unit 4, a microphone 5, and a lidar 7. In this embodiment, the sound system is mounted on a vehicle and controls sound in the vehicle interior as a predetermined space.

オーディオ機器１は、ＣＤプレイヤ、メモリーオーディオ、ラジオチューナ或いはストリーミング配信の受信装置などのユーザの聴取の目的となるオーディオ信号を出力する機器である。 The audio device 1 is a device that outputs an audio signal that is intended to be listened to by a user, such as a CD player, memory audio, radio tuner, or streaming distribution receiving device.

遅延調整部２は、本実施例では、左チャンネル用の遅延調整部２Ｌと右チャンネル用の遅延調整部２Ｒからなる。遅延調整部２Ｌは、オーディオ機器１から左チャネルのスピーカ３Ｌに出力される左チャネルのオーディオ信号を設定された遅延時間だけ遅延させる。遅延調整部２Ｒは、オーディオ機器１から右チャネルのスピーカ３Ｒに出力される右チャネルオーディオ信号を設定された遅延時間だけ遅延させる。 In this embodiment, the delay adjustment unit 2 includes a left channel delay adjustment unit 2L and a right channel delay adjustment unit 2R. The delay adjustment unit 2L delays the left channel audio signal output from the audio device 1 to the left channel speaker 3L by a set delay time. The delay adjusting unit 2R delays the right channel audio signal output from the audio device 1 to the right channel speaker 3R by a set delay time.

制御部４は、位置認識部４１と、音像制御部４２と、音声認識部４３と、を備えている。制御部４は、例えばＣＰＵを有するマイクロコンピュータで構成されている。位置認識部４１と、音像制御部４２と、音声認識部４３と、はＣＰＵで実行されるコンピュータプログラムの機能として実現すればよい。制御部４は本実施例にかかる情報処理装置として機能する。 The control unit 4 includes a position recognition unit 41, a sound image control unit 42, and a voice recognition unit 43. The control part 4 is comprised with the microcomputer which has CPU, for example. The position recognition unit 41, the sound image control unit 42, and the voice recognition unit 43 may be realized as functions of a computer program executed by the CPU. The control unit 4 functions as an information processing apparatus according to the present embodiment.

位置認識部４１は、ライダ７が出力した点群情報に基づいてユーザが身体の一部（例えば指先）によって示した位置を認識する。位置認識部４１による位置認識の例を図２を参照して説明する。図２は、ユーザ等の人物が人差し指ＩＦで位置を指示している図である。 The position recognition unit 41 recognizes the position indicated by a part of the body (for example, a fingertip) based on the point cloud information output by the lidar 7. An example of position recognition by the position recognition unit 41 will be described with reference to FIG. FIG. 2 is a diagram in which a person such as a user indicates a position with the index finger IF.

図２において、人差し指ＩＦを含むユーザの手Ｈは、ライダ７によって走査されて点群情報として取得される。取得された点群情報は、位置認識部４１において周知のオブジェクト認識により手Ｈや人差し指ＩＦが認識される。人差し指ＩＦが認識されると、点群情報には認識された人差し指ＩＦの指先からライダ７までの距離と方向の情報が含まれているので車室空間における指先の位置を認識することができる。 In FIG. 2, the user's hand H including the index finger IF is scanned by the lidar 7 and acquired as point cloud information. In the acquired point group information, the position recognition unit 41 recognizes the hand H and the index finger IF by well-known object recognition. When the index finger IF is recognized, the point cloud information includes information on the distance and direction from the recognized fingertip of the index finger IF to the lidar 7, so that the position of the fingertip in the vehicle compartment space can be recognized.

なお、点群情報のみからオブジェクト認識するに限らず、例えば別途カメラを設けて、カメラが撮像した画像から人差し指ＩＦを認識し、点群画像とカメラ画像とを照合して、人差し指ＩＦに相当する点群を抽出してもよい。 Note that the present invention is not limited to point cloud information recognition alone. For example, a separate camera is provided, the index finger IF is recognized from an image captured by the camera, and the point cloud image is compared with the camera image to correspond to the index finger IF. A point cloud may be extracted.

音像制御部４２は、位置認識部４１で認識された指先の位置及び音声認識部４３で認識された音声情報に基づいて、その位置にスピーカ３Ｌ、３Ｒから発せられる音により形成される車室内の音像を指定された位置（特定位置）に定位させるように遅延調整部２Ｌ、２Ｒの遅延量を調整して設定する。即ち、制御部４（音像制御部４２）は、スピーカ３Ｌ、３Ｒから発せられる音により形成される車室内の音像を特定位置に定位させる定位制御部として機能する。 Based on the position of the fingertip recognized by the position recognition unit 41 and the voice information recognized by the voice recognition unit 43, the sound image control unit 42 is located in the vehicle interior formed by the sound emitted from the speakers 3L and 3R. The delay amounts of the delay adjustment units 2L and 2R are adjusted and set so that the sound image is localized at a designated position (specific position). That is, the control unit 4 (sound image control unit 42) functions as a localization control unit that localizes a sound image in the vehicle interior formed by sounds emitted from the speakers 3L and 3R to a specific position.

音像制御部４２における定位位置の調整について説明する。まず、位置認識部４１で認識された指先の位置とスピーカ３Ｌ、３Ｒとの距離をそれぞれ求める。スピーカ３Ｌ、３Ｒは車室内に固定されているので、スピーカ３Ｌ、３Ｒからライダ７までの距離と方向は既知である。したがって、人差し指ＩＦの指先からライダ７までの距離及び方向と、スピーカ３Ｌ、３Ｒからライダ７までの距離及び方向と、からスピーカ３Ｌからライダ７までの距離及びスピーカ３Ｒをからライダ７までの距離求めることができる。そして、求めた距離の短い方のスピーカ３Ｌ（３Ｒ）からの出力を距離の長い方のスピーカ３Ｒ（３Ｌ）との距離の差分に応じて遅延させる。 Adjustment of the localization position in the sound image control unit 42 will be described. First, the distance between the fingertip position recognized by the position recognition unit 41 and the speakers 3L and 3R is obtained. Since the speakers 3L and 3R are fixed in the vehicle interior, the distance and direction from the speakers 3L and 3R to the rider 7 are known. Therefore, the distance and direction from the fingertip of the index finger IF to the lidar 7, the distance and direction from the speakers 3L, 3R to the lidar 7, and the distance from the speaker 3L to the lidar 7 and the distance from the speaker 3R to the lidar 7 are obtained. be able to. Then, the output from the speaker 3L (3R) having the shorter distance is delayed according to the difference in distance from the speaker 3R (3L) having the longer distance.

音声認識部４３は、マイク５で集音した音声の音声信号（発話内容）を周知の音声認識アルゴリズムにより認識して、認識された発話内容を音声情報として音像制御部４２に出力する。 The voice recognition unit 43 recognizes a voice signal (utterance content) collected by the microphone 5 using a known voice recognition algorithm, and outputs the recognized speech content to the sound image control unit 42 as voice information.

マイク５は、例えばダッシュボードや天井等の車室内のユーザの声が集音可能な位置に設置されている。マイク５は、ユーザ等が発話した内容を音声として集音し、電気信号に変換して音声信号として制御部４（音声認識部４３）へ出力する。 The microphone 5 is installed at a position where the voice of the user in the passenger compartment such as a dashboard or a ceiling can be collected. The microphone 5 collects the content uttered by the user or the like as voice, converts it into an electrical signal, and outputs it as an audio signal to the control unit 4 (voice recognition unit 43).

ライダ７は、所定空間として車室内あるいは車室内を走査できる位置に設置されている。ライダ７は、当該ライダ７が走査する周囲に存在する物体を認識するセンサであり、ＬｉＤＡＲ（Light Detection And Ranging）とも表記される。ライダ７は、レーザ光等の電磁波を照射してその電磁波の反射波（反射光）により、走査範囲に存在する物体までの方向と距離を離散的に測定し、当該物体の位置や形状等を三次元の点群として認識する公知のセンサである。したがって、ライダ７で認識された点群は所定空間に存在する物体に関する情報としての点群情報として出力される。 The lidar 7 is installed as a predetermined space at a position where the vehicle interior or the vehicle interior can be scanned. The lidar 7 is a sensor for recognizing an object existing in the periphery scanned by the lidar 7 and is also expressed as LiDAR (Light Detection And Ranging). The lidar 7 irradiates an electromagnetic wave such as a laser beam and discretely measures the direction and distance to the object existing in the scanning range by the reflected wave (reflected light) of the electromagnetic wave, and determines the position and shape of the object. This is a known sensor that recognizes as a three-dimensional point group. Therefore, the point cloud recognized by the lidar 7 is output as point cloud information as information relating to an object existing in the predetermined space.

ライダ７は、１つに限らず、複数設置されていてもよい。ライダ７の数や設置位置は、例えば車両であれば、対象となる車室空間の大きさに応じて適宜定めればよい。 The number of riders 7 is not limited to one, and a plurality of riders 7 may be installed. For example, in the case of a vehicle, the number and installation positions of the lidar 7 may be determined as appropriate according to the size of the target passenger compartment space.

なお、図１では図示しないが、操作の確認や各種表示用の表示部を音響システムが備えていてもよい。 Although not shown in FIG. 1, the acoustic system may include a display unit for confirming operations and various displays.

次に、上述した構成の制御部４の動作（情報処理方法）について図３のフローチャートを参照して説明する。また、図３に示したフローチャートを制御部４が有するＣＰＵで実行するプログラムとして構成することで情報処理プログラムとすることができる。 Next, the operation (information processing method) of the control unit 4 configured as described above will be described with reference to the flowchart of FIG. Moreover, it can be set as an information processing program by comprising the flowchart shown in FIG. 3 as a program performed with CPU which the control part 4 has.

まず、ステップＳ１１において、位置認識部４１は、点群情報をライダ７から取得する。即ち、位置認識部４１は、所定空間に存在する物体に関する情報を取得する第１取得部として機能する。 First, in step S <b> 11, the position recognition unit 41 acquires point cloud information from the lidar 7. That is, the position recognition unit 41 functions as a first acquisition unit that acquires information about an object existing in a predetermined space.

次に、ステップＳ１２において、位置認識部４１は、上述したように、ライダ７から取得した点群情報に基づいて、人差し指ＩＦの指先の位置を認識する。即ち、位置認識部４１は、物体に関する情報に基づいて、所定空間に存在する人物が身体の一部によって示した、所定空間における位置を認識する第１認識部として機能する。 Next, in step S <b> 12, the position recognition unit 41 recognizes the position of the fingertip of the index finger IF based on the point cloud information acquired from the lidar 7 as described above. That is, the position recognizing unit 41 functions as a first recognizing unit that recognizes a position in a predetermined space indicated by a part of the body of a person existing in the predetermined space based on information on the object.

次に、ステップＳ１３において、音声認識部４３は、マイク５から取得した音声信号に基づいて音声認識を行う。なお、ステップＳ１３は、ステップＳ１２と並行して行ってもよい。即ち、音声認識部４３は、発話音声を取得する第２取得部及び発話音声の内容を認識する第２認識部として機能する。 Next, in step S <b> 13, the voice recognition unit 43 performs voice recognition based on the voice signal acquired from the microphone 5. Note that step S13 may be performed in parallel with step S12. That is, the voice recognition unit 43 functions as a second acquisition unit that acquires uttered speech and a second recognition unit that recognizes the content of the uttered speech.

次に、ステップＳ１４において、音像制御部４２は、ステップＳ１３で認識した位置と、ステップＳ１４で認識した音声と、に基づいて、指定された定位位置に音像が定位するように遅延調整部２Ｌ、２Ｒをそれぞれ調整する。本ステップでは、例えば、マイク５から、「ここ」や「ここに定位させて」といった特定の音声が入力された場合にのみ実行される。つまり、指で指し示すだけでなく、音声でも指示を与えることで音像を定位させる位置の指定を確実に認識できるようにしている。即ち、音像制御部４２は、第１認識部で認識した位置と、第２認識部で認識した内容と、に基づいた処理を行う処理部として機能する。このような特定の音声の内容は、不図示の表示部からユーザに指示をするようにしてもよい。 Next, in step S14, the sound image control unit 42, based on the position recognized in step S13 and the sound recognized in step S14, the delay adjustment unit 2L, so that the sound image is localized at the specified localization position. Adjust 2R respectively. This step is executed only when a specific sound such as “here” or “localize here” is input from the microphone 5, for example. That is, it is possible to reliably recognize the designation of the position where the sound image is localized not only by pointing with a finger but also by giving a voice instruction. That is, the sound image control unit 42 functions as a processing unit that performs processing based on the position recognized by the first recognition unit and the content recognized by the second recognition unit. Such specific audio content may be instructed to the user from a display unit (not shown).

次に、ステップＳ１５において、音像制御部４２は、例えば再調整の要否についてのメッセージ等を不図示の表示部等に表示させ、再調整が必要との入力がなされた場合（ＹＥＳの場合）はステップＳ１１に戻り、再度点群情報の取得を行う。一方、再調整が不要との入力がなされた場合（ＮＯの場合）はフローチャートを終了する。 Next, in step S15, the sound image control unit 42 displays, for example, a message about whether or not readjustment is necessary on a display unit or the like (not shown) and inputs that readjustment is necessary (in the case of YES). Returns to step S11 to acquire the point cloud information again. On the other hand, when an input indicating that readjustment is unnecessary is made (in the case of NO), the flowchart ends.

以上の説明から明らかなように、ステップＳ１１が第１取得工程、ステップＳ１２が第１認識工程、ステップＳ１４が処理工程として機能する。 As is clear from the above description, step S11 functions as a first acquisition step, step S12 functions as a first recognition step, and step S14 functions as a processing step.

なお、上述した説明では、指で指し示す動作で定位位置を指定していたが、それに限らない。例えば、図４に示したように指である範囲を囲むように回転させるような指定方法でもよい。この場合は、指の位置の経時的変化から指が回転している範囲を特定し、その範囲の中心を定位位置と見做せばよい。 In the above description, the localization position is specified by the operation pointed by the finger, but the present invention is not limited to this. For example, as shown in FIG. 4, a designation method may be used in which the finger is rotated so as to surround a range that is a finger. In this case, a range in which the finger is rotating is identified from the change in the finger position over time, and the center of the range may be regarded as the localization position.

また、位置に基づいた処理としては、音像の定位位置だけでなく、特定位置における周波数特性の補正動作に利用してもよい。つまり、指等で指定した位置における周波数特性をフラットにするようにイコライザ等を調整する。このようにすることにより、ユーザの指定した位置において好適な周波数特性で音楽等を聴取することができる。 Further, the processing based on the position may be used not only for the localization position of the sound image but also for correcting the frequency characteristic at the specific position. That is, the equalizer or the like is adjusted so that the frequency characteristic at the position designated by the finger or the like is flat. By doing so, it is possible to listen to music or the like with a suitable frequency characteristic at a position designated by the user.

また、例えば、可動式指向性スピーカを利用してハンズフリーで通話中に、指向性がドライバに向けたものを「音声をあっちに向けて」との音声とともに手で方向を示した場合には、その方向にスピーカの指向性が向くように制御することができる。 In addition, for example, when a mobile directional speaker is used in a hands-free call, if the directionality is directed toward the driver and the direction is indicated by hand along with the voice "turn the voice away" It is possible to control the directivity of the speaker in that direction.

また、プロジェクタ等で例えばフロントガラスに表示している画像を、シートをリクライニングさせて仰向けになった場合には、手で天井を示すことで画像を車内天井面に移動させることができる。または、読書等のため指向性ライトのスポット位置の変更等にも利用できる。これらの位置の指定においては、人差し指ＩＦに限らず他の指であってもよい。例えば、手を右から左に振った場合は、左端として手が検出された位置を指定位置とすればよい。 Further, when the image displayed on the windshield, for example, by a projector or the like is reclined and turned upside down, the image can be moved to the interior ceiling surface by showing the ceiling by hand. Alternatively, it can be used for changing the spot position of a directional light for reading or the like. In specifying these positions, not only the index finger IF but also other fingers may be used. For example, when the hand is swung from the right to the left, the position where the hand is detected as the left end may be set as the designated position.

本実施例によれば、制御部４は、位置認識部４１が点群情報を取得し、点群情報に基づいて、人物が例えば指先によって示した、車室内における位置を認識する。そして、音像制御部４２が位置認識部４１が認識した位置に音像を定位させる。このようにすることにより、位置認識部４１が認識した車室内に存在する人物が指等によって示した位置に基づいて音像を定位させることができる。そのため、車室内等の三次元空間において直接位置の指定をすることができるので、音像を定位させる位置の指定を正確かつ容易にすることができる。 According to the present embodiment, the control unit 4 acquires the point cloud information by the position recognition unit 41, and recognizes the position in the vehicle interior indicated by the person, for example, with the fingertip based on the point cloud information. Then, the sound image control unit 42 localizes the sound image at the position recognized by the position recognition unit 41. In this way, the sound image can be localized based on the position indicated by the finger or the like of the person existing in the vehicle compartment recognized by the position recognition unit 41. Therefore, since the position can be directly specified in a three-dimensional space such as a passenger compartment, the position where the sound image is localized can be specified accurately and easily.

また、位置認識部４１は、ライダ７から点群情報を取得している。このようにすることにより、ライダ７を利用して容易に点群情報を取得することができる。また、物体までの距離を測定することができるので、正確な位置を特定することが容易となる。 The position recognition unit 41 acquires point cloud information from the lidar 7. By doing so, the point cloud information can be easily acquired using the lidar 7. Further, since the distance to the object can be measured, it is easy to specify an accurate position.

また、発話音声を取得し、その内容を認識する音声認識部４３を備え、音像制御部４２は、位置認識部４１で認識した位置と、音声認識部４３で認識した内容と、に基づいて音像を定位させている。このようにすることにより、発話音声の内容も考慮して処理を実行することができるので、より精度良く位置を特定することができる。 In addition, the voice recognition unit 43 that acquires the uttered voice and recognizes the content thereof is provided, and the sound image control unit 42 detects the sound image based on the position recognized by the position recognition unit 41 and the content recognized by the voice recognition unit 43. Is localized. By doing so, the processing can be executed in consideration of the content of the uttered voice, so that the position can be specified with higher accuracy.

次に、本発明の第２の実施例にかかる情報処理装置を図５〜図９を参照して説明する。なお、前述した第１の実施例と同一部分には、同一符号を付して説明を省略する。 Next, an information processing apparatus according to a second embodiment of the present invention will be described with reference to FIGS. The same parts as those in the first embodiment described above are denoted by the same reference numerals and description thereof is omitted.

図５に本実施例にかかる情報処理装置としての制御部４Ａを備えた入力装置の機能的構成を示す。入力装置は、制御部４Ａと、ライダ７と、プロジェクタ８と、を備えている。ライダ７は、第１の実施例と同様である。 FIG. 5 shows a functional configuration of an input device including a control unit 4A as an information processing device according to the present embodiment. The input device includes a control unit 4A, a lidar 7, and a projector 8. The lidar 7 is the same as that of the first embodiment.

プロジェクタ８は、例えば車室内に設置され、フロントガラスや天井面等を表示面として各種操作用の画像等を表示させる。 The projector 8 is installed, for example, in a vehicle interior, and displays images for various operations using a windshield, a ceiling surface, or the like as a display surface.

制御部４Ａは、動作認識部４４と、入力制御部４５と、を備えている。動作認識部４４は、ライダ７が出力した点群情報に基づいて、ライダ７からユーザの身体の一部（例えば指先）までの距離の経時的変化に基づいてジェスチャ等の所定の動作を認識する。 The control unit 4A includes an operation recognition unit 44 and an input control unit 45. The motion recognition unit 44 recognizes a predetermined motion such as a gesture based on a change over time in the distance from the lidar 7 to a part of the user's body (for example, a fingertip) based on the point cloud information output by the lidar 7. .

入力制御部４５は、動作認識部４４で認識された動作に基づいて、プロジェクタ８が表示している画像を変更する。また、入力制御部４５は、動作認識部４４で認識された動作に基づいて、当該動作に応じた機能を実行するために入力装置外部の機器等へ命令等を出力する。 The input control unit 45 changes the image displayed by the projector 8 based on the operation recognized by the operation recognition unit 44. Further, the input control unit 45 outputs a command or the like to a device or the like outside the input device in order to execute a function corresponding to the operation based on the operation recognized by the operation recognition unit 44.

動作認識部４４による動作認識及び入力制御部４５による画像の変更等の例図６を参照して説明する。図６は、プロジェクタ８が表示面Ｄに表示させた操作画像Ｃに対する入力操作の検出の説明図である。 An example of motion recognition by the motion recognition unit 44 and image change by the input control unit 45 will be described with reference to FIG. FIG. 6 is an explanatory diagram of detection of an input operation on the operation image C displayed on the display surface D by the projector 8.

図６に示したように、表示面Ｄには操作画像Ｃが表示されている。操作画像Ｃには、複数のアイコンＩ１〜Ｉ１５が配置されている。各アイコンＩ１〜Ｉ１５は、それぞれが操作されることで、操作されたアイコンに割り当てられている機能が実行される。しかしながら、表示面Ｄがフロントガラスや天井面等であった場合、ユーザの手が届かない場合もあり、そもそも表示面Ｄはタッチパネルではないので、シートに着席しているユーザがタッチ操作することはできない。 As shown in FIG. 6, the operation image C is displayed on the display surface D. In the operation image C, a plurality of icons I1 to I15 are arranged. Each of the icons I1 to I15 is operated to execute a function assigned to the operated icon. However, when the display surface D is a windshield, a ceiling surface, or the like, the user's hand may not reach, and since the display surface D is not a touch panel in the first place, a user sitting on the seat may perform a touch operation. Can not.

そこで、本実施例では、図６に示されたアイコンＩ１〜Ｉ１５の選択操作を、ユーザの手Ｈで行うジェスチャをライダ７が取得する点群情報の経時的変化により認識して行うことで操作画像Ｃを遠隔操作する。まず、操作画像Ｃにおいて、入力制御部４５は、例えば画面中央等の予め定めたデフォルト位置のアイコン（図６ではアイコンＩ８）をハイライト表示して選択状態とする。そして、例えばアイコンＩ８の左側のアイコンＩ７を選択したい場合は手Ｈを左に振る。このとき、ライダ７が検出した点群情報の複数フレームにおける手Ｈの位置の経時的変化を動作認識部４４で検出することにより、手Ｈの移動方向（振られた方向）が判明しジェスチャ等の動作を認識することができる。なお、手Ｈは第１の実施例と同様に点群情報からオブジェクト認識により認識する。 Therefore, in this embodiment, the selection operation of the icons I1 to I15 shown in FIG. 6 is performed by recognizing the gesture performed by the user's hand H by the change over time of the point cloud information acquired by the rider 7. The image C is remotely operated. First, in the operation image C, the input control unit 45 highlights an icon at a predetermined default position, such as the center of the screen (icon I8 in FIG. 6), and selects it. For example, when the user wants to select the icon I7 on the left side of the icon I8, the hand H is shaken to the left. At this time, the movement recognizing unit 44 detects the temporal change in the position of the hand H in a plurality of frames of the point cloud information detected by the lidar 7, thereby determining the moving direction (the shake direction) of the hand H and making a gesture or the like. Can be recognized. Note that the hand H is recognized by object recognition from the point cloud information as in the first embodiment.

動作認識部４４で手Ｈを左に振るジェスチャ等の動作を認識したことにより、入力制御部４５はハイライト表示するアイコンを１つ左にずらす（アイコンＩ７をハイライト表示する）。なお、図６は左側のアイコンをハイライト表示する操作を説明したが、右側や上下のアイコンをハイライト表示する操作も同様にすることができる。 When the motion recognition unit 44 recognizes a motion such as a gesture of shaking the hand H to the left, the input control unit 45 shifts the highlighted icon to the left (highlights the icon I7). In addition, although FIG. 6 demonstrated operation which highlights the icon on the left side, operation which highlights the icon on the right side and the upper and lower sides can be performed similarly.

次に、ハイライト表示したアイコンに割り当てられている機能を実行させる操作について図７を参照して説明する。本実施例では、ハイライト表示したアイコンに割り当てられている機能を実行させる操作としてはクリック操作或いはタップ操作に相当する動作を認識することにより行う。 Next, an operation for executing the function assigned to the highlighted icon will be described with reference to FIG. In this embodiment, the operation assigned to the highlighted icon is performed by recognizing an operation corresponding to a click operation or a tap operation.

クリック操作或いはタップ操作をする際は、図７に示したように、例えば人差し指ＩＦを指の付け根から表示面Ｄに向かって曲げるようにする。つまり、指の腹の部分が表示面Ｄに近づくようにするのが一般的である。本実施例では、この動作をライダ７で検出する。 When the click operation or the tap operation is performed, for example, the index finger IF is bent from the base of the finger toward the display surface D as shown in FIG. That is, it is common that the belly portion of the finger approaches the display surface D. In the present embodiment, this operation is detected by the lidar 7.

図７において、ライダ７は、表示面Ｄの近傍に設置されているものとする。即ち、人物の指先が近づくことで、ライダ７と指先の距離が変化する位置に設置されている。 In FIG. 7, the lidar 7 is assumed to be installed in the vicinity of the display surface D. That is, it is installed at a position where the distance between the rider 7 and the fingertip changes as the fingertip of the person approaches.

図７の場合においては、動作認識部４４が人差し指ＩＦを曲げる前の距離（図７実線ｄ）と人差し指ＩＦを曲げた後の距離（図７破線ｄ’）との変化を検出することによりクリック操作或いはタップ操作に相当する動作を認識する。つまり、クリック操作或いはタップ操作の際には、経時的な変化としてｄ＞ｄ’となることから、このような変化を検出することで、クリック操作或いはタップ操作に相当する動作を認識することができる。 In the case of FIG. 7, the motion recognition unit 44 clicks by detecting a change between the distance before bending the index finger IF (solid line d in FIG. 7) and the distance after bending the index finger IF (broken line d ′ in FIG. 7). Recognize an operation corresponding to an operation or a tap operation. That is, in the case of a click operation or a tap operation, since d> d ′ as a change with time, it is possible to recognize an operation corresponding to the click operation or the tap operation by detecting such a change. it can.

なお、図７の説明はクリック操作やタップ操作であったが、ダブルクリック操作やダブルタップ操作も同様にして認識することができる。つまり、ｄ＞ｄ’の経時的変化を２回連続して検出した場合はダブルクリック操作やダブルタップ操作と認識すればよい。 Although the description of FIG. 7 is a click operation and a tap operation, a double click operation and a double tap operation can be recognized in the same manner. That is, when a change with time of d> d ′ is detected twice in succession, it may be recognized as a double click operation or a double tap operation.

次に、上述した構成の制御部４Ａの動作（情報処理方法）について図８のフローチャートを参照して説明する。また、図８に示したフローチャートを制御部４Ａが有するＣＰＵで実行するプログラムとして構成することで情報処理プログラムとすることができる。 Next, the operation (information processing method) of the control unit 4A configured as described above will be described with reference to the flowchart of FIG. Moreover, it can be set as an information processing program by configuring the flowchart shown in FIG. 8 as a program executed by the CPU of the control unit 4A.

まず、ステップＳ２１において、動作認識部４４は、点群情報をライダ７から取得する。即ち、動作認識部４４は、所定空間内に電磁波を出射し、所定空間内の物体によって反射された電磁波を受信することで物体までの距離を測定可能なセンサ（ライダ７）から距離情報を取得する取得部として機能する。 First, in step S <b> 21, the motion recognition unit 44 acquires point cloud information from the lidar 7. That is, the motion recognition unit 44 obtains distance information from a sensor (lider 7) that can measure the distance to an object by emitting an electromagnetic wave in the predetermined space and receiving the electromagnetic wave reflected by the object in the predetermined space. Functions as an acquisition unit.

次に、ステップＳ２２において、動作認識部４４は、上述したように、ライダ７から取得した点群情報に基づいて、ハイライト表示するアイコンの変更や、クリック操作等の手Ｈ（指）によってなされたジェスチャ等の動作を認識する。即ち、動作認識部４４は、センサから人物の身体の一部（指先）までの距離の経時的変化に基づいて、画像の特定位置（特定のアイコン）に対する操作を認識する認識部として機能する。 Next, in step S22, as described above, the motion recognition unit 44 is performed by a hand H (finger) such as a change of an icon to be highlighted or a click operation based on the point cloud information acquired from the lidar 7. Recognize gestures and other actions. That is, the motion recognition unit 44 functions as a recognition unit that recognizes an operation on a specific position (a specific icon) of an image based on a change with time of a distance from a sensor to a part of a person's body (a fingertip).

次に、ステップＳ２３において、入力制御部４５は、ステップＳ２２で動作認識部４４が認識した動作に応じた処理を実行する。例えば、図６に示したような手を振る動作に応じてハイライト表示するアイコンを変更する。或いは図７に示したようなクリック操作等に応じてクリックされたアイコンに割り当てられている機能を実行する。即ち、入力制御部４５は、認識部の認識結果に基づいて、表示部に表示されている画像に対して、所定の処理を行う処理部として機能する。 Next, in step S23, the input control unit 45 executes processing corresponding to the operation recognized by the operation recognition unit 44 in step S22. For example, the icon to be highlighted is changed in accordance with the waving operation as shown in FIG. Alternatively, the function assigned to the clicked icon is executed in response to a click operation or the like as shown in FIG. That is, the input control unit 45 functions as a processing unit that performs a predetermined process on the image displayed on the display unit based on the recognition result of the recognition unit.

以上の説明から明らかなように、ステップＳ２１が取得工程、ステップＳ２２が認識工程、ステップＳ２３が処理工程として機能する。 As is clear from the above description, step S21 functions as an acquisition process, step S22 functions as a recognition process, and step S23 functions as a processing process.

本実施例によれば、制御部４Ａは、動作認識部４４がライダ７から距離情報を取得し、ライダ７から人物の指までの距離の経時的変化に基づいて、表示面Ｄの表示された操作画像Ｃに対するクリック操作等の動作を認識する。そして、入力制御部４５が動作認識部４４の認識結果に基づいて、表示面Ｄに表示されている操作画像Ｃに対して、クリック操作等に対応する処理を行う。このようにすることにより、タッチパネルや押しボタン等の直接的な入力装置を介さずに、例えば指等の身体の一部の距離の経時的変化に基づいて、操作画像Ｃに対して遠隔入力を容易に行うことができる、つまり操作画像Ｃの操作を容易に行うことができる。 According to the present embodiment, in the control unit 4A, the motion recognition unit 44 acquires the distance information from the rider 7, and the display surface D is displayed based on the change over time of the distance from the rider 7 to the person's finger. An operation such as a click operation on the operation image C is recognized. Then, the input control unit 45 performs processing corresponding to a click operation or the like on the operation image C displayed on the display surface D based on the recognition result of the motion recognition unit 44. By doing so, remote input can be performed on the operation image C based on the change over time of the distance of a part of the body such as a finger without using a direct input device such as a touch panel or a push button. The operation image C can be easily operated.

また、ジェスチャ等の動作で操作することができるので画像に対して直感的な操作をすることが可能となり操作性が向上する。また、ライダ７の検出範囲であれば、いかなる姿勢であっても動作の検出が可能となるので姿勢を問わず操作が可能となる。 In addition, since the operation can be performed by a gesture or the like, an intuitive operation can be performed on the image, and the operability is improved. In addition, since the motion can be detected in any posture within the detection range of the lidar 7, the operation can be performed regardless of the posture.

また、人物の指先が近づくことで距離が変化する位置にライダ７が設置されている。このようにすることにより、カメラ等で撮像された画像では検出が困難な、クリック操作等を容易に検出することができる。人物の指先が近づくような位置にセンサを設置した場合、そのセンサがカメラであると、画像の変化が少なく検出が困難になる場合が多い。一方ライダ７の場合は、距離の変化により検出するので、人物の指先が近づくような位置に設置されても検出が可能となる。また、クリック操作は、指の動きが小さい場合があるので、カメラ等で撮像された画像では動きが捉えられない場合がある。それに対して、ライダ７を利用すれば、指までの距離の情報が得られるので、小さい動きであっても検出が可能である。 A rider 7 is installed at a position where the distance changes as the fingertip of a person approaches. By doing so, it is possible to easily detect a click operation or the like that is difficult to detect in an image captured by a camera or the like. When a sensor is installed at a position where a person's fingertip approaches, if the sensor is a camera, there are many cases in which detection is difficult because there is little change in the image. On the other hand, in the case of the lidar 7, detection is performed based on a change in distance, so that detection is possible even when the lidar 7 is installed at a position where a fingertip of a person approaches. In addition, since the finger movement of the click operation may be small, the movement may not be captured in an image captured by a camera or the like. On the other hand, if the lidar 7 is used, information on the distance to the finger can be obtained, so that even a small movement can be detected.

さらに、ライダ７を利用すると、暗い状態でも認識が可能となる。また、ライダ７のレーザ光は透過するためガラスに映ったものに対して誤認識しない。 Further, when the lidar 7 is used, recognition is possible even in a dark state. Further, since the laser beam of the lidar 7 is transmitted, it is not erroneously recognized for what is reflected on the glass.

なお、第２の実施例では、図６や図７に示したような方法で選択するアイコンの変更やクリック操作等を認識していたが、図９に示したような方法でもよい。図９は、表示面Ｄにおいて選択したいアイコンを人差し指ＩＦで指し示すことで選択する方法の説明図である。 In the second embodiment, the change of the icon to be selected, the click operation, and the like are recognized by the method shown in FIGS. 6 and 7, but the method shown in FIG. 9 may be used. FIG. 9 is an explanatory diagram of a method of selecting an icon to be selected on the display surface D by pointing with the index finger IF.

図９に示した方法は、まず、ライダ７で取得した点群情報から、人差し指ＩＦの先端位置ＩＦ１と根本位置ＩＦ２とを特定する。そして、先端位置ＩＦ１と根本位置ＩＦ２を結び、さらに表示面Ｄへ向けて延長した線Ｌと表示面Ｄとが交差する点Ｐを人差し指ＩＦが指し示す位置と認識する。表示面Ｄは車室内に固定されているので、ライダ７との相対位置は予め求めることが可能である。したがって、ライダ７の位置を原点として、表示面Ｄの四隅の位置座標を予め求めておけば、点Ｐと四隅との相対位置から点Ｐが表示面Ｄ内のどこに位置するかを求めることは可能である。そして、点Ｐの表示面Ｄにおける位置に対応するアイコンが選択されたとしてハイライト表示すればよい。 The method shown in FIG. 9 first specifies the tip position IF1 and the root position IF2 of the index finger IF from the point cloud information acquired by the lidar 7. Then, a point P where the tip position IF1 and the root position IF2 are connected and the line L extended toward the display surface D and the display surface D intersect is recognized as the position indicated by the index finger IF. Since the display surface D is fixed in the passenger compartment, the relative position to the rider 7 can be obtained in advance. Accordingly, if the position coordinates of the four corners of the display surface D are obtained in advance with the position of the lidar 7 as the origin, it is possible to obtain where the point P is located in the display surface D from the relative positions of the point P and the four corners. Is possible. And what is necessary is just to highlight-display that the icon corresponding to the position in the display surface D of the point P was selected.

選択するアイコンが決定した後は、図７に示した方法でクリック操作等を認識してもよいし、図９に示した手Ｈの状態のまま、人差し指ＩＦを表示面Ｄへ近づけるような動作をしてクリック操作としてもよい。 After the icon to be selected is determined, the click operation or the like may be recognized by the method shown in FIG. 7, or the operation of bringing the index finger IF closer to the display surface D while the hand H is shown in FIG. It is good also as clicking operation.

また、本発明は上記実施例に限定されるものではない。即ち、当業者は、従来公知の知見に従い、本発明の骨子を逸脱しない範囲で種々変形して実施することができる。かかる変形によってもなお本発明の情報処理装置を具備する限り、勿論、本発明の範疇に含まれるものである。 Further, the present invention is not limited to the above embodiment. That is, those skilled in the art can implement various modifications in accordance with conventionally known knowledge without departing from the scope of the present invention. Of course, such modifications are included in the scope of the present invention as long as the information processing apparatus of the present invention is provided.

４制御部
４１位置認識部（第１取得部、第１認識部）
４２音像制御部（処理部）
４３音声認識部（第２取得部、第２認識部）
４Ａ制御部
４４動作認識部（取得部、認識部）
４５入力制御部（処理部） 4 control unit 41 position recognition unit (first acquisition unit, first recognition unit)
42 Sound image control unit (processing unit)
43 Voice recognition unit (second acquisition unit, second recognition unit)
4A control unit 44 motion recognition unit (acquisition unit, recognition unit)
45 Input control unit (processing unit)

Claims

A first acquisition unit that acquires information about an object existing in a predetermined space;
A first recognition unit for recognizing a position in the predetermined space indicated by a part of a body of a person existing in the predetermined space based on information on the object;
A processing unit that performs processing based on the position recognized by the first recognition unit;
An information processing apparatus comprising:

The first acquisition unit exists in the predetermined space from a sensor capable of measuring a distance to the object by emitting an electromagnetic wave in the predetermined space and receiving the electromagnetic wave reflected by the object in the predetermined space. The information processing apparatus according to claim 1, wherein information about an object is acquired.

A second acquisition unit for acquiring speech voice;
A second recognition unit for recognizing the content of the uttered voice,
The processing unit performs processing based on the position recognized by the first recognition unit and the content recognized by the second recognition unit.
The information processing apparatus according to claim 1, wherein the information processing apparatus is an information processing apparatus.

4. The method according to claim 1, wherein the processing unit performs a process of localizing a sound image in a predetermined space formed by sounds emitted from a plurality of speakers at a position recognized by the first recognition unit. The information processing apparatus according to claim 1.

An information processing method executed by an information processing apparatus that performs predetermined processing,
A first acquisition step of acquiring information related to an object existing in the predetermined space;
A first recognition step for recognizing a position in the predetermined space indicated by a part of a body of a person existing in the predetermined space based on information on the object;
A processing step for performing processing based on the position recognized in the first recognition step;
An information processing method comprising:

An information processing program for causing an information processing method according to claim 5 to be executed by a computer.