JP2024013956A

JP2024013956A - Information processing method, information processing device and program

Info

Publication number: JP2024013956A
Application number: JP2022116437A
Authority: JP
Inventors: 誠柿沼; コーティサックスマン; 斎藤　親
Original assignee: Exa Wizards Inc
Current assignee: Exa Wizards Inc
Priority date: 2022-07-21
Filing date: 2022-07-21
Publication date: 2024-02-01
Anticipated expiration: 2042-07-21
Also published as: JP7462869B2; JP2024014658A

Abstract

[Problem] To record a photographed person who is away from a camera at an appropriate timing.
[Solution]
An information processing method according to an embodiment is an information processing method executed by an information processing apparatus including a display and a camera that photographs the display side, and includes a process of acquiring a moving image photographed by the camera; The method includes a process of estimating a skeleton of the photographed person included in the video, a process of determining whether the estimated skeleton satisfies a start condition, and a process of starting recording when the start condition is satisfied. .
[Selection diagram] Figure 5

Description

本発明は、情報処理方法、情報処理装置及びプログラムに関する。 The present invention relates to an information processing method, an information processing device, and a program.

近年、運動中の動画を撮影してＡＩで解析するサービスが広がりつつある。このようなバービスを実現する技術として、引用文献１には、被検者の動画像を取得する画像取得部と、前記動画像から被検者の体勢を画像認識する所定の体勢認識方法により被検者の複数の身体特徴点を推定する特徴推定部と、所定の基準部位の実世界における長さである基準長さを記憶する基準記憶部と、前記推定された複数の身体特徴点から定まる前記基準部位に相当する画像上の距離と前記基準長さとの比に基づき、前記被検者の動作の評価に用いる値として、前記複数の身体特徴点の間の画像上の距離から被検者の動作状態を示す値を求める動作解析部と、前記動作状態を示す値を出力する出力部と、を備えることを特徴とする動作状態評価システムが開示されている。 In recent years, services that record videos of people exercising and analyze them using AI are becoming more popular. As a technology for realizing such a barbis, Cited Document 1 discloses an image acquisition section that acquires a moving image of the subject, and a predetermined posture recognition method that recognizes the posture of the subject from the moving image. a feature estimation unit that estimates a plurality of body feature points of the examiner; a reference storage unit that stores a reference length that is a length in the real world of a predetermined reference part; and a reference length that is determined from the estimated plurality of body feature points. Based on the ratio of the distance on the image corresponding to the reference part and the reference length, the distance on the image between the plurality of body feature points is used as a value to be used for evaluating the movement of the subject. An operating state evaluation system is disclosed, which includes a motion analysis section that obtains a value indicative of the operating state, and an output section that outputs the value indicative of the operating state.

特許第６７０３１９９号Patent No. 6703199

このようなサービスで利用される動画は、運動中の被撮影者の全身が写っていることが必要な場合が多い。そして、カメラで全身を写すためには、被撮影者がカメラからある程度離れる必要がある。 Videos used in such services often need to show the whole body of the person being photographed while exercising. In order to take a picture of the whole body with a camera, the person being photographed needs to be a certain distance away from the camera.

一方、ＡＩで解析する動画は、余計なシーンが写っていないのが好ましい。このため、適切なタイミングで録画を開始及び終了することが重要である。 On the other hand, it is preferable that videos analyzed by AI do not include unnecessary scenes. Therefore, it is important to start and end recording at appropriate timings.

カメラから離れた位置から適切なタイミングで録画を開始及び終了することは困難であるため、被撮影者が適切な動画を自分で録画するのは難しかった。 Since it is difficult to start and stop recording at an appropriate timing from a position away from the camera, it has been difficult for the person being photographed to record an appropriate video by themselves.

本発明は、上記の課題を鑑みてなされたものであり、カメラから離れた被撮影者を適切なタイミングで録画することを目的とする。 The present invention has been made in view of the above-mentioned problems, and an object of the present invention is to record a photographed person who is away from the camera at an appropriate timing.

一実施形態に係る情報処理方法は、ディスプレイと、前記ディスプレイ側を撮影するカメラと、を備えた情報処理装置が実行する情報処理方法であって、前記カメラが撮影した動画を取得する処理と、前記動画に含まれる被撮影者の骨格を推定する処理と、推定された前記骨格が開始条件を満たしたか判定する処理と、前記開始条件が満たされた場合、録画を開始する処理と、を含む。 An information processing method according to an embodiment is an information processing method executed by an information processing apparatus including a display and a camera that photographs the display side, and includes a process of acquiring a moving image photographed by the camera; The method includes a process of estimating a skeleton of the photographed person included in the video, a process of determining whether the estimated skeleton satisfies a start condition, and a process of starting recording when the start condition is satisfied. .

一実施形態によれば、カメラから離れた被撮影者を適切なタイミングで録画することができる。 According to one embodiment, a person to be photographed who is away from the camera can be recorded at an appropriate timing.

情報処理装置のハードウェア構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a hardware configuration of an information processing device. 情報処理装置の機能構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a functional configuration of an information processing device. 情報処理装置が実行する処理の一例を示すフローチャートである。3 is a flowchart illustrating an example of processing executed by the information processing device. 待機画面Ｓｃ１の一例を示す図である。It is a figure showing an example of standby screen Sc1. 待機画面Ｓｃ１の一例を示す図である。It is a figure showing an example of standby screen Sc1. カウントダウン画面Ｓｃ２の一例を示す図である。It is a figure showing an example of countdown screen Sc2. 録画画面Ｓｃ３の一例を示す図である。It is a figure showing an example of recording screen Sc3. 録画画面Ｓｃ３の一例を示す図である。It is a figure showing an example of recording screen Sc3. 録画画面Ｓｃ３の一例を示す図である。It is a figure showing an example of recording screen Sc3.

以下、本発明の各実施形態について、添付の図面を参照しながら説明する。なお、各実施形態に係る明細書及び図面の記載に関して、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複した説明を省略する。 Hereinafter, each embodiment of the present invention will be described with reference to the accompanying drawings. Note that in the descriptions of the specifications and drawings related to each embodiment, the same reference numerals are given to the constituent elements having substantially the same functional configuration to omit redundant explanation.

＜撮影装置１の概要＞
まず、本実施形態に係る撮影装置１の概要について説明する。本実施形態に係る撮影装置１は、ディスプレイ及びカメラを備えた情報処理装置である。撮影装置１は、例えば、ＰＣ（Personal Computer）、スマートフォン、タブレット端末又はデジタルカメラであるが、これに限られない。撮影装置１は、ユーザの運動中の動画を、適切なタイミングで自動的に撮影する。ここでいう運動は、例えば、歩行、走行、水泳、ダンス、トレーニング、ピラティス、ヨガ又は任意のスポーツのフォームであるが、これに限られない。 <Overview of photographing device 1>
First, an overview of the photographing device 1 according to the present embodiment will be explained. The photographing device 1 according to this embodiment is an information processing device including a display and a camera. The photographing device 1 is, for example, a PC (Personal Computer), a smartphone, a tablet terminal, or a digital camera, but is not limited thereto. The photographing device 1 automatically photographs a video of a user exercising at an appropriate timing. Exercises here include, but are not limited to, walking, running, swimming, dancing, training, Pilates, yoga, or any form of sports.

この撮影装置１を利用することで、ユーザは、自分が運動中の動画であって、ＡＩによる解析に適した動画を、自分で撮影（自撮り）することができる。本明細書において、ユーザは、動画の被撮影者に相当する。 By using this photographing device 1, the user can photograph (self-portrait) a video of himself/herself exercising that is suitable for analysis by AI. In this specification, a user corresponds to a person being photographed in a video.

より詳細には、ユーザは、ディスプレイを自分に向けた状態で撮影装置１を設置し、ディスプレイ側を撮影するカメラを起動し、自分の運動がカメラに写る位置まで移動する。そうすると、撮影装置１は、カメラに写ったユーザ（被撮影者）の骨格情報に基づいて、録画を開始し、終了する。また、撮影装置１は、録画の開始タイミングや終了タイミングがわかるように、ディスプレイに所定の画面を表示する。ユーザは、ディスプレイに表示された画面にしたがって、運動を開始し、終了する。これにより、運動中以外の余計なシーンが写っていない、ユーザの運動中の動画が撮影される。 More specifically, the user installs the photographing device 1 with the display facing toward himself, activates the camera that photographs the display side, and moves to a position where his movements are captured on the camera. Then, the photographing device 1 starts and ends recording based on the skeletal information of the user (photographed person) captured by the camera. The photographing device 1 also displays a predetermined screen on the display so that the user can know the start timing and end timing of recording. The user starts and finishes exercise according to the screen displayed on the display. As a result, a video of the user exercising, which does not include unnecessary scenes other than the exercise, is captured.

こうして撮影された動画は、ＡＩにより被撮影者の運動を解析する任意の動画解析サービスにおける、解析対象の動画として利用できる。例えば、歩行解析サービスを利用して歩行中の動画を解析することで、動画から自分の歩行能力を評価することができる。 The video shot in this way can be used as a video to be analyzed in any video analysis service that uses AI to analyze the motion of the person being photographed. For example, by analyzing a video of a person walking using a gait analysis service, it is possible to evaluate one's own walking ability from the video.

＜情報処理装置のハードウェア構成＞
次に、情報処理装置１００のハードウェア構成について説明する。図１は、情報処理装置１００のハードウェア構成の一例を示す図である。図１に示すように、情報処理装置１００は、バスＢを介して相互に接続された、プロセッサ１０１と、メモリ１０２と、ストレージ１０３と、通信Ｉ／Ｆ１０４と、入出力Ｉ／Ｆ１０５と、ドライブ装置１０６と、を備える。 <Hardware configuration of information processing device>
Next, the hardware configuration of the information processing device 100 will be explained. FIG. 1 is a diagram illustrating an example of the hardware configuration of the information processing apparatus 100. As shown in FIG. 1, the information processing device 100 includes a processor 101, a memory 102, a storage 103, a communication I/F 104, an input/output I/F 105, and a drive, which are interconnected via a bus B. A device 106 is provided.

プロセッサ１０１は、ストレージ１０３に記憶されたＯＳ（Operating System）を含む各種のプログラムをメモリ１０２に展開して実行することにより、情報処理装置１００の各構成を制御し、情報処理装置１００の機能を実現する。プロセッサ１０１は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＧＰＵ（Graphics Processing Unit）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＤＳＰ（Digital Signal Processor）、又はこれらの組み合わせである。 The processor 101 controls each configuration of the information processing device 100 and operates the functions of the information processing device 100 by loading various programs including an OS (Operating System) stored in the storage 103 into the memory 102 and executing them. Realize. The processor 101 is, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), an ASIC (Application Specific Integrated Circuit), a DSP (Digital Signal Processor), or a combination thereof.

メモリ１０２は、例えば、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、又はこれらの組み合わせである。ＲＯＭは、例えば、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable Programmable ROM）、ＥＥＰＲＯＭ（Electrically Erasable Programmable ROM）、又はこれらの組み合わせである。ＲＡＭは、例えば、ＤＲＡＭ（Dynamic RAM）、ＳＲＡＭ（Static RAM）、又はこれらの組み合わせである。 The memory 102 is, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), or a combination thereof. The ROM is, for example, a PROM (Programmable ROM), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), or a combination thereof. The RAM is, for example, DRAM (Dynamic RAM), SRAM (Static RAM), or a combination thereof.

ストレージ１０３は、ＯＳを含む各種のプログラム及びデータを記憶する。ストレージ１０３は、例えば、フラッシュメモリ、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、ＳＣＭ（Storage Class Memories）、又はこれらの組み合わせである。 The storage 103 stores various programs and data including an OS. The storage 103 is, for example, a flash memory, an HDD (Hard Disk Drive), an SSD (Solid State Drive), an SCM (Storage Class Memories), or a combination thereof.

通信Ｉ／Ｆ１０４は、情報処理装置１００を、ネットワークを介して外部装置に接続し、通信を制御するためのインタフェースである。ネットワークは、例えば、有線ＬＡＮ（Local Area Network）、無線ＬＡＮ、インターネット、公衆回線網、モバイルデータ通信網、又はこれらの組み合わせである。通信Ｉ／Ｆ１０４は、例えば、Ｂｌｕｅｔｏｏｔｈ（登録商標）、Ｗｉ－Ｆｉ（登録商標）、ＺｉｇＢｅｅ（登録商標）、Ｅｔｈｅｒｎｅｔ（登録商標）、又は光通信に準拠したアダプタであるが、これに限られない。 The communication I/F 104 is an interface for connecting the information processing device 100 to an external device via a network and controlling communication. The network is, for example, a wired LAN (Local Area Network), a wireless LAN, the Internet, a public network, a mobile data communication network, or a combination thereof. The communication I/F 104 is, for example, an adapter compliant with Bluetooth (registered trademark), Wi-Fi (registered trademark), ZigBee (registered trademark), Ethernet (registered trademark), or optical communication, but is not limited thereto. .

入出力Ｉ／Ｆ１０５は、情報処理装置１００に入力装置１０７及び出力装置１０８を接続するためのインタフェースである。入力装置１０７は、例えば、マウス、キーボード、タッチパネル、マイク、スキャナ、カメラ、各種センサ、操作ボタン、又はこれらの組み合わせである。出力装置１０８は、例えば、ディスプレイ、プロジェクタ、プリンタ、スピーカ、バイブレータ、又はこれらの組み合わせである。撮影装置１は、ディスプレイと、ディスプレイ側を撮影するカメラと、を少なくとも備える。 The input/output I/F 105 is an interface for connecting the input device 107 and the output device 108 to the information processing device 100. The input device 107 is, for example, a mouse, keyboard, touch panel, microphone, scanner, camera, various sensors, operation buttons, or a combination thereof. Output device 108 is, for example, a display, a projector, a printer, a speaker, a vibrator, or a combination thereof. The photographing device 1 includes at least a display and a camera for photographing the display side.

ドライブ装置１０６は、ディスクメディア１０９のデータを読み書きする。ドライブ装置１０６は、例えば、磁気ディスクドライブ、光学ディスクドライブ、光磁気ディスクドライブ、又はこれらの組み合わせである。ディスクメディア１０９は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ＦＤ（Floppy Disk）、ＭＯ（Magneto-Optical disk）、ＢＤ（Blu-ray（登録商標） Disc）、又はこれらの組み合わせである。 The drive device 106 reads and writes data on the disk medium 109. Drive device 106 is, for example, a magnetic disk drive, an optical disk drive, a magneto-optical disk drive, or a combination thereof. The disk media 109 is, for example, a CD (Compact Disc), a DVD (Digital Versatile Disc), an FD (Floppy Disk), an MO (Magneto-Optical disk), a BD (Blu-ray (registered trademark) Disc), or a combination thereof. It is.

なお、本実施形態において、プログラムは、情報処理装置１００の製造段階でメモリ１０２又はストレージ１０３に書き込まれてもよいし、ネットワークを介して情報処理装置１００に提供されてもよいし、ディスクメディア１０９などの非一時的でコンピュータ読み取り可能な記録媒体を介して情報処理装置１００に提供されてもよい。 Note that in this embodiment, the program may be written into the memory 102 or the storage 103 at the manufacturing stage of the information processing apparatus 100, may be provided to the information processing apparatus 100 via a network, or may be provided to the information processing apparatus 100 via the disk medium 109. The information may be provided to the information processing apparatus 100 via a non-transitory computer-readable recording medium such as.

＜撮影装置１の機能構成＞
次に、撮影装置１の機能構成について説明する。図２は、撮影装置１の機能構成の一例を示す図である。図２に示すように、撮影装置１は、通信部１１と、記憶部１２と、制御部１３と、を備える。 <Functional configuration of photographing device 1>
Next, the functional configuration of the photographing device 1 will be explained. FIG. 2 is a diagram showing an example of the functional configuration of the photographing device 1. As shown in FIG. As shown in FIG. 2, the photographing device 1 includes a communication section 11, a storage section 12, and a control section 13.

通信部１１は、通信Ｉ／Ｆ１０４により実現される。通信部１１は、ネットワークを介して、外部装置との間で情報の送受信を行う。通信部１１は、例えば、動画ファイル１２４を動画解析サービスに送信したり、動画解析サービスから解析結果を受信したりする。 The communication unit 11 is realized by the communication I/F 104. The communication unit 11 transmits and receives information to and from an external device via the network. For example, the communication unit 11 transmits the video file 124 to a video analysis service and receives analysis results from the video analysis service.

記憶部１２は、メモリ１０２及びストレージ１０３により実現される。記憶部１２は、撮影動画１２１と、骨格情報１２２と、条件情報１２３と、動画ファイル１２４と、を記憶する。 The storage unit 12 is realized by a memory 102 and a storage 103. The storage unit 12 stores a captured video 121, skeleton information 122, condition information 123, and a video file 124.

撮影動画１２１は、カメラが撮影した動画である。撮影動画１２１は、記憶部１２に一時的に保存される。 The captured video 121 is a video captured by a camera. The captured video 121 is temporarily stored in the storage unit 12.

骨格情報１２２は、撮影動画１２１に写った被撮影者の骨格に関する情報である。骨格情報１２２は、予め設計された１又は複数の骨格の種類及び座標を示す情報を含む。骨格情報は、撮影動画１２１のフレームごとに記憶される。なお、骨格は、任意に設計可能である。 The skeletal information 122 is information regarding the skeletal structure of the person photographed in the photographed video 121. The skeleton information 122 includes information indicating the type and coordinates of one or more skeletons designed in advance. Skeletal information is stored for each frame of the captured video 121. Note that the skeleton can be designed arbitrarily.

条件情報１２３は、録画（動画ファイル１２４の生成）の開始条件及び終了条件に関する情報である。開始条件及び終了条件は、骨格情報１２２に基づいて設定される。開始条件及び終了条件について、詳しくは後述する。 Condition information 123 is information regarding start conditions and end conditions for recording (generation of video file 124). The start condition and end condition are set based on the skeleton information 122. The start condition and end condition will be described in detail later.

動画ファイル１２４は、撮影装置１により生成された、被撮影者（ユーザ）の運動中の動画のファイルである。撮影動画１２１のうち、録画開始から録画終了までの部分が、１つの動画ファイル１２４として保存される。 The video file 124 is a file of a video of a person to be photographed (user) exercising, which is generated by the photographing device 1 . A portion of the captured video 121 from the start of recording to the end of recording is saved as one video file 124.

制御部１３は、プロセッサ１０１がメモリ１０２からプログラムを読み出して実行し、他のハードウェア構成と協働することにより実現される。制御部１３は、撮影装置１の動作全体を制御する。制御部１３は、取得部１３１と、推定部１３２と、判定部１３３と、表示制御部１３４と、撮影制御部１３５と、動画生成部１３６と、を備える。 The control unit 13 is realized by the processor 101 reading a program from the memory 102, executing it, and cooperating with other hardware configurations. The control unit 13 controls the entire operation of the photographing device 1. The control unit 13 includes an acquisition unit 131, an estimation unit 132, a determination unit 133, a display control unit 134, a shooting control unit 135, and a video generation unit 136.

取得部１２１は、撮影を開始したカメラから動画を取得する。取得部１２１は、取得した動画を撮影動画１２１として記憶部１２に保存する。 The acquisition unit 121 acquires a moving image from the camera that has started shooting. The acquisition unit 121 stores the acquired video in the storage unit 12 as a captured video 121.

推定部１３２は、骨格推定（姿勢推定）モデルを利用して、撮影動画１２１の各フレーム（画像）に含まれる被撮影者の骨格を推定する。骨格推定モデルとして、ＯｐｅｎＰｏｓｅなどの既存の任意のモデルを利用できる。推定部１３２は、画像から被撮影者を検出し、検出した被撮影者の骨格を推定してもよいし、画像からダイレクトに骨格を推定してもよい。推定部１３２は、推定した骨格に関する情報を骨格情報１２２として記憶部１２に保存する。 The estimation unit 132 estimates the skeleton of the photographed person included in each frame (image) of the photographed video 121 using a skeleton estimation (posture estimation) model. Any existing model such as OpenPose can be used as the skeleton estimation model. The estimation unit 132 may detect the photographed person from the image and estimate the skeleton of the detected photographed person, or may directly estimate the skeleton from the image. The estimation unit 132 stores information regarding the estimated skeleton in the storage unit 12 as skeleton information 122.

判定部１３３は、骨格情報１２２及び条件情報１２３に基づいて、録画の開始条件及び終了条件が満たされたか判定する。開始条件及び終了条件について、詳しくは後述する。 The determining unit 133 determines whether the recording start condition and end condition are satisfied based on the skeleton information 122 and the condition information 123. The start condition and end condition will be described in detail later.

表示制御部１３４は、ディスプレイに表示される画面を制御する。 The display control unit 134 controls the screen displayed on the display.

撮影制御部１３５は、カメラによる撮影を制御する。 The photographing control unit 135 controls photographing by the camera.

動画生成部１３６は、撮影動画１２１から動画ファイル１２４を生成する。 The video generation unit 136 generates a video file 124 from the captured video 121.

なお、撮影装置１の機能構成は、上記の例に限られない。例えば、撮影装置１は、上記の機能構成の一部を備え、残りの機能構成を、ネットワークを介して接続された外部装置が備えてもよい。また、撮影装置１は、上記以外の機能構成を備えてもよい。また、撮影装置１の各機能構成は、上記の通り、ソフトウェアにより実現されてもよいし、ＩＣチップ、ＳｏＣ（System on Chip）、ＬＳＩ（Large Scale Integration）、マイクロコンピュータ等のハードウェアによって実現されてもよい。 Note that the functional configuration of the photographing device 1 is not limited to the above example. For example, the photographing device 1 may include a part of the above functional configuration, and the remaining functional configuration may be provided by an external device connected via a network. Further, the photographing device 1 may have functional configurations other than those described above. Further, as described above, each functional configuration of the imaging device 1 may be realized by software, or by hardware such as an IC chip, SoC (System on Chip), LSI (Large Scale Integration), or microcomputer. It's okay.

＜撮影装置１が実行する処理＞
次に、本実施形態に係る撮影装置１が実行する処理について説明する。図３は、撮影装置１が実行する処理の一例を示すフローチャートである。以下では、撮影装置１がスマートフォンであり、撮影装置１により、カメラに向かって所定距離だけ歩行する被撮影者の動画を撮影する場合を例に説明する。 <Processing executed by photographing device 1>
Next, processing executed by the photographing device 1 according to this embodiment will be described. FIG. 3 is a flowchart illustrating an example of a process executed by the photographing device 1. In the following, an example will be described in which the photographing device 1 is a smartphone and the photographing device 1 photographs a video of a person to be photographed walking a predetermined distance toward the camera.

（ステップＳ１０１）
まず、ユーザは、撮影装置１を三脚などに設置し、撮影装置１を操作して、撮影装置１のディスプレイ側を撮影するカメラを起動する。撮影制御部１３５は、ユーザの操作に応じて、ディスプレイ側のカメラによる撮影を開始する（ステップＳ１０１）。なお、この時点では、録画はされない。 (Step S101)
First, the user sets the photographing device 1 on a tripod or the like, operates the photographing device 1, and activates a camera that photographs the display side of the photographing device 1. The photographing control unit 135 starts photographing using the camera on the display side in response to the user's operation (step S101). Note that no recording is performed at this point.

（ステップＳ１０２）
カメラによる撮影が開始されると、表示制御部１３４は、ディスプレイに待機画面Ｓｃ１を表示する（ステップＳ１０２）。待機画面Ｓｃ１は、録画の開始条件が満たされるのを待機する間に表示される画面である。待機画面Ｓｃ１は、撮影が開始されてから録画が開始されるまで表示される画面である。 (Step S102)
When the camera starts photographing, the display control unit 134 displays a standby screen Sc1 on the display (step S102). The standby screen Sc1 is a screen that is displayed while waiting for the recording start condition to be met. The standby screen Sc1 is a screen that is displayed from the start of shooting until the start of recording.

図４は、待機画面Ｓｃ１の一例を示す図である。図４の待機画面Ｓｃ１は、終了ボタンＢ１と、撮影動画表示領域Ａ１と、待機表示領域Ａ２と、を有する。 FIG. 4 is a diagram showing an example of the standby screen Sc1. The standby screen Sc1 in FIG. 4 includes an end button B1, a captured video display area A1, and a standby display area A2.

終了ボタンＢ１は、カメラによる撮影を終了させ、待機画面Ｓｃ１の表示を終了させるためのボタンである。ユーザが終了ボタンＢ１を選択すると、カメラによる撮影を終了し、例えば、アプリのホーム画面などがディスプレイに表示される。図４の例では、終了ボタンＢ１は、待機表示領域Ａ２の左上に配置されているが、終了ボタンＢ１の位置は任意である。 The end button B1 is a button for ending photographing by the camera and ending the display of the standby screen Sc1. When the user selects the end button B1, the photographing by the camera ends and, for example, the home screen of the application is displayed on the display. In the example of FIG. 4, the end button B1 is arranged at the upper left of the standby display area A2, but the position of the end button B1 is arbitrary.

撮影動画表示領域Ａ１は、撮影動画１２１をリアルタイムで表示する領域である。図４の例では、ディスプレイの外周部を除く部分が撮影動画表示領域Ａ１となっており、撮影動画表示領域Ａ１には、突き当たりにドアが設置された廊下が表示されている。 The photographed video display area A1 is an area for displaying the photographed video 121 in real time. In the example of FIG. 4, the portion of the display excluding the outer periphery is the photographed video display area A1, and the photographed video display area A1 displays a hallway with a door installed at the end.

待機表示領域Ａ２は、待機画面Ｓｃ１が表示されていること、すなわち、まだ録画が開始されていないことを、撮影装置１から離れたユーザにわかりやすく表示するための領域である。図４の例では、撮影動画表示領域Ａ１の外周を囲む部分が単色の待機表示領域Ａ２となっている。待機表示領域Ａ２は、例えば、明るい緑であるが、これに限られない。また、待機表示領域Ａ２は、単色でなくてもよいし、点滅表示などがされてもよい。歩行動画を撮影するためにディスプレイ側の離れた位置に移動したユーザは、ディスプレイに表示された待機表示領域Ａ２を見ることで、まだ録画が開始されていないことを容易に把握することができる。 The standby display area A2 is an area for clearly displaying to a user who is away from the photographing device 1 that the standby screen Sc1 is being displayed, that is, that recording has not yet started. In the example of FIG. 4, a portion surrounding the outer periphery of the captured video display area A1 is a monochrome standby display area A2. The standby display area A2 is, for example, bright green, but is not limited to this. Further, the standby display area A2 does not have to be a single color, or may be displayed in a blinking manner. A user who has moved to a remote position on the display side to capture a walking video can easily understand that recording has not yet started by looking at the standby display area A2 displayed on the display.

（ステップＳ１０３）
一方で、カメラによる撮影が開始されると、取得部１３１は、カメラから動画をリアルタイムで取得し、取得した動画を撮影動画１２１として記憶部１２に保存する（ステップＳ１０３）。表示制御部１３４は、こうして随時保存される撮影動画１２１を撮影動画表示領域Ａ１に表示する。取得部１３１は、カメラによる撮影が終了するまで動画の取得を継続する。 (Step S103)
On the other hand, when photography by the camera is started, the acquisition unit 131 acquires a video from the camera in real time, and stores the acquired video in the storage unit 12 as the captured video 121 (step S103). The display control unit 134 displays the captured video 121, which is thus saved from time to time, in the captured video display area A1. The acquisition unit 131 continues to acquire moving images until the camera finishes shooting.

（ステップＳ１０４）
次に、推定部１３２は、骨格推定モデルを利用して、最新の撮影動画１２１から、ユーザ（被撮影者）の骨格を推定する（ステップＳ１０４）。推定部１３２は、推定の結果として得られた骨格に関する情報を、推定の対象となったフレームと対応づけて、骨格情報１２２として記憶部１２に保存する。推定部１３２は、録画が終了するまで骨格の推定を継続する。 (Step S104)
Next, the estimation unit 132 estimates the skeleton of the user (photographed person) from the latest captured video 121 using the skeleton estimation model (step S104). The estimating unit 132 stores information regarding the skeleton obtained as a result of the estimation in the storage unit 12 as skeleton information 122 in association with the frame targeted for estimation. The estimating unit 132 continues estimating the skeleton until recording ends.

図５は、待機画面Ｓｃ１の一例を示す図である。図５の例では、歩行動画を撮影するために、ユーザが廊下の突き当たりに移動しており、突き当たりにいるユーザが映った撮影動画１２１が撮影動画表示領域Ａ１に表示されている。また、撮影動画１２１から推定されたユーザ（被撮影者）の骨格が、撮影動画１２１に重畳して表示されている。 FIG. 5 is a diagram showing an example of the standby screen Sc1. In the example of FIG. 5, the user is moving to the end of a hallway to take a walking video, and a shot video 121 showing the user at the end is displayed in the shot video display area A1. Further, the skeleton of the user (person to be photographed) estimated from the photographed video 121 is displayed superimposed on the photographed video 121.

（ステップＳ１０５）
続いて、判定部１３３は、骨格情報１２２に基づいて、録画の開始条件が満たされたか判定する（ステップＳ１０５）。判定部１３３は、録画の開始条件が満たされるまで判定を継続する（ステップＳ１０５：ＮＯ）。録画の開始条件が満たされた場合（ステップＳ１０５：ＹＥＳ）、処理はステップＳ１０６に進む。 (Step S105)
Subsequently, the determination unit 133 determines whether the recording start condition is satisfied based on the skeleton information 122 (step S105). The determination unit 133 continues the determination until the recording start condition is satisfied (step S105: NO). If the recording start condition is satisfied (step S105: YES), the process proceeds to step S106.

ここで、開始条件について、詳しく説明する。上述の通り、開始条件は、ユーザ（被撮影者）の骨格に基づいて設定される。より詳細には、開始条件は、ユーザ（被撮影者）の骨格の大きさに基づいて設定される。ユーザ（被撮影者）の骨格の大きさは、撮影装置１に対するユーザ（被撮影者）の距離に相当する。 Here, the starting conditions will be explained in detail. As described above, the start condition is set based on the user's (photographed person's) skeleton. More specifically, the start condition is set based on the size of the user's (photographed person's) skeleton. The size of the skeleton of the user (person being photographed) corresponds to the distance of the user (person being photographed) with respect to the photographing device 1 .

開始条件は、例えば、ディスプレイの高さＨに対するユーザ（被撮影者）の全身の骨格の高さｈ１（大きさ）の割合が閾値以下である期間が、所定時間（所定フレーム）継続することである。閾値及び所定時間は任意に設定可能である。ここで、開始条件が、ディスプレイの高さＨに対するユーザ（被撮影者）の全身の骨格の高さｈ１の割合が５０％以下である期間が１秒継続することであった場合について考える。 For example, the start condition is that a period in which the ratio of the height h1 (size) of the user's (photographed person's) whole body skeleton to the height H of the display is equal to or less than a threshold continues for a predetermined period of time (predetermined frames). be. The threshold value and the predetermined time can be set arbitrarily. Here, let us consider a case where the starting condition is that the period in which the ratio of the height h1 of the user's (photographed person's) whole body skeleton to the height H of the display is 50% or less continues for 1 second.

この場合、判定部１３３は、まず、骨格情報１２２に基づいて、各フレームに対するユーザの全身の骨格の高さｈ１を計算する。全身の骨格の高さｈ１は、例えば、全身の骨格の座標のうち、一番上の座標と一番下の座標との差である（図５参照）。 In this case, the determination unit 133 first calculates the height h1 of the user's entire body skeleton for each frame based on the skeleton information 122. The height h1 of the whole body skeleton is, for example, the difference between the top coordinate and the bottom coordinate among the coordinates of the whole body skeleton (see FIG. 5).

次に、判定部１３３は、予め設定されたディスプレイの高さＨに対する、高さｈ１の割合（ｈ１／Ｈ）を計算し、その割合が５０％以下であるか判定する。判定部１３３は、割合が５０％以下である期間が１秒（撮影動画が３０ｆｐｓである場合、３０フレーム）継続した場合、開始条件が満たされたと判定する。 Next, the determination unit 133 calculates the ratio (h1/H) of the height h1 to the preset height H of the display, and determines whether the ratio is 50% or less. The determining unit 133 determines that the start condition is satisfied if the period in which the ratio is 50% or less continues for 1 second (30 frames if the captured video is 30 fps).

これにより、図５の例にように、撮影装置１からユーザ（被撮影者）が所定の距離だけ離れた場合に録画を開始することができる。言い換えると、ユーザが撮影装置１から所定の距離だけ離れるまでの移動期間を録画しないようにすることができる。 Thereby, as in the example of FIG. 5, recording can be started when the user (person to be photographed) moves away from the photographing device 1 by a predetermined distance. In other words, it is possible to avoid recording the movement period until the user leaves the photographing device 1 by a predetermined distance.

なお、開始条件は上記の例に限られない。例えば、高さｈ１は、ユーザ（被撮影者）の全身ではなく、一部（例えば、上半身）の高さであってもよい。また、高さＨは、ディスプレイの高さではなく、撮影動画表示領域Ａ１又は待機表示領域Ａ２の高さであってもよい。また、高さｈ１の割合と閾値を比較する代わりに、高さｈ１と閾値を比較してもよい。 Note that the starting conditions are not limited to the above example. For example, the height h1 may be the height of a part (for example, the upper body) of the user (person to be photographed) instead of the whole body. Further, the height H may be the height of the photographed video display area A1 or the standby display area A2 instead of the height of the display. Furthermore, instead of comparing the proportion of the height h1 and the threshold value, the height h1 and the threshold value may be compared.

また、開始条件は、ディスプレイの幅Ｗに対するユーザ（被撮影者）の全身の骨格の幅ｗ（大きさ）の割合が閾値以下である期間が、所定時間（所定フレーム）継続することであってもよい。幅ｗは、ユーザ（被撮影者）の全身ではなく、一部（例えば、上半身）の幅であってもよい。また、幅Ｗは、ディスプレイの幅ではなく、撮影動画表示領域Ａ１又は待機表示領域Ａ２の幅であってもよい。また、幅ｗの割合と閾値を比較する代わりに、幅ｗと閾値を比較してもよい。 Further, the start condition is that the period in which the ratio of the width w (size) of the user's (photographed person's) whole body skeleton to the width W of the display is equal to or less than a threshold value continues for a predetermined period of time (predetermined frames). Good too. The width w may be the width of a part (for example, the upper body) of the user (person being photographed) rather than the whole body. Furthermore, the width W may be the width of the photographed video display area A1 or the standby display area A2 instead of the width of the display. Further, instead of comparing the proportion of the width w and the threshold value, the width w and the threshold value may be compared.

また、開始条件は、ディスプレイ上におけるユーザ（被撮影者）の位置に基づいて設定されてもよい。例えば、開始条件として、人検出により検出されたユーザ（被撮影者）の座標の範囲を設定することが考えられる。これにより、ディスプレイに対してユーザ（被撮影者）が所定の位置にいる場合に録画を開始することができる。 Further, the start condition may be set based on the position of the user (person to be photographed) on the display. For example, as a starting condition, it is possible to set a range of coordinates of a user (person to be photographed) detected by human detection. Thereby, recording can be started when the user (person to be photographed) is at a predetermined position with respect to the display.

なお、開始条件は、上記の例に限られない。開始条件は、高さｈ１、幅ｗ及びユーザの位置のうち２つ以上の組み合わせに基づいて設定されてもよいし、その他の条件を組み合わせて設定されてもよい。 Note that the start conditions are not limited to the above example. The start condition may be set based on a combination of two or more of the height h1, the width w, and the user's position, or may be set based on a combination of other conditions.

（ステップＳ１０６）
開始条件が満たされた場合、表示制御部１３４は、ディスプレイにカウントダウン画面Ｓｃ２を表示する（ステップＳ１０６）。カウントダウン画面Ｓｃ２は、録画を開始するまでの時間を表示する画面である。 (Step S106)
If the start condition is satisfied, the display control unit 134 displays the countdown screen Sc2 on the display (step S106). The countdown screen Sc2 is a screen that displays the time until recording starts.

図６は、カウントダウン画面Ｓｃ２の一例を示す画面である。図６の例では、カウントダウン画面Ｓｃ２に大きく「５」が表示されている。これは、録画を開始するまであと５秒であることを意味している。このカウントダウン画面Ｓｃ２では、表示される数字が１ずつ小さくなっていき、「０」のタイミングで録画が開始される。このように、開始条件が満たされた後、カウントダウン画面Ｓｃ２を表示することにより、ユーザ（被撮影者）は、録画が開始される正確なタイミングを容易に把握することができる。なお、カウントダウンされる時間は５秒に限られず、任意に設定可能である。また、判定部１３３は、カウントダウンが開始されてから終了するまでの間、開始条件が満たされたか判定し続けてもよい。この場合、カウントダウンの途中で開始条件が満たされなくなると、処理はステップＳ１０５に戻る。これにより、開始条件が確実に満たされた状態で録画を開始することができる。 FIG. 6 is a screen showing an example of the countdown screen Sc2. In the example of FIG. 6, a large number "5" is displayed on the countdown screen Sc2. This means that there are 5 seconds left until recording starts. On this countdown screen Sc2, the displayed number decreases by 1, and recording starts at the timing of "0". In this way, by displaying the countdown screen Sc2 after the start condition is satisfied, the user (person to be photographed) can easily grasp the exact timing at which recording will start. Note that the time to be counted down is not limited to 5 seconds, and can be set arbitrarily. Further, the determination unit 133 may continue to determine whether the start condition is satisfied from the start of the countdown until the end of the countdown. In this case, if the start condition is no longer satisfied during the countdown, the process returns to step S105. Thereby, recording can be started in a state where the start conditions are reliably satisfied.

（ステップＳ１０７）
カウントダウンが「０」になると、動画生成部１３６は、録画を開始する（ステップＳ１０７）。具体的には、撮影動画１２１を取得し、動画ファイル１２４の生成を開始する。 (Step S107)
When the countdown reaches "0", the video generation unit 136 starts recording (step S107). Specifically, the captured video 121 is acquired and the generation of the video file 124 is started.

（ステップＳ１０８）
一方で、表示制御部１３４は、ディスプレイに録画画面Ｓｃ３を表示する。録画画面Ｓｃ３は、録画中に表示される画面である。 (Step S108)
On the other hand, the display control unit 134 displays the recording screen Sc3 on the display. The recording screen Sc3 is a screen displayed during recording.

図７～図９は、録画画面Ｓｃ３の一例を示す図である。図７～図９の録画画面Ｓｃ３は、終了ボタンＢ１と、停止ボタンＢ２と、撮影動画表示領域Ａ１と、進捗バーｂと、を有する。 7 to 9 are diagrams showing an example of the recording screen Sc3. The recording screen Sc3 in FIGS. 7 to 9 includes an end button B1, a stop button B2, a captured video display area A1, and a progress bar b.

終了ボタンＢ１は、カメラによる撮影を終了させ、録画画面Ｓｃ３の表示を終了させるためのボタンである。ユーザが終了ボタンＢ１を選択すると、カメラによる撮影を終了し、例えば、アプリのホーム画面などがディスプレイに表示される。図７～図９の例では、終了ボタンＢ１は、待機表示領域Ａ１の左上に配置されているが、終了ボタンＢ１の位置は任意である。 The end button B1 is a button for ending photographing by the camera and ending the display of the recording screen Sc3. When the user selects the end button B1, the photographing by the camera ends and, for example, the home screen of the application is displayed on the display. In the examples shown in FIGS. 7 to 9, the end button B1 is arranged at the upper left of the standby display area A1, but the end button B1 can be placed at any position.

停止ボタンＢ２は、録画を停止及び再開させるためのボタンである。ユーザが停止ボタンＢ２を選択すると、録画が停止し、停止ボタンＢ２を再度選択すると、録画が再開する。図７～図９の例では、停止ボタンＢ２は、待機表示領域Ａ１の下部中央に配置されているが、停止ボタンＢ２の位置は任意である。 The stop button B2 is a button for stopping and restarting recording. When the user selects the stop button B2, recording stops, and when the user selects the stop button B2 again, recording resumes. In the examples of FIGS. 7 to 9, the stop button B2 is placed at the center of the lower part of the standby display area A1, but the position of the stop button B2 is arbitrary.

撮影動画表示領域Ａ１は、撮影動画１２１をリアルタイムで表示する領域である。図７～図９の例では、ディスプレイの上部を除く部分が撮影動画表示領域Ａ１となっており、撮影動画表示領域Ａ１には、突き当たりにドアが設置された廊下が表示されている。また、廊下の突き当たりには、ユーザ（被撮影者）が立っている。 The photographed video display area A1 is an area for displaying the photographed video 121 in real time. In the examples shown in FIGS. 7 to 9, the portion of the display excluding the upper part is the photographed video display area A1, and the photographed video display area A1 displays a hallway with a door installed at the end. Furthermore, a user (person to be photographed) is standing at the end of the hallway.

進捗バーｂは、録画の進捗率を表示するためのバーである。図７～図９の例では、進捗バーｂは、撮影動画表示領域Ａ１の上部に表示されているが、進捗バーｂの位置は任意である。また、図７～図９の例では、進捗バーｂは、進捗率が高まるほど縮む（０％に近づく）ように、すなわち、バーが残りの進捗率を表すように表示されているが、進捗率が高まるほど伸びる（１００％に近づく）ように、すなわち、バーが進捗率自体を表すように表示されてもよい。また、録画の進捗は、バー以外の形式でディスプレイに表示されてもよい。 Progress bar b is a bar for displaying the progress rate of recording. In the examples of FIGS. 7 to 9, the progress bar b is displayed at the top of the captured video display area A1, but the position of the progress bar b is arbitrary. Furthermore, in the examples of FIGS. 7 to 9, the progress bar b is displayed so that it shrinks (closer to 0%) as the progress rate increases, that is, the bar is displayed to represent the remaining progress rate. The bar may be displayed so that it increases as the rate increases (approaches 100%), that is, the bar represents the progress rate itself. Further, the recording progress may be displayed on the display in a format other than a bar.

（ステップＳ１０９）
また、表示制御部１３４は、録画の進捗率を算出する（ステップＳ１０９）。進捗率は、録画の開始条件を満たしたときが進捗率０％、録画の終了条件を満たしたときが進捗率１００％となるように、ユーザ（被撮影者）の骨格に基づいて算出される。 (Step S109)
The display control unit 134 also calculates the recording progress rate (step S109). The progress rate is calculated based on the skeleton of the user (person being photographed) so that when the recording start conditions are met, the progress rate is 0%, and when the recording end conditions are met, the progress rate is 100%. .

（ステップＳ１１０）
次に、判定部１３３は、骨格情報１２２に基づいて、録画の終了条件が満たされたか判定する（ステップＳ１１０）。判定部１３３は、録画の終了条件が満たされるまで判定を継続する（ステップＳ１１０：ＮＯ）。録画の終了条件が満たされた場合（ステップＳ１１０：ＹＥＳ）、処理はステップＳ１１１に進む。 (Step S110)
Next, the determination unit 133 determines whether the recording termination condition is satisfied based on the skeleton information 122 (step S110). The determination unit 133 continues the determination until the recording end condition is satisfied (step S110: NO). If the recording end condition is satisfied (step S110: YES), the process proceeds to step S111.

ここで、終了条件について、詳しく説明する。上述の通り、終了条件は、ユーザ（被撮影者）の骨格に基づいて設定される。より詳細には、終了条件は、ユーザ（被撮影者）の骨格の大きさに基づいて設定される。ユーザ（被撮影者）の骨格の大きさは、撮影装置１に対するユーザ（被撮影者）の距離に相当する。 Here, the termination conditions will be explained in detail. As described above, the termination condition is set based on the user's (photographed person's) skeleton. More specifically, the termination condition is set based on the size of the user's (photographed person's) skeleton. The size of the skeleton of the user (person being photographed) corresponds to the distance of the user (person being photographed) with respect to the photographing device 1 .

終了条件は、例えば、ディスプレイの高さＨに対するユーザ（被撮影者）の上半身の骨格の高さｈ２（大きさ）の割合が閾値以上である期間が、所定時間（所定フレーム）継続することである。閾値及び所定時間は任意に設定可能である。ここで、終了条件が、ディスプレイの高さＨに対するユーザ（被撮影者）の上半身の骨格の高さｈ２の割合が５０％以下である期間が０．１秒継続することであった場合について考える。 For example, the termination condition is that the ratio of the height h2 (size) of the upper body skeleton of the user (photographed person) to the height H of the display continues for a predetermined period of time (predetermined frames). be. The threshold value and the predetermined time can be set arbitrarily. Now, let us consider the case where the termination condition is that the period in which the ratio of the height h2 of the user's (photographed person's) upper body skeleton to the height H of the display is 50% or less continues for 0.1 seconds. .

この場合、判定部１３３は、まず、骨格情報１２２に基づいて、各フレームに対するユーザの上半身の骨格の高さｈ２を計算する。上半身の骨格の高さｈ２は、例えば、上半身の骨格として設定された骨格の座標のうち、一番上の座標と一番下の座標との差である（図８参照）。 In this case, the determination unit 133 first calculates the height h2 of the user's upper body skeleton for each frame based on the skeleton information 122. The height h2 of the upper body skeleton is, for example, the difference between the top coordinate and the bottom coordinate among the coordinates of the skeleton set as the upper body skeleton (see FIG. 8).

次に、判定部１３３は、予め設定されたディスプレイの高さＨに対する、高さｈ２の割合（ｈ２／Ｈ）を計算し、その割合が５０％以上であるか判定する。判定部１３３は、割合が５０％以上である期間が０．１秒（撮影動画が３０ｆｐｓである場合、３フレーム）継続した場合、終了条件が満たされたと判定する。 Next, the determining unit 133 calculates the ratio (h2/H) of the height h2 to the preset height H of the display, and determines whether the ratio is 50% or more. The determination unit 133 determines that the termination condition is satisfied if the period in which the ratio is 50% or more continues for 0.1 seconds (3 frames if the captured video is 30 fps).

これにより、図９の例にように、撮影装置１からユーザ（被撮影者）が所定の距離だけ離れた場合（撮影装置１までユーザ（被撮影者）が所定の距離だけ近づいた場合）に録画を終了することができる。言い換えると、ユーザが撮影装置１から所定の距離まで近づいた後を録画しないようにすることができる。また、上半身の骨格に基づいて終了条件を判定することにより、図９の例のように、撮影装置１に接近してユーザ（被撮影者）の下半身が映らない場合も録画を終了させることができる。 As a result, as shown in the example of FIG. 9, when the user (person to be photographed) leaves the photographing device 1 by a predetermined distance (when the user (person to be photographed) approaches the photographing device 1 by a predetermined distance), Recording can be ended. In other words, it is possible to prevent recording after the user approaches the photographing device 1 to a predetermined distance. Furthermore, by determining the termination condition based on the upper body skeleton, recording can be terminated even when the lower body of the user (photographed person) is not visible due to the proximity to the photographing device 1, as in the example of FIG. can.

なお、終了条件は上記の例に限られない。例えば、高さｈ２は、ユーザ（被撮影者）の上半身ではなく、上半身以外の体の一部、又は全身の高さであってもよい。また、高さＨは、ディスプレイの高さではなく、撮影動画表示領域Ａ１の高さであってもよい。また、高さｈ２の割合と閾値を比較する代わりに、高さｈ２と閾値を比較してもよい。 Note that the termination condition is not limited to the above example. For example, the height h2 may be the height of not the upper body of the user (person to be photographed), but a part of the body other than the upper body, or the height of the whole body. Further, the height H may be the height of the photographed video display area A1 instead of the height of the display. Moreover, instead of comparing the ratio of height h2 and the threshold value, the height h2 and the threshold value may be compared.

また、終了条件は、ディスプレイの幅Ｗに対するユーザ（被撮影者）の全身の骨格の幅ｗ（大きさ）の割合が閾値以上である期間が、所定時間（所定フレーム）継続することであってもよい。幅ｗは、ユーザ（被撮影者）の全身ではなく、一部（例えば、上半身）の幅であってもよい。また、幅Ｗは、ディスプレイの幅ではなく、撮影動画表示領域Ａ１の幅であってもよい。また、幅ｗの割合と閾値を比較する代わりに、幅ｗと閾値を比較してもよい。 Further, the termination condition is that the period in which the ratio of the width w (size) of the user's (photographed person's) whole body skeleton to the width W of the display is equal to or greater than the threshold continues for a predetermined period of time (predetermined frames). Good too. The width w may be the width of a part (for example, the upper body) of the user (person being photographed) rather than the whole body. Furthermore, the width W may be the width of the photographed video display area A1 instead of the width of the display. Further, instead of comparing the proportion of the width w and the threshold value, the width w and the threshold value may be compared.

また、終了条件は、ディスプレイ上におけるユーザ（被撮影者）の位置に基づいて設定されてもよい。例えば、終了条件として、人検出により検出されたユーザ（被撮影者）の座標の範囲を設定することが考えられる。これにより、ディスプレイに対してユーザ（被撮影者）が所定の位置にいる場合に録画を終了することができる。 Further, the termination condition may be set based on the position of the user (person to be photographed) on the display. For example, it is conceivable to set the coordinate range of the user (photographed person) detected by human detection as the termination condition. Thereby, recording can be ended when the user (person to be photographed) is at a predetermined position with respect to the display.

なお、終了条件は、上記の例に限られない。終了条件は、高さｈ２、幅ｗ及びユーザの位置のうち２つ以上の組み合わせに基づいて設定されてもよいし、その他の条件を組み合わせて設定されてもよい。また、ユーザが撮影装置１から離れる運動をする場合には、上述の開始条件と終了条件は逆であってもよい。 Note that the termination condition is not limited to the above example. The termination condition may be set based on a combination of two or more of the height h2, the width w, and the user's position, or may be set based on a combination of other conditions. Further, when the user moves away from the imaging device 1, the above-mentioned start condition and end condition may be reversed.

（ステップＳ１１１）
終了条件が満たされると、動画生成部１３６は、録画を終了する（ステップＳ１１１）。具体的には、撮影動画１２１の取得を終了し、録画を開始してから終了するまでに取得した撮影動画１２１を含む動画ファイル１２４を生成し、記憶部１２に保存する。 (Step S111)
When the termination condition is satisfied, the moving image generation unit 136 terminates recording (step S111). Specifically, the acquisition of the photographed video 121 is finished, and a video file 124 including the photographed video 121 acquired from the start to the end of recording is generated and stored in the storage unit 12.

＜まとめ＞
以上説明した通り、本実施形態によれば、撮影装置１は、撮影動画１２１に含まれるユーザ（被撮影者）の骨格に基づいて録画の開始条件を満たしたか判定し、開始条件が満たされた場合に録画を開始する。これにより、ユーザ（被撮影者）が撮影装置１から、運動を開始するのに適した所定の距離にきたタイミングで録画を開始することができる。結果として、ユーザ（被撮影者）が撮影装置１のカメラを起動してから所定の距離に移動するまでの余計なシーンを含まない、ＡＩでの運動の解析に適した動画ファイル１２４を生成できる。 <Summary>
As explained above, according to the present embodiment, the photographing device 1 determines whether the recording start condition is satisfied based on the skeleton of the user (photographed person) included in the photographed video 121, and determines whether the start condition is satisfied. Start recording if necessary. Thereby, recording can be started at the timing when the user (person to be photographed) comes to a predetermined distance from the photographing device 1 suitable for starting exercise. As a result, it is possible to generate a video file 124 that does not include unnecessary scenes from when the user (photographed person) activates the camera of the photographing device 1 until the user moves to a predetermined distance, and is suitable for motion analysis using AI. .

また、本実施形態によれば、撮影装置１は、撮影動画１２１に含まれるユーザ（被撮影者）の骨格に基づいて録画の終了条件を満たしたか判定し、終了条件が満たされた場合に録画を終了する。これにより、ユーザ（被撮影者）が撮影装置１から、運動を終了するのに適した所定の距離にきたタイミングで録画を終了することができる。結果として、ユーザ（被撮影者）が運動を終了した後の余計なシーンを含まない、ＡＩでの運動の解析に適した動画ファイル１２４を生成できる。 Further, according to the present embodiment, the photographing device 1 determines whether a recording end condition is satisfied based on the skeleton of the user (photographed person) included in the photographed video 121, and when the end condition is satisfied, the photographing device 1 performs recording. end. Thereby, recording can be ended at the timing when the user (person to be photographed) comes to a predetermined distance from the photographing device 1 suitable for ending the exercise. As a result, it is possible to generate a video file 124 that does not include unnecessary scenes after the user (person to be photographed) finishes exercising and is suitable for analysis of exercise using AI.

＜付記＞
本実施形態は、以下の開示を含む。 <Additional notes>
This embodiment includes the following disclosure.

（付記１）
ディスプレイと、前記ディスプレイ側を撮影するカメラと、を備えた情報処理装置が実行する情報処理方法であって、
前記カメラが撮影した動画を取得する処理と、
前記動画に含まれる被撮影者の骨格を推定する処理と、
推定された前記骨格が開始条件を満たしたか判定する処理と、
前記開始条件が満たされた場合、録画を開始する処理と、
を含む情報処理方法。 (Additional note 1)
An information processing method executed by an information processing device including a display and a camera that photographs the display side,
a process of acquiring a video shot by the camera;
a process of estimating the skeleton of the photographed person included in the video;
a process of determining whether the estimated skeleton satisfies a starting condition;
a process of starting recording when the start condition is met;
Information processing methods including.

（付記２）
前記開始条件は、前記被撮影者の少なくとも一部の骨格の大きさに基づいて設定される
付記１に記載の情報処理方法。 (Additional note 2)
The information processing method according to supplementary note 1, wherein the start condition is set based on the size of at least a part of the skeleton of the person to be photographed.

（付記３）
前記開始条件は、前記被撮影者の少なくとも一部の骨格の高さに基づいて設定される
付記１に記載の情報処理方法。 (Additional note 3)
The information processing method according to supplementary note 1, wherein the start condition is set based on the height of at least a part of the skeleton of the person to be photographed.

（付記４）
前記開始条件は、前記ディスプレイの高さに対する、前記被撮影者の少なくとも一部の骨格の高さの割合に基づいて設定される
付記１に記載の情報処理方法。 (Additional note 4)
The information processing method according to supplementary note 1, wherein the start condition is set based on a ratio of a height of at least a part of the skeleton of the person to be photographed to a height of the display.

（付記５）
前記録画を開始する処理は、前記ディスプレイに録画の開始までの時間を表示する処理を含む
付記１に記載の情報処理方法。 (Appendix 5)
The information processing method according to supplementary note 1, wherein the process of starting recording includes a process of displaying a time until the start of recording on the display.

（付記６）
録画の開始後、前記骨格が終了条件を満たしたか判定する処理と、
前記終了条件が満たされた場合、録画を終了する処理と、
を更に含む付記１に記載の情報処理方法。 (Appendix 6)
After the start of recording, a process of determining whether the skeleton satisfies an end condition;
a process of terminating the recording when the termination condition is met;
The information processing method according to Supplementary Note 1, further comprising:

（付記７）
前記終了条件は、前記被撮影者の少なくとも一部の骨格の大きさに基づいて設定される
付記６に記載の情報処理方法。 (Appendix 7)
The information processing method according to appendix 6, wherein the termination condition is set based on the size of at least a part of the skeleton of the person to be photographed.

（付記８）
前記終了条件は、前記被撮影者の少なくとも一部の骨格の高さに基づいて設定される
付記６に記載の情報処理方法。 (Appendix 8)
The information processing method according to appendix 6, wherein the termination condition is set based on the height of at least a part of the skeleton of the person to be photographed.

（付記９）
前記終了条件は、前記ディスプレイの高さに対する、前記被撮影者の少なくとも一部の骨格の高さの割合に基づいて設定される
付記６に記載の情報処理方法。 (Appendix 9)
The information processing method according to appendix 6, wherein the termination condition is set based on a ratio of a height of at least a part of the skeleton of the person to be photographed to a height of the display.

（付記１０）
前記終了条件は、前記ディスプレイの高さに対する、前記被撮影者の上半身の骨格の高さの割合に基づいて設定される
付記６に記載の情報処理方法。 (Appendix 10)
The information processing method according to appendix 6, wherein the termination condition is set based on a ratio of the height of the upper body skeleton of the photographed person to the height of the display.

（付記１１）
録画の開始後、前記ディスプレイに録画の進捗率を表示する処理を更に含む
付記１に記載の情報処理方法。 (Appendix 11)
The information processing method according to supplementary note 1, further comprising a process of displaying a progress rate of recording on the display after the start of recording.

（付記１２）
前記進捗率は、前記ディスプレイにバーで表示される
付記１１に記載の情報処理方法。 (Appendix 12)
The information processing method according to appendix 11, wherein the progress rate is displayed as a bar on the display.

（付記１３）
ディスプレイと、前記ディスプレイ側を撮影するカメラと、を備えた情報処理装置に、
前記カメラが撮影した動画を取得する処理と、
前記動画に含まれる被撮影者の骨格を推定する処理と、
推定された前記骨格が開始条件を満たしたか判定する処理と、
前記開始条件が満たされた場合、録画を開始する処理と、
を含む情報処理方法を実行させるためのプログラム。 (Appendix 13)
An information processing device including a display and a camera that photographs the display side,
a process of acquiring a video shot by the camera;
a process of estimating the skeleton of the photographed person included in the video;
a process of determining whether the estimated skeleton satisfies a starting condition;
a process of starting recording when the start condition is met;
A program for executing information processing methods including.

（付記１４）
ディスプレイと、
前記ディスプレイ側を撮影するカメラと、
前記カメラが撮影した動画を取得する取得部と、
前記動画に含まれる被撮影者の骨格を推定する推定部と、
推定された前記骨格が開始条件を満たしたか判定する判定部と、
前記開始条件が満たされた場合、録画を開始する撮影部と、
を含む情報処理装置。 (Appendix 14)
display and
a camera that photographs the display side;
an acquisition unit that acquires a video shot by the camera;
an estimation unit that estimates the skeleton of the photographed person included in the video;
a determination unit that determines whether the estimated skeleton satisfies a starting condition;
a camera unit that starts recording when the start condition is met;
Information processing equipment including.

今回開示された実施形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。また、本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The embodiments disclosed herein are illustrative in all respects and should not be considered restrictive. The scope of the present invention is indicated by the claims rather than the above-mentioned meaning, and is intended to include meanings equivalent to the claims and all changes within the scope. Furthermore, the present invention is not limited to the embodiments described above, and can be modified in various ways within the scope of the claims, and can be obtained by appropriately combining technical means disclosed in different embodiments. The embodiments are also included in the technical scope of the present invention.

１：撮影装置
１１：通信部
１２：記憶部
１３：制御部
１２１：撮影動画
１２２：骨格情報
１２３：条件情報
１２４：動画ファイル
１３１：取得部
１３２：推定部
１３３：判定部
１３４：表示制御部
１３５：撮影制御部
１３６：動画生成部 1: Imaging device 11: Communication unit 12: Storage unit 13: Control unit 121: Captured video 122: Skeleton information 123: Condition information 124: Video file 131: Acquisition unit 132: Estimation unit 133: Judgment unit 134: Display control unit 135 : Shooting control section 136: Video generation section

Claims

An information processing method executed by an information processing device including a display and a camera that photographs the display side,
a process of acquiring a video shot by the camera;
a process of estimating the skeleton of the photographed person included in the video;
a process of determining whether the estimated skeleton satisfies a starting condition;
a process of starting recording when the start condition is met;
Information processing methods including.

The information processing method according to claim 1, wherein the start condition is set based on the size of at least a part of the skeleton of the person to be photographed.

The information processing method according to claim 1, wherein the start condition is set based on the height of at least a part of the skeleton of the person to be photographed.

The information processing method according to claim 1, wherein the start condition is set based on a ratio of a height of at least a part of the skeleton of the person to be photographed to a height of the display.

2. The information processing method according to claim 1, wherein the process of starting recording includes a process of displaying a time until the start of recording on the display.

After the start of recording, a process of determining whether the skeleton satisfies an end condition;
a process of terminating the recording when the termination condition is met;
The information processing method according to claim 1, further comprising:

7. The information processing method according to claim 6, wherein the termination condition is set based on the size of at least a part of the skeleton of the photographed person.

7. The information processing method according to claim 6, wherein the termination condition is set based on the height of at least a part of the skeleton of the photographed person.

7. The information processing method according to claim 6, wherein the termination condition is set based on a ratio of the height of at least a part of the skeleton of the photographed person to the height of the display.

7. The information processing method according to claim 6, wherein the termination condition is set based on a ratio of the height of the upper body skeleton of the photographed person to the height of the display.

The information processing method according to claim 1, further comprising a process of displaying a recording progress rate on the display after recording starts.

The information processing method according to claim 11, wherein the progress rate is displayed as a bar on the display.

An information processing device including a display and a camera that photographs the display side,
a process of acquiring a video shot by the camera;
a process of estimating the skeleton of the photographed person included in the video;
a process of determining whether the estimated skeleton satisfies a starting condition;
a process of starting recording when the start condition is met;
A program for executing information processing methods including.

display and
a camera that photographs the display side;
an acquisition unit that acquires a video shot by the camera;
an estimation unit that estimates the skeleton of the photographed person included in the video;
a determination unit that determines whether the estimated skeleton satisfies a starting condition;
a camera unit that starts recording when the start condition is met;
Information processing equipment including.