WO2024135232A1 - Impact detection device, impact detection method, and program - Google Patents

Impact detection device, impact detection method, and program Download PDF

Info

Publication number
WO2024135232A1
WO2024135232A1 PCT/JP2023/042297 JP2023042297W WO2024135232A1 WO 2024135232 A1 WO2024135232 A1 WO 2024135232A1 JP 2023042297 W JP2023042297 W JP 2023042297W WO 2024135232 A1 WO2024135232 A1 WO 2024135232A1
Authority
WO
WIPO (PCT)
Prior art keywords
swing
video data
impact
frame
reference motion
Prior art date
Application number
PCT/JP2023/042297
Other languages
French (fr)
Japanese (ja)
Inventor
良平 田嶋
拓也 中島
和彦 山本
栄美 伊藤
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Publication of WO2024135232A1 publication Critical patent/WO2024135232A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63BAPPARATUS FOR PHYSICAL TRAINING, GYMNASTICS, SWIMMING, CLIMBING, OR FENCING; BALL GAMES; TRAINING EQUIPMENT
    • A63B69/00Training appliances or apparatus for special sports
    • A63B69/36Training appliances or apparatus for special sports for golf

Definitions

  • This disclosure relates to, for example, an impact detection device, an impact detection method, and a program.
  • Patent Document 1 has the problem that it is difficult to distinguish between a swing that does not involve impact (hitting), specifically a simple practice swing or a missed swing, and a swing that actually involves impact.
  • one aspect of the present disclosure has an object to provide a technique that makes it easy to distinguish between a practice swing or a whiff swing and a swing that results in impact.
  • An impact detection device includes a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
  • FIG. 1 is a diagram illustrating an example of a usage state of an impact detection device according to an embodiment.
  • FIG. 2 is a diagram illustrating a hardware configuration of the impact detection device.
  • FIG. 2 is a functional block diagram of the impact detection device.
  • FIG. 2 is a diagram illustrating an example of video data.
  • FIG. 11 is a diagram illustrating an example of reference motion data.
  • 4 is a flowchart showing an operation of the impact detection device.
  • FIG. 13 is a diagram illustrating impact detection in the relationship between video data and audio data.
  • FIG. 13 is a diagram showing an example of video data showing a series of swings.
  • FIG. 13 is a diagram showing the extraction of video data.
  • FIG. 13 is a diagram showing an example of playback of video data.
  • FIG. 1 is a diagram showing an example of a usage state of the impact detection device 1 according to the embodiment.
  • the impact detection device 1 is, for example, an information processing device with a shooting function, specifically, a smartphone.
  • the impact detection device 1 shoots a swing accompanied by an impact by a user U, for example, a golf swing, and analyzes the shot video data.
  • the impact detection device 1 may be a device other than a smartphone, such as a portable terminal device or a personal computer, that has a function of analyzing video data.
  • FIG. 2 is a diagram showing the hardware configuration of the impact detection device 1. Impact detection by the impact detection device 1 is achieved through the cooperation of hardware and software.
  • the impact detection device 1 includes a processing device 10, a photographing device 11, an operation input device 12, a display device 13, a storage device 14, and a communication device 15.
  • the processing device 10 is composed of one or more arithmetic processing circuits, such as a CPU (Central Processing Unit), and controls all elements of the impact detection device 1, which is an information processing device.
  • the processing device 10 may also be composed of circuits such as a DSP (Digital Signal Processor) and an ASIC (Application Specific Integrated Circuit) in addition to a CPU.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • the photographing device 11 photographs the swing of the user U and outputs video data including the photographed audio and video.
  • the operation input device 12 accepts operations by a user.
  • the operation input device 12 is, for example, a plurality of operators pressed by the user or a touch panel that detects contact by the user, and is overlaid on the image display screen of the display device 13. Note that the user who operates the operation input device 12 may be the user who made the swing, or may be another user.
  • the display device 13 is composed of, for example, a liquid crystal panel or an organic EL panel, and displays various images under the control of the processing device 10. For example, the photographed swing and an editing screen for that swing are displayed on the display device 13.
  • the storage device 14 is a single or multiple memories configured with known recording media such as magnetic recording media or semiconductor recording media, and stores the programs executed by the processing device 10, various data used by the processing device 10, and video data of the swing.
  • the storage device 14 may be configured with a combination of multiple types of recording media.
  • a portable recording medium that can be attached to and detached from the impact detection device 1, or an external recording medium (e.g., online storage) with which the impact detection device 1 can communicate via a network may be used as part of the storage device 14.
  • the communication device 15 communicates with a server etc. via a network under the control of the processing device 10.
  • FIG. 3 is a block diagram illustrating a functional configuration of the processing device 10.
  • the processing device 10 executes a program stored in the storage device 14 to function as a plurality of elements (a control unit 100, a first acquisition unit 101, a second acquisition unit 102, an analysis unit 103, an identification unit 104, and a determination unit 105) for detecting an impact frame in video data where an impact occurs.
  • the processing device 10 After detecting an impact frame in the video data, the processing device 10 also functions as an editing unit 106 for editing the video data and a playback unit 107 for playing the video data.
  • some of the functions of the processing device 10 may be implemented by another device, for example, a server connected via a network.
  • Fig. 4 is a diagram showing an example of video data.
  • the video data is an object of impact detection, and is a golf swing by a user U, captured by, for example, the impact detection device 1.
  • the video data includes a video portion made up of a plurality of frames Fr, and an audio portion synchronized with the video portion, although this is omitted in Fig. 4.
  • a swing image is recorded for each frame Fr, and in the audio portion, sounds picked up during the swing are recorded.
  • Each frame Fr of the video data is associated with a frame number as a serial number from the start frame.
  • the address, top, impact and finish of typical swing movements are shown as (A), (C), (E) and (G), respectively.
  • the moving image data is, for example, in the MPEG format, but any format is acceptable as long as it includes a video portion consisting of a series of multiple frames and an audio portion synchronized with the video portion.
  • the impact detection device 1 refers to the reference motion data to detect impact frames in the video data.
  • the reference motion data we will explain the reference motion data.
  • FIG. 5 is a diagram for explaining the reference action data.
  • the reference motion data is data showing a model/example swing.
  • the reference motion data is data showing a time transition of three-dimensional coordinates of each joint J of skeletal information when the model M swings.
  • the reference of the three-dimensional coordinates is, for example, the center of gravity coordinates of both feet of the model M, and the three-dimensional coordinates of each joint J are defined for each frame Fr by relative coordinates from the center of gravity coordinates.
  • Each frame Fr of the reference action data is associated with a frame number as a serial number from the start frame, similar to the video data.
  • Examples of the joint J include a left knee, a right knee, a left hip joint, a right hip joint, a left shoulder joint, a right shoulder joint, a left elbow, and a right elbow.
  • the joints J of a right-handed model M when swinging are shown as black circles when viewed from the front (east, if the ball is hit in the direction of north).
  • Representative parts of the swing, including the address, take-back, top, downswing, impact, follow-through, and finish, are illustrated, in that order, as (1) to (7).
  • the frame indicating the impact is tagged in advance with an identifier to distinguish it from the other frames.
  • the reference motion data indicates the three-dimensional coordinates of each joint J, it is possible to view the swing of the model M from any point by specifying a viewpoint.
  • swings differ depending on the attributes of the user U, such as the gender, dominant hand, generation (age), and golf skill (professional, semi-professional, amateur).
  • the database of reference motion data may be configured to be stored in the storage device 14 or may be configured to be stored in a server on the cloud.
  • a tree diagram showing the similarity of swings (time transition of joint coordinates) indicated by the reference motion data may be created in advance.
  • FIG. 6 is a flowchart showing the operation of the impact detection device 1. This operation is executed when, for example, a user U operates an icon corresponding to an application program that detects impact frames on the impact detection device 1.
  • control unit 100 instructs the first acquisition unit 101 to acquire video data for which impact frames are to be detected (step Sa11).
  • the instructed first acquisition unit 101 displays, for example, a list of one or more video data on the display device 13, and prompts the user U to select a video file that is to be detected for an impact frame.
  • the first acquisition unit 101 acquires the selected video file as a video file that is to be detected for an impact frame.
  • the video file may be acquired from the storage device 14 or an external device via a network.
  • the video files stored in the storage device 14 include the most recent video file captured by the imaging device 11 and video files captured in the past.
  • the control unit 100 transfers the video data acquired by the first acquisition unit 101 to the analysis unit 103.
  • the analysis unit 103 analyzes the time transition of the joint coordinates of the user swinging in the video data by using a known method.
  • the analysis unit 103 transfers information indicating the analyzed changes in the joint coordinates over time to the second acquisition unit 102 .
  • the second acquisition unit 102 searches a database to identify and acquire the reference motion data that is most similar to the time progression of the joint coordinates obtained by analyzing the video data (step Sa12).
  • the reference motion data that is most similar to the time progression of the joint coordinates obtained by analysis may be identified using, for example, Subsequence Dynamic Time Warping.
  • the control unit 100 transfers the video data acquired by the first acquisition unit 101 and the reference action data acquired by the second acquisition unit 102 to the identification unit 104.
  • the identification unit 104 determines the correlation between each frame of the video data and each frame of the reference action data, specifically, which frame of the reference action data each frame of the video data corresponds to.
  • the upper column of Figure 7 is a diagram showing the correlation between each frame of the video data and each frame of the reference action data; specifically, black dots indicate which frame of the reference action data each frame of the video data corresponds to.
  • frame (A) in the video data corresponds to the frame of address (1) in the reference motion data.
  • frame (E) in the video data corresponds to the frame of impact (5) in the reference motion data.
  • frame (G) in the video data corresponds to the frame of finish (7) in the reference motion data.
  • the identification unit 104 After determining the correlation, the identification unit 104 identifies a section S from, for example, 0.3 seconds earlier in time to, for example, 0.3 seconds later in time, based on frame (E) of the video data corresponding to impact (5) in the reference action data, as a section in the video data that is likely to include an impact frame (step Sa13). The identification unit 104 transfers information indicating the identified section S to the determination unit 105 .
  • the determination unit 105 determines whether or not there is one impact sound in the audio portion of the section S identified by the identification unit 104 in the video data (step Sa14).
  • An impact sound is a sound whose volume exceeds a threshold within an extremely short period of time (e.g., 0.01 seconds).
  • the lower section of Figure 7 shows an example waveform showing the audio portion of the section S identified by the identification unit 104, which is an example of a case in which there are two impact sounds in the section S.
  • the determination unit 105 determines that only one impact sound is present in the audio portion of section S (if the determination result in step Sa14 is "Yes"), it detects the frame in the video portion of section S in which the impact sound reaches its peak value as an impact frame (step Sa15).
  • step Sa16 determines whether there are two or more impact sounds in the audio portion of section S.
  • the determination unit 105 determines that, among the two or more impact sounds present in the section S, the impact sound having the timing closest to frame (E) of the video data is the impact that occurred during the swing by the user U. Then, the determination unit 105 identifies the frame in which the impact sound reaches its peak value as the impact frame (step Sa17).
  • the lower section of Figure 7 shows an example of a case where impact sounds occur at timings a and b in the audio portion of section S.
  • the volume of the impact sound generated at timing b is greater than the volume of the impact sound generated at timing a, there is a possibility that the impact in user U's swing will be erroneously detected as occurring at timing b.
  • step Sa18 the determination unit 105 notifies the control unit 100 that there is no impact frame in the video data, and the control unit 100 executes error processing (step Sa18).
  • the determination unit 105 when the swing by the user U is merely a practice swing, no impact sound is generated in the section S, and therefore error processing is executed in step Sa18. For this reason, in a swing that does not include an impact sound, frame (E) is not identified as an impact frame. Examples of error processing include displaying a message on the display device 13 indicating that there is no impact frame in the video data, or that the swing in the video data is a practice swing, etc.
  • This embodiment not only makes it possible to distinguish between practice swings and misses and swings that result in actual impact, but also makes it possible to accurately detect impact frames captured from users' swings, even when multiple people are swinging simultaneously.
  • the explanation is based on the assumption that the user U takes one swing, but in reality, the swing is repeated multiple times from the start of filming to the end of filming. If the video data includes multiple swings, the impact frame can be identified for each swing.
  • FIG 8 is a diagram showing the correlation between each frame of the video data and each frame of the reference motion data when the video data includes multiple swings.
  • the correlation shown in the figure is an example in which one video data includes three swings.
  • the audio portion is not shown, but if an impact sound occurs near frame (E) of each swing, the frame in which the impact sound occurs is detected as the impact frame.
  • the editing unit 106 may edit the video data to extract each swing. Specifically, when frame (E) is identified as an impact frame among the three swings in the video data, the editing unit 106 extracts the period from 2.0 seconds before frame (E) to 2.0 seconds after frame (E) as one swing, as shown in FIG. 9 .
  • three pieces of video data showing the swing are extracted.
  • the extracted video data are associated with information such as the shooting date and time.
  • 2.0 seconds before frame (E) is an example of the first time
  • 2.0 seconds after frame (E) is an example of the second time.
  • the first time and the second time may be set to any time by the user U.
  • the playback unit 107 may be configured to play two videos side by side for comparison, as shown in FIG. 10, for example.
  • the frames (E) identified as the impact frame are aligned as shown in the figure, the user U can easily grasp the difference in the swing before and after impact in the two videos.
  • the number of pieces of video data to be played back is not limited to two, but may be three or more.
  • the video data to be played back is not limited to that stored in the storage device 14, but may be video data stored in a server on the cloud. For example, by comparing one's swing with the swing of a more advanced player, it becomes easier to grasp the inferior points of one's swing. Also, by comparing a past swing with a current swing, it is possible to grasp the degree of improvement.
  • the playback unit 107 is not limited to playback at a constant speed, and may play back in slow motion, fast forward, or frame by frame.
  • the application program for identifying the impact frame is executed in response to an operation by the user U, but the application program may be executed in response to, for example, completion of filming of the swing.
  • the first acquisition unit 101 acquires video data of the filmed swing as a detection target for impact.
  • the analysis of the video data is configured to be performed by the analysis unit 103 of the impact detection device 1, but the analysis may also be performed by a device other than the impact detection device 1, for example an external device, and the second acquisition unit 102 may acquire information indicating the time progression of the joint coordinates, which is the analysis result.
  • a golf swing is given as an example of a swing, but the invention can also be applied to swings that involve impact sounds, such as tennis, baseball, and table tennis.
  • An impact detection device includes a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
  • a section is identified in the video data that includes a frame that corresponds to a timing in the reference motion data that is assigned an identifier indicating an impact
  • second if a frame including a hitting sound is present in that section, it is determined that the video data includes a swing with impact.
  • a frame including a hitting sound is not present in that section, it is determined that the video data does not include a swing with impact. Therefore, according to aspect 1, it is possible to distinguish between a swing that does not include impact, such as a practice swing or a whiff, and a swing that includes impact.
  • the determination unit determines the frame closest to the timing at which the identifier is assigned to be the impact frame.
  • the frame closest to the timing when the identifier was added is determined to be the impact frame. Therefore, the frame including the hitting sound by the target user can be identified with high accuracy without being affected by the hitting sounds by other users.
  • the impact detection device relating to a specific aspect 3 of aspect 1
  • a plurality of pieces of reference motion data are prepared for each model, and the second acquisition unit acquires one piece of reference motion data from the plurality of pieces of reference motion data that is similar to the swing shown in the video data.
  • the section is identified by using, from among the plurality of reference motion data, reference motion data similar to the swing shown in the video data.
  • the reference motion data is data indicating the time progression of the joint coordinates in the model
  • an analysis unit is provided that analyzes the time progression of the joint coordinates of the user in the video data
  • the second acquisition unit acquires, from the plurality of reference motion data, the reference motion data having the time progression that is most similar to the time progression of the joint coordinates of the user in the video data.
  • An impact detection device relating to a specific aspect 5 of aspect 2 has an editing unit that edits the video data, and the editing unit cuts out frames from the video data that are a first time ahead to a second time behind the impact frame identified by the identification unit. According to the fifth aspect, the editing unit cuts out from the video data a frame that is a first time before the impact frame and a frame that is a second time after the impact frame.
  • the editing unit cuts out the video data for each of the swings.
  • a series of video data including multiple swings is cut out for each swing.
  • the impact detection device which is a specific example of aspect 6, includes a playback unit that aligns and plays two or more pieces of video data extracted by the editing unit. According to aspect 7, swings can be compared by playing two or more videos.
  • the impact detection method causes a computer to execute the steps of: acquiring video data including sounds recorded when a user swings, the video data being a multiple-frame shot of the swing; acquiring reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; identifying a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and determining whether or not a frame including a hitting sound is present in the section, and determining that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
  • the program according to a preferred aspect 9 of the present disclosure causes a computer to function as a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of impact in the reference motion data that is given the identifier; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
  • 1...impact detection device 10...processing device, 11...imaging device, 12...operation input device, 13...display device, 14...storage device, 15...communication device, 100...control unit, 101...first acquisition unit, 102...second acquisition unit, 103...analysis unit, 104...identification unit, 105...determination unit, 106...editing unit, 107...playback unit.

Landscapes

  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physical Education & Sports Medicine (AREA)
  • Image Analysis (AREA)

Abstract

The present invention comprises: a first acquisition unit 101 which acquires moving image data that includes sound recorded at the time of a swing of a user and is obtained by capturing the swing at a plurality of frames; a second acquisition unit 102 which acquires reference motion data that represents a time transition of the swing and has an identifier attached thereto, the identifier indicating the time of impact in the swing; an identification unit 104 which identifies, from the moving image data, a section including a frame corresponding to the timing of attaching the identifier in the reference motion data; and a determination unit 105 which determines whether a frame including a hitting ball sound is present in the section and, when it is determined that the frame including the hitting ball sound is present in the section, determines that the swing with an impact is included in the moving image data.

Description

インパクト検出装置、インパクト検出方法およびプログラムImpact detection device, impact detection method and program
 本開示は、例えばインパクト検出装置、インパクト検出方法およびプログラムに関する。 This disclosure relates to, for example, an impact detection device, an impact detection method, and a program.
 従来、スポーツ等において一定のパターンを有するスイングを撮影し、撮影された動画データを解析することが行われている。これにより、当該スイングを評価したり、診断したりすることができる。このような技術の例として、例えばスイングに係る動画データの画像認識にかかる処理コストを減少させるとともに、画像認識結果を利用した動作の解析精度を向上させる技術が挙げられる(例えば特許文献1参照)。  Traditionally, in sports and the like, a swing with a certain pattern is filmed and the filmed video data is analyzed. This makes it possible to evaluate and diagnose the swing. An example of such technology is a technology that reduces the processing cost required for image recognition of video data related to a swing, and improves the accuracy of analysis of movements using the results of image recognition (see, for example, Patent Document 1).
特開2021-125075号公報JP 2021-125075 A
 しかしながら、特許文献1に記載された技術では、インパクト(打撃)を伴わないスイング、具体的には単なる素振りや空振りと、実際にインパクトを伴うスイングと、を区別することが困難である、という課題がある。
 このような事情を考慮して、本開示のひとつの態様は、素振りや空振りと、インパクトを伴うスイングとを区別することが容易な技術を提供することを目的とする。
However, the technology described in Patent Document 1 has the problem that it is difficult to distinguish between a swing that does not involve impact (hitting), specifically a simple practice swing or a missed swing, and a swing that actually involves impact.
In consideration of such circumstances, one aspect of the present disclosure has an object to provide a technique that makes it easy to distinguish between a practice swing or a whiff swing and a swing that results in impact.
 本開示の一態様に係るインパクト検出装置は、ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する第1取得部と、モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する第2取得部と、前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を、特定する特定部と、前記区間に、打球音を含むフレームが存在するか否かを判定し、前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する判定部と、を有する。 An impact detection device according to one aspect of the present disclosure includes a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
実施形態に係るインパクト検出装置の使用状況の一例を示す図である。1 is a diagram illustrating an example of a usage state of an impact detection device according to an embodiment. インパクト検出装置のハードウェアの構成を示す図である。FIG. 2 is a diagram illustrating a hardware configuration of the impact detection device. インパクト検出装置の機能ブロック図である。FIG. 2 is a functional block diagram of the impact detection device. 動画データの一例を示す図である。FIG. 2 is a diagram illustrating an example of video data. 基準動作データの一例を示す図である。FIG. 11 is a diagram illustrating an example of reference motion data. インパクト検出装置の動作を示すフローチャートである。4 is a flowchart showing an operation of the impact detection device. 動画データと音声データとの関係においてインパクト検出を示す図である。FIG. 13 is a diagram illustrating impact detection in the relationship between video data and audio data. スイングが連続する動画データの一例を示す図である。FIG. 13 is a diagram showing an example of video data showing a series of swings. 動画データの切り出しを示す図である。FIG. 13 is a diagram showing the extraction of video data. 動画データの再生例を示す図である。FIG. 13 is a diagram showing an example of playback of video data.
 以下、本開示の実施形態に係るインパクト検出装置について図面を参照して説明する。
 なお、各図において、各部の寸法および縮尺は、実際のものと適宜に異ならせてある。また、以下に述べる実施の形態は、好適な具体例であるから、技術的に好ましい種々の限定が付されているが、本発明の範囲は、以下の説明において特に本発明を限定する旨の記載がない限り、これらの形態に限られるものではない。
Hereinafter, an impact detection device according to an embodiment of the present disclosure will be described with reference to the drawings.
In each drawing, the dimensions and scale of each part are appropriately different from the actual ones. In addition, since the embodiments described below are preferred specific examples, various technically preferable limitations are added, but the scope of the present invention is not limited to these embodiments unless otherwise specified in the following description to the effect that the present invention is limited.
 図1は、実施形態に係るインパクト検出装置1の使用状況の一例を示す図である。インパクト検出装置1は、例えば撮影機能付きの情報処理装置、具体的にはスマートフォンである。インパクト検出装置1は、ユーザーUによるインパクトを伴うスイング、例えばゴルフにおけるスイングを撮影して、当該撮影した動画データを解析する。なお、インパクト検出装置1は、スマートフォン以外、例えば携帯型端末装置や、パーソナルコンピューターなどのように動画データを解析する機能を有する装置であってもよい。 FIG. 1 is a diagram showing an example of a usage state of the impact detection device 1 according to the embodiment. The impact detection device 1 is, for example, an information processing device with a shooting function, specifically, a smartphone. The impact detection device 1 shoots a swing accompanied by an impact by a user U, for example, a golf swing, and analyzes the shot video data. Note that the impact detection device 1 may be a device other than a smartphone, such as a portable terminal device or a personal computer, that has a function of analyzing video data.
 図2は、インパクト検出装置1におけるハードウェアの構成を示す図である。インパクト検出装置1によるインパクト検出は、ハードウェアとソフトウェアとの協働により実現される。インパクト検出装置1は、処理装置10と撮影装置11と操作入力装置12と表示装置13と記憶装置14と通信装置15とを含む。 FIG. 2 is a diagram showing the hardware configuration of the impact detection device 1. Impact detection by the impact detection device 1 is achieved through the cooperation of hardware and software. The impact detection device 1 includes a processing device 10, a photographing device 11, an operation input device 12, a display device 13, a storage device 14, and a communication device 15.
 処理装置10は、例えばCPU(Central Processing Unit)等の単数または複数の演算処理回路で構成され、情報処理装置であるインパクト検出装置1の各要素を統括的に制御する。なお、処理装置10は、CPUのほか、DSP(Digital Signal Processor)やASIC(Application Specific Integrated Circuit)等の回路によって構成されてもよい。 The processing device 10 is composed of one or more arithmetic processing circuits, such as a CPU (Central Processing Unit), and controls all elements of the impact detection device 1, which is an information processing device. The processing device 10 may also be composed of circuits such as a DSP (Digital Signal Processor) and an ASIC (Application Specific Integrated Circuit) in addition to a CPU.
 撮影装置11は、ユーザーUのスイングを撮影し、当該撮影した音声および映像を含む動画データを出力する。
 操作入力装置12は、ユーザーによる操作を受け付ける。操作入力装置12は、例えばユーザーが押下する複数の操作子、またはユーザーによる接触を検知するタッチパネルであり、表示装置13による画像表示面に重ねられる。なお、操作入力装置12を操作するユーザーは、スイングをしたユーザー本人であってもよいし、他のユーザーであってもよい。
The photographing device 11 photographs the swing of the user U and outputs video data including the photographed audio and video.
The operation input device 12 accepts operations by a user. The operation input device 12 is, for example, a plurality of operators pressed by the user or a touch panel that detects contact by the user, and is overlaid on the image display screen of the display device 13. Note that the user who operates the operation input device 12 may be the user who made the swing, or may be another user.
 表示装置13は、例えば液晶パネルや有機ELパネルで構成され、処理装置10による制御のもとで各種の画像を表示する。例えば撮影されたスイングや、当該スイングの編集画面などが表示装置13に表示される。 The display device 13 is composed of, for example, a liquid crystal panel or an organic EL panel, and displays various images under the control of the processing device 10. For example, the photographed swing and an editing screen for that swing are displayed on the display device 13.
 記憶装置14は、例えば磁気記録媒体または半導体記録媒体等の公知の記録媒体で構成された単数または複数のメモリーであり、処理装置10が実行するプログラムと処理装置10が使用する各種データや、スイングの動画データを記憶する。なお、複数種の記録媒体の組合せにより記憶装置14を構成してもよい。また、インパクト検出装置1に対して着脱可能な可搬型の記録媒体、または、インパクト検出装置1がネットワークを介して通信可能な外部記録媒体(例えばオンラインストレージ)を、記憶装置14の一部として利用してもよい。 The storage device 14 is a single or multiple memories configured with known recording media such as magnetic recording media or semiconductor recording media, and stores the programs executed by the processing device 10, various data used by the processing device 10, and video data of the swing. The storage device 14 may be configured with a combination of multiple types of recording media. Also, a portable recording medium that can be attached to and detached from the impact detection device 1, or an external recording medium (e.g., online storage) with which the impact detection device 1 can communicate via a network may be used as part of the storage device 14.
 通信装置15は、処理装置10による制御のもとで、ネットワークを介してサーバー等と通信する。 The communication device 15 communicates with a server etc. via a network under the control of the processing device 10.
 図3は、処理装置10の機能的な構成を例示するブロック図である。処理装置10は、記憶装置14に記憶されたプログラムを実行することで、動画データにおいてインパクトが発生したインパクトフレームを検出するための複数の要素(制御部100、第1取得部101、第2取得部102、解析部103、特定部104および判定部105)として機能する。また、処理装置10は、動画データにおいてインパクトフレーム検出した後に、当該動画データを編集するための編集部106、および、動画データを再生するための再生部107としても機能する。
 なお、処理装置10における機能の一部を他の装置、例えばネットワークを介して接続されたサーバーに負担させる構成としてもよい。
3 is a block diagram illustrating a functional configuration of the processing device 10. The processing device 10 executes a program stored in the storage device 14 to function as a plurality of elements (a control unit 100, a first acquisition unit 101, a second acquisition unit 102, an analysis unit 103, an identification unit 104, and a determination unit 105) for detecting an impact frame in video data where an impact occurs. After detecting an impact frame in the video data, the processing device 10 also functions as an editing unit 106 for editing the video data and a playback unit 107 for playing the video data.
It should be noted that some of the functions of the processing device 10 may be implemented by another device, for example, a server connected via a network.
 図4は、動画データの一例を示す図である。図に示されるように、動画データは、インパクトの検出対象であって、ユーザーUによるゴルフのスイングを、例えばインパクト検出装置1によって撮影したものである。当該動画データは、複数のフレームFrで構成される映像部分と、図4では省略されているが、映像部分に同期した音声部分と、を含む。映像部分では、スイング画像がフレームFr毎に記録され、音声部分では、スイングの際に収音した音が記録される。
 なお、動画データの各フレームFrには、開始フレームからの通し番号としてフレーム番号が関連付けられる。
Fig. 4 is a diagram showing an example of video data. As shown in the figure, the video data is an object of impact detection, and is a golf swing by a user U, captured by, for example, the impact detection device 1. The video data includes a video portion made up of a plurality of frames Fr, and an audio portion synchronized with the video portion, although this is omitted in Fig. 4. In the video portion, a swing image is recorded for each frame Fr, and in the audio portion, sounds picked up during the swing are recorded.
Each frame Fr of the video data is associated with a frame number as a serial number from the start frame.
 図では、スイングの代表的な動作のうち、アドレス、トップ、インパクトおよびフィニッシュが順に(A)、(C)、(E)および(G)で示される。
 また、動画データは、例えばMPEG方式であるが、複数のフレームで連続する映像部分と、当該映像部分に同期した音声部分とを含むのであれば、方式は問わない。
In the figure, the address, top, impact and finish of typical swing movements are shown as (A), (C), (E) and (G), respectively.
Furthermore, the moving image data is, for example, in the MPEG format, but any format is acceptable as long as it includes a video portion consisting of a series of multiple frames and an audio portion synchronized with the video portion.
 インパクト検出装置1では、動画データのインパクトフレームを検出するために、基準動作データを参照する。そこで、基準動作データについて説明する。 The impact detection device 1 refers to the reference motion data to detect impact frames in the video data. Here, we will explain the reference motion data.
 図5は、基準動作データを説明するための図である。
 基準動作データは、模範/見本になるスイングを示すデータである。詳細には、基準動作データは、モデルMがスイングした際の骨格情報の各関節Jにおける三次元座標の時間推移を示すデータである。三次元座標の基準は、例えばモデルMにおける両足の重心座標であり、各関節Jにおける三次元座標は、当該重心座標からの相対座標によりフレームFr毎に規定される。
 なお、基準動作データの各フレームFrには、動画データと同様に、開始フレームからの通し番号としてフレーム番号が関連付けられる。
FIG. 5 is a diagram for explaining the reference action data.
The reference motion data is data showing a model/example swing. In detail, the reference motion data is data showing a time transition of three-dimensional coordinates of each joint J of skeletal information when the model M swings. The reference of the three-dimensional coordinates is, for example, the center of gravity coordinates of both feet of the model M, and the three-dimensional coordinates of each joint J are defined for each frame Fr by relative coordinates from the center of gravity coordinates.
Each frame Fr of the reference action data is associated with a frame number as a serial number from the start frame, similar to the video data.
 関節Jの例としては、例えば左膝、右膝、左股関節、右股関節、左肩関節、右肩関節、左肘および右肘など挙げられる。
 図では、右利きのモデルMが正面(打球方向を北とした場合に、東)から見たスイングした際の関節Jが黒丸で示され、スイングのうち、代表的なアドレス、テイクバック、トップ、ダウンスイング、インパクト、フォロースルーおよびフィニッシュについて、順に(1)~(7)として例示されている。
 基準動作データで示されるスイングのフレームのうち、インパクトを示すフレームインパクトフレームには、他のフレームとは区別するために識別子が予めタグ付けされる。
Examples of the joint J include a left knee, a right knee, a left hip joint, a right hip joint, a left shoulder joint, a right shoulder joint, a left elbow, and a right elbow.
In the figure, the joints J of a right-handed model M when swinging are shown as black circles when viewed from the front (east, if the ball is hit in the direction of north). Representative parts of the swing, including the address, take-back, top, downswing, impact, follow-through, and finish, are illustrated, in that order, as (1) to (7).
Among the frames of the swing indicated by the reference motion data, the frame indicating the impact is tagged in advance with an identifier to distinguish it from the other frames.
 基準動作データは、各関節Jの三次元座標を示すので、視点を指定すれば、任意の地点から、モデルMのスイングを眺めることが可能である。
 スイングは、一般的に言えば、ユーザーUの性別、利き手、世代(年齢)、ゴルフの巧拙(プロ、セミプロ、アマ)などの属性によって異なる。このため、基準動作データは、これらの属性に応じて、複数の種類が用意されてデータベース化される。基準動作データのデータベースは、記憶装置14に記憶する構成としてもよいし、クラウド上のサーバーに記憶する構成としてもよい。
 また、基準動作データのデータベースの検索性を向上させるために、基準動作データで示されるスイング(関節座標の時間推移)の類似性を示す樹形図を、予め作成してもよい。
Since the reference motion data indicates the three-dimensional coordinates of each joint J, it is possible to view the swing of the model M from any point by specifying a viewpoint.
Generally speaking, swings differ depending on the attributes of the user U, such as the gender, dominant hand, generation (age), and golf skill (professional, semi-professional, amateur). For this reason, multiple types of reference motion data are prepared according to these attributes and organized into a database. The database of reference motion data may be configured to be stored in the storage device 14 or may be configured to be stored in a server on the cloud.
In order to improve the ease of searching the database of reference motion data, a tree diagram showing the similarity of swings (time transition of joint coordinates) indicated by the reference motion data may be created in advance.
 次に、インパクト検出装置1において、動画データからインパクトフレームを検出するための動作について説明する。 Next, we will explain the operation of the impact detection device 1 to detect impact frames from video data.
 図6は、インパクト検出装置1の動作を示すフローチャートである。この動作は、例えばユーザーUが、インパクト検出装置1に対して、インパクトフレームを検出するアプリケーションプログラムに対応するアイコンを操作したことを契機として実行される。 FIG. 6 is a flowchart showing the operation of the impact detection device 1. This operation is executed when, for example, a user U operates an icon corresponding to an application program that detects impact frames on the impact detection device 1.
 当該アプリケーションプログラムが実行されると、制御部100は、第1取得部101に対し、インパクトフレームの検出対象である動画データの取得を指示する(ステップSa11)。 When the application program is executed, the control unit 100 instructs the first acquisition unit 101 to acquire video data for which impact frames are to be detected (step Sa11).
 指示された第1取得部101は、例えば1または2以上の動画データの一覧を表示装置13に表示させて、ユーザーUにインパクトフレームの検出対象となる動画ファイルの選択を促す。ユーザーUが動画ファイルすると、第1取得部101は、選択した動画ファイルを、インパクトフレームの検出する対象の動画ファイルとして取得する。
 なお、動画ファイルの取得元としては、記憶装置14や、ネットワークを介した外部装置などが挙げられる。記憶装置14に記憶された動画ファイルとしては、撮影装置11で撮影された直近の動画ファイルや、過去に撮影された動画ファイルなどが含まれる。
The instructed first acquisition unit 101 displays, for example, a list of one or more video data on the display device 13, and prompts the user U to select a video file that is to be detected for an impact frame. When the user U selects a video file, the first acquisition unit 101 acquires the selected video file as a video file that is to be detected for an impact frame.
The video file may be acquired from the storage device 14 or an external device via a network. The video files stored in the storage device 14 include the most recent video file captured by the imaging device 11 and video files captured in the past.
 制御部100は、第1取得部101によって取得された動画データを解析部103に転送する。解析部103は、動画データにおいて、スイングするユーザーの関節座標の時間推移を、既知の方法を用いて解析する。
 解析部103は、解析した関節座標の時間推移を示す情報を第2取得部102に転送する。
The control unit 100 transfers the video data acquired by the first acquisition unit 101 to the analysis unit 103. The analysis unit 103 analyzes the time transition of the joint coordinates of the user swinging in the video data by using a known method.
The analysis unit 103 transfers information indicating the analyzed changes in the joint coordinates over time to the second acquisition unit 102 .
 第2取得部102は、動画データを解析して得られた関節座標の時間推移に最も類似する基準動作データを、データベースを検索して特定し、取得する(ステップSa12)。なお、解析して得られた関節座標の時間推移に最も類似する基準動作データは、例えばSubsequence Dynamic Time Warpingを用いて特定してもよい。 The second acquisition unit 102 searches a database to identify and acquire the reference motion data that is most similar to the time progression of the joint coordinates obtained by analyzing the video data (step Sa12). Note that the reference motion data that is most similar to the time progression of the joint coordinates obtained by analysis may be identified using, for example, Subsequence Dynamic Time Warping.
 制御部100は、第1取得部101によって取得された動画データおよび第2取得部102によって取得された基準動作データを特定部104に転送する。特定部104は、動画データの各フレームと基準動作データの各フレームとの相関性を、具体的には、動画データの各フレームが基準動作データのどのフレームに対応しているのかを求める。 The control unit 100 transfers the video data acquired by the first acquisition unit 101 and the reference action data acquired by the second acquisition unit 102 to the identification unit 104. The identification unit 104 determines the correlation between each frame of the video data and each frame of the reference action data, specifically, which frame of the reference action data each frame of the video data corresponds to.
 図7の上欄は、動画データの各フレームと基準動作データの各フレームとの相関性を示す図であり、具体的には、動画データの各フレームが基準動作データのどのフレームに対応しているのかを黒点で示す。
 例えば、動画データにおいてフレーム(A)が、基準動作データにおいてアドレス(1)のフレームに対応していることを示す。同様に、動画データにおいてフレーム(E)が、基準動作データにおいてインパクト(5)のフレームに対応していることを示す。また、動画データにおいてフレーム(G)が、基準動作データにおいてフィニッシュ(7)のフレームに対応していることを示す。
The upper column of Figure 7 is a diagram showing the correlation between each frame of the video data and each frame of the reference action data; specifically, black dots indicate which frame of the reference action data each frame of the video data corresponds to.
For example, frame (A) in the video data corresponds to the frame of address (1) in the reference motion data. Similarly, frame (E) in the video data corresponds to the frame of impact (5) in the reference motion data. Also, frame (G) in the video data corresponds to the frame of finish (7) in the reference motion data.
 特定部104は、相関性を求めた後、基準動作データのインパクト(5)に対応する動画データのフレーム(E)を基準にして、時間的に前の例えば0.3秒から、時間的に後方の例えば0.3秒までの区間Sを、当該動画データにおいてインパクトフレームが含まれる可能性が高い区間であると特定する(ステップSa13)。
 特定部104は、特定した区間Sを示す情報を判定部105に転送する。
After determining the correlation, the identification unit 104 identifies a section S from, for example, 0.3 seconds earlier in time to, for example, 0.3 seconds later in time, based on frame (E) of the video data corresponding to impact (5) in the reference action data, as a section in the video data that is likely to include an impact frame (step Sa13).
The identification unit 104 transfers information indicating the identified section S to the determination unit 105 .
 次に、判定部105は、動画データのうち、特定部104によって特定された区間Sの音声部分に、打撃音が1つであるか否かを判定する(ステップSa14)。打撃音とは、極めて短時間(例えば0.01秒)以内において音量が閾値を超える音をいう。図7の下欄は、特定部104によって特定された区間Sの音声部分を示す波形一例であって、当該区間Sに打撃音が2つ存在する場合の例である。 Next, the determination unit 105 determines whether or not there is one impact sound in the audio portion of the section S identified by the identification unit 104 in the video data (step Sa14). An impact sound is a sound whose volume exceeds a threshold within an extremely short period of time (e.g., 0.01 seconds). The lower section of Figure 7 shows an example waveform showing the audio portion of the section S identified by the identification unit 104, which is an example of a case in which there are two impact sounds in the section S.
 判定部105は、区間Sの音声部分に存在する打撃音が1つのみと判定した場合(ステップSa14の判定結果が「Yes」である場合)、区間Sの映像部分のうち、当該打撃音がピーク値を迎えるフレームをインパクトフレームと検出する(ステップSa15)。 If the determination unit 105 determines that only one impact sound is present in the audio portion of section S (if the determination result in step Sa14 is "Yes"), it detects the frame in the video portion of section S in which the impact sound reaches its peak value as an impact frame (step Sa15).
 区間Sの音声部分に存在する打撃音が1つではないと判定される場合(ステップSa14の判定結果が「No」になる場合)、次に、判定部105は、当該区間Sの音声部分に存在する打撃音が2つ以上であるか否かを判定する(ステップSa16)。 If it is determined that there is more than one impact sound in the audio portion of section S (if the determination result in step Sa14 is "No"), the determination unit 105 then determines whether there are two or more impact sounds in the audio portion of section S (step Sa16).
 当該区間Sの音声部分に存在する打撃音が2つ以上であると判定した場合(ステップSa16の判定結果が「Yes」の場合)、判定部105は、区間Sに存在する2以上の打撃音のうち、動画データのフレーム(E)に最も近いタイミングの打撃音が、ユーザーUによるスイングで発生したインパクトであるとする。そして、判定部105は、当該打撃音がピーク値を迎えるフレームをインパクトフレームと特定する(ステップSa17)。 If it is determined that there are two or more impact sounds present in the audio portion of the section S (if the determination result of step Sa16 is "Yes"), the determination unit 105 determines that, among the two or more impact sounds present in the section S, the impact sound having the timing closest to frame (E) of the video data is the impact that occurred during the swing by the user U. Then, the determination unit 105 identifies the frame in which the impact sound reaches its peak value as the impact frame (step Sa17).
 ゴルフ練習場では、多数のユーザーが、同時並行的にゴルフを練習する。このため、短時間の間に多数の打撃音が発生し得る。図7の下欄は、区間Sの音声部分では、タイミングa、bで打撃音が発生した場合の例である。この場合、タイミングbで発生した打撃音の音量がタイミングaで発生した打撃音の音量よりも大きければ、ユーザーUのスイングにおけるインパクトがタイミングbで発生したと誤検出してしまう可能性がある。 At a golf driving range, many users practice golf simultaneously. For this reason, many impact sounds can occur in a short period of time. The lower section of Figure 7 shows an example of a case where impact sounds occur at timings a and b in the audio portion of section S. In this case, if the volume of the impact sound generated at timing b is greater than the volume of the impact sound generated at timing a, there is a possibility that the impact in user U's swing will be erroneously detected as occurring at timing b.
 これに対し、本実施形態では、打撃音の音量ではなく、フレーム(E)に最も近いタイミングaの打撃音が、ユーザーUによるスイングで発生したと判定されるので、多数のユーザーが同時に練習する場合であっても、ユーザーUのスイングにおけるインパクトフレームを精度良く特定することができる。 In contrast, in this embodiment, it is not the volume of the impact sound that is used to determine that the impact sound occurred in the swing by user U, but rather the timing a closest to frame (E). Therefore, even when multiple users are practicing at the same time, it is possible to accurately identify the impact frame in user U's swing.
 なお、区間Sの音声部分に存在する打撃音が1つではなく(ステップSa14の判定結果が「No」であって)、打撃音が2つ以上でない場合(ステップSa16の判定結果が「No」になる場合)とは、当該区間Sの音声部分に打撃音が存在しない場合である。このため、判定部105は、ステップSa16の判定結果が「No」の場合には、動画データにはインパクトフレームが存在しないことになるので、制御部100に通知し、当該制御部100は、エラー処理を実行する(ステップSa18)。
 本実施形態では、ユーザーUによるスイングが単なる素振りである場合、区間Sに打撃音が発生しないので、ステップSa18においてエラー処理が実行される。このため、打撃音を伴わないスイングにおいてフレーム(E)がインパクトフレームである、と特定されることはない。
 なお、エラー処理の例としては、動画データにはインパクトフレームが存在しない旨や、動画データのスイングは素振りです旨などのメッセージを、表示装置13に表示させることなどが挙げられる。
Note that if there is not one impact sound in the audio portion of section S (the determination result in step Sa14 is "No") but two or more impact sounds (the determination result in step Sa16 is "No"), this means that no impact sound is present in the audio portion of section S. Therefore, if the determination result in step Sa16 is "No", the determination unit 105 notifies the control unit 100 that there is no impact frame in the video data, and the control unit 100 executes error processing (step Sa18).
In this embodiment, when the swing by the user U is merely a practice swing, no impact sound is generated in the section S, and therefore error processing is executed in step Sa18. For this reason, in a swing that does not include an impact sound, frame (E) is not identified as an impact frame.
Examples of error processing include displaying a message on the display device 13 indicating that there is no impact frame in the video data, or that the swing in the video data is a practice swing, etc.
 本実施形態によれば、素振りや空振りと、実際の打撃を伴うスイングとを区別することができるだけでなく、複数人が同時並行的にスイングする場合でも、ユーザーのスイングを撮影したインパクトフレームを精度良く検出することができる。 This embodiment not only makes it possible to distinguish between practice swings and misses and swings that result in actual impact, but also makes it possible to accurately detect impact frames captured from users' swings, even when multiple people are swinging simultaneously.
 実施形態において、ユーザーUが1回のスイングしたことを前提として説明したが、実際には、撮影開始から撮影終了まで、スイングを複数回繰り返す。動画データに複数回のスイングが含まれている場合、スイング毎にインパクトフレームを特定すればよい。 In the embodiment, the explanation is based on the assumption that the user U takes one swing, but in reality, the swing is repeated multiple times from the start of filming to the end of filming. If the video data includes multiple swings, the impact frame can be identified for each swing.
 図8は、動画データに複数回のスイングが含まれている場合において、当該動画データの各フレームと基準動作データの各フレームとの相関性を示す図である。図に示される相関性は、1つの動画データに3回のスイングが含まれる例である。
 この例において、音声部分の図示は省略されているが、仮に、各スイングのフレーム(E)付近に打撃音が発生していれば、当該打撃音の発生しているフレームがインパクトフレームとして検出される。
8 is a diagram showing the correlation between each frame of the video data and each frame of the reference motion data when the video data includes multiple swings. The correlation shown in the figure is an example in which one video data includes three swings.
In this example, the audio portion is not shown, but if an impact sound occurs near frame (E) of each swing, the frame in which the impact sound occurs is detected as the impact frame.
 動画データに複数回のスイングが含まれる場合、各スイングにおいてインパクトフレームが特定された後に、編集部106は、当該動画データを編集して、各スイングを切り出す構成としてもよい。
 具体的には、当該動画データのうち、3回のスイングにおいてフレーム(E)がインパクトフレームとして特定された場合、図9に示されるように、編集部106が、フレーム(E)の時間的に前の2.0秒から、フレーム(E)の時間的に後の2.0秒までを、1つのスイングとして切り出す。
In the case where the video data includes a plurality of swings, after the impact frame is identified for each swing, the editing unit 106 may edit the video data to extract each swing.
Specifically, when frame (E) is identified as an impact frame among the three swings in the video data, the editing unit 106 extracts the period from 2.0 seconds before frame (E) to 2.0 seconds after frame (E) as one swing, as shown in FIG. 9 .
 なお、図の例の場合、スイングを示す動画データが3つ切り出されることになる。このように切り出された動画データには、例えば撮影日時など情報が関連付けられる。
 また、フレーム(E)の時間的に前の2.0秒が第1時間の一例であり、フレーム(E)の時間的に後の2.0秒が第2時間の一例である。第1時間および第2時間については、ユーザーUによって任意の時間に設定可能にしてもよい。
In the illustrated example, three pieces of video data showing the swing are extracted. The extracted video data are associated with information such as the shooting date and time.
In addition, 2.0 seconds before frame (E) is an example of the first time, and 2.0 seconds after frame (E) is an example of the second time. The first time and the second time may be set to any time by the user U.
 スイング毎に、動画データが切り出された場合に、再生部107は、例えば図10に示されるように、比較のために2つ並べて再生する構成としてもよい。2つ並べて再生する構成において、図に示されるように、インパクトフレームとして特定されたフレーム(E)で揃えれば、2つの動画においてインパクト前後においてスイングの相違をユーザーUが把握しやすくなる。 When video data is extracted for each swing, the playback unit 107 may be configured to play two videos side by side for comparison, as shown in FIG. 10, for example. In a configuration in which two videos are played side by side, if the frames (E) identified as the impact frame are aligned as shown in the figure, the user U can easily grasp the difference in the swing before and after impact in the two videos.
 再生する動画データについては、2つに限られず、3以上であってもよい。また、再生する動画データについて、記憶装置14に記憶されているものに限られず、クラウド上のサーバーに記憶された動画データであってもよい。
 例えば、自己のスイングと、より上級者のスイングとを比較することによって、自己のスイングが劣っている点を把握しやすくなる。また例えば過去のスイングと現在のスイングとを比較することによって上達の度合いを把握することもできる。る、  なお、再生部107は、等速の再生に限られず、スロー再生してもよいし、早送り再生してもよいし、コマ送りで再生してもよい。
The number of pieces of video data to be played back is not limited to two, but may be three or more. Furthermore, the video data to be played back is not limited to that stored in the storage device 14, but may be video data stored in a server on the cloud.
For example, by comparing one's swing with the swing of a more advanced player, it becomes easier to grasp the inferior points of one's swing. Also, by comparing a past swing with a current swing, it is possible to grasp the degree of improvement. Note that the playback unit 107 is not limited to playback at a constant speed, and may play back in slow motion, fast forward, or frame by frame.
 なお、実施形態では、インパクトフレームを特定するアプリケーションプログラムが、ユーザーUによる操作を契機として実行されたが、例えばスイングの撮影が終了したことを契機として実行される構成としてもよい。この構成において、第1取得部101は、撮影されたスイングの動画データをインパクトの検出対象として取得する。
 また、実施形態では、動画データの解析を、インパクト検出装置1の解析部103が行う構成としたが、インパクト検出装置1以外の装置、例えば外部装置が解析し、当該解析結果である、関節座標の時間推移を示す情報を、第2取得部102が取得する構成でもよい。
In the embodiment, the application program for identifying the impact frame is executed in response to an operation by the user U, but the application program may be executed in response to, for example, completion of filming of the swing. In this configuration, the first acquisition unit 101 acquires video data of the filmed swing as a detection target for impact.
In addition, in the embodiment, the analysis of the video data is configured to be performed by the analysis unit 103 of the impact detection device 1, but the analysis may also be performed by a device other than the impact detection device 1, for example an external device, and the second acquisition unit 102 may acquire information indicating the time progression of the joint coordinates, which is the analysis result.
 また、実施形態では、スイングとして、ゴルフスイングを例示したが、他にテニス、野球、卓球など打撃音を伴うスイングにおいて適用可能である。 In the embodiment, a golf swing is given as an example of a swing, but the invention can also be applied to swings that involve impact sounds, such as tennis, baseball, and table tennis.
 以上の記載から、例えば以下のように本開示の好適な態様が把握される。 From the above description, preferred aspects of the present disclosure can be understood as follows, for example:
 本開示のひとつの態様(態様1)に係るインパクト検出装置は、ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する第1取得部と、モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する第2取得部と、前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を特定する特定部と、前記区間に、打球音を含むフレームが存在するか否かを判定し、前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する判定部と、を有する。 An impact detection device according to one aspect (aspect 1) of the present disclosure includes a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
 態様1によれば、第1に、動画データにおいて、基準動作データにおいてインパクトである旨の識別子が付されたタイミングに相当するフレームを含む区間が特定され、第2に、当該区間に、打球音を含むフレームが存在すれば、動画データにインパクトを伴うスイングが含まれると判定される。換言すれば、当該区間に、打球音を含むフレームが存在しなれければ、動画データに、インパクトを伴うスイングが含まれないと判定される。したがって、態様1によれば、素振りや空振りのように、インパクトを伴わないスイングと、インパクトを伴うスイングとを区別することができる。 According to aspect 1, first, a section is identified in the video data that includes a frame that corresponds to a timing in the reference motion data that is assigned an identifier indicating an impact, and second, if a frame including a hitting sound is present in that section, it is determined that the video data includes a swing with impact. In other words, if a frame including a hitting sound is not present in that section, it is determined that the video data does not include a swing with impact. Therefore, according to aspect 1, it is possible to distinguish between a swing that does not include impact, such as a practice swing or a whiff, and a swing that includes impact.
 態様1の具体的な態様2に係るインパクト検出装置では、前記区間に打球音を含むフレームが2以上存在する場合、前記判定部は、前記識別子が付されたタイミングに最も近いフレームをインパクトフレームとして判定する。
 態様2によれば、区間に打球音を含むフレームが2以上存在する場合、識別子が付されたタイミングに最も近いフレームがインパクトフレームとして判定される。このため、対象ユーザーによる打球音を伴うフレームを、他のユーザーによる打球音の影響を受けずに、精度良く特定することができる。
In the impact detection device according to a specific aspect 2 of aspect 1, when there are two or more frames including a hitting sound in the section, the determination unit determines the frame closest to the timing at which the identifier is assigned to be the impact frame.
According to the second aspect, when there are two or more frames including a hitting sound in a section, the frame closest to the timing when the identifier was added is determined to be the impact frame. Therefore, the frame including the hitting sound by the target user can be identified with high accuracy without being affected by the hitting sounds by other users.
 態様1の具体的な態様3に係るインパクト検出装置では、前記基準動作データは、モデル毎に複数で用意され、前記第2取得部は、前記複数の基準動作データのうち、前記動画データで示されるスイングに類似する一の基準動作データを取得する。
 態様3によれば、複数の基準動作データのうち、動画データで示されるスイングに類似する基準動作データが用いられて、区間が特定される。
In the impact detection device relating to a specific aspect 3 of aspect 1, a plurality of pieces of reference motion data are prepared for each model, and the second acquisition unit acquires one piece of reference motion data from the plurality of pieces of reference motion data that is similar to the swing shown in the video data.
According to the third aspect, the section is identified by using, from among the plurality of reference motion data, reference motion data similar to the swing shown in the video data.
 態様3の具体的な態様4に係るインパクト検出装置では、前記基準動作データは、前記モデルにおける関節座標の時間推移を示すデータであり、前記動画データにおけるユーザーにおける関節座標の時間推移を解析する解析部を有し、前記第2取得部は、前記複数の基準動作データのうち、前記動画データの、ユーザーにおける関節座標の時間推移に最も類似する時間推移を有する基準動作データを取得する。 In the impact detection device according to specific aspect 4 of aspect 3, the reference motion data is data indicating the time progression of the joint coordinates in the model, and an analysis unit is provided that analyzes the time progression of the joint coordinates of the user in the video data, and the second acquisition unit acquires, from the plurality of reference motion data, the reference motion data having the time progression that is most similar to the time progression of the joint coordinates of the user in the video data.
 態様2の具体的な態様5に係るインパクト検出装置では、前記動画データを編集する編集部を有し、前記編集部は、前記動画データに対し、前記特定部で特定されたインパクトフレームを基準に、第1時間だけ前方のフレームから第2時間だけ後方のフレームまでを切り出す。
 態様5によれば、動画データのうち、インパクトフレームに対し第1時間だけ前方のフレームから、インパクトフレームに対し第2時間だけ後方のフレームまでが、編集部によって切り出される。
An impact detection device relating to a specific aspect 5 of aspect 2 has an editing unit that edits the video data, and the editing unit cuts out frames from the video data that are a first time ahead to a second time behind the impact frame identified by the identification unit.
According to the fifth aspect, the editing unit cuts out from the video data a frame that is a first time before the impact frame and a frame that is a second time after the impact frame.
 態様5の具体的な態様6に係るインパクト検出装置では、前記動画データに2以上のスイングが含まれている場合、前記編集部は、前記スイング毎に、当該動画データを切り出す。態様6によれば、複数のスイングを含む一連の動画データが、スイング毎に切り出される。 In the impact detection device according to a specific aspect 6 of aspect 5, when the video data includes two or more swings, the editing unit cuts out the video data for each of the swings. According to aspect 6, a series of video data including multiple swings is cut out for each swing.
 態様6の具体的な態様7に係るインパクト検出装置では、前記編集部によって切り出された2以上の動画データを並べて再生する再生部を備える。態様7によれば、2以上の動画を再生することによりスイングを比較することができる。 The impact detection device according to aspect 7, which is a specific example of aspect 6, includes a playback unit that aligns and plays two or more pieces of video data extracted by the editing unit. According to aspect 7, swings can be compared by playing two or more videos.
 本開示の好適な態様8に係るインパクト検出方法は、コンピューターに、ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する過程と、モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する過程部と、前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を特定する過程と、前記区間に、打球音を含むフレームが存在するか否かを判定し、前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する過程と、を実行させる。 The impact detection method according to the eighth preferred aspect of the present disclosure causes a computer to execute the steps of: acquiring video data including sounds recorded when a user swings, the video data being a multiple-frame shot of the swing; acquiring reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; identifying a section of the video data that includes a frame corresponding to the timing of the identifier in the reference motion data; and determining whether or not a frame including a hitting sound is present in the section, and determining that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
 本開示の好適な態様9に係るプログラムは、コンピューターを、ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する第1取得部、モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する第2取得部、前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を特定する特定部、および、前記区間に、打球音を含むフレームが存在するか否かを判定し、前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する判定部、として機能させる。 The program according to a preferred aspect 9 of the present disclosure causes a computer to function as a first acquisition unit that acquires video data including sounds recorded when a user swings, the video data being a plurality of frames of the swing; a second acquisition unit that acquires reference motion data showing the time progression of a swing by a model, the reference motion data being given an identifier indicating the timing of impact in the swing; an identification unit that identifies a section of the video data that includes a frame corresponding to the timing of impact in the reference motion data that is given the identifier; and a determination unit that determines whether or not a frame including a hitting sound is present in the section, and determines that the video data includes a swing with impact if it is determined that a frame including a hitting sound is present in the section.
 1…インパクト検出装置、10…処理装置、11…撮影装置、12…操作入力装置、13…表示装置、14…記憶装置、15…通信装置、100…制御部、101…第1取得部、102…第2取得部、103…解析部、104…特定部、105…判定部、106…編集部、107…再生部。 1...impact detection device, 10...processing device, 11...imaging device, 12...operation input device, 13...display device, 14...storage device, 15...communication device, 100...control unit, 101...first acquisition unit, 102...second acquisition unit, 103...analysis unit, 104...identification unit, 105...determination unit, 106...editing unit, 107...playback unit.

Claims (9)

  1.  ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する第1取得部と、
     モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する第2取得部と、
     前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を、特定する特定部と、
     前記区間に、打球音を含むフレームが存在するか否かを判定し、
     前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する判定部と、
     を有するインパクト検出装置。
    A first acquisition unit that acquires video data including a sound recorded when a user swings, the video data being a plurality of frames of the swing;
    A second acquisition unit that acquires reference motion data indicating a time transition of a swing by a model, the reference motion data being assigned an identifier indicating a timing of an impact in the swing;
    an identification unit that identifies a section including a frame corresponding to a timing at which the identifier is assigned in the reference action data, from the video data;
    determining whether or not a frame including a hitting sound is present in the section;
    a determination unit that, when it is determined that a frame including a hitting sound exists in the section, determines that the video data includes a swing including an impact;
    An impact detection device having an impact detection means.
  2.  前記区間に打球音を含むフレームが2以上存在する場合、
     前記判定部は、
     前記識別子が付されたタイミングに最も近いフレームをインパクトフレームとして判定する
     請求項1に記載のインパクト検出装置。
    When there are two or more frames including a hitting sound in the section,
    The determination unit is
    The impact detection device according to claim 1 , wherein a frame closest to the timing at which the identifier is attached is determined to be an impact frame.
  3.  前記基準動作データは、モデル毎に複数で用意され、
     前記第2取得部は、
     前記複数の基準動作データのうち、前記動画データで示されるスイングに類似する一の基準動作データを取得する
     請求項1に記載のインパクト検出装置。
    A plurality of sets of the reference motion data are prepared for each model,
    The second acquisition unit is
    The impact detection device according to claim 1 , further comprising: acquiring one piece of reference motion data similar to the swing shown in the video data from among the plurality of pieces of reference motion data.
  4.  前記基準動作データは、前記モデルにおける関節座標の時間推移を示すデータであり、
     前記動画データにおけるユーザーにおける関節座標の時間推移を解析する解析部を有し、
     前記第2取得部は、
     前記複数の基準動作データのうち、
     前記動画データの、ユーザーにおける関節座標の時間推移に最も類似する時間推移を有する基準動作データを取得する
     請求項3に記載のインパクト検出装置。
    the reference motion data is data indicating a time transition of joint coordinates in the model,
    an analysis unit that analyzes a time transition of a joint coordinate of a user in the video data;
    The second acquisition unit is
    Among the plurality of reference motion data,
    The impact detection device according to claim 3 , further comprising: acquiring reference motion data having a time transition that is most similar to a time transition of the joint coordinates of the user in the video data.
  5.  前記動画データを編集する編集部を有し、
     前記編集部は、
     前記動画データに対し、前記特定部で特定されたインパクトフレームを基準に、第1時間だけ前方のフレームから第2時間だけ後方のフレームまでを切り出す
     請求項2に記載のインパクト検出装置。
    An editing unit that edits the video data,
    The editorial department:
    The impact detection device according to claim 2 , wherein frames from a first time ahead to a second time behind the impact frame specified by the specification unit are extracted from the video data.
  6.  前記動画データに2以上のスイングが含まれている場合、
     前記編集部は、
     前記スイング毎に、当該動画データを切り出す
     請求項5に記載のインパクト検出装置。
    If the video data includes two or more swings,
    The editorial department:
    The impact detection device according to claim 5 , wherein the video data is extracted for each swing.
  7.  前記編集部によって切り出された2以上の動画データを並べて再生する再生部を備える
     請求項6に記載のインパクト検出装置。
    The impact detection device according to claim 6 , further comprising a playback unit that arranges and plays back two or more pieces of video data cut out by the editing unit.
  8.  コンピューターに、
     ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する過程と、
     モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する過程部と、
     前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を、特定する過程と、
     前記区間に、打球音を含むフレームが存在するか否かを判定し、
     前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する過程と、
     を実行させるインパクト検出方法。
    On the computer,
    acquiring video data including sounds recorded during a user's swing, the video data being a plurality of frames of the swing;
    A process unit for acquiring reference motion data showing a time progression of a swing by a model, the reference motion data being assigned an identifier showing a timing of an impact in the swing;
    identifying a section of the video data that includes a frame corresponding to a timing at which the identifier is assigned in the reference action data;
    determining whether or not a frame including a hitting sound is present in the section;
    determining that the video data includes a swing involving an impact when it is determined that a frame including a ball-hitting sound exists in the section;
    An impact detection method that causes the
  9.  コンピューターを、
     ユーザーによるスイングの際に収録した音を含む動画データであって、当該スイングが複数フレームで撮影された動画データを取得する第1取得部、
     モデルによるスイングの時間推移を示す基準動作データであって、前記スイングにおいてインパクトのタイミングであることを示す識別子が付された基準動作データを取得する第2取得部、
     前記動画データのうち、前記基準動作データにおいて前記識別子が付されたタイミングに相当するフレームを含んだ区間を、特定する特定部、および、
     前記区間に、打球音を含むフレームが存在するか否かを判定し、前記区間に打球音を含むフレームが存在すると判定した場合には、前記動画データに、インパクトを伴うスイングが含まれると判定する判定部、
     として機能させるプログラム。
    Computer,
    A first acquisition unit that acquires video data including a sound recorded when a user swings, the video data being a plurality of frames of the swing;
    a second acquisition unit that acquires reference motion data indicating a time progression of a swing by a model, the reference motion data being assigned an identifier indicating a timing of impact in the swing;
    an identification unit that identifies a section including a frame corresponding to a timing at which the identifier is assigned in the reference action data, from the video data; and
    a determination unit that determines whether or not a frame including a hitting sound is present in the section, and when it is determined that a frame including a hitting sound is present in the section, determines that the video data includes a swing involving an impact;
    A program that functions as a
PCT/JP2023/042297 2022-12-22 2023-11-27 Impact detection device, impact detection method, and program WO2024135232A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-205279 2022-12-22
JP2022205279A JP2024089828A (en) 2022-12-22 2022-12-22 Impact detection device, impact detection method and program

Publications (1)

Publication Number Publication Date
WO2024135232A1 true WO2024135232A1 (en) 2024-06-27

Family

ID=91588199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/042297 WO2024135232A1 (en) 2022-12-22 2023-11-27 Impact detection device, impact detection method, and program

Country Status (2)

Country Link
JP (1) JP2024089828A (en)
WO (1) WO2024135232A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10258146A (en) * 1997-03-17 1998-09-29 Yamaha Corp Form analyzer and image recording and reproducing device used for the same
US20060008116A1 (en) * 2002-06-06 2006-01-12 Kiraly Christopher M Flight parameter measurement system
JP2016168196A (en) * 2015-03-13 2016-09-23 ヤマハ株式会社 Swing measurement system
JP2017169203A (en) * 2017-04-12 2017-09-21 カシオ計算機株式会社 Image processing system, image processing method and program
JP2021058760A (en) * 2021-01-19 2021-04-15 株式会社ユピテル Device and program
KR102281124B1 (en) * 2020-12-24 2021-07-23 주식회사 유라이크 User's Golf Swing Video Editing Method and Management Server Used Therein
JP2021125075A (en) * 2020-02-07 2021-08-30 株式会社Nttドコモ Information processing device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10258146A (en) * 1997-03-17 1998-09-29 Yamaha Corp Form analyzer and image recording and reproducing device used for the same
US20060008116A1 (en) * 2002-06-06 2006-01-12 Kiraly Christopher M Flight parameter measurement system
JP2016168196A (en) * 2015-03-13 2016-09-23 ヤマハ株式会社 Swing measurement system
JP2017169203A (en) * 2017-04-12 2017-09-21 カシオ計算機株式会社 Image processing system, image processing method and program
JP2021125075A (en) * 2020-02-07 2021-08-30 株式会社Nttドコモ Information processing device
KR102281124B1 (en) * 2020-12-24 2021-07-23 주식회사 유라이크 User's Golf Swing Video Editing Method and Management Server Used Therein
JP2021058760A (en) * 2021-01-19 2021-04-15 株式会社ユピテル Device and program

Also Published As

Publication number Publication date
JP2024089828A (en) 2024-07-04

Similar Documents

Publication Publication Date Title
JP4905474B2 (en) Video processing apparatus, video processing method, and program
EP1907076B1 (en) A method for analyzing the motion of a person during an activity
US8988528B2 (en) Video processing device, video processing method, and program
CN109789329B (en) Scoring support program, scoring support device, and scoring support method
JP2003117045A (en) Swing form diagnosing device
JP2006230630A (en) Practical skill analysis system and program
JPH11339009A (en) Analytic data generating device
AU2012214978A1 (en) Virtual golf simulation apparatus and method
TWI590856B (en) Extraction method and device
CN112287771A (en) Method, apparatus, server and medium for detecting video event
KR101705836B1 (en) System and Method for analyzing golf swing motion using Depth Information
US20050213817A1 (en) Image recognition apparatus and image recognition program
WO2024135232A1 (en) Impact detection device, impact detection method, and program
US10950276B2 (en) Apparatus and method to display event information detected from video data
KR101701632B1 (en) Determination method and device
JP7295053B2 (en) Scene extraction method, device and program
JP2005286378A (en) Moving picture reproduction system and moving picture reproduction method
KR102055146B1 (en) Method for checking out sports motion using event-based vision sensor and apparatus for the same
JP2017131308A (en) Instruction assist system, and instruction assist program
WO2024185355A1 (en) Moving picture reproduction device, program, and moving picture reproduction method
TWI559760B (en) Extraction method and device
JP2020107991A (en) Moving image tagging device and moving image tagging method
JP7429887B2 (en) Ball game video analysis device, ball game video analysis method, and computer program
KR102180108B1 (en) Golf memorial image providing system and providing method
KR20020046862A (en) Apparatus for comparing motions using a video camera and method for comparing the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23906601

Country of ref document: EP

Kind code of ref document: A1