WO2017209260A1

WO2017209260A1 - Audio training device and audio training method

Info

Publication number: WO2017209260A1
Application number: PCT/JP2017/020514
Authority: WO
Inventors: 旭　保彦; 満細尾
Original assignee: ヤマハ株式会社
Priority date: 2016-06-01
Filing date: 2017-06-01
Publication date: 2017-12-07
Also published as: JP2017213240A

Abstract

An audio training device (1) is provided with: a wearable unit (2) worn by a user and having a sound emitting unit (14) for emitting guidance audio to the user; a sensor (15) for detecting user posture or movement; and a guidance audio generating unit (10) for generating guidance audio determined in real time on the basis of user posture or movement detected by the sensor (15).

Description

Voice learning apparatus and voice learning method

This disclosure relates to a voice learning apparatus and a voice learning method for learning a movement of a user using a voice in real time during a predetermined movement.

An apparatus for analyzing and evaluating a user's predetermined operation, for example, a golf swing, has been put into practical use (for example, Patent Document 1 and Patent Document 2). These devices analyze the user's swing and display the trajectory and speed after the swing to inform the user.

Japanese Laid-Open Patent Publication No. 2010-025737

Japanese Unexamined Patent Publication No. 2012-254205

However, the above-described conventional apparatus does not notify the user in real time during the swing of the quality of the swing, for example, the deviation of the swing trajectory. Further, even if there is a device that displays a trajectory deviation or the like in real time during the swing, the user cannot see the display during the swing.

An object of this disclosure is to provide a voice learning device capable of learning a predetermined operation such as a golf swing in real time during the operation (determining the quality of the operation without looking at the display). .

The present disclosure includes a mounting unit that is mounted in the vicinity of the user's ear canal and includes a sound emitting unit that emits a guide voice to a user, and a sensor that detects the posture or movement of the user, and the detected user There is provided a voice learning device including a guide voice generation unit that generates the guide voice determined in real time based on the posture or motion of the voice.

The present disclosure includes a detection step of detecting a posture or movement of a user, a generation step of generating guide voice determined in real time based on the detected posture or movement of the user, and the user during the movement of the user There is provided a sound learning method including a sound emitting step for emitting a guide sound.

According to the present disclosure, a predetermined operation such as a golf swing can be learned in real time during the operation using a voice.

FIG. 1 is a diagram illustrating a usage pattern of a voice learning device (voice instructor) according to an embodiment of the present disclosure. FIG. 2 is a block diagram of the speech learning apparatus. FIG. 3 is a diagram illustrating an example of changes in the user's swing and head speed. FIG. 4 is a diagram illustrating an example of changes in the user's swing and head speed. FIG. 5 is a diagram showing a guide voice generation rule of the voice learning device. FIG. 6A and FIG. 6B are flowcharts showing a guide voice generation procedure and a reference update procedure of the voice learning device.

FIG. 1 is a diagram showing a usage pattern of the audio learning device 1. FIG. 2 is a block diagram of the speech learning apparatus 1. The voice learning apparatus 1 has a mounting portion 2 that is worn on the user's ear, and a small speaker 14 and a motion sensor 15 are built in the mounting portion 2. The motion sensor 15 detects the motion of the user P and the motion of the head. For example, the motion of the head during the swing motion of the golf club C is detected. The voice learning device 1 detects the motion of the head of the user P by the

motion sensors

15L and 15R provided in the

mounting portions

2L and 2R of both ears, and the control unit 10 determines the difference between the motion and the reference with a small speaker. The user P is notified by voice (guide voice) via 14L and 14R. Thereby, the user P can correct the swing in real time by listening to the guide voice while performing the swing operation.

In FIG. 1, the user P wears the voice learning device 1 and takes the address posture with the golf club C. The voice learning device 1 includes a mounting unit 2 (2L, 2R) mounted on both ears of a user and a control box 3 provided near the middle of a cable connecting the left and

right mounting units

2L, 2R. . As described above, the left and right mounting

portions

2L and 2R incorporate

small speakers

14L and 14R that emit sound to the user's outer ear, and

motion sensors

15L and 15R that detect swing. A 9-axis MEMS sensor is applicable as the

motion sensors

15L and 15R. The

motion sensors

15L and 15R are also referred to as a left ear sensor 15L and a right ear sensor 15R. From the

small speakers

14L and 14R, a guide voice for instructing a golf swing in real time is emitted. That is, the guide voice for instructing the golf swing is emitted from the

small speakers

14L and 14R while the user P is performing the golf swing. The left mounting portion 2L and the control box 3, and the right mounting portion 2R and the control box 3 are connected by a stereo cable. The control box 3 transmits audio (guide audio) to be heard by the user P to the left and right mounting

portions

2L and 2R, and acquires detection values from the

motion sensors

15L and 15R provided in the left and right mounting

portions

2L and 2R.

The control box 3 includes a control unit 10 which is a small computer. A left ear sensor 15L, a right ear sensor 15R, a memory 12, and a sound source 11 are connected to the control unit 10. Note that the control box 3 may be dedicated to the audio learning device 1, but may be realized by a multi-function mobile phone (smartphone) and an application program. In this case, the mounting unit 2 and the multi-function telephone may communicate with each other via Bluetooth (registered trademark).

The control unit 10 estimates and analyzes the movement of the user P based on the detection values input from the left ear sensor 15L and the right ear sensor 15R, and causes the sound source 11 to generate based on the difference between the analysis result and the reference. The mode of the guide voice is determined. The mode of the guide voice includes beats added to the guide voice, volume, left / right balance, and localization. The reference is stored in the memory 12. This guide sound mode determination process is performed, for example, every 10 milliseconds, and is output to the sound source 11 as guide sound control information. Details of determining the mode of the guide voice will be described later (see FIG. 4). The sound source (sound generator) 11 generates guide voice based on the control information input from the control unit 10 and controls the change of the mode. The guide sound is amplified by the left and right amplifiers (drivers) 13L and 13R and emitted from the

small speakers

14L and 14R. In FIG. 1, the

amplifiers

13L and 13R are provided in the control box 3, but may be provided in the

mounting portions

2L and 2R.

FIG. 3 and FIG. 4 are diagrams showing a change in posture and a change in head direction when the user P swings the golf club C. FIG. FIG. 3 shows an example when the user P makes a preferable swing, and FIG. 4 shows an example when the user P makes an unfavorable swing. Here, the moving direction of the head is the forward direction in the drawing (the hitting ball direction) and the reverse direction is the backward direction.

In the preferred example of FIG. 3, when the user P swings and hits the golf club C, the user P first takes an address posture (A) and then performs a backswing from this posture (B) to create a top posture. (C). In this process, it can be seen from the detection value of the motion sensor 15 that the head turns to the right by a small angle, and the front-rear position of the head hardly changes. The golf club C is slowly moving (turning) in a direction opposite to the swing direction. The user P starts the swing from the top posture (C) (D), and finishes the swing at the follow-through (F) through the impact (E) while accelerating. At the start of the swing, the head moves forward and turns to the right, and the forward movement and right turn are eliminated (moved and turned in the opposite direction) with the swing. Then, at the moment of impact, the head is facing backwards (turning angle = 0) and is moving backward greatly. After that, along with the follow-through, the head largely turns to the left (in the direction of the hit ball), and the backward movement is canceled and the head moves slightly forward.

In the unfavorable example of FIG. 4, when the user P swings and hits the golf club C, the user P first takes an address posture (A), performs a backswing from this posture (B), and changes the top posture. Make (C). The process so far is almost the same as the preferred example of FIG. The user P starts swinging from the top posture (C) (D), passes through the impact (E) while accelerating and ends the swing through the follow-through (F), but the head moves forward from the start of the swing. Move and start turning left. Even at the moment of impact, the head is turning left (hit direction) while moving forward. And it will follow through with that posture.

FIG. 5 is a diagram illustrating rules for controlling the mode of the guide voice that the voice learning device 1 emits to the user P. The speech learning apparatus 1 of this embodiment detects the movement of the head of the user P, and estimates and analyzes the movement of the whole body. Specifically, the front and back movement and turning of the user P's head are detected based on the detection values of the left and

right ear sensors

15L and 15R. The movement of the head of the user P represents the posture of the user P, and based on the movement of the head, it is possible to estimate the movement of the whole body such as whether the body is facing the ball or the body is not open. Is possible.

References, which are model data relating to the movement of the head of the user P, which is the object of analysis, and the turning of the head, are stored in the memory 12 and read out in synchronization with the swing of the user P for analysis. Compared with content. The reference may be data that records an ideal (teacher's) swing, or may be past data (in a good form).

The sound source 11 generates a sound having a predetermined frequency (for example, 440 Hz) as a guide sound. A sinusoidal wave may be used, but a sawtooth wave that includes overtones and has an easy-to-understand localization may be used.

The control unit 10 compares the head turning angle and the movement amount detected in the address posture, the back swing posture, and the top posture with the reference, and causes the guide voice to produce an amount of beat according to the deviation. The beat is a subtracted waveform of two sounds, and has a frequency corresponding to the frequency difference between the two sounds. Specifically, amplitude modulation (AM) may be performed on the guide voice with a low frequency signal waveform corresponding to beat. The number of beats may be increased as the deviation is larger, and the number of beats may be reduced as the deviation is reduced. For example, the number of beats (beat frequency) can be determined using the following equation.
Number of beats = coefficient x (1- (analysis value / reference value)) x guide audio frequency

Also, the depth of amplitude modulation may be adjusted. That is, the larger the deviation, the larger the beat (modulation deeper), and the smaller the deviation, the smaller the beat (modulation smaller). Since the deviation of the turning angle and the amount of movement of the head with respect to the reference is reflected in the number of times the guide voice beats, the user P may correct the swing posture so as to eliminate the beat while listening to the guide voice. The coefficient is adjusted so that the number of beats is around 10 Hz as in the case of normal instrument tuning. Further, a guide 1 sound and a guide 2 sound whose frequencies are controlled may be generated so that a predetermined number of beats occur, and played simultaneously.

As described above, the magnitude of the beat includes both or one of the magnitude of the beat frequency and the magnitude (depth of modulation) of the beat. In this embodiment, the number of beats and the depth of beat are made to correspond to the deviation of the amount of forward and backward movement and the deviation of the turning angle, respectively. The beat may be generated by amplitude-modulating the guide sound, but the basic guide sound (440 Hz) is emitted to one ear and the frequency is raised or lowered to the other ear according to the deviation of the swing trajectory. The sound of the swing analysis sound (440 ± b hertz) may be emitted to cause a beat in the user P's hearing.

The above describes the example in which the user P is informed of the deviation between the turning angle of the head and the reference of the moving amount by using the guide voice. However, instead of or in addition to the beat, the volume balance or the localization is used. Thus, the user P may be notified of the deviation. Examples thereof will be described below.

The control unit 10 compares the analyzed front / rear position of the head with the reference, and if there is a deviation, changes the left / right volume balance to notify the user P of the deviation. That is, when the head is shifted behind the reference, the left sound is increased to alert the left direction. If the head is displaced before the reference, the right sound is increased to alert the right. This informs the user P that the position of the head is displaced and alerts the user in the direction to return, so that the user P can reflexively correct the head position. Note that, depending on the type of sound, there is a possibility that attention can be drawn when the volume is reduced. In this case, the direction of head displacement and the direction of volume balance may be reversed.

The control unit 10 compares the analyzed turning of the head with the reference, and changes the localization of the guide voice when it is deviated. The guide voice is usually localized on the top of the head or the face. When the user P's head is turning to the right of the reference, the guide voice is localized to the left rear. As a result, interest is generated in the left rear, and an extra right turn can be stopped (returned to the original). Conversely, when the head is turning to the left of the reference, the guide voice is localized to the right rear. As a result, interest of the user P is generated at the right rear, and an extra left turn can be stopped.

The above-described analysis of the detection value (motion) of the motion sensor and the mode control process of the guide voice are executed at short time intervals (for example, 10 milliseconds), and the mode of the guide voice generated by the sound source 11 is controlled in real time. While swinging the golf club C, the user P can correct the position of the address, the posture of the back swing, the posture during the swing, and the like by listening to the guide voice controlled in various modes.

FIG. 6A is a flowchart of the swing guide process of the control unit 10. The control unit 10 repeats this process every 10 milliseconds. The control unit 10 acquires the detection values of the left and

right ear sensors

15L and 15R (S20), and analyzes the back and forth movement of the head and the left and right turns based on the detection values (S21). Then, the control unit 10 compares the left / right movement of the head with the reference, and determines the number of beats of the guide voice and the left / right volume balance according to the deviation (S23). The control unit 10 compares the turning angle of the head with the reference, and determines the beat depth and localization position of the guide voice based on the deviation (S14). Then, the sound source 11 is instructed to generate the guide voice in the manner determined in S13, S14, S23, and S24.

FIG. 6B is a flowchart illustrating an example of a reference update procedure. In this example, a procedure for registering a record of a past swing as a reference is shown. The memory 12 has an area for storing a plurality of times of swing analysis data, analyzes the detection values of the motion sensors 15L and R for each swing of the user P (S30), and stores the analysis data of the swing (S30). S31). The swing at S30 may be a practice swing using a reference that has already been registered. After swinging once or a plurality of times, the user P who has determined that there is a good swing performs a predetermined reference update operation on the speech training apparatus 1. When the reference update operation is performed (YES in S32), the control unit 10 advances the process to the history selection process in S33. In S33, the control unit 10 accepts a history selection by the user P. When the memory 12 can store analysis data for a plurality of past times, the user P selects the one that seems to be good. Then, the control unit 10 transfers the selected analysis data to the reference storage area (S34), and returns to the normal processing for receiving the detected value of the motion sensor by the swing.

Note that when the memory 12 stores analysis data for the past one time and a reference update operation is performed, the control unit 10 may automatically register the stored analysis data as a reference. Further, the analysis data of a plurality of times may be stored in the memory 12, and the average value may be used as a reference.

Also, the object of analysis is not limited to the address position, back swing posture, swing trajectory, swing speed, head left / right movement, and head turn. Some of these or other elements related to the operation of the user P may be analyzed.

Also, the change mode of the guide voice for notifying the deviation between the analysis result and the reference in real time is not limited to the above. Further, instead of the guide voice or together with the guide voice, a guide word may be generated and emitted. Guide words include, for example, “Gaze remains in the ball!” And “Do not thrust the upper body!”.

Although the above embodiment has described an example of analyzing and instructing the swing of the golf club C, the present disclosure can be applied to other than the golf swing. For example, a baseball bat swing, a tennis racket swing, and the analysis target are not limited to swings, and anything that can compare the posture / motion of the model and the posture / motion of the analysis target, such as dance. Further, the transmission medium of the sound emitting unit is not limited to air vibration, and any medium that transmits vibration to the human auditory organ may be used.

Here, the embodiments of the present disclosure are summarized as follows.
(1) A mounting unit that is mounted near the user's ear canal, and that includes a sound emitting unit that emits a guide voice to the user, and a sensor that detects the posture or movement of the user, and the detected user's Provided is a voice learning device including a guide voice generation unit that generates the guide voice determined in real time based on posture or motion.
(2) In the voice training device according to (1), the guide voice generation unit may be at least one of beat, volume, left / right balance, and localization of the guide voice based on the detected posture or action of the user. Control one.
(3) In the speech training apparatus according to (1) above,
The guide sound generation unit generates a pulse sound as the guide sound, and controls at least one of the interval and the volume of the pulse sound based on the detected posture or action of the user.
(4) The speech training device according to any one of (1) to (3) further includes a reference memory that stores a reference of the posture or motion of the user, and the guide speech generation unit includes the detected posture of the user Alternatively, the operation is compared with the reference, and the mode of the guide voice is controlled in real time based on the comparison result.
(5) The voice training device according to (1) to (4) described above is generated by the musical sound generator that generates musical sounds, the guide voice generated by the guide voice generator, and the musical sound generator. A mixer that mixes the musical sound and outputs the mixed sound to the sound emitting unit;
(6) In the speech training apparatus according to (4) to (5), the guide speech generation unit indicates a deviation between the detected user posture or motion and the reference based on the comparison result. Thus, the mode of the guide voice is controlled.
(7) In the voice training device according to (1) to (6), the sound emitting unit and the sensor are built in the mounting unit.
(8) In the audio learning device according to (1) to (7), the mounting unit is an earphone.
(9) The voice learning method of the present disclosure includes a detection step of detecting a posture or action of a user, a generation step of generating guide voice determined in real time based on the detected posture or action of the user, And a sound emitting step for emitting the guide sound during the user's operation.
(10) In the audio learning method according to (9), in the generation step, at least one of beat, volume, left / right balance, and localization of the guide voice based on the detected posture or action of the user. To control.
(11) In the audio learning method according to (9), in the generation step, a pulse sound is generated as the guide sound, and the interval and volume of the pulse sound are determined based on the detected posture or action of the user. Control at least one.
(12) In the audio learning method according to (9) to (11) above, a reference of the posture or motion of the user is stored in a reference memory, and the posture or motion of the user detected in the generation step Is compared with the reference, and the mode of the guide voice is controlled in real time based on the comparison result.
(13) The speech learning method according to (9) to (12) includes a musical sound generating step for generating a musical sound, the guide voice generated in the generating step, and the musical sound generated in the musical sound generating step. A mixing step for mixing the sound and a sound emission step for emitting the mixed sound.
(14) In the audio learning method according to (12) to (13), the generation step may indicate a deviation between the detected user posture or motion and the reference based on the comparison result. The aspect of the guide voice is controlled.

This application is based on a Japanese patent application (Japanese Patent Application No. 2016-109687) filed on June 1, 2016, and is incorporated herein by reference.

According to the present disclosure, it is useful because a predetermined operation such as a golf swing can be learned in real time during the operation using the voice.

1 Voice learning device 2 (2L, 2R) Mounting

part

15L, 15R Motion sensor

Claims

A mounting unit that is mounted near the ear canal of the user, and includes a sound emitting unit that emits a guide voice to the user, and a sensor that detects the posture or operation of the user;
A guide voice generating unit that generates the guide voice determined in real time based on the detected posture or motion of the user;
Voice learning device with
The voice training device according to claim 1, wherein the guide voice generation unit controls at least one of beat, volume, left / right balance, and localization of the guide voice based on the detected posture or action of the user.
The voice lesson according to claim 1, wherein the guide voice generation unit generates a pulse sound as the guide voice and controls at least one of an interval and a volume of the pulse sound based on the detected posture or action of the user. apparatus.
A reference memory for storing a reference of the posture or movement of the user;
The said guide audio | voice production | generation part compares the detected user's attitude | position or operation | movement with the said reference, and controls the aspect of the said guide audio | voice in real time based on a comparison result. Voice learning device.
A musical sound generator for generating musical sounds;
A mixer that mixes the guide sound generated by the guide sound generation unit and the musical sound generated by the musical sound generation unit and outputs the mixed sound to the sound emission unit;
The speech learning apparatus according to claim 1, further comprising:
6. The guide voice generation unit according to any one of claims 4 to 5, wherein the guide voice generation unit controls an aspect of the guide voice so as to indicate a deviation between the detected posture or action of the user and the reference based on the comparison result. The audio training device according to item.
The voice learning device according to any one of claims 1 to 6, wherein the sound emitting unit and the sensor are built in the mounting unit.
The voice learning device according to any one of claims 1 to 7, wherein the mounting portion is an earphone.
A detection process for detecting the posture or movement of the user;
Generating a guide voice determined in real time based on the detected posture or motion of the user;
A sound emission step of emitting the guide voice during the operation of the user;
Voice learning method with
10. The speech learning method according to claim 9, wherein in the generation step, at least one of beat, volume, left / right balance, and localization of the guide voice is controlled based on the detected posture or action of the user.
10. The speech learning method according to claim 9, wherein in the generation step, a pulse sound is generated as the guide sound, and at least one of the interval and the volume of the pulse sound is controlled based on the detected posture or action of the user.
A reference of the user's posture or motion is stored in a reference memory;
The voice according to any one of claims 9 to 11, wherein in the generation step, the detected user's posture or motion is compared with the reference, and the mode of the guide voice is controlled in real time based on the comparison result. Teaching method.
A musical sound generation process for generating musical sounds;
A mixing step of mixing the guide voice generated in the generation step and the musical sound generated in the musical sound generation step;
A sound emission process for emitting the mixed sound;
The speech learning method according to claim 9, further comprising:
The method according to any one of claims 12 to 13, wherein in the generation step, a mode of the guide voice is controlled so as to indicate a deviation between the detected user posture or motion and the reference based on the comparison result. The audio teaching method described.