WO2021230067A1 - Information processing device and information processing method - Google Patents

Information processing device and information processing method Download PDF

Info

Publication number
WO2021230067A1
WO2021230067A1 PCT/JP2021/016706 JP2021016706W WO2021230067A1 WO 2021230067 A1 WO2021230067 A1 WO 2021230067A1 JP 2021016706 W JP2021016706 W JP 2021016706W WO 2021230067 A1 WO2021230067 A1 WO 2021230067A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
sensor
tap
information processing
detection result
Prior art date
Application number
PCT/JP2021/016706
Other languages
French (fr)
Japanese (ja)
Inventor
慧 高橋
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2021230067A1 publication Critical patent/WO2021230067A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones

Definitions

  • the present technology relates to an information processing apparatus and an information processing method, and more particularly to an information processing apparatus and an information processing method capable of identifying vibrations generated by a user's operation.
  • Patent Document 1 describes a small terminal that a user can operate by gesture, voice, and button operation.
  • the vibration generated by the user tapping around the earphone can be detected.
  • the earphone recognizes the user's operation by detecting the vibration by the acceleration sensor.
  • the vibration generated by the user tapping around the earphone may be similar to the noise such as the vibration caused by the user walking or chewing, or the vibration of the sound output from the earphone.
  • the acceleration sensor mounted on the earphone detecting the vibration, it is difficult to distinguish between the vibration and the noise generated by the user's operation.
  • This technology was made in view of such a situation, and makes it possible to identify the vibration generated by the user's operation.
  • the information processing apparatus of the first aspect of the present technology includes two terminals equipped with sensors for detecting vibration including vibration and noise representing a user operation, and one terminal is the other terminal. Based on the receiving unit that receives the detection result by the first sensor mounted on the receiver and the detection result received by the receiving unit, the detection result by the second sensor mounted on one of the terminals is obtained. It is an information processing device having a determination unit for determining whether or not it is noise.
  • the information processing method of the first aspect of the present technology is an information processing apparatus including two terminals equipped with sensors for detecting vibration, wherein one terminal is mounted on the other terminal.
  • the information processing device on the second aspect of the present technology includes a sensor unit that detects vibration including vibration and noise that represent a user operation, a detection result by the sensor unit, and a prediction result of noise detected by the sensor unit. Based on the above, it is an information processing device including a recognition unit that recognizes a user's operation.
  • one of the two terminals equipped with a sensor for detecting vibration including vibration and noise representing a user's operation is mounted on the other terminal.
  • the detection result by the first sensor is received, and based on the received detection result, it is determined whether or not the detection result by the second sensor mounted on one of the terminals is noise. ..
  • the second aspect of the present technology is based on a sensor unit that detects vibration including vibration and noise that represent a user's operation, a detection result by the sensor unit, and a prediction result of noise detected by the sensor unit. And recognize the user's operation.
  • FIG. 1 is a diagram showing an example of the appearance of earphones (inner ear headphones) 10 according to an embodiment of the present technology.
  • the earphone 10 is an acoustic output device that is attached to the user's ear to appreciate the sound output from the built-in driver.
  • the earphone 10 has a left ear terminal 10L and a right ear terminal 10R.
  • the earphone 10 is a left-right independent earphone in which the left ear terminal 10L and the right ear terminal 10R are not physically connected.
  • the left ear terminal 10L and the right ear terminal 10R are connected via a wireless communication path such as NFMI (Near Field Magnetic Induction).
  • NFMI Near Field Magnetic Induction
  • Each of the left ear terminal 10L and the right ear terminal 10R is equipped with a processing device such as a CPU (Central Processing Unit), an acceleration sensor, and a sound output device.
  • a processing device such as a CPU (Central Processing Unit), an acceleration sensor, and a sound output device.
  • the user can operate the earphone 10 by tapping around the ear where the earphone 10 is attached. Normally, the operation of the earphone 10 by tapping around the ear is performed on either the left ear side or the right ear side at a certain timing.
  • the left ear terminal 10L and the right ear terminal 10R detect the vibration of the tap by the acceleration sensor and output a sound effect to notify the user that the tap has been detected. Since the operation uses vibration detection, the earphone 10 is naturally operated not only by tapping around the ear but also by tapping the main body of the earphone 10.
  • the earphone 10 is an example of an information processing device to which this technology is applied.
  • the earphone 10 may be configured as a true wireless earphone.
  • FIG. 2 is a block diagram showing a hardware configuration example of the earphone 10.
  • the left ear terminal 10L has a CPU 101L, a ROM (Read Only Memory) 102L, a RAM (Random Access Memory) 103L, a bus 104L, an input / output I / F unit 105L, a sound output unit 106L, and a sensor unit. It includes 107L, a communication unit 108L, a storage unit 109L, and a power supply unit 110L.
  • the CPU 101L, ROM 102L, and RAM 103L are connected to each other by the bus 104L.
  • the input / output I / F unit 105L is further connected to the bus 104L.
  • a sound output unit 106L, a sensor unit 107L, a communication unit 108L, and a storage unit 109L are connected to the input / output I / F unit 105L.
  • the sound output unit 106L reproduces music data acquired from an external music playback device, for example, and outputs sound. Further, the sound output unit 106L outputs a sound effect indicating that the operation has been detected.
  • the sensor unit 107L is composed of an IMU (Inertial Measurement Unit) 121L.
  • IMU Inertial Measurement Unit
  • IMU121L is composed of an acceleration sensor, a gyro sensor, etc.
  • the IMU121L detects the acceleration, angular acceleration, etc. of the left ear terminal 10L and outputs it as sensor data.
  • the communication unit 108L is composed of a left / right communication unit 131L and an external communication unit 132L.
  • the left and right communication unit 131L is configured as a communication module that supports short-range wireless communication such as NFMI.
  • the left-right communication unit 131L communicates with the right ear terminal 10R, and exchanges music data, sensor data, and the like.
  • the external communication unit 132L is configured as a communication module compatible with wireless communication such as Bluetooth (registered trademark), wireless LAN (Local Area Network), cellular communication (for example, LTE-Advanced, 5G, etc.), or wired communication. ..
  • the external communication unit 132L communicates with an external device and exchanges sound signals, sensor data, and the like.
  • the external device here includes, for example, a smartphone, a tablet terminal, a personal computer, a server, a music playback device, and the like. Servers include servers provided by music distribution services that distribute music over the Internet.
  • the storage unit 109L is composed of, for example, a non-volatile memory, a semiconductor memory including a volatile memory, and the like. Music data or the like acquired from an external device is recorded in the storage unit 109L.
  • the power supply unit 110L has a battery.
  • the power supply unit 110L supplies power to each unit of the left ear terminal 10L.
  • the right ear terminal 10R has the same configuration as the left ear terminal 10L.
  • the blocks corresponding to each configuration of the left ear terminal 10L shall be indicated by a reference numeral having the same number followed by "R”, and duplicate description will be omitted. Further, when it is not necessary to distinguish which of the left ear terminal 10L and the right ear terminal 10R is provided, "L" or "R” will be omitted.
  • FIG. 3 is a block diagram showing a functional configuration example of the earphone 10.
  • the information processing unit 141 is realized by executing a predetermined program by the CPU 101 of FIG.
  • the configuration shown in FIG. 3 is provided for each of the left ear terminal 10L and the right ear terminal 10R.
  • the other terminal which will be described later, is the right ear terminal 10R when the information processing unit 141 of FIG. 3 has the configuration of the left ear terminal 10L, and the left ear terminal when the information processing unit 141 has the configuration of the right ear terminal 10R. It becomes 10L.
  • the configuration provided in the left ear terminal 10L will be described by adding "L” to the same reference numeral
  • the configuration provided in the right ear terminal 10R will be described by adding "R" to the same reference numeral.
  • the information processing unit 141 is composed of a one-ear tap detection unit 151, a transmission control unit 152, a reception control unit 153, a tap determination unit 154, a sound control unit 155, and an execution unit 156.
  • the one-ear tap detection unit 151 acquires sensor data from the IMU 121 and detects taps based on the sensor data.
  • the one-ear tap detection unit 151 detects the presence or absence of a tap operation from the vibration generated by the user's operation including the tap operation and other operations.
  • Operations other than tapping include, for example, walking and chewing.
  • the tap operation represents a user's operation (for example, an operation in which the user's fingertip touches the main body of the earphone 10), and the tap is a vibration detected in the earphone 10 as being caused by the user's tap operation. Represents.
  • the one-ear tap detection unit 151 supplies event information indicating that the tap has been detected to the transmission control unit 152. Further, the one-ear tap detection unit 151 supplies the sensor data or event information acquired from the IMU 121 to the tap determination unit 154.
  • the one-ear tap detection unit 151 supplies the output request of the operation recognition sound to the sound control unit 155.
  • the operation recognition sound is a sound effect indicating that a tap has been detected.
  • the transmission control unit 152 is supplied with the same information as the sensor data acquired by the one-ear tap detection unit 151 from the IMU 121.
  • the transmission control unit 152 controls the left / right communication unit 131 to transmit the event information supplied from the one-ear tap detection unit 151 and the sensor data supplied from the IMU 121 to the other terminal.
  • the reception control unit 153 acquires the event information and the sensor data transmitted from the other terminal from the left / right communication unit 131 and supplies them to the tap determination unit 154.
  • the event information transmitted from the other terminal indicates that the tap was detected in the other terminal.
  • an event that the tap is detected in the one ear tap detection unit 151 based on the sensor data supplied from the one ear tap detection unit 151 and the sensor data supplied from the reception control unit 153 is a tap event. Whether it is a tap-like event or not.
  • the tap event is an event indicating that a tap operation has been detected.
  • the tap-like event is an event indicating that an action that causes vibration is detected, except for the tap action.
  • the tap determination unit 154 is a determination unit that determines whether or not the vibration detection by the IMU 121 is noise such as the detection of vibration generated by an operation other than the tap operation.
  • the tap determination unit 154 determines that the event that the tap is detected in the one-ear tap detection unit 151 is a tap-like event
  • the tap determination unit 154 supplies a cancellation sound output request to the sound control unit 155.
  • the cancel sound is a sound indicating that the detection of the tap has been canceled.
  • the tap determination unit 154 determines that the event that the tap is detected in the one-ear tap detection unit 151 is a tap event
  • the tap determination unit 154 supplies an output request for the function execution sound to the sound control unit 155.
  • the function execution sound is a sound indicating that the function assigned to the tap event is executed. Further, the tap determination unit 154 controls the execution unit 156 to execute the function assigned to the tap event.
  • the sound control unit 155 outputs sound effects corresponding to output requests for sound effects such as an operation recognition sound supplied from the one-ear tap detection unit 151, a cancellation sound supplied from the tap determination unit 154, and a function execution sound. Output from 106.
  • the execution unit 156 executes the function assigned to the tap event according to the control by the tap determination unit 154.
  • the tap event for example, processing related to content reproduction such as start of music reproduction and song advance is assigned.
  • tap detection is performed separately on both terminals, and when a tap is detected, the operation recognition sound is immediately reproduced as a reaction sound to the tap operation. This makes it possible to notify the user that the tap operation is immediately reacting.
  • the tap detection is canceled and canceled.
  • the sound is played. This makes it possible to notify the user that the tap has responded but has been cancelled.
  • FIG. 4 is a diagram showing an example of the reproduction timing of the sound effect.
  • a in FIG. 4 shows the reproduction timing of the sound effect when the tap operation is performed by the user.
  • the operation recognition sound is reproduced at the timing of time t1, which is the time immediately after the tap is detected.
  • the function execution sound is played at the timing of time t2, which is the time when a predetermined timeout time has elapsed from time t1.
  • the function assigned to the tap event is executed at the timing of time t2.
  • B in FIG. 4 shows the reproduction timing of the sound effect when an operation other than the tap operation is performed by the user.
  • the operation recognition sound is reproduced at the timing of time t1, which is the time immediately after the tap is detected.
  • the tap detection is canceled.
  • event information and sensor data are transmitted from the other terminal as information used for determining whether to cancel the tap.
  • the cancellation sound is played at the timing of time t4, which is the time between time t3 and time t2. Even if the tap detection is canceled, the cancellation sound may not be played.
  • the operation recognition sound is played immediately after the tap is detected, and the function execution sound is played after the predetermined time-out time has elapsed.
  • the operation recognition sound is always reproduced when a tap is detected on each of the left ear terminal 10L and the right ear terminal 10R.
  • the function execution sound is played only when the information for canceling the tap detection is not transmitted from the other terminal by the time-out time elapses.
  • a tap is detected in each of the left ear terminal 10L and the right ear terminal 10R, and the information of both terminals is used to determine whether the event that the tap is detected is a tap event or a tap-like event. Is done.
  • Two methods can be considered as a method of exchanging information used for determining whether the event that the tap is detected is a tap event or a tap-like event. 1.
  • First method A method of transmitting event information indicating that a tap has been detected on one terminal to the other terminal.
  • Second method A method of transmitting sensor data from one terminal to the other terminal.
  • FIG. 5 is a diagram showing the flow of information in the first method.
  • the sensor data of the IMU 121L is monitored by the one ear tap detection unit 151L.
  • the event information is transmitted to the tap determination unit 154R of the right ear terminal 10R.
  • the sensor data of the IMU121R is also monitored by the one-ear tap detection unit 151R in the right-ear terminal 10R.
  • the event information is supplied from the one-ear tap detection unit 151R to the tap determination unit 154R.
  • the tap determination unit 154R has an event that the tap is detected in the one ear tap detection unit 151R based on the event information transmitted from the left ear terminal 10L and the event information supplied from the one ear tap detection unit 151R. Determine if it is a tap event.
  • the function execution sound and the cancellation sound are reproduced as described above based on the determination result by the tap determination unit 154R.
  • step S101 the one-ear tap detection unit 151R of the right ear terminal 10R detects the tap based on the sensor data of the IMU121R.
  • step S102 the sound control unit 155R of the right ear terminal 10R reproduces the operation recognition sound and outputs it from the sound output unit 106R.
  • step S103 the tap determination unit 154R determines whether or not event information indicating that a similar tap event has been detected has been received from the left ear terminal 10L within a certain period of time.
  • step S104 If it is determined in step S103 that the same event information has been received from the left ear terminal 10L within a certain period of time, the process proceeds to step S104.
  • the tap since the tap is detected by the one-ear tap detection unit 151R, it is determined that the same event information has been received when the event information indicating that the tap has been detected is received from the left ear terminal 10L. ..
  • the tap determination unit 154R determines that the event that the tap is detected in the one-ear tap detection unit 151R is a tap-like event. That is, the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is invalid.
  • step S104 the sound control unit 155R reproduces the canceling sound and outputs it from the sound output unit 106R.
  • the cancellation sound may be output from the sound output units 106 of both terminals at the same time. After the cancellation sound is output, the process ends.
  • step S103 determines that the same event information has not been received from the left ear terminal 10L within a certain period.
  • the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is a tap event. That is, the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is valid.
  • step S105 the sound control unit 155R reproduces the function execution sound and outputs it from the sound output unit 106R.
  • the function execution sound may be output from the sound output units 106 of both terminals at the same time.
  • step S106 the execution unit 156R executes a predetermined function according to the tap event. After the predetermined function is executed, the process ends.
  • FIG. 7 is a diagram showing the flow of information in the second method.
  • the value of the sensor data of the IMU121L is transmitted to the tap determination unit 154R of the right ear terminal 10R.
  • the transmission of the sensor data to the tap determination unit 154R is performed, for example, when a condition such that the value of the sensor data of the IMU 121L exceeds a predetermined threshold value is satisfied by applying a predetermined filter.
  • the value of the sensor data of the IMU 121L may be constantly transmitted to the tap determination unit 154R.
  • the sensor data of the IMU121R is monitored by the one ear tap detection unit 151R.
  • the sensor data of the IMU121R is supplied from the one-ear tap detection unit 151R to the tap determination unit 154R.
  • the tap determination unit 154R has an event that the tap is detected in the one ear tap detection unit 151R based on the sensor data transmitted from the left ear terminal 10L and the sensor data supplied from the one ear tap detection unit 151R. Determine if it is a tap event.
  • the function execution sound and the cancellation sound are reproduced as described above based on the determination result by the tap determination unit 154R.
  • step S151 the one-ear tap detection unit 151R of the right ear terminal 10R detects the tap based on the sensor data of the IMU121R.
  • step S152 the sound control unit 155R of the right ear terminal 10R reproduces the operation recognition sound and outputs it from the sound output unit 106R.
  • step S153 the reception control unit 153R receives the value of the sensor data transmitted from the left ear terminal 10L via the left / right communication unit 131R.
  • step S154 the tap determination unit 154R performs a tap determination process. Specifically, the tap determination unit 154R calculates the similarity between the sensor data value of the IMU121R and the sensor data value of the IMU121L transmitted from the left ear terminal 10L, and is based on the calculated similarity. , It is determined whether the event that the tap is detected by the one-ear tap detection unit 151R is a tap event or a tap-like event. The determination based on the similarity of the values of the sensor data will be described later.
  • step S155 the event that the tap is detected by the one-ear tap detection unit 151R using the sensor data of the left ear terminal 10L and the sensor data of the right ear terminal 10R in the tap determination process of step S154 is the tap event. If it is determined that the event did not exist, that is, it was a tap-like event, the process proceeds to step S156.
  • step S156 the sound control unit 155R reproduces the canceling sound and outputs it from the sound output unit 106R.
  • the cancellation sound may be output from the sound output units 106 of both terminals at the same time. After the cancellation sound is output, the process ends.
  • step S155 if it is determined in the tap determination process of step S154 that the event that the tap was detected by the one-ear tap detection unit 151R is a tap event, the process proceeds to step S157.
  • step S157 the sound control unit 155R reproduces the function execution sound and outputs it from the sound output unit 106R.
  • the function execution sound may be output from the sound output units 106 of both terminals at the same time.
  • step S158 the execution unit 156R executes a predetermined function according to the tap event. After the predetermined function is executed, the process ends.
  • Vibrations generated by operations other than the tap operation are often detected simultaneously by the sensors of both the left ear terminal 10L and the right ear terminal 10R. For example, when walking or chewing, the entire user's head vibrates. On the other hand, the vibration due to the tap is detected as a large vibration only in either the left ear terminal 10L or the right ear terminal 10R.
  • the tap determination unit 154 determines that the tap detection by the one-ear tap detection unit 151 is a tap-like event. do.
  • FIG. 9 is a diagram showing an example of vibration due to an operation other than the tap operation.
  • FIG. 9 shows the waveform of the vibration generated by the chewing of the user during a meal.
  • the vertical axis represents acceleration and the horizontal axis represents time. The same applies to the graph of FIG. 10 described later.
  • a predetermined bandpass filter is applied to the waveform shown in FIG.
  • FIG. 9A shows the waveform of the sensor data of the IMU121L mounted on the left ear terminal 10L
  • FIG. 9B shows the waveform of the sensor data of the IMU121R mounted on the right ear terminal 10R. Has been done.
  • FIG. 10 is a diagram showing another example of vibration due to an operation other than the tap operation.
  • FIG. 10 shows the waveform of the vibration generated by the walking of the user.
  • a predetermined bandpass filter is applied to the waveform shown in FIG.
  • FIG. 10A shows the waveform of the sensor data of the IMU121L mounted on the left ear terminal 10L
  • FIG. 10B shows the waveform of the sensor data of the IMU121R mounted on the right ear terminal 10R. It is shown.
  • the earphone 10 can detect the user's mastication or walking based on the sensor data of both terminals, unlike the user's mastication or walking, one of the left ear side and the right ear side. It is possible to identify the tap action to be performed.
  • the tap operation by the user is detected based on the information of the sensors mounted on each of the left ear terminal 10L and the right ear terminal 10R. Further, in the earphone 10, the detection of the tap for an operation other than the tap operation is canceled after the operation recognition sound is reproduced.
  • the earphone 10 Since the detection of the tap for the operation other than the tap operation is canceled based on the information of the sensor mounted on both terminals, the earphone 10 detects the tap operation based on the information of the sensor mounted on one terminal. It is possible to improve the accuracy of tap operation detection more than in the case.
  • the band used for transmitting music data is narrowed, and sound interruption occurs when the radio wave condition is poor.
  • the sensor information needs to be transmitted only when the tap is detected. By transmitting the sensor information only when a tap is detected and giving priority to the transmission of music data between both terminals, the sound interruption caused by transmitting the sensor information can be eliminated. It will be possible to reduce it.
  • the operation recognition sound is immediately reproduced, so that the earphone 10 can quickly provide feedback to the user's operation.
  • the user can immediately confirm that the operation has been detected.
  • the user can confirm that the false detection of the tap due to the movement that the user does not intend to tap has been canceled by listening to the cancel sound.
  • the cancel operation by the user may be accepted between the time when the operation recognition sound is played and the time when the function execution sound is played.
  • the user taps with the earphone 10 by performing a cancel operation at the timing until the function execution sound is played. Detection can be canceled.
  • FIG. 11 is a diagram showing an example of how to use the earphone 10.
  • the left ear terminal 10L is attached to the left ear of the user U
  • the right ear terminal 10R is attached to the right ear of the user U.
  • the right ear terminal 10R is used alone and is attached to the right ear of the user U.
  • the operation of the right ear terminal 10R is changed.
  • the threshold value of the acceleration determined that the tap is detected is set to a larger value. This makes it possible to reduce erroneous detection such that an operation other than the tap operation is detected as a tap even when only one of the right ear terminals 10R is used.
  • the threshold value may be determined for the evaluation value obtained by performing various arithmetic processing or machine learning processing on the value of the acceleration sensor.
  • the function assigned to the tap event in the right ear terminal 10R and the function assigned to the tap event in the left ear terminal 10L can be different functions.
  • a function different from that when both terminals are used may be assigned to the tap event in the right ear terminal 10R.
  • Second Embodiment> When detecting a tap using the sensor data of the IMU 121, it may be affected by noise such as detecting an operation other than the user's tap operation and detecting a vibration generated by playing music.
  • the acceleration value of noise is subtracted from the sensor data, and the timing of noise generation is determined from the tap detection period.
  • Noise can be reduced by removing it.
  • the earphone 10 can improve the accuracy of detecting the tap operation of the user by reducing the noise.
  • Two examples can be considered as examples for reducing noise.
  • 1. An example of recording the time change of noise caused by walking or chewing and canceling the noise when a tap is detected.
  • Second example An example of predicting vibration based on the reproduced sound signal and canceling the predicted vibration.
  • FIG. 12 is a block diagram showing a functional configuration example of the earphone 10 in the first example.
  • the information processing unit 161 is realized by executing a predetermined program by the CPU 101 of FIG.
  • the configuration shown in FIG. 12 can be provided in each of the left ear terminal 10L and the right ear terminal 10R, or either the left ear terminal 10L or the right ear terminal 10R. It is also possible to provide it in. The same applies to FIG. 18 described later.
  • the information processing unit 161 is composed of a tap detection unit 171 and an execution unit 172.
  • the tap detection unit 171 detects a tap based on the sensor data of the IMU 121 and the past sensor data.
  • the tap detection unit 171 includes an acquisition unit 181, a calculation unit 182, and a determination unit 183.
  • the acquisition unit 181 acquires the sensor data from the IMU 121 and detects the peak of vibration based on the sensor data. Further, the acquisition unit 181 acquires the history of the peaks of the vibration detected in the past. For example, in the storage unit 109 of FIG. 2, information representing a peak detected in the past by the acquisition unit 181 is stored as a history, and is acquired by the acquisition unit 181.
  • the calculation unit 182 calculates the peak interval based on the history and sensor data acquired by the acquisition unit 181.
  • the determination unit 183 determines whether or not to recognize the peak detected by the acquisition unit 181 as a tap based on the calculation result by the calculation unit 182. That is, the determination unit 183 functions as a recognition unit that recognizes the tap based on the calculation result by the calculation unit 182.
  • the execution unit 172 executes processing according to the event that the tap is detected in the tap detection unit 171.
  • the waveform of vibration generated by the tap operation of the user may be similar to the waveform of vibration generated by the landing of the walking user.
  • the vibration waveform generated by the landing of the user depends on the shoes worn by the user and the way the user walks. For example, when a user wearing shoes with a hard sole such as high heels lands vigorously, vibration similar to the vibration generated by the tap operation is generated.
  • FIG. 13 is a diagram showing an example of a vibration waveform generated by landing during walking.
  • the vibration generated by landing during walking has periodicity. As shown in FIG. 13, the vibration waveform generated by landing during walking is the waveform W11 showing a peak at a constant period T.
  • the earphone 10 can predict the landing timing during walking. Although the walking habit differs depending on the person, the earphone 10 can predict the next vibration based on the vibration detected in the past.
  • the earphone 10 records a periodic vibration pattern such as walking, and predicts the timing of the next landing based on the vibration pattern.
  • the vibration detected before and after the time predicted to be the landing timing is not recognized as a tap.
  • FIG. 14 is a diagram showing an example of the waveform of the sensor data of the IMU 121 when the vibration generated by the landing during walking is detected.
  • the solid line portion of the waveform W12 in FIG. 14 represents the vibration detected in the past by the IMU 121.
  • the earphone 10 applies a low-pass filter to the acceleration value as sensor data of the IMU 121, and calculates the history of peaks within a certain period.
  • the time when the norm of acceleration exceeds a predetermined threshold value and gives the maximum value within a certain period is calculated as the peak timing.
  • three peaks are detected.
  • the interval from the first peak to the second peak is T0, and the interval from the second peak to the third peak is T1.
  • the calculation unit 182 calculates the interval between the peak Pa and the nearest peak when a peak Pa that may be tapped is newly detected.
  • the broken line portion of the waveform W12 in FIG. 14 represents the vibration including the newly detected peak Pa. In FIG. 14, it is calculated that the interval between the peak Pa and the nearest peak is Ta.
  • the determination unit 183 When the interval Ta is close to the interval T0 or interval T1 of the peaks detected in the past, the determination unit 183 does not recognize the peak Pa as a tap, but recognizes it as noise.
  • FIG. 15 is a diagram showing the waveform of the sensor data of the IMU 121 when the vibration generated by the tap operation performed after walking is detected.
  • the solid line portion of the waveform W13 in FIG. 15 represents the vibration detected in the past by the IMU 121.
  • the peak history is calculated in the same manner as in the waveform W12 of FIG.
  • the calculation unit 182 calculates the interval between the peak Pb and the nearest peak when a peak Pb that may be tapped is newly detected.
  • the broken line portion of the waveform W13 in FIG. 15 represents the vibration including the newly detected peak Pb.
  • the interval between the peak Pb and the nearest peak is calculated to be Tb.
  • the determination unit 183 recognizes that the vibration including the peak Pb is likely to be tapped.
  • the tap detection unit 171 recognizes the vibration including the newly detected peak as a tap based on the interval between the peaks detected in the past and the interval between the newly detected peaks. Is determined.
  • step S201 the acquisition unit 181 acquires the history of the peaks of vibration detected in the past.
  • step S202 the calculation unit 182 calculates the average T_ave of the intervals between the peaks of the vibrations detected in the past.
  • step S203 the acquisition unit 181 detects a new peak Pa.
  • step S204 the calculation unit 182 calculates the interval T_new between the peak Pa and the peak immediately before it.
  • step S205 the determination unit 183 determines whether or not the difference between the average T_ave of the intervals and the interval T_new is equal to or less than the threshold value (THRES) (
  • THRES threshold value
  • step S205 If it is determined in step S205 that the difference between the average T_ave of the intervals and the interval T_new is equal to or less than the threshold value, the process proceeds to step S206.
  • step S206 the determination unit 183 does not recognize the vibration including the peak Pa as a tap, but recognizes it as noise. After that, the process ends.
  • step S205 if it is determined in step S205 that the difference between the average T_ave of the intervals and the interval T_new exceeds the threshold value, the process proceeds to step S207.
  • step S207 the determination unit 183 recognizes the vibration including the peak Pa as a vibration that may be tapped, and subsequently, the tap detection unit 171 executes the tap detection algorithm. By executing the tap detection algorithm, the tap detection unit 171 detects the tap based on the sensor data of the IMU 121.
  • the execution unit 172 executes a process corresponding to the event that the tap is detected by the one-ear tap detection unit 171.
  • the vibration including the newly detected peak is generated by the tap operation based on the interval between the peaks of the vibration detected in the past and the interval of the newly detected peak. Whether or not it is vibration is recognized. If the vibration containing the newly detected peak is recognized as noise, the noise will not be detected as a tap.
  • the tap detection unit 171 of FIG. 12 in the information processing unit 141 instead of the one-ear tap detection unit 151 of FIG.
  • the tap detection unit 171 determines whether or not the event that the tap detection unit 171 has detected the tap is a tap event. This is done by the tap determination unit 154.
  • the tap event is an event in which the tap is detected in the tap detection unit 171 by performing the process described with reference to FIGS. 6 and 8 after the process of step S207 of FIG. It is determined whether or not there is.
  • the event information or the sensor data may not be transmitted to the other terminal, and it is detected that the vibration generated by the operation which seems to be walking is detected.
  • the information to be represented may be transmitted to the other terminal. As a result, the amount of data of communication performed between the left ear terminal 10L and the right ear terminal 10R can be reduced.
  • vibration generated by one landing during walking is learned by a method such as machine learning, and whether or not the newly detected peak of vibration is the peak generated by landing during walking is based on the learning result. Is recognized.
  • the sensor data as teacher data may be labeled with the user's behavior detected by a sensor other than IMU121.
  • the walking of the user is detected.
  • the sensor data for the period is labeled as walking sensor data.
  • the noise generated by walking may be constantly observed by the earphone 10 using the sensor data of the IMU 121.
  • the earphone 10 can record the sensor data of the IMU 121 as it is, or can record the acceleration value exceeding the threshold value.
  • the learning result of learning noise based on the recorded sensor data and the acceleration value exceeding the threshold value is used for the gesture recognition process to detect the tap.
  • the sensor data and the acceleration value exceeding the threshold value may be uploaded to a device (such as a smartphone) connected to the earphone 10 or a server provided as a cloud.
  • the earphone 10 acquires the learning result of the noise performed externally, and performs the gesture recognition process using the learning result.
  • the noise learning process of FIG. 17 is performed before the gesture recognition process is performed.
  • step S301 of the noise learning process the acquisition unit 181 (FIG. 12) acquires the noise generated by walking.
  • the history of noise detected and recorded in the past is acquired by the acquisition unit 181.
  • step S302 the calculation unit 182 calculates a noise removal filter that removes noise caused by walking based on the noise history acquired in step S301.
  • the noise reduction filter calculated in step S302 is used in step S352 of the gesture recognition process.
  • step S351 of the gesture recognition process the acquisition unit 181 acquires the sensor data of the IMU 121.
  • step S352 the calculation unit 182 applies a noise reduction filter to the sensor data of the IMU 121 acquired in step S351, and corrects the sensor data of the IMU 121.
  • step S353 the tap detection unit 171 executes the tap detection algorithm using the corrected sensor data.
  • FIG. 18 is a block diagram showing a functional configuration example of the earphone 10 in the second example.
  • the same configuration as that of the earphone 10 in FIG. 12 is designated by the same reference numeral. Duplicate explanations will be omitted as appropriate.
  • the configuration of the information processing unit 161 shown in FIG. 18 is different from the configuration described with reference to FIG. 12 in that the tap detection unit 171 has a correction unit 185 instead of the calculation unit 182.
  • the acquisition unit 181 further acquires a music reproduction signal as music data to be reproduced.
  • the determination unit 183 determines whether or not the power of the music reproduction signal acquired by the acquisition unit 181 is equal to or greater than the threshold value.
  • the correction unit 185 corrects the sensor data acquired by the acquisition unit 181 based on the music reproduction signal acquired by the acquisition unit 181. Specifically, the correction unit 185 predicts the vibration generated by the reproduction of the music reproduction signal, and corrects the sensor data by subtracting the acceleration value of the predicted vibration from the sensor data.
  • the tap detection unit 171 detects a tap based on the corrected sensor data. That is, the tap detection unit 171 functions as a recognition unit that recognizes the tap based on the sensor data corrected by the correction unit 185.
  • FIG. 19 is a diagram schematically showing an example of the structure of the earphone 10.
  • each of the left ear terminal 10L and the right ear terminal 10R has the same structure.
  • the vibration plate 191 provided on the earphone 10 vibrates to generate sound vibration. Since the IMU 121 is provided in the vicinity of the diaphragm 191, the vibration of the diaphragm 191 may be detected by the IMU 121.
  • the vibration plate 191 may generate vibration of sound similar to the vibration generated by the tap operation.
  • FIG. 20 is a diagram showing an example of a waveform of a music reproduction signal.
  • the vertical axis represents the power of the music reproduction signal
  • the horizontal axis represents time
  • the determination unit 183 predicts the peak of vibration generated by the reproduction of the music reproduction signal based on the music reproduction signal as shown in the waveform W21 of FIG. During the period before and after the peak timing, the tap detection unit 171 prevents the tap from being detected. Alternatively, the correction unit 185 corrects the sensor data based on the predicted vibration.
  • the first method of preventing the tap from being detected in the period before and after the timing at which the peak music reproduction signal is reproduced, and the second method of correcting the sensor data based on the music reproduction signal are performed.
  • the flow of each process will be explained.
  • step S401 the acquisition unit 181 acquires the acceleration value as the sensor data of the IMU 121.
  • step S402 the acquisition unit 181 acquires the music reproduction signal to be reproduced by the earphone 10.
  • step S403 the determination unit 183 determines whether or not the power of the music reproduction signal for a certain period in the past is equal to or greater than the threshold value. For example, the determination unit 183 makes a determination using the power of the music reproduction signal reproduced in a certain period until the timing when the sensor data of the IMU 121 is acquired.
  • step S403 If it is determined in step S403 that the power of the music reproduction signal for a certain period in the past is equal to or higher than the threshold value, the process ends. That is, the tap detection by the tap detection unit 171 is suppressed (not performed).
  • step S403 determines whether the power of the music reproduction signal for a certain period in the past is less than the threshold value. If it is determined in step S403 that the power of the music reproduction signal for a certain period in the past is less than the threshold value, the process proceeds to step S404.
  • step S404 the information processing unit 161 executes the tap detection algorithm described above.
  • step S451 the acquisition unit 181 acquires the acceleration value as the sensor data of the IMU 121.
  • step S452 the acquisition unit 181 acquires the music reproduction signal to be reproduced by the earphone 10.
  • step S453 the determination unit 183 determines whether or not the power of the music reproduction signal is equal to or greater than the threshold value. For example, the determination unit 183 makes a determination using the power of the music reproduction signal reproduced at the timing when the sensor data of the IMU 121 is acquired.
  • step S453 If it is determined in step S453 that the power of the music reproduction signal is equal to or greater than the threshold value, the process proceeds to step S454.
  • step S454 the correction unit 185 performs a process of subtracting the acceleration value of the vibration predicted to be generated by reproducing the music reproduction signal from the acceleration value of the sensor data, and corrects the sensor data.
  • step S453 determines whether the power of the music reproduction signal is less than the threshold value. If it is determined in step S453 that the power of the music reproduction signal is less than the threshold value, the process of step S454 is skipped.
  • step S455 the tap detection unit 171 executes the above-mentioned tap detection algorithm.
  • the vibration generated by reproducing the music reproduction signal is predicted.
  • tap detection is suppressed and the sensor data of the IMU 121 is corrected based on the predicted vibration.
  • the earphone 10 does not erroneously detect the vibration generated by playing music as a tap, and reliably detects the vibration of the tap operation such that the peak is different from the vibration of the output sound as a tap. It becomes possible to do.
  • FIG. 23 is a block diagram showing another hardware configuration example of the earphone 10.
  • FIG. 23 the same components as those of the earphone 10 in FIG. 2 are designated by the same reference numerals. Duplicate explanations will be omitted as appropriate.
  • the configuration of the earphone 10 shown in FIG. 23 is such that the sensor unit 107L of the left ear terminal 10L is provided with the electrostatic sensor 201L, and the sensor unit 107R of the right ear terminal 10R is provided with the electrostatic sensor 201R. It is different from the configuration described with reference to.
  • the electrostatic sensors 201L and 201R are composed of, for example, sensors of the XY coordinate detection method and the electrostatic button detection method.
  • the electrostatic sensors 201L and 201R output signals corresponding to the user's contact with the electrostatic sensors 201L and 201R as sensor data.
  • FIG. 24 is a block diagram showing a functional configuration example of the earphone 10.
  • the information processing unit 211 is realized by executing a predetermined program by the CPU 101 of FIG. 24.
  • the configuration shown in FIG. 24 can be provided in each of the left ear terminal 10L and the right ear terminal 10R, or either the left ear terminal 10L or the right ear terminal 10R. It is also possible to provide it in.
  • the information processing unit 211 is composed of a tap detection unit 221, a tap determination unit 222, and an execution unit 223.
  • the tap detection unit 221 acquires sensor data from the sensor unit 107 and detects the main body tap and the face tap based on the sensor data of the IMU 121. Further, the tap detection unit 221 detects the main body tap based on the sensor data of the electrostatic sensor 201. The main body tap and face tap will be described later.
  • the tap detection result based on the sensor data of the IMU 121 and the tap detection result based on the sensor data of the electrostatic sensor 201 are supplied to the tap determination unit 222.
  • the tap determination unit 222 distinguishes between the main body tap and the face tap based on the detection result supplied from the tap detection unit 221.
  • the tap determination unit 222 controls the execution unit 223 based on the identification result to execute the function assigned to the main body tap or the function assigned to the face tap.
  • the execution unit 223 executes the function assigned to the main body tap or the function assigned to the face tap according to the control by the tap determination unit 222.
  • FIG. 25 is a diagram showing an example of a main body tap and a face tap.
  • FIG. 25A shows a situation in which the user U is tapping the housing of the left ear terminal 10L attached to the left ear.
  • the main body tap is detected based on the sensor data of the electrostatic sensor 201 installed in the area A11 of a part of the housing of the earphone 10. Since the electrostatic sensor 201 is installed only in a part of the housing of the earphone 10, the user U may tap a part other than the area A11 in an attempt to tap the main body.
  • the tap of the main body is detected by using the sensor data of the IMU 121 together with the sensor data of the electrostatic sensor 201.
  • the face tap means that the user taps around the ear where the earphone 10 is attached.
  • FIG. 25B shows a situation in which the user U is tapping around the left ear to which the left ear terminal 10L is attached.
  • the region A12 around the left ear is defined as a region where a face tap can be detected.
  • FIG. 26 is a diagram showing the distribution of tap intensities of the main body tap and the face tap.
  • the vertical axis represents the frequency and the horizontal axis represents the tap strength.
  • the waveform W31 represents the distribution of the tap strength of the main body tap, and the waveform W32 represents the distribution of the tap strength of the face tap.
  • the tap intensity represents the intensity of vibration detected by the IMU 121 in response to tapping.
  • the tap strength of the main body tap is high and the tap strength of the face tap is low.
  • the earphone 10 Since the tap strength of the main body tap is high, when strong vibration is detected by the IMU 121, the earphone 10 detects that the main body tap has been performed. On the other hand, since the tap strength of the face tap is low, when the IMU 121 detects a weak vibration, the earphone 10 detects that the face tap has been performed.
  • the earphone 10 can identify the main body tap based on the sensor data of the electrostatic sensor 201.
  • the process of FIG. 27 is started, for example, when a tap-like vibration is detected by the IMU 121.
  • step S501 the tap detection unit 221 acquires the acceleration value as the sensor data from the IMU 121, and detects the main body tap or the face tap based on the sensor data of the IMU 121.
  • step S502 the tap detection unit 221 acquires the sensor data from the electrostatic sensor 201 and detects the main body tap based on the sensor data of the electrostatic sensor 201.
  • step S503 the tap determination unit 222 determines whether or not the tap is detected by the electrostatic sensor 201.
  • step S503 If it is determined in step S503 that the tap is detected by the electrostatic sensor 201, the process proceeds to step S504. For example, when the user's contact is detected by the electrostatic sensor 201, it is determined that the tap has been detected.
  • step S504 the tap determination unit 222 determines that the main body has been tapped by the user. After that, the execution unit 223 executes the function assigned to the main body tap.
  • step S503 After it is determined in step S503 that the tap is detected by the electrostatic sensor 201, it may be determined whether or not the main body tap is detected based on the sensor data of the IMU 121.
  • step S504 When it is determined that the main body tap is detected based on the sensor data of the IMU 121, the process of step S504 is performed. If it is determined that the main body tap is not detected based on the sensor data of the IMU 121, it is determined that the electrostatic sensor 201 has detected noise, and the process ends.
  • the electrostatic sensor 201 detects the contact of the user's hair, it is possible to avoid determining the contact as the main body tap.
  • step S503 if it is determined in step S503 that the tap is not detected by the electrostatic sensor 201, the process proceeds to step S505.
  • step S505 the tap determination unit 222 determines whether or not the main body tap is detected by the IMU 121. For example, when the tap strength measured based on the sensor data of the IMU 121 is higher than a predetermined threshold value, it is determined that the main body tap is detected by the IMU 121.
  • step S505 If it is determined in step S505 that the main body tap has been detected by the IMU 121, the process proceeds to step S504, it is determined that the main body tap has been performed by the user as described above, and the function assigned to the main body tap is executed.
  • step S505 if it is determined in step S505 that the main body tap is not detected by the IMU 121, the process proceeds to step S506.
  • step S506 the tap determination unit 222 determines whether or not the face tap is detected by the IMU 121. For example, when the tap strength measured based on the sensor data of the IMU 121 is lower than a predetermined threshold value, it is determined that the face tap is detected by the IMU 121.
  • step S506 If it is determined in step S506 that the face tap was not detected by the IMU 121, the process ends. For example, if the vibration is not detected by the IMU 121, it is determined that the face tap is not detected by the IMU 121.
  • step S506 determines whether the face tap is detected by the IMU 121. If it is determined in step S506 that the face tap is detected by the IMU 121, the process proceeds to step S507.
  • step S507 the tap determination unit 222 determines that the face tap has been performed by the user. After that, the execution unit 223 executes the function assigned to the face tap.
  • the main body tap and the face tap are identified based on the sensor data of the electrostatic sensor 201 and the sensor data of the IMU 121.
  • the earphone 10 can identify the main body tap based on the sensor data of the IMU 121 even when the user touches an area other than the area where the electrostatic sensor 201 is installed. Further, the earphone 10 can identify the main body tap based on the sensor data of the electrostatic sensor 201 even when the user taps the area where the electrostatic sensor 201 is installed with a weak force.
  • an image representing the above-mentioned information may be displayed on the screen of an external device such as a smartphone. Further, the vibration or sound representing the above-mentioned information may be output from an external device. Further, at least one of the sound, vibration, and image representing the above-mentioned information may be combined and output.
  • the information processing unit 141 in the first embodiment, the information processing unit 161 in the second embodiment, and the information processing unit 211 in the third embodiment have been described as being provided on the earphone 10. A part or all of these information processing units may be provided in an external device of the earphone 10.
  • the tap determination unit 154 of the information processing unit 161 may be provided on a smartphone connected to the earphone 10 by wireless communication or wired communication.
  • the smartphone determines whether or not the event that the tap is detected by the one-ear tap detection unit 151 is a tap event, and the determination result is transmitted to the earphone 10.
  • the above-mentioned series of processes can be executed by hardware or software.
  • the programs constituting the software are installed on a computer embedded in dedicated hardware, a general-purpose personal computer, or the like.
  • the installed program is provided by recording it on a removable medium consisting of an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.) or a semiconductor memory. It may also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital broadcasting.
  • the program can be installed in advance in the ROM 102 or the storage unit 109 shown in FIGS. 3 and 24.
  • the program executed by the computer may be a program in which processing is performed in chronological order according to the order described in the present specification, in parallel, or at a necessary timing such as when a call is made. It may be a program in which processing is performed.
  • this technology can take a cloud computing configuration in which one function is shared by multiple devices via a network and processed jointly.
  • each step described in the above flowchart can be executed by one device or shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices.
  • the present technology can also have the following configurations.
  • the terminal A receiving unit that receives the detection result of the first sensor mounted on the other terminal, and a receiving unit.
  • An information processing device having a determination unit for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the detection result received by the reception unit.
  • the terminal further includes an execution unit that executes processing according to the detection result by the second sensor when the determination unit determines that the detection result by the second sensor is not noise. The information processing device according to 1).
  • the terminal controls to notify that the detection result by the second sensor is noise.
  • the information processing apparatus further comprising a unit.
  • (4) The information processing device according to (3) above, wherein the control unit controls to output at least one notification by sound, vibration, and an image notifying that the detection result by the second sensor is noise. .. (5)
  • the control unit has a timing between the timing when the vibration is detected by the second sensor and the timing when the processing is executed by the execution unit, and the detection result by the second sensor is noise.
  • the information processing apparatus according to (3) or (4) above which controls to notify the fact.
  • (6) The information processing device according to any one of (3) to (5) above, wherein the control unit further controls to output a sound at the timing when the vibration is detected by the second sensor.
  • the information processing device according to any one of (3) to (6) above, wherein the control unit further controls to output a sound at a timing when the processing is executed by the execution unit.
  • the determination unit The degree of similarity between the detection result by the first sensor and the detection result by the second sensor is calculated.
  • the information processing apparatus according to any one of (1) to (7) above, which determines whether or not the detection result by the second sensor is noise based on the calculated similarity.
  • the determination unit determines that the detection result by the second sensor is noise when the similarity is higher than a predetermined threshold value.
  • the receiving unit receives event information indicating that the user's operation is detected according to the sensor data of the first sensor.
  • the determination unit determines whether or not it is effective to detect the user's operation according to the sensor data of the second sensor based on the event information received by the reception unit (1).
  • the receiving unit receives the sensor data of the first sensor as the detection result of the first sensor, and receives the sensor data of the first sensor.
  • the determination unit is described in any one of (1) to (9) above, which determines whether or not the sensor data of the second sensor is noise based on the sensor data received by the reception unit.
  • Information processing device (12) The information processing device according to any one of (1) to (11), further comprising a sound output unit that outputs a sound corresponding to a sound signal according to the operation of the user. (13) An information processing device equipped with two terminals equipped with sensors for detecting vibration.
  • One of the terminals Upon receiving the detection result by the first sensor mounted on the other terminal, An information processing method for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the received detection result.
  • a sensor unit that detects vibrations including vibrations and noises that represent user operations
  • An information processing device including a recognition unit that recognizes a user's operation based on a detection result by the sensor unit and a noise prediction result detected by the sensor unit.
  • the recognition unit recognizes the user's operation based on the comparison result between the peak interval of the vibration detected by the sensor unit and the noise period predicted based on the periodic vibration detected in the past.
  • the recognition unit corrects the detection result by the sensor unit based on the learning result of the noise detected by the sensor unit, and recognizes the user's operation based on the corrected detection result according to the above (14).
  • Information processing equipment (17) It also has a sound output unit that outputs sound based on the sound signal. The recognition unit recognizes a user's operation based on a detection result by the sensor unit and a prediction result of noise generated by the reproduction of the sound signal by the sound output unit. Information processing device. (18) The recognition unit determines whether or not the detection result by the sensor unit is noise based on the vibration generated by the sound output by the sound output unit, and is based on the detection result determined to be not noise.
  • the information processing apparatus according to (17) above, which recognizes the user's operation.
  • the recognition unit corrects the detection result by the sensor unit based on the vibration generated by the sound output by the sound output unit, and recognizes the user's operation based on the corrected detection result (17). ).
  • the information processing device (20) It is provided with two terminals on which the sensor unit is mounted, or one terminal on which the sensor unit is mounted. The information processing device according to any one of (14) to (19), wherein the terminal has the recognition unit. (21) A contact sensor that detects contact by the user, and A vibration sensor that detects vibration and An information processing device including a determination unit for determining whether or not the user has touched the housing based on the detection result of the contact sensor and the detection result of the vibration sensor.
  • the information processing apparatus according to (24) above, further comprising an execution unit that executes the above-mentioned function.
  • (26) It comprises two terminals equipped with the contact sensor and the vibration sensor, or one terminal equipped with the contact sensor and the vibration sensor.
  • the terminal has the determination unit, and the terminal has the determination unit.
  • the information processing device according to any one of (21) to (25), wherein the housing is a housing of the terminal.

Abstract

The present technology relates to an information processing device and an information processing method for making it possible to identify a vibration generated by an operation by a user. The information processing device of the present technology is provided with two terminals equipped with sensors for detecting a vibration representative of an operation by a user and a vibration including noise, wherein one terminal comprises a reception unit for receiving a result of detection by a first sensor with which the other terminal is equipped, and a determination unit for determining whether, on the basis of the result of detection received by the reception unit, the result of detection by a second sensor with which the one terminal is equipped is noise. The present technology may be applied, for example, to true wireless earbuds.

Description

情報処理装置および情報処理方法Information processing equipment and information processing method
 本技術は、情報処理装置および情報処理方法に関し、特に、ユーザの操作により生じる振動を識別することができるようにした情報処理装置および情報処理方法に関する。 The present technology relates to an information processing apparatus and an information processing method, and more particularly to an information processing apparatus and an information processing method capable of identifying vibrations generated by a user's operation.
 近年、ユーザは、様々な操作方法によってイヤホンを操作することができる。例えば、特許文献1には、ユーザが、ジェスチャ、音声、およびボタン操作によって操作を行うことができる小型端末が記載されている。 In recent years, users can operate earphones by various operation methods. For example, Patent Document 1 describes a small terminal that a user can operate by gesture, voice, and button operation.
特開2017-207890号公報Japanese Unexamined Patent Publication No. 2017-207890
 イヤホンに加速度センサを搭載した場合、ユーザがイヤホンの周辺をタップすることによって生じた振動を検出することができる。イヤホンは、加速度センサで振動が検出されることによってユーザの操作を認識する。 When the accelerometer is mounted on the earphone, the vibration generated by the user tapping around the earphone can be detected. The earphone recognizes the user's operation by detecting the vibration by the acceleration sensor.
 ところで、ユーザがイヤホンの周辺をタップすることにより生じる振動は、ユーザの歩行や咀嚼による振動、イヤホンから出力される音の振動などのノイズと類似することがある。これにより、イヤホンに搭載された加速度センサが振動を検出することによりユーザの操作が認識される場合、ユーザの操作により生じる振動とノイズを識別することが困難であった。 By the way, the vibration generated by the user tapping around the earphone may be similar to the noise such as the vibration caused by the user walking or chewing, or the vibration of the sound output from the earphone. As a result, when the user's operation is recognized by the acceleration sensor mounted on the earphone detecting the vibration, it is difficult to distinguish between the vibration and the noise generated by the user's operation.
 本技術はこのような状況に鑑みてなされたものであり、ユーザの操作により生じる振動を識別することができるようにするものである。 This technology was made in view of such a situation, and makes it possible to identify the vibration generated by the user's operation.
 本技術の第1の側面の情報処理装置は、ユーザの操作を表す振動とノイズとを含む振動を検出するセンサが搭載された2台の端末を備え、一方の前記端末は、他方の前記端末に搭載された第1の前記センサによる検出結果を受信する受信部と、前記受信部により受信された前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する判定部とを有する情報処理装置である。 The information processing apparatus of the first aspect of the present technology includes two terminals equipped with sensors for detecting vibration including vibration and noise representing a user operation, and one terminal is the other terminal. Based on the receiving unit that receives the detection result by the first sensor mounted on the receiver and the detection result received by the receiving unit, the detection result by the second sensor mounted on one of the terminals is obtained. It is an information processing device having a determination unit for determining whether or not it is noise.
 本技術の第1の側面の情報処理方法は、振動を検出するセンサが搭載された2台の端末を備える情報処理装置であって、一方の前記端末が、他方の前記端末に搭載された第1の前記センサによる検出結果を受信し、受信した前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する情報処理方法である。 The information processing method of the first aspect of the present technology is an information processing apparatus including two terminals equipped with sensors for detecting vibration, wherein one terminal is mounted on the other terminal. An information processing method for receiving the detection result of the first sensor and determining whether or not the detection result of the second sensor mounted on one of the terminals is noise based on the received detection result. be.
 本技術の第2の側面の情報処理装置は、ユーザの操作を表す振動とノイズとを含む振動を検出するセンサ部と、前記センサ部による検出結果と前記センサ部により検出されるノイズの予測結果とに基づいて、ユーザの操作を認識する認識部とを備える情報処理装置である。 The information processing device on the second aspect of the present technology includes a sensor unit that detects vibration including vibration and noise that represent a user operation, a detection result by the sensor unit, and a prediction result of noise detected by the sensor unit. Based on the above, it is an information processing device including a recognition unit that recognizes a user's operation.
 本技術の第1の側面においては、ユーザの操作を表す振動とノイズとを含む振動を検出するセンサが搭載された2台の端末のうちの一方の前記端末において、他方の前記端末に搭載された第1の前記センサによる検出結果が受信され、受信された前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かが判定される。 In the first aspect of the present technology, one of the two terminals equipped with a sensor for detecting vibration including vibration and noise representing a user's operation is mounted on the other terminal. The detection result by the first sensor is received, and based on the received detection result, it is determined whether or not the detection result by the second sensor mounted on one of the terminals is noise. ..
 本技術の第2の側面においては、ユーザの操作を表す振動とノイズとを含む振動を検出するセンサ部と、前記センサ部による検出結果と前記センサ部により検出されるノイズの予測結果とに基づいて、ユーザの操作を認識する。 The second aspect of the present technology is based on a sensor unit that detects vibration including vibration and noise that represent a user's operation, a detection result by the sensor unit, and a prediction result of noise detected by the sensor unit. And recognize the user's operation.
本技術の一実施形態に係るイヤホンの外観の例を示す図である。It is a figure which shows the example of the appearance of the earphone which concerns on one Embodiment of this technique. イヤホンのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware configuration example of an earphone. イヤホンの機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of an earphone. 効果音の再生タイミングの例を示す図である。It is a figure which shows the example of the reproduction timing of a sound effect. 第1の方法における情報の流れを示す図である。It is a figure which shows the flow of information in the 1st method. イヤホンにより実行される処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an earphone. 第2の方法における情報の流れを示す図である。It is a figure which shows the flow of information in the 2nd method. イヤホンにより実行される処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an earphone. タップ動作以外の動作による振動の例を示す図である。It is a figure which shows the example of the vibration by the operation other than the tap operation. タップ動作以外の動作による振動の他の例を示す図である。It is a figure which shows the other example of the vibration by the operation other than the tap operation. イヤホンの使用方法の例を示す図である。It is a figure which shows the example of how to use an earphone. 第1の例におけるイヤホンの機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the earphone in 1st example. 歩行中の着地により生じる振動の波形の例を示す図である。It is a figure which shows the example of the waveform of the vibration generated by the landing during walking. 歩行中の着地により生じた振動を検出した場合のIMUのセンサデータの波形の例を示す図である。It is a figure which shows the example of the waveform of the sensor data of IMU when the vibration generated by the landing during walking is detected. 歩行後に行われたタップ動作により生じた振動を検出した場合のIMUのセンサデータの波形を示す図である。It is a figure which shows the waveform of the sensor data of IMU when the vibration generated by the tap operation performed after walking is detected. 情報処理部により実行される処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an information processing unit. 情報処理部により実行されるノイズ学習処理とジェスチャ認識処理の流れについて説明するシーケンス図である。It is a sequence diagram explaining the flow of a noise learning process and a gesture recognition process executed by an information processing unit. 第2の例におけるイヤホンの機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the earphone in the 2nd example. イヤホンの構造の例を概略的に示す図である。It is a figure which shows the example of the structure of an earphone schematically. 音楽再生信号の波形の例を示す図である。It is a figure which shows the example of the waveform of the music reproduction signal. 情報処理部により実行される処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an information processing unit. 情報処理部により実行される処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an information processing unit. イヤホンの他のハードウェア構成例を示すブロック図である。It is a block diagram which shows the other hardware configuration example of an earphone. イヤホンの機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of an earphone. 本体タップと顔タップの例を示す図である。It is a figure which shows the example of the main body tap and the face tap. 本体タップと顔タップのタップ強度の分布を示す図である。It is a figure which shows the distribution of the tap strength of a main body tap and a face tap. イヤホンにより実行される処理の流れについて説明するフローチャートである。It is a flowchart explaining the flow of the process executed by an earphone.
 以下、本技術を実施するための形態について説明する。説明は以下の順序で行う。
 1.第1の実施の形態
 2.第2の実施の形態
 3.第3の実施の形態
 4.変形例
Hereinafter, a mode for implementing the present technology will be described. The explanation will be given in the following order.
1. 1. First embodiment 2. Second embodiment 3. Third embodiment 4. Modification example
<1.第1の実施の形態>
・イヤホンの外観
 図1は、本技術の一実施形態に係るイヤホン(インナーイヤーヘッドホン)10の外観の例を示す図である。
<1. First Embodiment>
Appearance of Earphones FIG. 1 is a diagram showing an example of the appearance of earphones (inner ear headphones) 10 according to an embodiment of the present technology.
 イヤホン10は、ユーザの耳に装着して、内蔵しているドライバから出力された音を鑑賞するための音響出力装置である。 The earphone 10 is an acoustic output device that is attached to the user's ear to appreciate the sound output from the built-in driver.
 図1において、イヤホン10は、左耳用端末10Lと右耳用端末10Rを有する。イヤホン10は、左耳用端末10Lと右耳用端末10Rが物理的に接続されていない左右独立型のイヤホンである。左耳用端末10Lと右耳用端末10Rは、NFMI(Near Field Magnetic Induction)などの無線通信路を介して接続される。 In FIG. 1, the earphone 10 has a left ear terminal 10L and a right ear terminal 10R. The earphone 10 is a left-right independent earphone in which the left ear terminal 10L and the right ear terminal 10R are not physically connected. The left ear terminal 10L and the right ear terminal 10R are connected via a wireless communication path such as NFMI (Near Field Magnetic Induction).
 左耳用端末10Lと右耳用端末10Rのそれぞれには、CPU(Central Processing Unit)などの処理装置、加速度センサ、音出力装置が備えられる。 Each of the left ear terminal 10L and the right ear terminal 10R is equipped with a processing device such as a CPU (Central Processing Unit), an acceleration sensor, and a sound output device.
 ユーザは、イヤホン10を装着した耳周辺をタップすることによってイヤホン10を操作することができる。通常、耳周辺をタップすることによるイヤホン10の操作は、あるタイミングでは左耳側と右耳側の一方で行われる。左耳用端末10Lと右耳用端末10Rは、加速度センサでタップの振動を検出し、効果音を出力することによってタップを検出したことをユーザに通知する。振動の検出を用いた操作であるから、当然、耳周辺をタップするだけでなく、イヤホン10の本体をタップすることによっても、イヤホン10の操作は行われる。 The user can operate the earphone 10 by tapping around the ear where the earphone 10 is attached. Normally, the operation of the earphone 10 by tapping around the ear is performed on either the left ear side or the right ear side at a certain timing. The left ear terminal 10L and the right ear terminal 10R detect the vibration of the tap by the acceleration sensor and output a sound effect to notify the user that the tap has been detected. Since the operation uses vibration detection, the earphone 10 is naturally operated not only by tapping around the ear but also by tapping the main body of the earphone 10.
 イヤホン10は、本技術を適用した情報処理装置の一例である。なお、イヤホン10がトゥルーワイヤレスイヤホンとして構成されるようにしてもよい。 The earphone 10 is an example of an information processing device to which this technology is applied. The earphone 10 may be configured as a true wireless earphone.
・イヤホンの構成例
 図2は、イヤホン10のハードウェア構成例を示すブロック図である。
-Earphone configuration example FIG. 2 is a block diagram showing a hardware configuration example of the earphone 10.
 図2に示すように、左耳用端末10Lは、CPU101L、ROM(Read Only Memory)102L、RAM(Random Access Memory)103L、バス104L、入出力I/F部105L、音出力部106L、センサ部107L、通信部108L、記憶部109L、および電源部110Lを備える。 As shown in FIG. 2, the left ear terminal 10L has a CPU 101L, a ROM (Read Only Memory) 102L, a RAM (Random Access Memory) 103L, a bus 104L, an input / output I / F unit 105L, a sound output unit 106L, and a sensor unit. It includes 107L, a communication unit 108L, a storage unit 109L, and a power supply unit 110L.
 CPU101L、ROM102L、RAM103Lは、バス104Lにより相互に接続される。 The CPU 101L, ROM 102L, and RAM 103L are connected to each other by the bus 104L.
 バス104Lには、さらに、入出力I/F部105Lが接続される。入出力I/F部105Lには、音出力部106L、センサ部107L、通信部108L、および記憶部109Lが接続される。 The input / output I / F unit 105L is further connected to the bus 104L. A sound output unit 106L, a sensor unit 107L, a communication unit 108L, and a storage unit 109L are connected to the input / output I / F unit 105L.
 音出力部106Lは、例えば、外部の音楽再生装置から取得された音楽データを再生し、音を出力する。また、音出力部106Lは、操作を検出したことを表す効果音を出力する。 The sound output unit 106L reproduces music data acquired from an external music playback device, for example, and outputs sound. Further, the sound output unit 106L outputs a sound effect indicating that the operation has been detected.
 センサ部107Lは、IMU(Inertial Measurement Unit)121Lにより構成される。 The sensor unit 107L is composed of an IMU (Inertial Measurement Unit) 121L.
 IMU121Lは、加速度センサ、ジャイロセンサなどから構成される。IMU121Lは、左耳用端末10Lの加速度、角加速度などを検出し、センサデータとして出力する。 IMU121L is composed of an acceleration sensor, a gyro sensor, etc. The IMU121L detects the acceleration, angular acceleration, etc. of the left ear terminal 10L and outputs it as sensor data.
 通信部108Lは、左右通信部131Lと外部通信部132Lにより構成される。 The communication unit 108L is composed of a left / right communication unit 131L and an external communication unit 132L.
 左右通信部131Lは、NFMIなどの近距離無線通信に対応した通信モジュールとして構成される。左右通信部131Lは、右耳用端末10Rとの間で通信を行い、音楽データやセンサデータなどをやり取りする。 The left and right communication unit 131L is configured as a communication module that supports short-range wireless communication such as NFMI. The left-right communication unit 131L communicates with the right ear terminal 10R, and exchanges music data, sensor data, and the like.
 外部通信部132Lは、Bluetooth(登録商標)、無線LAN(Local Area Network)、セルラー方式の通信(例えばLTE-Advancedや5G等)などの無線通信、または有線通信に対応した通信モジュールとして構成される。外部通信部132Lは、外部の機器との間で通信を行い、音信号やセンサデータなどをやり取りする。なお、ここでの外部の機器としては、例えば、スマートフォン、タブレット端末、パーソナルコンピュータ、サーバ、音楽再生装置などが含まれる。サーバは、インターネットを通じて楽曲を配信する音楽配信サービスにより提供されるサーバを含む。 The external communication unit 132L is configured as a communication module compatible with wireless communication such as Bluetooth (registered trademark), wireless LAN (Local Area Network), cellular communication (for example, LTE-Advanced, 5G, etc.), or wired communication. .. The external communication unit 132L communicates with an external device and exchanges sound signals, sensor data, and the like. The external device here includes, for example, a smartphone, a tablet terminal, a personal computer, a server, a music playback device, and the like. Servers include servers provided by music distribution services that distribute music over the Internet.
 記憶部109Lは、例えば、不揮発性メモリや揮発性メモリを含む半導体メモリなどから構成される。記憶部109Lには、外部の装置から取得された音楽データなどが記録される。 The storage unit 109L is composed of, for example, a non-volatile memory, a semiconductor memory including a volatile memory, and the like. Music data or the like acquired from an external device is recorded in the storage unit 109L.
 電源部110Lは、バッテリを有している。電源部110Lは、左耳用端末10Lの各部に対して電源を供給する。 The power supply unit 110L has a battery. The power supply unit 110L supplies power to each unit of the left ear terminal 10L.
 右耳用端末10Rは、左耳用端末10Lの構成と同様の構成を備える。右耳用端末10Rにおいて、左耳用端末10Lの各構成に対応するブロックは、同一の数字の後に「R」を付した符号で示されるものとし、重複する説明は省略する。また、左耳用端末10Lと右耳用端末10Rのどちらに設けられる構成であるかを区別する必要がない場合、「L」または「R」を省略して説明する。 The right ear terminal 10R has the same configuration as the left ear terminal 10L. In the right ear terminal 10R, the blocks corresponding to each configuration of the left ear terminal 10L shall be indicated by a reference numeral having the same number followed by "R", and duplicate description will be omitted. Further, when it is not necessary to distinguish which of the left ear terminal 10L and the right ear terminal 10R is provided, "L" or "R" will be omitted.
・イヤホンの機能構成例
 図3は、イヤホン10の機能構成例を示すブロック図である。
-Functional configuration example of earphone FIG. 3 is a block diagram showing a functional configuration example of the earphone 10.
 図3に示すように、イヤホン10においては、図2のCPU101により所定のプログラムが実行されることによって情報処理部141が実現される。 As shown in FIG. 3, in the earphone 10, the information processing unit 141 is realized by executing a predetermined program by the CPU 101 of FIG.
 なお、図3に示す構成は、左耳用端末10Lと右耳用端末10Rのそれぞれに設けられる。後述する他方の端末は、図3の情報処理部141が左耳用端末10Lの構成である場合には右耳用端末10Rとなり、右耳用端末10Rの構成である場合には左耳用端末10Lとなる。以下では、左耳用端末10Lに設けられる構成には同一の符号に「L」を付し、右耳用端末10Rに設けられる構成には同一の符号に「R」を付して説明する。後述する図12、図18、および図24においても同様である。 The configuration shown in FIG. 3 is provided for each of the left ear terminal 10L and the right ear terminal 10R. The other terminal, which will be described later, is the right ear terminal 10R when the information processing unit 141 of FIG. 3 has the configuration of the left ear terminal 10L, and the left ear terminal when the information processing unit 141 has the configuration of the right ear terminal 10R. It becomes 10L. Hereinafter, the configuration provided in the left ear terminal 10L will be described by adding "L" to the same reference numeral, and the configuration provided in the right ear terminal 10R will be described by adding "R" to the same reference numeral. The same applies to FIGS. 12, 18, and 24, which will be described later.
 情報処理部141は、片耳タップ検出部151、送信制御部152、受信制御部153、タップ判定部154、音制御部155、および実行部156により構成される。 The information processing unit 141 is composed of a one-ear tap detection unit 151, a transmission control unit 152, a reception control unit 153, a tap determination unit 154, a sound control unit 155, and an execution unit 156.
 片耳タップ検出部151は、センサデータをIMU121から取得し、センサデータに基づいてタップを検出する。ここでは、片耳タップ検出部151は、タップ動作やそれ以外の動作を含むユーザの動作によって生じた振動から、タップ動作の有無を検出する。タップ動作以外の動作には、例えば、歩行や咀嚼が含まれる。なお、タップ動作はユーザの動作(例えば、ユーザの指先がイヤホン10の本体に触れる動作)のことを表し、タップは、ユーザのタップ動作により生じたものとしてイヤホン10内で検出される振動のことを表す。 The one-ear tap detection unit 151 acquires sensor data from the IMU 121 and detects taps based on the sensor data. Here, the one-ear tap detection unit 151 detects the presence or absence of a tap operation from the vibration generated by the user's operation including the tap operation and other operations. Operations other than tapping include, for example, walking and chewing. The tap operation represents a user's operation (for example, an operation in which the user's fingertip touches the main body of the earphone 10), and the tap is a vibration detected in the earphone 10 as being caused by the user's tap operation. Represents.
 タップが検出された場合、片耳タップ検出部151は、タップを検出したことを表すイベント情報を送信制御部152に供給する。また、片耳タップ検出部151は、IMU121から取得したセンサデータまたはイベント情報をタップ判定部154に供給する。 When a tap is detected, the one-ear tap detection unit 151 supplies event information indicating that the tap has been detected to the transmission control unit 152. Further, the one-ear tap detection unit 151 supplies the sensor data or event information acquired from the IMU 121 to the tap determination unit 154.
 さらに、片耳タップ検出部151は、操作認識音の出力要求を音制御部155に供給する。操作認識音は、タップを検出したことを表す効果音である。 Further, the one-ear tap detection unit 151 supplies the output request of the operation recognition sound to the sound control unit 155. The operation recognition sound is a sound effect indicating that a tap has been detected.
 送信制御部152には、片耳タップ検出部151が取得するセンサデータと同じ情報がIMU121から供給される。送信制御部152は、左右通信部131を制御して、片耳タップ検出部151から供給されたイベント情報とIMU121から供給されたセンサデータとを他方の端末に送信させる。 The transmission control unit 152 is supplied with the same information as the sensor data acquired by the one-ear tap detection unit 151 from the IMU 121. The transmission control unit 152 controls the left / right communication unit 131 to transmit the event information supplied from the one-ear tap detection unit 151 and the sensor data supplied from the IMU 121 to the other terminal.
 受信制御部153は、他方の端末から送信されてきたイベント情報とセンサデータを左右通信部131から取得し、タップ判定部154に供給する。他方の端末から送信されてきたイベント情報により、他方の端末においてタップが検出されたことが表される。 The reception control unit 153 acquires the event information and the sensor data transmitted from the other terminal from the left / right communication unit 131 and supplies them to the tap determination unit 154. The event information transmitted from the other terminal indicates that the tap was detected in the other terminal.
 タップ判定部154は、片耳タップ検出部151から供給されたセンサデータと、受信制御部153から供給されたセンサデータとに基づいて、片耳タップ検出部151においてタップが検出されたというイベントがタップイベントであるかタップ類似イベントであるかを判定する。 In the tap determination unit 154, an event that the tap is detected in the one ear tap detection unit 151 based on the sensor data supplied from the one ear tap detection unit 151 and the sensor data supplied from the reception control unit 153 is a tap event. Whether it is a tap-like event or not.
 タップイベントは、タップ動作を検出したことを表すイベントである。タップ類似イベントは、タップ動作を除く、振動を生じさせる動作を検出したことを表すイベントである。タップ判定部154は、IMU121による振動の検出が、タップ動作以外の動作により生じた振動の検出などのノイズであるか否かを判定する判定部であると換言できる。 The tap event is an event indicating that a tap operation has been detected. The tap-like event is an event indicating that an action that causes vibration is detected, except for the tap action. In other words, the tap determination unit 154 is a determination unit that determines whether or not the vibration detection by the IMU 121 is noise such as the detection of vibration generated by an operation other than the tap operation.
 タップ判定部154は、片耳タップ検出部151においてタップが検出されたというイベントがタップ類似イベントであると判定した場合、取り消し音の出力要求を音制御部155に供給する。取り消し音は、タップの検出が取り消されたことを表す音である。 When the tap determination unit 154 determines that the event that the tap is detected in the one-ear tap detection unit 151 is a tap-like event, the tap determination unit 154 supplies a cancellation sound output request to the sound control unit 155. The cancel sound is a sound indicating that the detection of the tap has been canceled.
 タップ判定部154は、片耳タップ検出部151においてタップが検出されたというイベントがタップイベントであると判定した場合、機能実行音の出力要求を音制御部155に供給する。機能実行音は、タップイベントに割り当てられた機能が実行されることを表す音である。また、タップ判定部154は、実行部156を制御し、タップイベントに割り当てられた機能を実行させる。 When the tap determination unit 154 determines that the event that the tap is detected in the one-ear tap detection unit 151 is a tap event, the tap determination unit 154 supplies an output request for the function execution sound to the sound control unit 155. The function execution sound is a sound indicating that the function assigned to the tap event is executed. Further, the tap determination unit 154 controls the execution unit 156 to execute the function assigned to the tap event.
 音制御部155は、片耳タップ検出部151から供給された操作認識音、タップ判定部154から供給された取り消し音、機能実行音などの効果音の出力要求に応じた効果音を、音出力部106から出力させる。 The sound control unit 155 outputs sound effects corresponding to output requests for sound effects such as an operation recognition sound supplied from the one-ear tap detection unit 151, a cancellation sound supplied from the tap determination unit 154, and a function execution sound. Output from 106.
 実行部156は、タップ判定部154による制御に従い、タップイベントに割り当てられた機能を実行する。タップイベントに対しては、例えば、音楽の再生の開始、曲送りなどのコンテンツの再生に関する処理が割り当てられる。 The execution unit 156 executes the function assigned to the tap event according to the control by the tap determination unit 154. For the tap event, for example, processing related to content reproduction such as start of music reproduction and song advance is assigned.
 ところで、左耳用端末10Lと右耳用端末10Rのセンサデータを用いてタップイベントの判定を行うためには、両端末間でセンサデータを送受信するための無線通信の時間が必要となる。また、両端末間では、ステレオ再生のための音楽データが常時送信されているから、センサデータを送ることができない可能性がある。 By the way, in order to determine the tap event using the sensor data of the left ear terminal 10L and the right ear terminal 10R, a wireless communication time for transmitting and receiving the sensor data between the two terminals is required. Further, since music data for stereo reproduction is constantly transmitted between both terminals, there is a possibility that sensor data cannot be transmitted.
 無線通信に時間がかかったり、センサデータが送信されなかったりすることによって、タップ動作に対する反応に長い時間がかかると、イヤホン10の性能が低いという印象をユーザに与えてしまう。このため、タップ動作に対する反応音は素早く再生されることが望ましい。したがって、両端末のセンサデータに基づいてタップイベントの判定が行われた後に反応音が再生されることは、反応速度の点で望ましくない。 If it takes a long time to respond to the tap operation due to the time required for wireless communication or the sensor data is not transmitted, the user will be given the impression that the performance of the earphone 10 is low. Therefore, it is desirable that the reaction sound to the tap operation is reproduced quickly. Therefore, it is not desirable in terms of reaction speed that the reaction sound is reproduced after the tap event is determined based on the sensor data of both terminals.
 本技術では、両端末においてタップの検出が別々に行われ、タップの検出時、タップ動作に対する反応音として操作認識音がすぐに再生される。これにより、タップ動作に対してすぐに反応していることをユーザに通知することが可能となる。 In this technology, tap detection is performed separately on both terminals, and when a tap is detected, the operation recognition sound is immediately reproduced as a reaction sound to the tap operation. This makes it possible to notify the user that the tap operation is immediately reacting.
 また、本技術では、操作認識音の再生後、所定のタイムアウト時間が経過するまでに、タップが検出されたというイベントがタップ類似イベントであると判定された場合、タップの検出が取り消され、取り消し音が再生される。これにより、タップに対して反応したものの、それが取り消されたことをユーザに通知することが可能となる。 Further, in the present technology, if it is determined that the event that the tap is detected is a tap-like event before the predetermined timeout period elapses after the operation recognition sound is played, the tap detection is canceled and canceled. The sound is played. This makes it possible to notify the user that the tap has responded but has been cancelled.
 図4は、効果音の再生タイミングの例を示す図である。 FIG. 4 is a diagram showing an example of the reproduction timing of the sound effect.
 図4のAは、タップ動作がユーザにより行われた場合の効果音の再生タイミングを示す。図4のAに示すように、タップが検出された直後の時刻である時刻t1のタイミングで操作認識音が再生される。 A in FIG. 4 shows the reproduction timing of the sound effect when the tap operation is performed by the user. As shown in FIG. 4A, the operation recognition sound is reproduced at the timing of time t1, which is the time immediately after the tap is detected.
 時刻t1から所定のタイムアウト時間が経過した時刻である時刻t2のタイミングで機能実行音が再生される。なお、時刻t2のタイミングでタップイベントに割り当てられた機能が実行される。 The function execution sound is played at the timing of time t2, which is the time when a predetermined timeout time has elapsed from time t1. The function assigned to the tap event is executed at the timing of time t2.
 図4のBは、タップ動作以外の動作がユーザにより行われた場合の効果音の再生タイミングを示す。図4のBに示すように、タップが検出された直後の時刻である時刻t1のタイミングで操作認識音が再生される。 B in FIG. 4 shows the reproduction timing of the sound effect when an operation other than the tap operation is performed by the user. As shown in B of FIG. 4, the operation recognition sound is reproduced at the timing of time t1, which is the time immediately after the tap is detected.
 時刻t1から時刻t2までの間のタイミングである時刻t3のタイミングで、他方の端末から送信されてきたタップの検出を取り消すことの判定に用いられる情報が受信された場合、タップの検出が取り消される。例えば、タップを取り消すことの判定に用いられる情報として、イベント情報やセンサデータが他方の端末から送信される。 When the information used for determining to cancel the detection of the tap transmitted from the other terminal is received at the timing of the time t3, which is the timing between the time t1 and the time t2, the tap detection is canceled. .. For example, event information and sensor data are transmitted from the other terminal as information used for determining whether to cancel the tap.
 時刻t3から時刻t2までの間の時刻である時刻t4のタイミングで取り消し音が再生される。なお、タップの検出が取り消された場合でも、取り消し音が再生されないようにしてもよい。 The cancellation sound is played at the timing of time t4, which is the time between time t3 and time t2. Even if the tap detection is canceled, the cancellation sound may not be played.
 以上のように、イヤホン10においては、タップ動作が行われた場合、タップを検出した直後に操作認識音が再生され、所定のタイムアウト時間が経過した後に機能実行音が再生される。 As described above, in the earphone 10, when the tap operation is performed, the operation recognition sound is played immediately after the tap is detected, and the function execution sound is played after the predetermined time-out time has elapsed.
 操作認識音は、左耳用端末10Lと右耳用端末10Rのそれぞれでタップを検出したときに必ず再生される。一方、機能実行音は、タイムアウト時間が経過するまでにタップの検出を取り消すための情報が他方の端末から送信されてこなかった場合のみ再生される。 The operation recognition sound is always reproduced when a tap is detected on each of the left ear terminal 10L and the right ear terminal 10R. On the other hand, the function execution sound is played only when the information for canceling the tap detection is not transmitted from the other terminal by the time-out time elapses.
・イヤホンの動作
 ここで、以上のような構成を有するイヤホン10の動作について説明する。
-Operation of earphone Here, the operation of the earphone 10 having the above configuration will be described.
 左耳用端末10Lと右耳用端末10Rのそれぞれにおいてタップが検出され、両端末の情報を用いて、タップが検出されたというイベントがタップイベントであるか、またはタップ類似イベントであるかの判定が行われる。タップが検出されたというイベントがタップイベントであるか、またはタップ類似イベントであるかの判定に用いられる情報のやり取りの方法として、2つの方法が考えられる。
 1.第1の方法
 一方の端末でタップを検出したことを表すイベント情報を他方の端末に送信する方法
 2.第2の方法
 一方の端末のセンサデータを他方の端末に送信する方法
A tap is detected in each of the left ear terminal 10L and the right ear terminal 10R, and the information of both terminals is used to determine whether the event that the tap is detected is a tap event or a tap-like event. Is done. Two methods can be considered as a method of exchanging information used for determining whether the event that the tap is detected is a tap event or a tap-like event.
1. 1. First method A method of transmitting event information indicating that a tap has been detected on one terminal to the other terminal. Second method A method of transmitting sensor data from one terminal to the other terminal.
(第1の方法について)
 第1の方法によって送信されたイベント情報に基づいて、タップの検出がタップイベントであるか否かの判定が行われる例について説明する。
(About the first method)
An example in which it is determined whether or not the tap detection is a tap event is described based on the event information transmitted by the first method.
 なお、以下では左耳用端末10Lから右耳用端末10Rに情報が送信される場合について説明する。その逆に右耳用端末10Rから左耳用端末10Lに情報が送信されるようにすることも可能であるし、両端末から相互に情報が送信されるようにすることも可能である。 In the following, a case where information is transmitted from the left ear terminal 10L to the right ear terminal 10R will be described. On the contrary, it is possible to transmit information from the right ear terminal 10R to the left ear terminal 10L, and it is also possible to transmit information from both terminals to each other.
 図5は、第1の方法における情報の流れを示す図である。 FIG. 5 is a diagram showing the flow of information in the first method.
 図5の左側に示すように、左耳用端末10Lにおいては、IMU121Lのセンサデータが片耳タップ検出部151Lにより監視される。片耳タップ検出部151Lによりタップが検出された場合、イベント情報が右耳用端末10Rのタップ判定部154Rに送信される。 As shown on the left side of FIG. 5, in the left ear terminal 10L, the sensor data of the IMU 121L is monitored by the one ear tap detection unit 151L. When the tap is detected by the one-ear tap detection unit 151L, the event information is transmitted to the tap determination unit 154R of the right ear terminal 10R.
 図5の右側に示すように、右耳用端末10Rにおいても、IMU121Rのセンサデータが片耳タップ検出部151Rにより監視される。片耳タップ検出部151Rによりタップが検出された場合、イベント情報が、片耳タップ検出部151Rからタップ判定部154Rに供給される。 As shown on the right side of FIG. 5, the sensor data of the IMU121R is also monitored by the one-ear tap detection unit 151R in the right-ear terminal 10R. When the tap is detected by the one-ear tap detection unit 151R, the event information is supplied from the one-ear tap detection unit 151R to the tap determination unit 154R.
 タップ判定部154Rは、左耳用端末10Lから送信されてきたイベント情報と、片耳タップ検出部151Rから供給されたイベント情報とに基づいて、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントであるか否かを判定する。 The tap determination unit 154R has an event that the tap is detected in the one ear tap detection unit 151R based on the event information transmitted from the left ear terminal 10L and the event information supplied from the one ear tap detection unit 151R. Determine if it is a tap event.
 右耳用端末10Rでは、タップ判定部154Rによる判定結果に基づいて、上述したように、機能実行音や取り消し音が再生される。 In the right ear terminal 10R, the function execution sound and the cancellation sound are reproduced as described above based on the determination result by the tap determination unit 154R.
 図6のフローチャートを参照して、イヤホン10(情報処理部141)により実行される処理の流れを説明する。 The flow of processing executed by the earphone 10 (information processing unit 141) will be described with reference to the flowchart of FIG.
 ステップS101において、右耳用端末10Rの片耳タップ検出部151Rは、IMU121Rのセンサデータに基づいてタップを検出する。 In step S101, the one-ear tap detection unit 151R of the right ear terminal 10R detects the tap based on the sensor data of the IMU121R.
 ステップS102において、右耳用端末10Rの音制御部155Rは、操作認識音を再生し、音出力部106Rから出力させる。 In step S102, the sound control unit 155R of the right ear terminal 10R reproduces the operation recognition sound and outputs it from the sound output unit 106R.
 ステップS103において、タップ判定部154Rは、一定の期間内に、同様のタップイベントを検出したことを表すイベント情報を左耳用端末10Lから受信したか否かを判定する。 In step S103, the tap determination unit 154R determines whether or not event information indicating that a similar tap event has been detected has been received from the left ear terminal 10L within a certain period of time.
 一定の期間内に同様のイベント情報を左耳用端末10Lから受信したとステップS103において判定された場合、処理はステップS104に進む。ここでは、片耳タップ検出部151Rにおいてタップが検出されているから、同じくタップを検出したことを表すイベント情報を左耳用端末10Lから受信した場合に、同様のイベント情報を受信したと判定される。この場合、タップ判定部154Rは、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップ類似イベントであると判定する。つまり、タップ判定部154Rは、片耳タップ検出部151Rにおいてタップが検出されたというイベントが無効であると判定する。 If it is determined in step S103 that the same event information has been received from the left ear terminal 10L within a certain period of time, the process proceeds to step S104. Here, since the tap is detected by the one-ear tap detection unit 151R, it is determined that the same event information has been received when the event information indicating that the tap has been detected is received from the left ear terminal 10L. .. In this case, the tap determination unit 154R determines that the event that the tap is detected in the one-ear tap detection unit 151R is a tap-like event. That is, the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is invalid.
 ステップS104において、音制御部155Rは、取り消し音を再生し、音出力部106Rから出力させる。取り消し音が両端末の音出力部106から同時に出力されるようにしてもよい。取り消し音が出力された後、処理は終了する。 In step S104, the sound control unit 155R reproduces the canceling sound and outputs it from the sound output unit 106R. The cancellation sound may be output from the sound output units 106 of both terminals at the same time. After the cancellation sound is output, the process ends.
 一方、一定の期間内に同様のイベント情報を左耳用端末10Lから受信していないとステップS103において判定された場合、処理はステップS105に進む。この場合、タップ判定部154Rは、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントであると判定する。つまり、タップ判定部154Rは、片耳タップ検出部151Rにおいてタップが検出されたというイベントが有効であると判定する。 On the other hand, if it is determined in step S103 that the same event information has not been received from the left ear terminal 10L within a certain period, the process proceeds to step S105. In this case, the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is a tap event. That is, the tap determination unit 154R determines that the event that the tap is detected by the one-ear tap detection unit 151R is valid.
 ステップS105において、音制御部155Rは、機能実行音を再生し、音出力部106Rから出力させる。機能実行音が両端末の音出力部106から同時に出力されるようにしてもよい。 In step S105, the sound control unit 155R reproduces the function execution sound and outputs it from the sound output unit 106R. The function execution sound may be output from the sound output units 106 of both terminals at the same time.
 ステップS106において、実行部156Rは、タップイベントに応じた所定の機能を実行する。所定の機能が実行された後、処理は終了する。 In step S106, the execution unit 156R executes a predetermined function according to the tap event. After the predetermined function is executed, the process ends.
(第2の方法について)
 次に、第2の方法によって送信されたセンサデータに基づいて、タップが検出されたというイベントがタップイベントであるか否かの判定が行われる例について説明する。
(About the second method)
Next, an example in which it is determined whether or not the event that the tap is detected is a tap event will be described based on the sensor data transmitted by the second method.
 図7は、第2の方法における情報の流れを示す図である。 FIG. 7 is a diagram showing the flow of information in the second method.
 図7の左側に示すように、左耳用端末10Lにおいては、IMU121Lのセンサデータの値が右耳用端末10Rのタップ判定部154Rに送信される。タップ判定部154Rに対するセンサデータの送信は、例えば、所定のフィルタが適用されることによって、IMU121Lのセンサデータの値が所定のしきい値を超えるなどの条件が満たされた場合に行われる。なお、IMU121Lのセンサデータの値がタップ判定部154Rに常時送信されるようにしてもよい。 As shown on the left side of FIG. 7, in the left ear terminal 10L, the value of the sensor data of the IMU121L is transmitted to the tap determination unit 154R of the right ear terminal 10R. The transmission of the sensor data to the tap determination unit 154R is performed, for example, when a condition such that the value of the sensor data of the IMU 121L exceeds a predetermined threshold value is satisfied by applying a predetermined filter. The value of the sensor data of the IMU 121L may be constantly transmitted to the tap determination unit 154R.
 図7の右側に示すように、右耳用端末10Rにおいては、IMU121Rのセンサデータが片耳タップ検出部151Rにより監視される。片耳タップ検出部151Rによりタップが検出された場合、IMU121Rのセンサデータが、片耳タップ検出部151Rからタップ判定部154Rに供給される。 As shown on the right side of FIG. 7, in the right ear terminal 10R, the sensor data of the IMU121R is monitored by the one ear tap detection unit 151R. When the tap is detected by the one-ear tap detection unit 151R, the sensor data of the IMU121R is supplied from the one-ear tap detection unit 151R to the tap determination unit 154R.
 タップ判定部154Rは、左耳用端末10Lから送信されてきたセンサデータと、片耳タップ検出部151Rから供給されたセンサデータとに基づいて、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントであるか否かを判定する。 The tap determination unit 154R has an event that the tap is detected in the one ear tap detection unit 151R based on the sensor data transmitted from the left ear terminal 10L and the sensor data supplied from the one ear tap detection unit 151R. Determine if it is a tap event.
 右耳用端末10Rでは、タップ判定部154Rによる判定結果に基づいて、上述したように、機能実行音や取り消し音が再生される。 In the right ear terminal 10R, the function execution sound and the cancellation sound are reproduced as described above based on the determination result by the tap determination unit 154R.
 図8のフローチャートを参照して、イヤホン10(情報処理部141)により実行される処理の流れを説明する。 The flow of processing executed by the earphone 10 (information processing unit 141) will be described with reference to the flowchart of FIG.
 ステップS151において、右耳用端末10Rの片耳タップ検出部151Rは、IMU121Rのセンサデータに基づいて、タップを検出する。 In step S151, the one-ear tap detection unit 151R of the right ear terminal 10R detects the tap based on the sensor data of the IMU121R.
 ステップS152において、右耳用端末10Rの音制御部155Rは、操作認識音を再生し、音出力部106Rから出力させる。 In step S152, the sound control unit 155R of the right ear terminal 10R reproduces the operation recognition sound and outputs it from the sound output unit 106R.
 ステップS153において、受信制御部153Rは、左耳用端末10Lから送信されてきたセンサデータの値を、左右通信部131Rを介して受信する。 In step S153, the reception control unit 153R receives the value of the sensor data transmitted from the left ear terminal 10L via the left / right communication unit 131R.
 ステップS154において、タップ判定部154Rは、タップ判定処理を行う。具体的には、タップ判定部154Rは、IMU121Rのセンサデータの値と、左耳用端末10Lから送信されてきたIMU121Lのセンサデータの値との類似度を計算し、計算した類似度に基づいて、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントであるか、またはタップ類似イベントであるかを判定する。センサデータの値の類似度に基づく判定については後述する。 In step S154, the tap determination unit 154R performs a tap determination process. Specifically, the tap determination unit 154R calculates the similarity between the sensor data value of the IMU121R and the sensor data value of the IMU121L transmitted from the left ear terminal 10L, and is based on the calculated similarity. , It is determined whether the event that the tap is detected by the one-ear tap detection unit 151R is a tap event or a tap-like event. The determination based on the similarity of the values of the sensor data will be described later.
 ステップS155において、ステップS154のタップ判定処理で、左耳用端末10Lのセンサデータと右耳用端末10Rのセンサデータを用いて、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントではなかった、つまり、タップ類似イベントであったと判定された場合、処理はステップS156に進む。 In step S155, the event that the tap is detected by the one-ear tap detection unit 151R using the sensor data of the left ear terminal 10L and the sensor data of the right ear terminal 10R in the tap determination process of step S154 is the tap event. If it is determined that the event did not exist, that is, it was a tap-like event, the process proceeds to step S156.
 ステップS156において、音制御部155Rは、取り消し音を再生し、音出力部106Rから出力させる。取り消し音が両端末の音出力部106から同時に出力されるようにしてもよい。取り消し音が出力された後、処理は終了する。 In step S156, the sound control unit 155R reproduces the canceling sound and outputs it from the sound output unit 106R. The cancellation sound may be output from the sound output units 106 of both terminals at the same time. After the cancellation sound is output, the process ends.
 一方、ステップS155において、ステップS154のタップ判定処理で、片耳タップ検出部151Rにおいてタップが検出されたというイベントがタップイベントであったと判定された場合、処理はステップS157に進む。 On the other hand, in step S155, if it is determined in the tap determination process of step S154 that the event that the tap was detected by the one-ear tap detection unit 151R is a tap event, the process proceeds to step S157.
 ステップS157において、音制御部155Rは、機能実行音を再生し、音出力部106Rから出力させる。機能実行音が両端末の音出力部106から同時に出力されるようにしてもよい。 In step S157, the sound control unit 155R reproduces the function execution sound and outputs it from the sound output unit 106R. The function execution sound may be output from the sound output units 106 of both terminals at the same time.
 ステップS158において、実行部156Rは、タップイベントに応じた所定の機能を実行する。所定の機能が実行された後、処理は終了する。 In step S158, the execution unit 156R executes a predetermined function according to the tap event. After the predetermined function is executed, the process ends.
(センサデータの類似度による判定について)
 タップ動作以外の動作により生じる振動は、左耳用端末10Lと右耳用端末10Rの両方のセンサにより同時に検出されることが多い。例えば、歩行や咀嚼は、ユーザの頭部全体が振動する。それに対して、タップによる振動は、左耳用端末10Lまたは右耳用端末10Rのどちらかだけにおいて大きい振動として検出される。
(About judgment based on the similarity of sensor data)
Vibrations generated by operations other than the tap operation are often detected simultaneously by the sensors of both the left ear terminal 10L and the right ear terminal 10R. For example, when walking or chewing, the entire user's head vibrates. On the other hand, the vibration due to the tap is detected as a large vibration only in either the left ear terminal 10L or the right ear terminal 10R.
 このため、タップ判定部154は、両端末に搭載されたセンサのセンサデータの類似度が所定のしきい値よりも高い場合、片耳タップ検出部151によるタップの検出がタップ類似イベントであると判定する。 Therefore, when the similarity of the sensor data of the sensors mounted on both terminals is higher than a predetermined threshold value, the tap determination unit 154 determines that the tap detection by the one-ear tap detection unit 151 is a tap-like event. do.
 図9は、タップ動作以外の動作による振動の例を示す図である。 FIG. 9 is a diagram showing an example of vibration due to an operation other than the tap operation.
 図9には、食事中のユーザの咀嚼により生じた振動の波形が示されている。図9において、縦軸は加速度を表し、横軸は時間を表す。後述する図10のグラフにおいても同様である。なお、図9に示す波形には所定のバンドパスフィルタが適用されている。 FIG. 9 shows the waveform of the vibration generated by the chewing of the user during a meal. In FIG. 9, the vertical axis represents acceleration and the horizontal axis represents time. The same applies to the graph of FIG. 10 described later. A predetermined bandpass filter is applied to the waveform shown in FIG.
 図9のAには、左耳用端末10Lに搭載されたIMU121Lのセンサデータの波形が示され、図9のBには、右耳用端末10Rに搭載されたIMU121Rのセンサデータの波形が示されている。 9A shows the waveform of the sensor data of the IMU121L mounted on the left ear terminal 10L, and FIG. 9B shows the waveform of the sensor data of the IMU121R mounted on the right ear terminal 10R. Has been done.
 図9のAとBを比較すると、ユーザが食事中である場合、左耳用端末10Lと右耳用端末10Rに搭載されたセンサにより振動が同時に検出されていることがわかる。したがって、イヤホン10は、左耳用端末10Lと右耳用端末10Rにそれぞれ搭載されたセンサのセンサデータに基づいて、ユーザが咀嚼を行ったことを検出することができる。 Comparing A and B in FIG. 9, it can be seen that when the user is eating, vibration is simultaneously detected by the sensors mounted on the left ear terminal 10L and the right ear terminal 10R. Therefore, the earphone 10 can detect that the user has chewed based on the sensor data of the sensors mounted on the left ear terminal 10L and the right ear terminal 10R, respectively.
 図10は、タップ動作以外の動作による振動の他の例を示す図である。 FIG. 10 is a diagram showing another example of vibration due to an operation other than the tap operation.
 図10では、ユーザの歩行により生じた振動の波形が示されている。なお、図10に示す波形には所定のバンドパスフィルタが適用されている。 FIG. 10 shows the waveform of the vibration generated by the walking of the user. A predetermined bandpass filter is applied to the waveform shown in FIG.
 図10のAには、左耳用端末10Lに搭載されたIMU121Lのセンサデータの波形が示され、図10のBのグラフには、右耳用端末10Rに搭載されたIMU121Rのセンサデータの波形が示されている。 FIG. 10A shows the waveform of the sensor data of the IMU121L mounted on the left ear terminal 10L, and FIG. 10B shows the waveform of the sensor data of the IMU121R mounted on the right ear terminal 10R. It is shown.
 図10のAとBを比較すると、ユーザが歩行中やランニング中の場合、左耳用端末10Lと右耳用端末10Rに搭載されたセンサにより振動が同時に検出されていることがわかる。したがって、イヤホン10は、左耳用端末10Lと右耳用端末10Rにそれぞれ搭載されたセンサのセンサデータに基づいて、ユーザが歩行中やランニング中であることを検出することができる。 Comparing A and B in FIG. 10, it can be seen that when the user is walking or running, vibration is simultaneously detected by the sensors mounted on the left ear terminal 10L and the right ear terminal 10R. Therefore, the earphone 10 can detect that the user is walking or running based on the sensor data of the sensors mounted on the left ear terminal 10L and the right ear terminal 10R, respectively.
 上述したように、イヤホン10は、両端末のセンサデータに基づいてユーザの咀嚼や歩行などを検出することができるため、ユーザの咀嚼や歩行などと異なり、左耳側と右耳側の一方で行われるタップ動作を識別することができる。 As described above, since the earphone 10 can detect the user's mastication or walking based on the sensor data of both terminals, unlike the user's mastication or walking, one of the left ear side and the right ear side. It is possible to identify the tap action to be performed.
 以上のように、イヤホン10においては、左耳用端末10Lと右耳用端末10Rのそれぞれに搭載されたセンサの情報に基づいて、ユーザによるタップ動作が検出される。また、イヤホン10においては、タップ動作以外の動作に対するタップの検出が操作認識音の再生後に取り消される。 As described above, in the earphone 10, the tap operation by the user is detected based on the information of the sensors mounted on each of the left ear terminal 10L and the right ear terminal 10R. Further, in the earphone 10, the detection of the tap for an operation other than the tap operation is canceled after the operation recognition sound is reproduced.
 タップ動作以外の動作に対するタップの検出が、両方の端末に搭載されたセンサの情報に基づいて取り消されるため、イヤホン10は、片方の端末に搭載されたセンサの情報に基づいてタップ動作を検出する場合よりも、タップ動作の検出の精度を向上させることが可能となる。 Since the detection of the tap for the operation other than the tap operation is canceled based on the information of the sensor mounted on both terminals, the earphone 10 detects the tap operation based on the information of the sensor mounted on one terminal. It is possible to improve the accuracy of tap operation detection more than in the case.
 タップ動作以外の動作に対してタップが検出された場合でも操作認識音が再生されることになるが、そのような誤ったタップの検出がすぐに取り消され、機能が実行されることはないため、ユーザの操作感を悪化させないで済む。 Even if a tap is detected for an operation other than the tap operation, the operation recognition sound will be played, but the detection of such an erroneous tap is immediately canceled and the function is not executed. , It is not necessary to deteriorate the user's operability.
 センサの情報が両端末間で常時送信される場合、音楽データの送信に用いられる帯域が狭められ、電波状況が悪いときには音切れが生じてしまう。イヤホン10においては、タップが検出された場合にのみセンサの情報が送信されればよい。タップが検出された場合にのみセンサの情報を送信するようにして、両端末間の音楽データの送信が優先して行われるようにすることにより、センサの情報を送信することによって生じる音切れを減らすことが可能となる。 When the sensor information is constantly transmitted between both terminals, the band used for transmitting music data is narrowed, and sound interruption occurs when the radio wave condition is poor. In the earphone 10, the sensor information needs to be transmitted only when the tap is detected. By transmitting the sensor information only when a tap is detected and giving priority to the transmission of music data between both terminals, the sound interruption caused by transmitting the sensor information can be eliminated. It will be possible to reduce it.
 タップが検出された場合、操作認識音をすぐに再生するため、イヤホン10は、ユーザの操作に対するフィードバックを素早く提供することができる。ユーザは、操作認識音を聞くことによって、操作が検出されたことをすぐに確認することができる。また、ユーザは、取り消し音を聞くことによって、ユーザがタップを意図しない動きによるタップの誤検出が取り消されたことを確認することができる。 When a tap is detected, the operation recognition sound is immediately reproduced, so that the earphone 10 can quickly provide feedback to the user's operation. By listening to the operation recognition sound, the user can immediately confirm that the operation has been detected. In addition, the user can confirm that the false detection of the tap due to the movement that the user does not intend to tap has been canceled by listening to the cancel sound.
・変形例
 操作認識音が再生されてから機能実行音が再生されるまでの間にユーザによるキャンセル操作が受け付けられるようにしてもよい。
-Modification example The cancel operation by the user may be accepted between the time when the operation recognition sound is played and the time when the function execution sound is played.
 ユーザは、誤ってタップ操作を行った場合やタップ動作以外の動作が誤検出されたと判断した場合、機能実行音が再生されるまでの間のタイミングでキャンセル操作を行うことによって、イヤホン10によるタップの検出を取り消すことができる。 When the user mistakenly performs a tap operation or determines that an operation other than the tap operation is erroneously detected, the user taps with the earphone 10 by performing a cancel operation at the timing until the function execution sound is played. Detection can be canceled.
 図11は、イヤホン10の使用方法の例を示す図である。 FIG. 11 is a diagram showing an example of how to use the earphone 10.
 図11のAの例では、左耳用端末10LがユーザUの左耳に装着され、右耳用端末10RがユーザUの右耳に装着されている。 In the example of A in FIG. 11, the left ear terminal 10L is attached to the left ear of the user U, and the right ear terminal 10R is attached to the right ear of the user U.
 図11のBの例では、右耳用端末10Rが単体で使用され、ユーザUの右耳に装着されている。 In the example of B in FIG. 11, the right ear terminal 10R is used alone and is attached to the right ear of the user U.
 図11のBに示すように、右耳用端末10Rと左耳用端末10Lのうちのいずれかの端末だけが使用される場合、上述したようなタップの検出に対するイヤホン10の挙動は一部変更となる。 As shown in B of FIG. 11, when only one of the right ear terminal 10R and the left ear terminal 10L is used, the behavior of the earphone 10 with respect to the tap detection as described above is partially changed. It becomes.
 例えば、右耳用端末10Rだけが使用される場合、左耳用端末10Lから送信されてくる情報に基づくタップの検出の取り消しを行うことができない。このため、タップ動作以外の動作を右耳用端末10Rにおいてタップとして検出してしまうなどの、ノイズの影響を受けやすくなる。 For example, when only the right ear terminal 10R is used, it is not possible to cancel the detection of the tap based on the information transmitted from the left ear terminal 10L. Therefore, an operation other than the tap operation is easily affected by noise, such as being detected as a tap on the right ear terminal 10R.
 このため、右耳用端末10Rの動作を変更する。変更の例として、例えば、右耳用端末10Rにおいては、タップが検出されたものとして判定する加速度のしきい値がより大きい値に設定される。これにより、右耳用端末10Rの片方だけを使用する場合であっても、タップ動作以外の動作をタップとして検出してしまうといった誤検出を減らすことが可能となる。なお、加速度センサの値に様々な算術処理や機械学習による処理を行った評価値に対してしきい値判定が行われるようにしてもよい。 Therefore, the operation of the right ear terminal 10R is changed. As an example of the change, for example, in the right ear terminal 10R, the threshold value of the acceleration determined that the tap is detected is set to a larger value. This makes it possible to reduce erroneous detection such that an operation other than the tap operation is detected as a tap even when only one of the right ear terminals 10R is used. It should be noted that the threshold value may be determined for the evaluation value obtained by performing various arithmetic processing or machine learning processing on the value of the acceleration sensor.
 両端末が使用される場合、右耳用端末10Rにおいてタップイベントに割り当てられる機能と左耳用端末10Lにおいてタップイベント割り当てられる機能とを異なる機能とすることができる。一方、右耳用端末10Rだけが使用される場合、右耳用端末10Rにおいて、両端末が使用される場合と異なる機能がタップイベントに割り当てられるようにしてもよい。 When both terminals are used, the function assigned to the tap event in the right ear terminal 10R and the function assigned to the tap event in the left ear terminal 10L can be different functions. On the other hand, when only the right ear terminal 10R is used, a function different from that when both terminals are used may be assigned to the tap event in the right ear terminal 10R.
<2.第2の実施の形態>
 IMU121のセンサデータを用いてタップを検出する際、ユーザのタップ動作以外の動作を検出すること、音楽が再生されることによって生じる振動を検出することなどの、ノイズの影響を受けることがある。
<2. Second Embodiment>
When detecting a tap using the sensor data of the IMU 121, it may be affected by noise such as detecting an operation other than the user's tap operation and detecting a vibration generated by playing music.
 過去に検出されたノイズの周期性や再生される音楽データの波形などに基づいてノイズとしての振動を予測し、ノイズの加速度値をセンサデータから差し引いたり、ノイズが生じるタイミングをタップの検出期間から除いたりすることで、ノイズを低減することができる。イヤホン10は、ノイズを低減することによって、ユーザのタップ動作を検出する精度を向上させることができる。 Vibration as noise is predicted based on the periodicity of noise detected in the past and the waveform of music data to be played, the acceleration value of noise is subtracted from the sensor data, and the timing of noise generation is determined from the tap detection period. Noise can be reduced by removing it. The earphone 10 can improve the accuracy of detecting the tap operation of the user by reducing the noise.
 ノイズを低減するための実施例として、2つの例が考えられる。
 1.第1の例
 歩行や咀嚼により生じるノイズの時間変化を記録し、タップの検出時にノイズをキャンセルする例
 2.第2の例
 再生される音信号に基づいて振動を予測し、予測した振動をキャンセルする例
Two examples can be considered as examples for reducing noise.
1. 1. First example An example of recording the time change of noise caused by walking or chewing and canceling the noise when a tap is detected. Second example An example of predicting vibration based on the reproduced sound signal and canceling the predicted vibration.
・第1の例
 図12は、第1の例におけるイヤホン10の機能構成例を示すブロック図である。
First Example FIG. 12 is a block diagram showing a functional configuration example of the earphone 10 in the first example.
 図12に示すように、イヤホン10においては、図2のCPU101により所定のプログラムが実行されることによって情報処理部161が実現される。なお、図12に示す構成は、左耳用端末10Lと右耳用端末10Rのそれぞれに設けられるようにすることも可能であるし、左耳用端末10Lと右耳用端末10Rのどちらか一方に設けられるようにすることも可能である。後述する図18についても同様である。 As shown in FIG. 12, in the earphone 10, the information processing unit 161 is realized by executing a predetermined program by the CPU 101 of FIG. The configuration shown in FIG. 12 can be provided in each of the left ear terminal 10L and the right ear terminal 10R, or either the left ear terminal 10L or the right ear terminal 10R. It is also possible to provide it in. The same applies to FIG. 18 described later.
 情報処理部161は、タップ検出部171と実行部172により構成される。 The information processing unit 161 is composed of a tap detection unit 171 and an execution unit 172.
 タップ検出部171は、IMU121のセンサデータと過去のセンサデータに基づいてタップを検出する。タップ検出部171は、取得部181、計算部182、および判定部183を有する。 The tap detection unit 171 detects a tap based on the sensor data of the IMU 121 and the past sensor data. The tap detection unit 171 includes an acquisition unit 181, a calculation unit 182, and a determination unit 183.
 取得部181は、センサデータをIMU121から取得し、センサデータに基づいて振動のピークを検出する。また、取得部181は、過去に検出された振動のピークの履歴を取得する。例えば、図2の記憶部109には、取得部181により過去に検出されたピークを表す情報が履歴として記憶され、取得部181により取得される。 The acquisition unit 181 acquires the sensor data from the IMU 121 and detects the peak of vibration based on the sensor data. Further, the acquisition unit 181 acquires the history of the peaks of the vibration detected in the past. For example, in the storage unit 109 of FIG. 2, information representing a peak detected in the past by the acquisition unit 181 is stored as a history, and is acquired by the acquisition unit 181.
 計算部182は、取得部181により取得された履歴とセンサデータに基づいて、ピークの間隔を計算する。 The calculation unit 182 calculates the peak interval based on the history and sensor data acquired by the acquisition unit 181.
 判定部183は、計算部182による計算結果に基づいて、取得部181により検出されたピークをタップとして認識するか否かを判定する。すなわち、判定部183は、計算部182による計算結果に基づいてタップを認識する認識部として機能する。 The determination unit 183 determines whether or not to recognize the peak detected by the acquisition unit 181 as a tap based on the calculation result by the calculation unit 182. That is, the determination unit 183 functions as a recognition unit that recognizes the tap based on the calculation result by the calculation unit 182.
 実行部172は、タップ検出部171においてタップが検出されたというイベントに応じた処理を実行する。 The execution unit 172 executes processing according to the event that the tap is detected in the tap detection unit 171.
 図13乃至図15を参照して、ピークの間隔に基づくタップの認識方法について説明する。 A tap recognition method based on peak intervals will be described with reference to FIGS. 13 to 15.
 一般的に、ユーザのタップ動作により生じる振動の波形と、歩行中のユーザが着地したことにより生じる振動の波形とが類似することがある。ユーザが着地したことにより生じる振動の波形は、ユーザが履いている靴やユーザの歩き方に依存している。例えば、ハイヒールなどの靴底が固い靴を履いたユーザが勢いよく着地した場合、タップ動作により生じる振動に似た振動が発生する。 In general, the waveform of vibration generated by the tap operation of the user may be similar to the waveform of vibration generated by the landing of the walking user. The vibration waveform generated by the landing of the user depends on the shoes worn by the user and the way the user walks. For example, when a user wearing shoes with a hard sole such as high heels lands vigorously, vibration similar to the vibration generated by the tap operation is generated.
 図13は、歩行中の着地により生じる振動の波形の例を示す図である。 FIG. 13 is a diagram showing an example of a vibration waveform generated by landing during walking.
 歩行は周期的な運動であるため、歩行中の着地により生じる振動は周期性を有する。図13に示すように、歩行中の着地により生じる振動の波形は、一定の周期Tでピークを示す波形W11となる。 Since walking is a periodic movement, the vibration generated by landing during walking has periodicity. As shown in FIG. 13, the vibration waveform generated by landing during walking is the waveform W11 showing a peak at a constant period T.
 周期的にピークが検出されるため、イヤホン10は、歩行中の着地のタイミングを予測することができる。人により歩行の癖は異なるが、イヤホン10は、過去に検出された振動に基づいて次の振動を予測することができる。 Since the peak is detected periodically, the earphone 10 can predict the landing timing during walking. Although the walking habit differs depending on the person, the earphone 10 can predict the next vibration based on the vibration detected in the past.
 イヤホン10は、歩行のような周期的な振動のパターンを記録し、振動パターンに基づいて次の着地のタイミングを予測する。イヤホン10においては、着地のタイミングであると予測した時間の前後に検出された振動をタップとして認識しないようにする。 The earphone 10 records a periodic vibration pattern such as walking, and predicts the timing of the next landing based on the vibration pattern. In the earphone 10, the vibration detected before and after the time predicted to be the landing timing is not recognized as a tap.
 図14は、歩行中の着地により生じた振動を検出した場合のIMU121のセンサデータの波形の例を示す図である。 FIG. 14 is a diagram showing an example of the waveform of the sensor data of the IMU 121 when the vibration generated by the landing during walking is detected.
 図14の波形W12の実線部分は、IMU121により過去に検出された振動を表す。イヤホン10は、IMU121のセンサデータとしての加速度値にローパスフィルタを適用し、一定期間内でのピークの履歴を計算する。 The solid line portion of the waveform W12 in FIG. 14 represents the vibration detected in the past by the IMU 121. The earphone 10 applies a low-pass filter to the acceleration value as sensor data of the IMU 121, and calculates the history of peaks within a certain period.
 例えば、加速度のノルムが所定のしきい値を超え、かつ、一定期間内での最大値を与える時間がピークのタイミングとして計算される。図14では、3つピークが検出されている。1番目のピークから2番目のピークまでの間隔はT0とされ、2番目のピークから3番目のピークまでの間隔はT1とされる。 For example, the time when the norm of acceleration exceeds a predetermined threshold value and gives the maximum value within a certain period is calculated as the peak timing. In FIG. 14, three peaks are detected. The interval from the first peak to the second peak is T0, and the interval from the second peak to the third peak is T1.
 計算部182は、タップの可能性があるピークPaが新たに検出された場合、ピークPaと直近のピークとの間隔を計算する。図14の波形W12の破線部分は、新たに検出されたピークPaを含む振動を表す。図14では、ピークPaと直近のピークとの間隔がTaであると計算される。 The calculation unit 182 calculates the interval between the peak Pa and the nearest peak when a peak Pa that may be tapped is newly detected. The broken line portion of the waveform W12 in FIG. 14 represents the vibration including the newly detected peak Pa. In FIG. 14, it is calculated that the interval between the peak Pa and the nearest peak is Ta.
 間隔Taが、過去に検出されたピークの間隔T0や間隔T1と近い場合、判定部183は、ピークPaをタップとして認識せず、ノイズとして認識する。 When the interval Ta is close to the interval T0 or interval T1 of the peaks detected in the past, the determination unit 183 does not recognize the peak Pa as a tap, but recognizes it as noise.
 図15は、歩行後に行われたタップ動作により生じた振動を検出した場合のIMU121のセンサデータの波形を示す図である。 FIG. 15 is a diagram showing the waveform of the sensor data of the IMU 121 when the vibration generated by the tap operation performed after walking is detected.
 図15の波形W13の実線部分は、IMU121により過去に検出された振動を表す。波形W13においても、図14の波形W12と同様に、ピークの履歴が計算される。 The solid line portion of the waveform W13 in FIG. 15 represents the vibration detected in the past by the IMU 121. In the waveform W13 as well, the peak history is calculated in the same manner as in the waveform W12 of FIG.
 計算部182は、タップの可能性があるピークPbが新たに検出された場合、ピークPbと直近のピークとの間隔を計算する。図15の波形W13の破線部分は、新たに検出されたピークPbを含む振動を表す。図15では、ピークPbと直近のピークとの間隔がTbであると計算される。 The calculation unit 182 calculates the interval between the peak Pb and the nearest peak when a peak Pb that may be tapped is newly detected. The broken line portion of the waveform W13 in FIG. 15 represents the vibration including the newly detected peak Pb. In FIG. 15, the interval between the peak Pb and the nearest peak is calculated to be Tb.
 間隔Tbが、間隔T0や間隔T1に比べて短い場合、判定部183は、ピークPbを含む振動をタップの可能性が高いと認識する。 When the interval Tb is shorter than the interval T0 or the interval T1, the determination unit 183 recognizes that the vibration including the peak Pb is likely to be tapped.
 以上のように、タップ検出部171においては、過去に検出されたピークの間隔と新たに検出されたピークの間隔とに基づいて、新たに検出されたピークを含む振動をタップとして認識するか否かが判定される。 As described above, whether or not the tap detection unit 171 recognizes the vibration including the newly detected peak as a tap based on the interval between the peaks detected in the past and the interval between the newly detected peaks. Is determined.
 なお、周期性や個人差がある咀嚼などの歩行以外の動作により生じる振動に対しても、上述した歩行により生じた振動に対する場合と同様に、ピークの間隔に基づいて、新たに検出されたピークを含む振動をタップとして認識するか否かが判定される。 For vibrations caused by movements other than walking, such as mastication, which have periodicity and individual differences, newly detected peaks are also based on the peak interval, as in the case of vibrations caused by walking described above. It is determined whether or not to recognize the vibration including the tap as a tap.
 図16のフローチャートを参照して、情報処理部161(図12)により実行される処理の流れについて説明する。 The flow of processing executed by the information processing unit 161 (FIG. 12) will be described with reference to the flowchart of FIG.
 ステップS201において、取得部181は、過去に検出された振動のピークの履歴を取得する。 In step S201, the acquisition unit 181 acquires the history of the peaks of vibration detected in the past.
 ステップS202において、計算部182は、過去に検出された振動のピークの間隔の平均T_aveを計算する。 In step S202, the calculation unit 182 calculates the average T_ave of the intervals between the peaks of the vibrations detected in the past.
 ステップS203において、取得部181は、新たなピークPaを検出する。 In step S203, the acquisition unit 181 detects a new peak Pa.
 ステップS204において、計算部182は、ピークPaとその直前のピークとの間隔T_newを計算する。 In step S204, the calculation unit 182 calculates the interval T_new between the peak Pa and the peak immediately before it.
 ステップS205において、判定部183は、間隔の平均T_aveと間隔T_newとの差がしきい値(THRES)以下であるか否か(|T_new-T_ave|≦THRES)を判定する。 In step S205, the determination unit 183 determines whether or not the difference between the average T_ave of the intervals and the interval T_new is equal to or less than the threshold value (THRES) (| T_new-T_ave | ≦ THRES).
 間隔の平均T_aveと間隔T_newとの差がしきい値以下であるとステップS205において判定された場合、処理はステップS206に進む。 If it is determined in step S205 that the difference between the average T_ave of the intervals and the interval T_new is equal to or less than the threshold value, the process proceeds to step S206.
 ステップS206において、判定部183は、ピークPaを含む振動をタップとして認識せず、ノイズとして認識する。その後、処理は終了する。 In step S206, the determination unit 183 does not recognize the vibration including the peak Pa as a tap, but recognizes it as noise. After that, the process ends.
 一方、間隔の平均T_aveと間隔T_newとの差がしきい値超であるとステップS205において判定された場合、処理はステップS207に進む。 On the other hand, if it is determined in step S205 that the difference between the average T_ave of the intervals and the interval T_new exceeds the threshold value, the process proceeds to step S207.
 ステップS207において、判定部183は、ピークPaを含む振動をタップの可能性がある振動であると認識し、続けて、タップ検出部171は、タップ検出アルゴリズムを実行する。タップ検出アルゴリズムが実行されることにより、タップ検出部171において、タップの検出がIMU121のセンサデータに基づいて行われる。 In step S207, the determination unit 183 recognizes the vibration including the peak Pa as a vibration that may be tapped, and subsequently, the tap detection unit 171 executes the tap detection algorithm. By executing the tap detection algorithm, the tap detection unit 171 detects the tap based on the sensor data of the IMU 121.
 タップ検出部171によりタップが検出された場合、片耳タップ検出部171においてタップを検出したというイベントに応じた処理が実行部172により実行される。
When the tap is detected by the tap detection unit 171, the execution unit 172 executes a process corresponding to the event that the tap is detected by the one-ear tap detection unit 171.
 以上のように、イヤホン10においては、過去に検出された振動のピークの間隔と、新たに検出されたピークの間隔とに基づいて、新たに検出されたピークを含む振動がタップ動作により生じた振動であるか否かが認識される。新たに検出されたピークを含む振動がノイズであると認識された場合、ノイズがタップとして検出されなくなる。 As described above, in the earphone 10, the vibration including the newly detected peak is generated by the tap operation based on the interval between the peaks of the vibration detected in the past and the interval of the newly detected peak. Whether or not it is vibration is recognized. If the vibration containing the newly detected peak is recognized as noise, the noise will not be detected as a tap.
 これにより、イヤホン10は、歩行により生じる振動をタップとして誤って検出せず、かつ、歩行中の着地と異なるタイミングで行われたタップ動作を確実に検出することが可能となる。 This makes it possible for the earphone 10 to not erroneously detect the vibration generated by walking as a tap, and to reliably detect the tap operation performed at a timing different from the landing during walking.
 なお、図3の片耳タップ検出部151の代わりに、図12のタップ検出部171が情報処理部141に設けられるようにすることも可能である。この場合、タップ検出部171によりタップが検出された場合、タップ検出部171においてタップを検出したというイベントがタップイベントであるか否かの判定が、他方の端末から送信されてきた情報に基づいてタップ判定部154により行われる。 It is also possible to provide the tap detection unit 171 of FIG. 12 in the information processing unit 141 instead of the one-ear tap detection unit 151 of FIG. In this case, when the tap is detected by the tap detection unit 171, the determination of whether or not the event that the tap detection unit 171 has detected the tap is a tap event is based on the information transmitted from the other terminal. This is done by the tap determination unit 154.
 この場合、図16のステップS207の処理が行われた後、図6や図8を参照して説明した処理が行われることによって、タップ検出部171においてタップが検出されたというイベントがタップイベントであるか否かが判定される。 In this case, the tap event is an event in which the tap is detected in the tap detection unit 171 by performing the process described with reference to FIGS. 6 and 8 after the process of step S207 of FIG. It is determined whether or not there is.
 また、この場合、図16のステップS206の処理が行われた後、他方の端末にイベント情報やセンサデータが送信されないようにしてもよいし、歩行とみられる動作により生じた振動を検出したことを表す情報が他方の端末に送信されるようにしてもよい。これにより、左耳用端末10Lと右耳用端末10Rとの間で行われる通信のデータ量を減らすことができる。 Further, in this case, after the processing of step S206 of FIG. 16 is performed, the event information or the sensor data may not be transmitted to the other terminal, and it is detected that the vibration generated by the operation which seems to be walking is detected. The information to be represented may be transmitted to the other terminal. As a result, the amount of data of communication performed between the left ear terminal 10L and the right ear terminal 10R can be reduced.
 なお、歩行などの動作の周期性に基づいてピークを含む振動がタップであるか否かを認識する例について説明したが、歩行の振動パターンを学習した結果に基づいてタップの認識が行われるようにしてもよい。 An example of recognizing whether or not a vibration including a peak is a tap based on the periodicity of a motion such as walking has been described, but the tap is recognized based on the result of learning the vibration pattern of walking. You may do it.
 例えば、歩行中の一度の着地により生じた振動が機械学習などの手法により学習され、新たに検出された振動のピークが歩行中の着地により生じたピークであるか否かが、学習結果に基づいて認識される。 For example, vibration generated by one landing during walking is learned by a method such as machine learning, and whether or not the newly detected peak of vibration is the peak generated by landing during walking is based on the learning result. Is recognized.
 機械学習が行われる場合、教師データとしてのセンサデータに、IMU121以外のセンサにより検知されたユーザの行動がラベル付けされるようにしてもよい。 When machine learning is performed, the sensor data as teacher data may be labeled with the user's behavior detected by a sensor other than IMU121.
 例えば、イヤホン10に搭載されたGPS(Global Positioning System)センサにより検出されたユーザの位置や移動速度に基づいて、ユーザが歩行中であることが検知された場合、ユーザの歩行が検知されていた期間のセンサデータに、歩行中のセンサデータであることがラベル付けされる。 For example, when it is detected that the user is walking based on the position and movement speed of the user detected by the GPS (Global Positioning System) sensor mounted on the earphone 10, the walking of the user is detected. The sensor data for the period is labeled as walking sensor data.
 歩行中のセンサデータであることがラベル付けされたセンサデータに基づいて、歩行中の着地により生じる振動を学習した学習データが取得される。 Based on the sensor data labeled as walking sensor data, learning data that learned the vibration generated by landing while walking is acquired.
 また、IMU121のセンサデータを用いて歩行により生じるノイズが定常的にイヤホン10により観測されるようにしてもよい。例えば、イヤホン10は、IMU121のセンサデータをそのまま記録することも可能であるし、しきい値を超えた加速度値を記録することも可能である。 Further, the noise generated by walking may be constantly observed by the earphone 10 using the sensor data of the IMU 121. For example, the earphone 10 can record the sensor data of the IMU 121 as it is, or can record the acceleration value exceeding the threshold value.
 記録されたセンサデータやしきい値を超えた加速度値に基づいてノイズを学習した学習結果が、タップを検出するジェスチャ認識処理に用いられる。 The learning result of learning noise based on the recorded sensor data and the acceleration value exceeding the threshold value is used for the gesture recognition process to detect the tap.
 なお、センサデータやしきい値を超えた加速度値が、イヤホン10に接続された機器(スマートフォンなど)またはクラウドとして提供されるサーバにアップロードされるようにしてもよい。この場合、イヤホン10は、外部で行われたノイズの学習結果を取得し、学習結果を用いてジェスチャ認識処理を行う。 Note that the sensor data and the acceleration value exceeding the threshold value may be uploaded to a device (such as a smartphone) connected to the earphone 10 or a server provided as a cloud. In this case, the earphone 10 acquires the learning result of the noise performed externally, and performs the gesture recognition process using the learning result.
 図17のシーケンス図を参照して、情報処理部161により実行されるノイズ学習処理とジェスチャ認識処理の流れについて説明する。 The flow of the noise learning process and the gesture recognition process executed by the information processing unit 161 will be described with reference to the sequence diagram of FIG.
 例えば、図17のノイズ学習処理は、ジェスチャ認識処理が行われる前に行われる。 For example, the noise learning process of FIG. 17 is performed before the gesture recognition process is performed.
 ノイズ学習処理のステップS301において、取得部181(図12)は、歩行により生じたノイズを取得する。過去に検出され、記録されたノイズの履歴が取得部181により取得される。 In step S301 of the noise learning process, the acquisition unit 181 (FIG. 12) acquires the noise generated by walking. The history of noise detected and recorded in the past is acquired by the acquisition unit 181.
 ステップS302において、計算部182は、ステップS301で取得されたノイズの履歴に基づいて、歩行により生じるノイズを除去するようなノイズ除去フィルタを計算する。 In step S302, the calculation unit 182 calculates a noise removal filter that removes noise caused by walking based on the noise history acquired in step S301.
 ステップS302で計算されたノイズ除去フィルタは、ジェスチャ認識処理のステップS352に用いられる。 The noise reduction filter calculated in step S302 is used in step S352 of the gesture recognition process.
 ジェスチャ認識処理のステップS351において、取得部181は、IMU121のセンサデータを取得する。 In step S351 of the gesture recognition process, the acquisition unit 181 acquires the sensor data of the IMU 121.
 ステップS352において、計算部182は、ステップS351で取得されたIMU121のセンサデータにノイズ除去フィルタを適用し、IMU121のセンサデータを補正する。 In step S352, the calculation unit 182 applies a noise reduction filter to the sensor data of the IMU 121 acquired in step S351, and corrects the sensor data of the IMU 121.
 ステップS353において、タップ検出部171は、補正されたセンサデータを用いてタップ検出アルゴリズムを実行する。 In step S353, the tap detection unit 171 executes the tap detection algorithm using the corrected sensor data.
・第2の例
 図18は、第2の例におけるイヤホン10の機能構成例を示すブロック図である。図18において、図12のイヤホン10の構成と同じ構成には同じ符号を付してある。重複する説明については適宜省略する。
Second Example FIG. 18 is a block diagram showing a functional configuration example of the earphone 10 in the second example. In FIG. 18, the same configuration as that of the earphone 10 in FIG. 12 is designated by the same reference numeral. Duplicate explanations will be omitted as appropriate.
 図18に示す情報処理部161の構成は、タップ検出部171が計算部182の代わりに補正部185を有する点で、図12を参照して説明した構成と異なる。 The configuration of the information processing unit 161 shown in FIG. 18 is different from the configuration described with reference to FIG. 12 in that the tap detection unit 171 has a correction unit 185 instead of the calculation unit 182.
 取得部181は、再生される音楽データとしての音楽再生信号をさらに取得する。 The acquisition unit 181 further acquires a music reproduction signal as music data to be reproduced.
 判定部183は、取得部181により取得された音楽再生信号のパワーがしきい値以上であるか否かを判定する。 The determination unit 183 determines whether or not the power of the music reproduction signal acquired by the acquisition unit 181 is equal to or greater than the threshold value.
 補正部185は、取得部181により取得された音楽再生信号に基づいて、取得部181により取得されたセンサデータを補正する。具体的には、補正部185は、音楽再生信号が再生されることによって生じる振動を予測し、予測した振動の加速度値をセンサデータから差し引くことによって、センサデータを補正する。 The correction unit 185 corrects the sensor data acquired by the acquisition unit 181 based on the music reproduction signal acquired by the acquisition unit 181. Specifically, the correction unit 185 predicts the vibration generated by the reproduction of the music reproduction signal, and corrects the sensor data by subtracting the acceleration value of the predicted vibration from the sensor data.
 タップ検出部171は、補正されたセンサデータに基づいてタップを検出する。すなわち、タップ検出部171は、補正部185により補正されたセンサデータに基づいてタップを認識する認識部として機能する。 The tap detection unit 171 detects a tap based on the corrected sensor data. That is, the tap detection unit 171 functions as a recognition unit that recognizes the tap based on the sensor data corrected by the correction unit 185.
 図19は、イヤホン10の構造の例を概略的に示す図である。 FIG. 19 is a diagram schematically showing an example of the structure of the earphone 10.
 図19では、イヤホン10の片方の端末だけが示されているが、左耳用端末10Lと右耳用端末10Rのそれぞれが同様の構造とされる。 Although only one terminal of the earphone 10 is shown in FIG. 19, each of the left ear terminal 10L and the right ear terminal 10R has the same structure.
 イヤホン10で音楽が再生される際、イヤホン10に設けられた振動板191が振動することによって音の振動が生成される。振動板191の近傍にはIMU121が設けられるため、振動板191の振動がIMU121により検出されてしまうことがある。 When music is played on the earphone 10, the vibration plate 191 provided on the earphone 10 vibrates to generate sound vibration. Since the IMU 121 is provided in the vicinity of the diaphragm 191, the vibration of the diaphragm 191 may be detected by the IMU 121.
 音楽再生信号は様々な周波数成分を含むため、イヤホン10で音楽が再生される際、タップ動作により生じる振動に類似した音の振動が振動板191により生成される場合がある。 Since the music reproduction signal contains various frequency components, when music is reproduced by the earphone 10, the vibration plate 191 may generate vibration of sound similar to the vibration generated by the tap operation.
 図20は、音楽再生信号の波形の例を示す図である。 FIG. 20 is a diagram showing an example of a waveform of a music reproduction signal.
 図20のグラフにおいて、縦軸は音楽再生信号のパワーを表し、横軸は時間を表す。 In the graph of FIG. 20, the vertical axis represents the power of the music reproduction signal, and the horizontal axis represents time.
 判定部183は、図20の波形W21のような音楽再生信号に基づいて、音楽再生信号が再生されることによって生じる振動のピークを予測する。ピークとなるタイミングの前後の期間では、タップ検出部171がタップを検出しないようにする。あるいは、補正部185が、予測した振動に基づいてセンサデータを補正するようにする。 The determination unit 183 predicts the peak of vibration generated by the reproduction of the music reproduction signal based on the music reproduction signal as shown in the waveform W21 of FIG. During the period before and after the peak timing, the tap detection unit 171 prevents the tap from being detected. Alternatively, the correction unit 185 corrects the sensor data based on the predicted vibration.
 以下では、ピークとなる音楽再生信号が再生されるタイミングの前後の期間ではタップを検出しないようにする第1の方法、および、音楽再生信号に基づいてセンサデータを補正する第2の方法において行われる処理の流れをそれぞれ説明する。 In the following, the first method of preventing the tap from being detected in the period before and after the timing at which the peak music reproduction signal is reproduced, and the second method of correcting the sensor data based on the music reproduction signal are performed. The flow of each process will be explained.
(第1の方法について)
 図21のフローチャートを参照して、情報処理部161(図18)により実行される処理の流れを説明する。
(About the first method)
The flow of processing executed by the information processing unit 161 (FIG. 18) will be described with reference to the flowchart of FIG. 21.
 ステップS401において、取得部181は、IMU121のセンサデータとして加速度値を取得する。 In step S401, the acquisition unit 181 acquires the acceleration value as the sensor data of the IMU 121.
 ステップS402において、取得部181は、イヤホン10で再生される音楽再生信号を取得する。 In step S402, the acquisition unit 181 acquires the music reproduction signal to be reproduced by the earphone 10.
 ステップS403において、判定部183は、過去の一定の期間の音楽再生信号のパワーがしきい値以上であるか否かを判定する。例えば、判定部183は、IMU121のセンサデータが取得されたタイミングまでの一定の期間に再生された音楽再生信号のパワーを用いて判定を行う。 In step S403, the determination unit 183 determines whether or not the power of the music reproduction signal for a certain period in the past is equal to or greater than the threshold value. For example, the determination unit 183 makes a determination using the power of the music reproduction signal reproduced in a certain period until the timing when the sensor data of the IMU 121 is acquired.
 過去の一定の期間の音楽再生信号のパワーがしきい値以上であるとステップS403において判定された場合、処理は終了する。すなわち、タップ検出部171によるタップの検出が抑制される(行われない)。 If it is determined in step S403 that the power of the music reproduction signal for a certain period in the past is equal to or higher than the threshold value, the process ends. That is, the tap detection by the tap detection unit 171 is suppressed (not performed).
 一方、過去の一定の期間の音楽再生信号のパワーがしきい値未満であるとステップS403において判定された場合、処理はステップS404に進む。 On the other hand, if it is determined in step S403 that the power of the music reproduction signal for a certain period in the past is less than the threshold value, the process proceeds to step S404.
 ステップS404において、情報処理部161は、上述したタップ検出アルゴリズムを実行する。 In step S404, the information processing unit 161 executes the tap detection algorithm described above.
(第2の方法について)
 図22のフローチャートを参照して、情報処理部161(図18)により実行される処理の流れを説明する。
(About the second method)
The flow of processing executed by the information processing unit 161 (FIG. 18) will be described with reference to the flowchart of FIG. 22.
 ステップS451において、取得部181は、IMU121のセンサデータとして加速度値を取得する。 In step S451, the acquisition unit 181 acquires the acceleration value as the sensor data of the IMU 121.
 ステップS452において、取得部181は、イヤホン10において再生される音楽再生信号を取得する。 In step S452, the acquisition unit 181 acquires the music reproduction signal to be reproduced by the earphone 10.
 ステップS453において、判定部183は、音楽再生信号のパワーがしきい値以上であるか否かを判定する。例えば、判定部183は、IMU121のセンサデータが取得されたタイミングに再生された音楽再生信号のパワーを用いて判定を行う。 In step S453, the determination unit 183 determines whether or not the power of the music reproduction signal is equal to or greater than the threshold value. For example, the determination unit 183 makes a determination using the power of the music reproduction signal reproduced at the timing when the sensor data of the IMU 121 is acquired.
 音楽再生信号のパワーがしきい値以上であるとステップS453において判定された場合、処理はステップS454に進む。 If it is determined in step S453 that the power of the music reproduction signal is equal to or greater than the threshold value, the process proceeds to step S454.
 ステップS454において、補正部185は、音楽再生信号を再生することにより生じたと予測された振動の加速度値をセンサデータの加速度値から差し引く処理を行い、センサデータを補正する。 In step S454, the correction unit 185 performs a process of subtracting the acceleration value of the vibration predicted to be generated by reproducing the music reproduction signal from the acceleration value of the sensor data, and corrects the sensor data.
 一方、音楽再生信号のパワーがしきい値未満であるとステップS453において判定された場合、ステップS454の処理はスキップされる。 On the other hand, if it is determined in step S453 that the power of the music reproduction signal is less than the threshold value, the process of step S454 is skipped.
 ステップS455において、タップ検出部171は、上述したタップ検出アルゴリズムを実行する。 In step S455, the tap detection unit 171 executes the above-mentioned tap detection algorithm.
 以上のように、イヤホン10においては、音楽再生信号を再生することにより生じる振動の予測が行われる。イヤホン10においては、予測された振動に基づいて、タップの検出の抑制が行われたり、IMU121のセンサデータの補正が行われたりする。 As described above, in the earphone 10, the vibration generated by reproducing the music reproduction signal is predicted. In the earphone 10, tap detection is suppressed and the sensor data of the IMU 121 is corrected based on the predicted vibration.
 これにより、イヤホン10は、音楽が再生されることにより生じた振動をタップとして誤って検出せず、かつ、出力された音の振動とピークが異なるようなタップ動作の振動をタップとして確実に検出することが可能となる。 As a result, the earphone 10 does not erroneously detect the vibration generated by playing music as a tap, and reliably detects the vibration of the tap operation such that the peak is different from the vibration of the output sound as a tap. It becomes possible to do.
 なお、図12の第1の例と同様に、図3の片耳タップ検出部151の代わりに、図18のタップ検出部171が情報処理部141に設けられるようにすることも可能である。 Similar to the first example of FIG. 12, it is also possible to provide the tap detection unit 171 of FIG. 18 in the information processing unit 141 instead of the one-ear tap detection unit 151 of FIG.
<3.第3の実施の形態>
 図23は、イヤホン10の他のハードウェア構成例を示すブロック図である。
<3. Third Embodiment>
FIG. 23 is a block diagram showing another hardware configuration example of the earphone 10.
 図23において、図2のイヤホン10の構成と同じ構成には同じ符号を付してある。重複する説明については適宜省略する。 In FIG. 23, the same components as those of the earphone 10 in FIG. 2 are designated by the same reference numerals. Duplicate explanations will be omitted as appropriate.
 図23に示すイヤホン10の構成は、左耳用端末10Lのセンサ部107Lに静電センサ201Lが設けられ、右耳用端末10Rのセンサ部107Rに静電センサ201Rが設けられる点で、図2を参照して説明した構成と異なる。 The configuration of the earphone 10 shown in FIG. 23 is such that the sensor unit 107L of the left ear terminal 10L is provided with the electrostatic sensor 201L, and the sensor unit 107R of the right ear terminal 10R is provided with the electrostatic sensor 201R. It is different from the configuration described with reference to.
 静電センサ201L,201Rは、例えばX-Y座標検出方式や静電ボタン検出方式のセンサから構成される。静電センサ201L,201Rは、静電センサ201L,201Rに対するユーザの接触に応じた信号をセンサデータとして出力する。 The electrostatic sensors 201L and 201R are composed of, for example, sensors of the XY coordinate detection method and the electrostatic button detection method. The electrostatic sensors 201L and 201R output signals corresponding to the user's contact with the electrostatic sensors 201L and 201R as sensor data.
 図24は、イヤホン10の機能構成例を示すブロック図である。 FIG. 24 is a block diagram showing a functional configuration example of the earphone 10.
 図24に示すように、イヤホン10においては、図24のCPU101により所定のプログラムが実行されることによって情報処理部211が実現される。なお、図24に示す構成は、左耳用端末10Lと右耳用端末10Rのそれぞれに設けられるようにすることも可能であるし、左耳用端末10Lと右耳用端末10Rのどちらか一方に設けられるようにすることも可能である。 As shown in FIG. 24, in the earphone 10, the information processing unit 211 is realized by executing a predetermined program by the CPU 101 of FIG. 24. The configuration shown in FIG. 24 can be provided in each of the left ear terminal 10L and the right ear terminal 10R, or either the left ear terminal 10L or the right ear terminal 10R. It is also possible to provide it in.
 情報処理部211は、タップ検出部221、タップ判定部222、および実行部223により構成される。 The information processing unit 211 is composed of a tap detection unit 221, a tap determination unit 222, and an execution unit 223.
 タップ検出部221は、センサデータをセンサ部107から取得し、IMU121のセンサデータに基づいて本体タップと顔タップを検出する。また、タップ検出部221は、静電センサ201のセンサデータに基づいて本体タップを検出する。なお、本体タップと顔タップについては後述する。 The tap detection unit 221 acquires sensor data from the sensor unit 107 and detects the main body tap and the face tap based on the sensor data of the IMU 121. Further, the tap detection unit 221 detects the main body tap based on the sensor data of the electrostatic sensor 201. The main body tap and face tap will be described later.
 IMU121のセンサデータに基づくタップの検出結果と、静電センサ201のセンサデータに基づくタップの検出結果とが、タップ判定部222に供給される。 The tap detection result based on the sensor data of the IMU 121 and the tap detection result based on the sensor data of the electrostatic sensor 201 are supplied to the tap determination unit 222.
 タップ判定部222は、タップ検出部221から供給された検出結果に基づいて、本体タップと顔タップを識別する。タップ判定部222は、識別結果に基づいて、実行部223を制御し、本体タップに割り当てられた機能または顔タップに割り当てられた機能を実行させる。 The tap determination unit 222 distinguishes between the main body tap and the face tap based on the detection result supplied from the tap detection unit 221. The tap determination unit 222 controls the execution unit 223 based on the identification result to execute the function assigned to the main body tap or the function assigned to the face tap.
 実行部223は、タップ判定部222による制御に従い、本体タップに割り当てられた機能または顔タップに割り当てられた機能を実行する。 The execution unit 223 executes the function assigned to the main body tap or the function assigned to the face tap according to the control by the tap determination unit 222.
 図25は、本体タップと顔タップの例を示す図である。 FIG. 25 is a diagram showing an example of a main body tap and a face tap.
 本体タップは、イヤホン10の筐体をユーザがタップすることを表す。図25のAには、ユーザUが、左耳に装着した左耳用端末10Lの筐体をタップしている状況が示されている。 The main body tap indicates that the user taps the housing of the earphone 10. FIG. 25A shows a situation in which the user U is tapping the housing of the left ear terminal 10L attached to the left ear.
 本体タップは、イヤホン10の筐体の一部の領域A11に設置された静電センサ201のセンサデータに基づいて検出される。静電センサ201はイヤホン10の筐体の一部分にだけ設置されるため、ユーザUが、本体タップを行おうとして、領域A11以外の部分をタップしてしまう可能性がある。 The main body tap is detected based on the sensor data of the electrostatic sensor 201 installed in the area A11 of a part of the housing of the earphone 10. Since the electrostatic sensor 201 is installed only in a part of the housing of the earphone 10, the user U may tap a part other than the area A11 in an attempt to tap the main body.
 そこで、イヤホン10においては、静電センサ201のセンサデータとともに、IMU121のセンサデータを用いて本体タップが検出される。 Therefore, in the earphone 10, the tap of the main body is detected by using the sensor data of the IMU 121 together with the sensor data of the electrostatic sensor 201.
 一方、顔タップは、イヤホン10を装着した耳周辺をユーザがタップすることを表す。図25のBには、ユーザUが、左耳用端末10Lを装着した左耳周辺をタップしている状況が示されている。例えば、左耳周辺の領域A12が顔タップを検出可能な領域とされる。 On the other hand, the face tap means that the user taps around the ear where the earphone 10 is attached. FIG. 25B shows a situation in which the user U is tapping around the left ear to which the left ear terminal 10L is attached. For example, the region A12 around the left ear is defined as a region where a face tap can be detected.
 図26は、本体タップと顔タップのタップ強度の分布を示す図である。 FIG. 26 is a diagram showing the distribution of tap intensities of the main body tap and the face tap.
 図26において、縦軸は度数を表し、横軸はタップ強度を表す。波形W31は本体タップのタップ強度の分布を表し、波形W32は顔タップのタップ強度の分布を表す。ここでは、タップ強度は、タップが行われることに応じてIMU121により検出される振動の強度を表す。 In FIG. 26, the vertical axis represents the frequency and the horizontal axis represents the tap strength. The waveform W31 represents the distribution of the tap strength of the main body tap, and the waveform W32 represents the distribution of the tap strength of the face tap. Here, the tap intensity represents the intensity of vibration detected by the IMU 121 in response to tapping.
 波形W31と波形W32を比較すると、本体タップのタップ強度は高く、顔タップのタップ強度は低い。 Comparing the waveform W31 and the waveform W32, the tap strength of the main body tap is high and the tap strength of the face tap is low.
 本体タップのタップ強度は高いため、IMU121により強い振動が検出された場合、イヤホン10は、本体タップが行われたことを検出する。一方、顔タップのタップ強度は低いため、IMU121により弱い振動が検出された場合、イヤホン10は、顔タップが行われたことを検出する。 Since the tap strength of the main body tap is high, when strong vibration is detected by the IMU 121, the earphone 10 detects that the main body tap has been performed. On the other hand, since the tap strength of the face tap is low, when the IMU 121 detects a weak vibration, the earphone 10 detects that the face tap has been performed.
 ユーザがイヤホン10の筐体を弱い力でタップした場合、IMU121のセンサデータに基づいて本体タップと顔タップを識別することは難しい。この場合であっても、ユーザが静電センサ201に接触していれば、イヤホン10は、静電センサ201のセンサデータに基づいて本体タップを識別することができる。 When the user taps the housing of the earphone 10 with a weak force, it is difficult to distinguish between the main body tap and the face tap based on the sensor data of the IMU 121. Even in this case, if the user is in contact with the electrostatic sensor 201, the earphone 10 can identify the main body tap based on the sensor data of the electrostatic sensor 201.
 次に、図27のフローチャートを参照して、イヤホン10(情報処理部211)により実行される処理の流れについて説明する。 Next, the flow of processing executed by the earphone 10 (information processing unit 211) will be described with reference to the flowchart of FIG. 27.
 図27の処理は、例えば、タップらしい振動がIMU121により検出されたときに開始される。 The process of FIG. 27 is started, for example, when a tap-like vibration is detected by the IMU 121.
 ステップS501において、タップ検出部221は、センサデータとしての加速度値をIMU121から取得し、IMU121のセンサデータに基づいて本体タップまたは顔タップを検出する。 In step S501, the tap detection unit 221 acquires the acceleration value as the sensor data from the IMU 121, and detects the main body tap or the face tap based on the sensor data of the IMU 121.
 ステップS502において、タップ検出部221は、センサデータを静電センサ201から取得し、静電センサ201のセンサデータに基づいて本体タップを検出する。 In step S502, the tap detection unit 221 acquires the sensor data from the electrostatic sensor 201 and detects the main body tap based on the sensor data of the electrostatic sensor 201.
 ステップS503において、タップ判定部222は、静電センサ201によりタップが検出されたか否かを判定する。 In step S503, the tap determination unit 222 determines whether or not the tap is detected by the electrostatic sensor 201.
 静電センサ201によりタップが検出されたとステップS503において判定された場合、処理はステップS504に進む。例えば、ユーザの接触が静電センサ201により検出された場合、タップが検出されたと判定される。 If it is determined in step S503 that the tap is detected by the electrostatic sensor 201, the process proceeds to step S504. For example, when the user's contact is detected by the electrostatic sensor 201, it is determined that the tap has been detected.
 ステップS504において、タップ判定部222は、ユーザにより本体タップが行われたと判定する。その後、実行部223は、本体タップに割り当てられた機能を実行する。 In step S504, the tap determination unit 222 determines that the main body has been tapped by the user. After that, the execution unit 223 executes the function assigned to the main body tap.
 なお、静電センサ201によりタップが検出されたとステップS503において判定された後、IMU121のセンサデータに基づいて本体タップが検出されたか否かが判定されるようにしてもよい。 After it is determined in step S503 that the tap is detected by the electrostatic sensor 201, it may be determined whether or not the main body tap is detected based on the sensor data of the IMU 121.
 IMU121のセンサデータに基づいて本体タップが検出されたと判定された場合、ステップS504の処理が行われる。また、IMU121のセンサデータに基づいて本体タップが検出されなかったと判定された場合、静電センサ201がノイズを検出したと判定され、処理は終了する。 When it is determined that the main body tap is detected based on the sensor data of the IMU 121, the process of step S504 is performed. If it is determined that the main body tap is not detected based on the sensor data of the IMU 121, it is determined that the electrostatic sensor 201 has detected noise, and the process ends.
 これにより、例えば、静電センサ201がユーザの髪の毛の接触を検出した場合、その接触を本体タップとして判定することを回避することができる。 Thereby, for example, when the electrostatic sensor 201 detects the contact of the user's hair, it is possible to avoid determining the contact as the main body tap.
 一方、静電センサ201によりタップが検出されなかったとステップS503において判定された場合、処理はステップS505に進む。 On the other hand, if it is determined in step S503 that the tap is not detected by the electrostatic sensor 201, the process proceeds to step S505.
 ステップS505において、タップ判定部222は、IMU121により本体タップが検出されたか否かを判定する。例えば、IMU121のセンサデータに基づいて計測されたタップ強度が所定のしきい値よりも高い場合、IMU121により本体タップが検出されたと判定される。 In step S505, the tap determination unit 222 determines whether or not the main body tap is detected by the IMU 121. For example, when the tap strength measured based on the sensor data of the IMU 121 is higher than a predetermined threshold value, it is determined that the main body tap is detected by the IMU 121.
 IMU121により本体タップが検出されたとステップS505において判定された場合、処理はステップS504に進み、上述したようにユーザにより本体タップが行われたと判定され、本体タップに割り当てられた機能が実行される。 If it is determined in step S505 that the main body tap has been detected by the IMU 121, the process proceeds to step S504, it is determined that the main body tap has been performed by the user as described above, and the function assigned to the main body tap is executed.
 一方、IMU121により本体タップが検出されなかったとステップS505において判定された場合、処理はステップS506に進む。 On the other hand, if it is determined in step S505 that the main body tap is not detected by the IMU 121, the process proceeds to step S506.
 ステップS506において、タップ判定部222は、IMU121により顔タップが検出されたか否かを判定する。例えば、IMU121のセンサデータに基づいて計測されたタップ強度が所定のしきい値よりも低い場合、IMU121により顔タップが検出されたと判定される。 In step S506, the tap determination unit 222 determines whether or not the face tap is detected by the IMU 121. For example, when the tap strength measured based on the sensor data of the IMU 121 is lower than a predetermined threshold value, it is determined that the face tap is detected by the IMU 121.
 IMU121により顔タップが検出されなかったとステップS506において判定された場合、処理は終了する。例えば、IMU121により振動が検出されなかった場合、IMU121により顔タップが検出されなかったと判定される。 If it is determined in step S506 that the face tap was not detected by the IMU 121, the process ends. For example, if the vibration is not detected by the IMU 121, it is determined that the face tap is not detected by the IMU 121.
 一方、IMU121により顔タップが検出されたとステップS506において判定された場合、処理はステップS507に進む。 On the other hand, if it is determined in step S506 that the face tap is detected by the IMU 121, the process proceeds to step S507.
 ステップS507において、タップ判定部222は、ユーザにより顔タップが行われたと判定する。その後、実行部223は、顔タップに割り当てられた機能を実行する。 In step S507, the tap determination unit 222 determines that the face tap has been performed by the user. After that, the execution unit 223 executes the function assigned to the face tap.
 以上のように、イヤホン10においては、静電センサ201のセンサデータとIMU121のセンサデータとに基づいて本体タップと顔タップの識別が行われる。 As described above, in the earphone 10, the main body tap and the face tap are identified based on the sensor data of the electrostatic sensor 201 and the sensor data of the IMU 121.
 これにより、イヤホン10は、ユーザが、静電センサ201が設置された領域以外の領域に接触した場合においても、IMU121のセンサデータに基づいて本体タップを識別することが可能となる。また、イヤホン10は、ユーザが、静電センサ201が設置された領域を弱い力でタップした場合においても、静電センサ201のセンサデータに基づいて本体タップを識別することが可能となる。 As a result, the earphone 10 can identify the main body tap based on the sensor data of the IMU 121 even when the user touches an area other than the area where the electrostatic sensor 201 is installed. Further, the earphone 10 can identify the main body tap based on the sensor data of the electrostatic sensor 201 even when the user taps the area where the electrostatic sensor 201 is installed with a weak force.
<4.変形例>
 第1の実施の形態において、操作認識音、機能実行音、取り消し音などの効果音が再生されるものとして説明したが、タップを検出したことを表す情報、機能が実行されることを表す情報、およびタップの検出が取り消されたことを表す情報をユーザにフィードバックする他の手段が用いられるようにしてもよい。
<4. Modification example>
In the first embodiment, it has been described that sound effects such as an operation recognition sound, a function execution sound, and a cancellation sound are reproduced, but information indicating that a tap has been detected and information indicating that the function is executed have been described. , And other means of feeding back to the user information that the tap detection has been canceled may be used.
 例えば、上述した情報を表す画像がスマートフォンなどの外部の機器の画面に表示されるようにしてもよい。また、上述した情報を表す振動や音が外部の機器から出力されるようにしてもよい。また、上述した情報を表す音、振動、および画像の少なくともいずれかが組み合わされて出力されるようにしてもよい。 For example, an image representing the above-mentioned information may be displayed on the screen of an external device such as a smartphone. Further, the vibration or sound representing the above-mentioned information may be output from an external device. Further, at least one of the sound, vibration, and image representing the above-mentioned information may be combined and output.
 第1の実施の形態における情報処理部141、第2の実施の形態における情報処理部161、および第3の実施の形態における情報処理部211のそれぞれがイヤホン10に設けられるものとして説明したが、これらの情報処理部の一部または全部の構成がイヤホン10の外部の装置に設けられるようにしてもよい。 Although the information processing unit 141 in the first embodiment, the information processing unit 161 in the second embodiment, and the information processing unit 211 in the third embodiment have been described as being provided on the earphone 10. A part or all of these information processing units may be provided in an external device of the earphone 10.
 例えば、情報処理部161のタップ判定部154が、イヤホン10と無線通信または有線通信で接続されたスマートフォンに設けられるようにしてもよい。この場合、片耳タップ検出部151においてタップが検出されたというイベントがタップイベントであるか否かの判定がスマートフォンにおいて行われ、判定結果がイヤホン10に送信される。 For example, the tap determination unit 154 of the information processing unit 161 may be provided on a smartphone connected to the earphone 10 by wireless communication or wired communication. In this case, the smartphone determines whether or not the event that the tap is detected by the one-ear tap detection unit 151 is a tap event, and the determination result is transmitted to the earphone 10.
 上述した一連の処理は、ハードウェアにより実行することもできるし、ソフトウェアにより実行することもできる。一連の処理をソフトウェアにより実行する場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、汎用のパーソナルコンピュータなどにインストールされる。 The above-mentioned series of processes can be executed by hardware or software. When a series of processes are executed by software, the programs constituting the software are installed on a computer embedded in dedicated hardware, a general-purpose personal computer, or the like.
 インストールされるプログラムは、光ディスク(CD-ROM(Compact Disc-Read Only Memory),DVD(Digital Versatile Disc)等)や半導体メモリなどよりなるリムーバブルメディアに記録して提供される。また、ローカルエリアネットワーク、インターネット、デジタル放送といった、有線または無線の伝送媒体を介して提供されるようにしてもよい。プログラムは、図3、図24に示されるROM102や記憶部109に、あらかじめインストールしておくことができる。 The installed program is provided by recording it on a removable medium consisting of an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.) or a semiconductor memory. It may also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital broadcasting. The program can be installed in advance in the ROM 102 or the storage unit 109 shown in FIGS. 3 and 24.
 なお、コンピュータが実行するプログラムは、本明細書で説明する順序に沿って時系列に処理が行われるプログラムであっても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで処理が行われるプログラムであっても良い。 The program executed by the computer may be a program in which processing is performed in chronological order according to the order described in the present specification, in parallel, or at a necessary timing such as when a call is made. It may be a program in which processing is performed.
 本明細書に記載された効果はあくまで例示であって限定されるものでは無く、また他の効果があってもよい。 The effects described in the present specification are merely examples and are not limited, and other effects may be obtained.
 本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present technology is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present technology.
 例えば、本技術は、1つの機能をネットワークを介して複数の装置で分担、共同して処理するクラウドコンピューティングの構成をとることができる。 For example, this technology can take a cloud computing configuration in which one function is shared by multiple devices via a network and processed jointly.
 また、上述のフローチャートで説明した各ステップは、1つの装置で実行する他、複数の装置で分担して実行することができる。 In addition, each step described in the above flowchart can be executed by one device or shared by a plurality of devices.
 さらに、1つのステップに複数の処理が含まれる場合には、その1つのステップに含まれる複数の処理は、1つの装置で実行する他、複数の装置で分担して実行することができる。 Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices.
<構成の組み合わせ例>
 本技術は、以下のような構成をとることもできる。
<Example of configuration combination>
The present technology can also have the following configurations.
(1)
 ユーザの操作を表す振動とノイズとを含む振動を検出するセンサが搭載された2台の端末を備え、
 一方の前記端末は、
  他方の前記端末に搭載された第1の前記センサによる検出結果を受信する受信部と、
  前記受信部により受信された前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する判定部と
 を有する
 情報処理装置。
(2)
 一方の前記端末は、第2の前記センサによる検出結果がノイズではないと前記判定部により判定された場合、第2の前記センサによる検出結果に応じた処理を実行する実行部をさらに有する
 前記(1)に記載の情報処理装置。
(3)
 一方の前記端末は、第2の前記センサによる検出結果がノイズであると前記判定部により判定された場合、第2の前記センサによる検出結果がノイズであることを通知するための制御を行う制御部をさらに有する
 前記(2)に記載の情報処理装置。
(4)
 前記制御部は、第2の前記センサによる検出結果がノイズであることを通知する音、振動、および画像による少なくとも1つの通知を出力するための制御を行う
 前記(3)に記載の情報処理装置。
(5)
 前記制御部は、第2の前記センサにより前記振動が検出されたタイミングから、前記実行部により前記処理が実行されるタイミングまでの間のタイミングで、第2の前記センサによる検出結果がノイズであることを通知するための制御を行う
 前記(3)または(4)に記載の情報処理装置。
(6)
 前記制御部は、第2の前記センサにより前記振動が検出されたタイミングで音を出力するための制御をさらに行う
 前記(3)乃至(5)のいずれかに記載の情報処理装置。
(7)
 前記制御部は、前記実行部により前記処理が実行されるタイミングで音を出力するための制御をさらに行う
 前記(3)乃至(6)のいずれかに記載の情報処理装置。
(8)
 前記判定部は、
  第1の前記センサによる検出結果と、第2の前記センサによる検出結果との類似度を計算し、
  計算した前記類似度に基づいて、第2の前記センサによる検出結果がノイズであるか否かを判定する
 前記(1)乃至(7)のいずれかに記載の情報処理装置。
(9)
 前記判定部は、前記類似度が所定のしきい値よりも高い場合、第2の前記センサによる検出結果がノイズであると判定する
 前記(8)に記載の情報処理装置。
(10)
 前記受信部は、第1の前記センサの検出結果として、第1の前記センサのセンサデータに応じてユーザの操作を検出したことを表すイベント情報を受信し、
 前記判定部は、前記受信部により受信された前記イベント情報に基づいて、第2の前記センサのセンサデータに応じてユーザの操作を検出したことが有効であるか否かを判定する
 前記(1)乃至(9)のいずれかに記載の情報処理装置。
(11)
 前記受信部は、第1の前記センサの検出結果として、第1の前記センサのセンサデータを受信し、
 前記判定部は、前記受信部により受信された前記センサデータに基づいて、第2の前記センサのセンサデータがノイズであるか否かを判定する
 前記(1)乃至(9)のいずれかに記載の情報処理装置。
(12)
 前記ユーザの操作に従い、音信号に応じた音を出力する音出力部をさらに有する
 前記(1)乃至(11)のいずれかに記載の情報処理装置。
(13)
 振動を検出するセンサが搭載された2台の端末を備える情報処理装置であって、
 一方の前記端末が、
  他方の前記端末に搭載された第1の前記センサによる検出結果を受信し、
  受信した前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する
 情報処理方法。
(14)
 ユーザの操作を表す振動とノイズとを含む振動を検出するセンサ部と、
 前記センサ部による検出結果と前記センサ部により検出されるノイズの予測結果とに基づいて、ユーザの操作を認識する認識部と
 を備える情報処理装置。
(15)
 前記認識部は、前記センサ部により検出された振動のピーク間隔と、過去に検出された周期的な振動に基づいて予測されたノイズの周期との比較結果に基づいて、ユーザの操作を認識する
 前記(14)に記載の情報処理装置。
(16)
 前記認識部は、前記センサ部により検出されたノイズの学習結果に基づいて、前記センサ部による検出結果を補正し、補正後の検出結果に基づいてユーザの操作を認識する
 前記(14)に記載の情報処理装置。
(17)
 音信号に基づいて音を出力する音出力部をさらに備え、
 前記認識部は、前記センサ部による検出結果と、前記音出力部により前記音信号が再生されることにより生じるノイズの予測結果とに基づいて、ユーザの操作を認識する
 前記(14)に記載の情報処理装置。
(18)
 前記認識部は、前記音出力部により音が出力されることによって生じる振動に基づいて、前記センサ部による検出結果がノイズであるか否かを判定し、ノイズではないと判定した検出結果に基づいてユーザの操作を認識する
 前記(17)に記載の情報処理装置。
(19)
 前記認識部は、前記音出力部により音が出力されることによって生じる振動に基づいて、前記センサ部による検出結果を補正し、補正後の検出結果に基づいてユーザの操作を認識する
 前記(17)に記載の情報処理装置。
(20)
 前記センサ部が搭載された2台の端末、または前記センサ部が搭載された1台の端末を備え、
 前記端末が、前記認識部を有する
 前記(14)乃至(19)のいずれかに記載の情報処理装置。
(21)
 ユーザによる接触を検出する接触センサと、
 振動を検出する振動センサと、
 前記接触センサの検出結果と前記振動センサの検出結果とに基づいて、前記ユーザが筐体に接触したか否かを判定する判定部と
 を備える情報処理装置。
(22)
 前記接触センサは、前記筐体の一部の領域に設置される
 前記(21)に記載の情報処理装置。
(23)
 前記判定部は、前記振動センサにより検出された振動の強度が所定のしきい値よりも高い場合、前記ユーザが前記筐体に接触したと判定する
 前記(21)または(22)に記載の情報処理装置。
(24)
 前記判定部は、前記接触センサの検出結果と前記振動センサの検出結果とに基づいて、前記ユーザが、前記筐体を装着した自己の部位の一部に接触したか否かをさらに判定する
 前記(21)乃至(23)のいずれかに記載の情報処理装置。
(25)
 前記ユーザが前記筐体に接触したと前記判定部により判定されたことに応じた機能、または、前記ユーザが前記筐体を装着した自己の部位の一部に接触したと判定されたことに応じた機能を実行する実行部をさらに備える
 前記(24)に記載の情報処理装置。
(26)
 前記接触センサと前記振動センサが搭載された2台の端末、または前記接触センサと前記振動センサが搭載された1台の端末を備え、
 前記端末が、前記判定部を有し、
 前記筐体は、前記端末の筐体である
 前記(21)乃至(25)のいずれかに記載の情報処理装置。
(1)
Equipped with two terminals equipped with sensors that detect vibrations including vibrations and noises that represent user operations.
On the other hand, the terminal
A receiving unit that receives the detection result of the first sensor mounted on the other terminal, and a receiving unit.
An information processing device having a determination unit for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the detection result received by the reception unit.
(2)
On the other hand, the terminal further includes an execution unit that executes processing according to the detection result by the second sensor when the determination unit determines that the detection result by the second sensor is not noise. The information processing device according to 1).
(3)
On the other hand, when the determination unit determines that the detection result by the second sensor is noise, the terminal controls to notify that the detection result by the second sensor is noise. The information processing apparatus according to (2) above, further comprising a unit.
(4)
The information processing device according to (3) above, wherein the control unit controls to output at least one notification by sound, vibration, and an image notifying that the detection result by the second sensor is noise. ..
(5)
The control unit has a timing between the timing when the vibration is detected by the second sensor and the timing when the processing is executed by the execution unit, and the detection result by the second sensor is noise. The information processing apparatus according to (3) or (4) above, which controls to notify the fact.
(6)
The information processing device according to any one of (3) to (5) above, wherein the control unit further controls to output a sound at the timing when the vibration is detected by the second sensor.
(7)
The information processing device according to any one of (3) to (6) above, wherein the control unit further controls to output a sound at a timing when the processing is executed by the execution unit.
(8)
The determination unit
The degree of similarity between the detection result by the first sensor and the detection result by the second sensor is calculated.
The information processing apparatus according to any one of (1) to (7) above, which determines whether or not the detection result by the second sensor is noise based on the calculated similarity.
(9)
The information processing device according to (8) above, wherein the determination unit determines that the detection result by the second sensor is noise when the similarity is higher than a predetermined threshold value.
(10)
As the detection result of the first sensor, the receiving unit receives event information indicating that the user's operation is detected according to the sensor data of the first sensor.
The determination unit determines whether or not it is effective to detect the user's operation according to the sensor data of the second sensor based on the event information received by the reception unit (1). ) To (9).
(11)
The receiving unit receives the sensor data of the first sensor as the detection result of the first sensor, and receives the sensor data of the first sensor.
The determination unit is described in any one of (1) to (9) above, which determines whether or not the sensor data of the second sensor is noise based on the sensor data received by the reception unit. Information processing device.
(12)
The information processing device according to any one of (1) to (11), further comprising a sound output unit that outputs a sound corresponding to a sound signal according to the operation of the user.
(13)
An information processing device equipped with two terminals equipped with sensors for detecting vibration.
One of the terminals
Upon receiving the detection result by the first sensor mounted on the other terminal,
An information processing method for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the received detection result.
(14)
A sensor unit that detects vibrations including vibrations and noises that represent user operations, and
An information processing device including a recognition unit that recognizes a user's operation based on a detection result by the sensor unit and a noise prediction result detected by the sensor unit.
(15)
The recognition unit recognizes the user's operation based on the comparison result between the peak interval of the vibration detected by the sensor unit and the noise period predicted based on the periodic vibration detected in the past. The information processing apparatus according to (14) above.
(16)
The recognition unit corrects the detection result by the sensor unit based on the learning result of the noise detected by the sensor unit, and recognizes the user's operation based on the corrected detection result according to the above (14). Information processing equipment.
(17)
It also has a sound output unit that outputs sound based on the sound signal.
The recognition unit recognizes a user's operation based on a detection result by the sensor unit and a prediction result of noise generated by the reproduction of the sound signal by the sound output unit. Information processing device.
(18)
The recognition unit determines whether or not the detection result by the sensor unit is noise based on the vibration generated by the sound output by the sound output unit, and is based on the detection result determined to be not noise. The information processing apparatus according to (17) above, which recognizes the user's operation.
(19)
The recognition unit corrects the detection result by the sensor unit based on the vibration generated by the sound output by the sound output unit, and recognizes the user's operation based on the corrected detection result (17). ). The information processing device.
(20)
It is provided with two terminals on which the sensor unit is mounted, or one terminal on which the sensor unit is mounted.
The information processing device according to any one of (14) to (19), wherein the terminal has the recognition unit.
(21)
A contact sensor that detects contact by the user, and
A vibration sensor that detects vibration and
An information processing device including a determination unit for determining whether or not the user has touched the housing based on the detection result of the contact sensor and the detection result of the vibration sensor.
(22)
The information processing device according to (21), wherein the contact sensor is installed in a partial area of the housing.
(23)
The information according to (21) or (22), wherein the determination unit determines that the user has touched the housing when the intensity of the vibration detected by the vibration sensor is higher than a predetermined threshold value. Processing device.
(24)
Based on the detection result of the contact sensor and the detection result of the vibration sensor, the determination unit further determines whether or not the user has touched a part of his / her own portion where the housing is attached. The information processing apparatus according to any one of (21) to (23).
(25)
Depending on the function according to the determination by the determination unit that the user has touched the housing, or according to the determination that the user has touched a part of the self-partition to which the housing is attached. The information processing apparatus according to (24) above, further comprising an execution unit that executes the above-mentioned function.
(26)
It comprises two terminals equipped with the contact sensor and the vibration sensor, or one terminal equipped with the contact sensor and the vibration sensor.
The terminal has the determination unit, and the terminal has the determination unit.
The information processing device according to any one of (21) to (25), wherein the housing is a housing of the terminal.
 10 イヤホン, 10L 左耳用端末, 10R 右耳用端末,106 音出力部,107 センサ部, 121 IMU, 141 情報処理部, 151 片耳タップ検出部, 152 送信制御部, 153 受信制御部, 154 タップ判定部, 155 音制御部, 156 実行部, 161 情報処理部, 171 タップ検出部, 172 実行部,181 取得部, 182 計算部, 183 判定部, 175 補正部, 191 振動板, 201 静電センサ, 211 情報処理部, 221 タップ検出部, 222 タップ判定部, 223 実行部 10 earphones, 10L left ear terminal, 10R right ear terminal, 106 sound output unit, 107 sensor unit, 121 IMU, 141 information processing unit, 151 one-ear tap detection unit, 152 transmission control unit, 153 reception control unit, 154 taps. Judgment unit, 155 sound control unit, 156 execution unit, 161 information processing unit, 171 tap detection unit, 172 execution unit, 181 acquisition unit, 182 calculation unit, 183 judgment unit, 175 correction unit, 191 diaphragm, 201 electrostatic sensor , 211 Information processing unit, 221 tap detection unit, 222 tap judgment unit, 223 execution unit

Claims (20)

  1.  ユーザの操作を表す振動とノイズとを含む振動を検出するセンサが搭載された2台の端末を備え、
     一方の前記端末は、
      他方の前記端末に搭載された第1の前記センサによる検出結果を受信する受信部と、
      前記受信部により受信された前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する判定部と
     を有する
     情報処理装置。
    Equipped with two terminals equipped with sensors that detect vibrations including vibrations and noises that represent user operations.
    On the other hand, the terminal
    A receiving unit that receives the detection result of the first sensor mounted on the other terminal, and a receiving unit.
    An information processing device having a determination unit for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the detection result received by the reception unit.
  2.  一方の前記端末は、第2の前記センサによる検出結果がノイズではないと前記判定部により判定された場合、第2の前記センサによる検出結果に応じた処理を実行する実行部をさらに有する
     請求項1に記載の情報処理装置。
    One claim further comprises an execution unit that executes processing according to the detection result by the second sensor when the determination unit determines that the detection result by the second sensor is not noise. The information processing apparatus according to 1.
  3.  一方の前記端末は、第2の前記センサによる検出結果がノイズであると前記判定部により判定された場合、第2の前記センサによる検出結果がノイズであることを通知するための制御を行う制御部をさらに有する
     請求項2に記載の情報処理装置。
    On the other hand, when the determination unit determines that the detection result by the second sensor is noise, the terminal controls to notify that the detection result by the second sensor is noise. The information processing apparatus according to claim 2, further comprising a unit.
  4.  前記制御部は、第2の前記センサによる検出結果がノイズであることを通知する音、振動、および画像による少なくとも1つの通知を出力するための制御を行う
     請求項3に記載の情報処理装置。
    The information processing apparatus according to claim 3, wherein the control unit controls to output at least one notification by sound, vibration, and an image notifying that the detection result by the second sensor is noise.
  5.  前記制御部は、第2の前記センサにより前記振動が検出されたタイミングから、前記実行部により前記処理が実行されるタイミングまでの間のタイミングで、第2の前記センサによる検出結果がノイズであることを通知するための制御を行う
     請求項3に記載の情報処理装置。
    The control unit has a timing between the timing when the vibration is detected by the second sensor and the timing when the processing is executed by the execution unit, and the detection result by the second sensor is noise. The information processing apparatus according to claim 3, which controls to notify the fact.
  6.  前記制御部は、第2の前記センサにより前記振動が検出されたタイミングで音を出力するための制御をさらに行う
     請求項3に記載の情報処理装置。
    The information processing device according to claim 3, wherein the control unit further controls to output a sound at the timing when the vibration is detected by the second sensor.
  7.  前記制御部は、前記実行部により前記処理が実行されるタイミングで音を出力するための制御をさらに行う
     請求項3に記載の情報処理装置。
    The information processing apparatus according to claim 3, wherein the control unit further controls to output a sound at a timing when the processing is executed by the execution unit.
  8.  前記判定部は、
      第1の前記センサによる検出結果と、第2の前記センサによる検出結果との類似度を計算し、
      計算した前記類似度に基づいて、第2の前記センサによる検出結果がノイズであるか否かを判定する
     請求項1に記載の情報処理装置。
    The determination unit
    The degree of similarity between the detection result by the first sensor and the detection result by the second sensor is calculated.
    The information processing apparatus according to claim 1, wherein it is determined whether or not the detection result by the second sensor is noise based on the calculated similarity.
  9.  前記判定部は、前記類似度が所定のしきい値よりも高い場合、第2の前記センサによる検出結果がノイズであると判定する
     請求項8に記載の情報処理装置。
    The information processing device according to claim 8, wherein the determination unit determines that the detection result by the second sensor is noise when the similarity is higher than a predetermined threshold value.
  10.  前記受信部は、第1の前記センサの検出結果として、第1の前記センサのセンサデータに応じて前記ユーザの操作を検出したことを表すイベント情報を受信し、
     前記判定部は、前記受信部により受信された前記イベント情報に基づいて、第2の前記センサのセンサデータに応じて前記ユーザの操作を検出したことが有効であるか否かを判定する
     請求項1に記載の情報処理装置。
    As a detection result of the first sensor, the receiving unit receives event information indicating that the user's operation is detected according to the sensor data of the first sensor.
    A claim that the determination unit determines whether or not it is effective to detect the user's operation according to the sensor data of the second sensor based on the event information received by the reception unit. The information processing apparatus according to 1.
  11.  前記受信部は、第1の前記センサの検出結果として、第1の前記センサのセンサデータを受信し、
     前記判定部は、前記受信部により受信された前記センサデータに基づいて、第2の前記センサのセンサデータがノイズであるか否かを判定する
     請求項1に記載の情報処理装置。
    The receiving unit receives the sensor data of the first sensor as the detection result of the first sensor, and receives the sensor data of the first sensor.
    The information processing device according to claim 1, wherein the determination unit determines whether or not the sensor data of the second sensor is noise based on the sensor data received by the reception unit.
  12.  前記ユーザの操作に従い、音信号に応じた音を出力する音出力部をさらに有する
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, further comprising a sound output unit that outputs a sound corresponding to a sound signal according to the operation of the user.
  13.  振動を検出するセンサが搭載された2台の端末を備える情報処理装置であって、
     一方の前記端末が、
      他方の前記端末に搭載された第1の前記センサによる検出結果を受信し、
      受信した前記検出結果に基づいて、一方の前記端末に搭載された第2の前記センサによる検出結果がノイズであるか否かを判定する
     情報処理方法。
    An information processing device equipped with two terminals equipped with sensors for detecting vibration.
    One of the terminals
    Upon receiving the detection result by the first sensor mounted on the other terminal,
    An information processing method for determining whether or not the detection result by the second sensor mounted on one of the terminals is noise based on the received detection result.
  14.  ユーザの操作を表す振動とノイズとを含む振動を検出するセンサ部と、
     前記センサ部による検出結果と前記センサ部により検出されるノイズの予測結果とに基づいて、前記ユーザの操作を認識する認識部と
     を備える情報処理装置。
    A sensor unit that detects vibrations including vibrations and noises that represent user operations, and
    An information processing device including a recognition unit that recognizes a user's operation based on a detection result by the sensor unit and a noise prediction result detected by the sensor unit.
  15.  前記認識部は、前記センサ部により検出された振動のピーク間隔と、過去に検出された周期的な振動に基づいて予測されたノイズの周期との比較結果に基づいて、前記ユーザの操作を認識する
     請求項14に記載の情報処理装置。
    The recognition unit recognizes the user's operation based on the comparison result between the peak interval of the vibration detected by the sensor unit and the noise period predicted based on the periodic vibration detected in the past. The information processing apparatus according to claim 14.
  16.  前記認識部は、前記センサ部により検出されたノイズの学習結果に基づいて、前記センサ部による検出結果を補正し、補正後の検出結果に基づいて前記ユーザの操作を認識する
     請求項14に記載の情報処理装置。
    The 14th aspect of the present invention, wherein the recognition unit corrects the detection result by the sensor unit based on the learning result of the noise detected by the sensor unit, and recognizes the operation of the user based on the corrected detection result. Information processing equipment.
  17.  音信号に基づいて音を出力する音出力部をさらに備え、
     前記認識部は、前記センサ部による検出結果と、前記音出力部により前記音信号が再生されることにより生じるノイズの予測結果とに基づいて、前記ユーザの操作を認識する
     請求項14に記載の情報処理装置。
    It also has a sound output unit that outputs sound based on the sound signal.
    The 14th aspect of claim 14, wherein the recognition unit recognizes the user's operation based on the detection result by the sensor unit and the prediction result of noise generated by the reproduction of the sound signal by the sound output unit. Information processing device.
  18.  前記認識部は、前記音出力部により音が出力されることによって生じる振動に基づいて、前記センサ部による検出結果がノイズであるか否かを判定し、ノイズではないと判定した検出結果に基づいて前記ユーザの操作を認識する
     請求項17に記載の情報処理装置。
    The recognition unit determines whether or not the detection result by the sensor unit is noise based on the vibration generated by the sound output by the sound output unit, and is based on the detection result determined to be not noise. The information processing apparatus according to claim 17, which recognizes the user's operation.
  19.  前記認識部は、前記音出力部により音が出力されることによって生じる振動に基づいて、前記センサ部による検出結果を補正し、補正後の検出結果に基づいて前記ユーザの操作を認識する
     請求項17に記載の情報処理装置。
    The recognition unit corrects the detection result by the sensor unit based on the vibration generated by the sound output by the sound output unit, and recognizes the user's operation based on the corrected detection result. The information processing apparatus according to 17.
  20.  前記センサ部が搭載された2台の端末、または前記センサ部が搭載された1台の端末を備え、
     前記端末が、前記認識部を有する
     請求項14に記載の情報処理装置。
    It is provided with two terminals on which the sensor unit is mounted, or one terminal on which the sensor unit is mounted.
    The information processing device according to claim 14, wherein the terminal has the recognition unit.
PCT/JP2021/016706 2020-05-11 2021-04-27 Information processing device and information processing method WO2021230067A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020083049 2020-05-11
JP2020-083049 2020-05-11

Publications (1)

Publication Number Publication Date
WO2021230067A1 true WO2021230067A1 (en) 2021-11-18

Family

ID=78525851

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/016706 WO2021230067A1 (en) 2020-05-11 2021-04-27 Information processing device and information processing method

Country Status (1)

Country Link
WO (1) WO2021230067A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012140818A1 (en) * 2011-04-11 2012-10-18 パナソニック株式会社 Hearing aid and method of detecting vibration
JP2014153729A (en) * 2013-02-04 2014-08-25 Sharp Corp Input determination device and portable terminal
JP2014209329A (en) * 2013-03-15 2014-11-06 イマージョンコーポレーションImmersion Corporation Systems and methods for parameter modification of haptic effects
JP2016095776A (en) * 2014-11-17 2016-05-26 ラピスセミコンダクタ株式会社 Semiconductor device, portable terminal device and operation detection method
JP2016177343A (en) * 2015-03-18 2016-10-06 株式会社トヨタIt開発センター Signal processing apparatus, input device, signal processing method, and program
JP2018042241A (en) * 2016-09-06 2018-03-15 アップル インコーポレイテッド Wireless ear bud

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012140818A1 (en) * 2011-04-11 2012-10-18 パナソニック株式会社 Hearing aid and method of detecting vibration
JP2014153729A (en) * 2013-02-04 2014-08-25 Sharp Corp Input determination device and portable terminal
JP2014209329A (en) * 2013-03-15 2014-11-06 イマージョンコーポレーションImmersion Corporation Systems and methods for parameter modification of haptic effects
JP2016095776A (en) * 2014-11-17 2016-05-26 ラピスセミコンダクタ株式会社 Semiconductor device, portable terminal device and operation detection method
JP2016177343A (en) * 2015-03-18 2016-10-06 株式会社トヨタIt開発センター Signal processing apparatus, input device, signal processing method, and program
JP2018042241A (en) * 2016-09-06 2018-03-15 アップル インコーポレイテッド Wireless ear bud

Similar Documents

Publication Publication Date Title
US9374647B2 (en) Method and apparatus using head movement for user interface
US10292002B2 (en) Systems and methods for delivery of personalized audio
EP3892009B1 (en) Wearable audio device with head on/off state detection
CN101765035B (en) Music reproducing system and information processing method
EP2775738B1 (en) Orientation free handsfree device
JP5973465B2 (en) Audio processing device
US10206043B2 (en) Method and apparatus for audio pass-through
US20160014539A1 (en) Earphone and sound channel control method thereof
US20160323672A1 (en) Multi-channel speaker output orientation detection
CN111698607B (en) TWS earphone audio output control method, apparatus, device and medium
CN116324969A (en) Hearing enhancement and wearable system with positioning feedback
JP7436564B2 (en) Headphones and headphone status detection method
CN107948792B (en) Left and right sound channel determination method and earphone equipment
WO2021230067A1 (en) Information processing device and information processing method
EP4124065A1 (en) Acoustic reproduction method, program, and acoustic reproduction system
CN113196792A (en) Specific sound detection apparatus, method, and program
CN113393855A (en) Active noise reduction method and device, computer readable storage medium and processor
US9319809B2 (en) Hearing loss compensation apparatus including external microphone
JP6522105B2 (en) Audio signal reproduction apparatus, audio signal reproduction method, program, and recording medium
KR101661106B1 (en) The dangerous situation notification apparatus by using 2-channel sound input-output device standing on the basis headset
JP6194740B2 (en) Audio processing apparatus, audio processing method, and program
CN114647397A (en) Earphone play control method and device, electronic equipment and storage medium
WO2023017622A1 (en) Information processing device, information processing method, and program
US20220408178A1 (en) Method and electronic device for providing ambient sound when user is in danger
CN112585993B (en) Sound signal processing system and sound signal processing device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21803883

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21803883

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP