EP3913926A1 - Information processing apparatus, wearable device, information processing method, and storage medium - Google Patents
Information processing apparatus, wearable device, information processing method, and storage medium Download PDFInfo
- Publication number
- EP3913926A1 EP3913926A1 EP20740784.2A EP20740784A EP3913926A1 EP 3913926 A1 EP3913926 A1 EP 3913926A1 EP 20740784 A EP20740784 A EP 20740784A EP 3913926 A1 EP3913926 A1 EP 3913926A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- acoustic information
- user
- acoustic
- information
- sound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000010365 information processing Effects 0.000 title claims description 52
- 238000003672 processing method Methods 0.000 title claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 12
- 210000000613 ear canal Anatomy 0.000 claims description 10
- 230000007613 environmental effect Effects 0.000 claims description 6
- 210000000056 organ Anatomy 0.000 claims description 5
- 238000004891 communication Methods 0.000 description 58
- 230000006870 function Effects 0.000 description 33
- 238000000034 method Methods 0.000 description 27
- 230000008569 process Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 14
- 238000007689 inspection Methods 0.000 description 12
- 238000001228 spectrum Methods 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- KNMAVSAGTYIFJF-UHFFFAOYSA-N 1-[2-[(2-hydroxy-3-phenoxypropyl)amino]ethylamino]-3-phenoxypropan-2-ol;dihydrochloride Chemical compound Cl.Cl.C=1C=CC=CC=1OCC(O)CNCCNCC(O)COC1=CC=CC=C1 KNMAVSAGTYIFJF-UHFFFAOYSA-N 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 210000000214 mouth Anatomy 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000003928 nasal cavity Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 210000003625 skull Anatomy 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1041—Mechanical or electronic switches, or control elements
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1785—Methods, e.g. algorithms; Devices
- G10K11/17857—Geometric disposition, e.g. placement of microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1787—General system configurations
- G10K11/17873—General system configurations using a reference signal without an error signal, e.g. pure feedforward
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1787—General system configurations
- G10K11/17879—General system configurations using both a reference signal and an error signal
- G10K11/17881—General system configurations using both a reference signal and an error signal the reference signal being an acoustic signal, e.g. recorded with a microphone
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for
- G10K15/04—Sound-producing devices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1008—Earpieces of the supra-aural or circum-aural type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K2210/00—Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
- G10K2210/10—Applications
- G10K2210/108—Communication systems, e.g. where useful sound is kept and noise is cancelled
- G10K2210/1081—Earphones, e.g. for telephones, ear protectors or headsets
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K2210/00—Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
- G10K2210/30—Means
- G10K2210/301—Computational
- G10K2210/3027—Feedforward
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K2210/00—Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
- G10K2210/30—Means
- G10K2210/321—Physical
- G10K2210/3219—Geometry of the configuration
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1016—Earpieces of the intra-aural type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2460/00—Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
- H04R2460/01—Hearing devices using active noise cancellation
Definitions
- the present invention relates to an information processing device, a wearable device, an information processing method, and a storage medium.
- Patent Literature 1 discloses a headphone having a personal authentication function. Patent Literature 1 further discloses, as an example of the personal authentication function, a method for determining a person based on acoustic characteristics inside the ear.
- Acoustic characteristics acquired by a wearable device as described in Patent Literature 1 may change depending on the wearing state. Thus, differences in the wearing states may affect the accuracy of the matching based on acoustic characteristics.
- the present invention intends to provide an information processing device, a wearable device, an information processing method, and a storage medium which can improve the accuracy of a biometric matching using acoustic information acquired by the wearable device.
- an information processing device including a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user, a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device and a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- a wearable device including a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- an information processing method including acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- a storage medium storing a program that causes a computer to perform acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- an information processing device a wearable device, an information processing method, and a storage medium which can improve the accuracy of a biometric matching using acoustic information acquired by the wearable device can be provided.
- the information processing system of the present example embodiment is a system for performing a biometric matching by a wearable device such as an earphone.
- Fig. 1 is a schematic diagram illustrating a general configuration of an information processing system according to the present example embodiment.
- the information processing system is provided with an information communication device 1 and an earphone 2 which may be connected to each other by wireless communication.
- the earphone 2 includes an earphone control device 20, a speaker 26, and a microphone 27.
- the earphone 2 is an acoustic device which can be worn on the ear of the user 3, and is typically a wireless earphone, a wireless headset or the like.
- the speaker 26 functions as a sound wave generation unit which emits a sound wave toward the ear canal of the user 3 when worn, and is arranged on the wearing surface side of the earphone 2.
- the microphone 27 is also arranged on the wearing surface side of the earphone 2 so as to receive sound waves echoed in the ear canal or the like of the user 3 when worn.
- the earphone control device 20 controls the speaker 26 and the microphone 27 and communicates with an information communication device 1.
- sound such as sound waves and voices includes inaudible sounds whose frequency or sound pressure level is outside the audible range.
- the information communication device 1 is, for example, a computer that is communicatively connected to the earphone 2, and performs a biometric matching based on an acoustic information.
- the information communication device 1 further controls the operation of the earphone 2, transmits audio data for generating sound waves emitted from the earphone 2, and receives audio data acquired from the sound waves received by the earphone 2.
- the information communication device 1 transmits compressed data of music to the earphone 2.
- the earphone 2 is a telephone device for business command at an event site, a hospital or the like
- the information communication device 1 transmits audio data of the business instruction to the earphone 2.
- the audio data of the utterance of the user 3 may be transmitted from the earphone 2 to the information communication device 1.
- the general configuration is an example, and for example, the information communication device 1 and the earphone 2 may be connected by wire. Further, the information communication device 1 and the earphone 2 may be configured as an integrated device, and further another device may be included in the information processing system.
- Fig. 2 is a block diagram illustrating a hardware configuration example of the earphone control device 20.
- the earphone control device 20 includes a central processing unit (CPU) 201, a random access memory (RAM) 202, a read only memory (ROM) 203, and a flash memory 204.
- the earphone control device 20 also includes a speaker interface (I/F) 205, a microphone I/F 206, a communication I/F 207, and a battery 208. Note that, each unit of the earphone control device 20 are connected to each other via a bus, wiring, a driving device, or the like (not shown).
- the CPU 201 is a processor that has a function of performing a predetermined calculation according to a program stored in the ROM 203, the flash memory 204, or the like, and also controlling each unit of the earphone control device 20.
- the RAM 202 is composed of a volatile storage medium and provides a temporary memory area required for the operation of the CPU 201.
- the ROM 203 is composed of a non-volatile storage medium and stores necessary information such as a program used for the operation of the earphone control device 20.
- the flash memory 204 is a storage device configured from a non-volatile storage medium and temporarily storing data, storing an operation program of the earphone control device 20, or the like.
- the communication I/F 207 is a communication interface based on standards such as Bluetooth (registered trademark) and Wi-Fi (registered trademark), and is a module for performing communication with the information communication device 1.
- the speaker I/F 205 is an interface for driving the speaker 26.
- the speaker I/F 205 includes a digital-to-analog conversion circuit, an amplifier, or the like.
- the speaker I/F 205 converts the audio data into an analog signal and supplies the analog signal to the speaker 26.
- the speaker 26 emits sound waves based on the audio data.
- the microphone I/F 206 is an interface for acquiring a signal from the microphone 27.
- the microphone I/F 206 includes an analog-to-digital conversion circuit, an amplifier, or the like.
- the microphone I/F 206 converts an analog signal generated by a sound wave received by the microphone 27 into a digital signal.
- the earphone control device 20 acquires audio data based on the received sound waves.
- the battery 208 is, for example, a secondary battery, and supplies electric power required for the operation of the earphone 2.
- the earphone 2 can operate wirelessly without being connected to an external power source by wire.
- the hardware configuration illustrated in Fig. 2 is an example, and devices other than these may be added or some devices may not be provided. Further, some devices may be replaced with another device having similar functions.
- the earphone 2 may further be provided with an input device such as a button so as to be able to receive an operation by the user 3, and further provided with a display device such as a display or an indicator lamp for providing information to the user 3.
- the hardware configuration illustrated in Fig. 2 can be appropriately changed.
- Fig. 3 is a block diagram illustrating a hardware configuration example of the information communication device 1.
- the information communication device 1 includes a CPU 101, a RAM 102, a ROM 103, and a hard disk drive (HDD) 104.
- the information communication device 1 also includes a communication I/F 105, an input device 106, and an output device 107. Note that, each unit of the information communication device 1 is connected to each other via a bus, wiring, a driving device, or the like (not shown).
- each unit constituting the information communication device 1 is illustrated as an integrated device, but some of these functions may be provided by an external device.
- the input device 106 and the output device 107 may be external devices other than the unit constituting functions of a computer including the CPU 101 or the like.
- the CPU 101 is a processor that has a function of performing a predetermined calculation according to a program stored in the ROM 103, the HDD 104, or the like, and also controlling each unit of the information communication device 1.
- the RAM 102 is composed of a volatile storage medium and provides a temporary memory area required for the operation of the CPU 101.
- the ROM 103 is composed of a non-volatile storage medium and stores necessary information such as a program used for the operation of the information communication device 1.
- the HDD 104 is a storage device configured from a non-volatile storage medium and temporarily storing data sent to and received from the earphone 2, storing an operation program of the information communication device 1, or the like.
- the communication I/F 105 is a communication interface based on standards such as Bluetooth (registered trademark) and Wi-Fi (registered trademark), and is a module for performing communication with the other devices such as the earphone 2.
- the input device 106 is a keyboard, a pointing device, or the like, and is used by the user 3 to operate the information communication device 1.
- Examples of the pointing device include a mouse, a trackball, a touch panel, and a pen tablet.
- the output device 107 is, for example, a display device.
- the display device is a liquid crystal display, an organic light emitting diode (OLED) display, or the like, and is used for displaying information, graphical user interface (GUI) for operation input, or the like.
- GUI graphical user interface
- the input device 106 and the output device 107 may be integrally formed as a touch panel.
- the hardware configuration illustrated in Fig. 3 is an example, and devices other than these may be added or some devices may not be provided. Further, some devices may be replaced with other devices having similar functions. Further, some of the functions of the present example embodiment may be provided by another device via a network, or the functions of the present example embodiment may be realized by being distributed to a plurality of devices.
- the HDD 104 may be replaced with a solid state drive (SSD) using a semiconductor memory, or may be replaced with a cloud storage.
- SSD solid state drive
- Fig. 4 is a functional block diagram of the earphone 2 and the information communication device 1 according to the present example embodiment.
- the information communication device 1 includes a first acoustic information acquisition unit 121, a second acoustic information acquisition unit 122, a third acoustic information acquisition unit 123, and a determination unit 124. Since the configuration of the earphone 2 is the same as that of Fig. 1 , a description thereof will be omitted.
- the CPU 101 performs predetermined arithmetic processing by loading programs stored in the ROM 103, the HDD 104 or the like into the RAM 102 and executing them.
- the CPU 101 controls each part of the information communication device 1 such as the communication I/F 105 based on the program.
- the CPU 201 realizes the functions of the first acoustic information acquisition unit 121, the second acoustic information acquisition unit 122, the third acoustic information acquisition unit 123, and the determination unit 124. Details of the specific processing performed by each functional block will be described later.
- the functions of the functional blocks described in the information communication device 1 may be provided in the earphone control device 20 instead of the information communication device 1. That is, the above-described functions may be realized by the information communication device 1, may be realized by the earphone control device 20, or may be realized by cooperation between the information communication device 1 and the earphone control device 20.
- the information communication device 1 and the earphone control device 20 may be sometimes referred to as information processing devices more generally. In the following description, unless otherwise specified, it is assumed that each of the functional blocks about acquisition and determination of acoustic information is provided in the information communication device 1 as illustrated in Fig. 4 .
- Fig. 5 is a flowchart illustrating a biometric matching process performed by the information communication device 1 according to the present example embodiment. The operation of the information communication device 1 will be described with reference to Fig. 5 .
- the biometric matching process of Fig. 5 is executed, for example, when the user 3 starts using the earphone 2 by operating the earphone 2.
- the biometric matching process of Fig. 5 may be executed every time a predetermined time elapses when the power of the earphone 2 is turned on.
- step S101 the first acoustic information acquisition unit 121 instructs the earphone control device 20 to emit an inspection sound.
- the earphone control device 20 transmits an inspection signal to a speaker 26, and the speaker 26 emits an inspection sound generated based on the inspection signal to an ear canal of a user 3.
- the speaker 26 may be referred to as a first sound source more generally.
- the frequency band of the inspection sound at least partially overlaps the frequency band of the voice of the user 3, that is, the frequency band of the audible sound.
- step S102 the microphone 27 receives the echo sound (ear acoustic sound) in the ear canal and converts it into an electrical signal.
- the microphone 27 transmits an electric signal based on the ear acoustic sound to an earphone control device 20, and the earphone control device 20 transmits the signal to the information communication device 1.
- the first acoustic information acquisition unit 121 acquires first acoustic information based on echo sound in the ear canal.
- the first acoustic information includes a transmission characteristic of the ear canal of the user 3.
- the acquired first acoustic information is stored in the HDD 104.
- the second acoustic information acquisition unit 122 instructs the earphone control device 20 to urge the user 3 to make a voice.
- the second acoustic information acquisition unit 122 generates notification information to urge the user 3 to make a voice.
- the notification information is audio information, for example, and information used for controlling the speaker 26 to emit a message voice such as "Please speak.” or "Please say XXX (specific keyword)." through the earphone control device 20. In this way, the user 3 is notified of the message to urge utterance. If the information communication device 1 or the earphone 2 includes a display device that the user 3 can watch, the above message may be displayed on the display device.
- the reason for notifying the user to emit a specific keyword is to reduce the influence of a difference in frequency characteristics (formant) due to a difference in words emitted by the user 3.
- the vocal cords, lungs, oral cavity, nasal cavity, or the like of the user 3 is a sound source for acquisition by the second acoustic information acquisition unit 122. Therefore, the sound emitting organ of the user 3 may be referred to as the second sound source more generally.
- step S105 the microphone 27 receives the sound based on the voice of the user 3 and converts it into an electric signal.
- the microphone 27 transmits the electric signal based on the voice of a user 3 to the earphone control device 20, and the earphone control device 20 transmits the signal to the information communication device 1.
- the second acoustic information acquisition unit 122 acquires second acoustic information based on the voice of the user 3.
- the first acoustic information includes a transmission characteristic of the voice from the sound emitting organ of the user 3 to the earphone 2 and a frequency characteristic (voiceprint) of the voice of the user 3.
- the acquired second acoustic information is stored in the HDD 104. Note that, the order of acquisition of the first acoustic information in steps S101 to S103 and the order of acquisition of the second acoustic information in steps S104 to S106 may be reversed, and at least a part of them may be performed in parallel.
- the third acoustic information acquisition unit 123 reads the first acoustic information and the second acoustic information from the HDD 104, and generates the third acoustic information based on them.
- This processing may be to subtract or divide the first acoustic information from the second acoustic information.
- this processing may be to subtract or divide the second acoustic information from the first acoustic information.
- the third acoustic information is generated and acquired by subtracting or dividing one of the first acoustic information and the second acoustic information from the other.
- the third acoustic information is used for the biometric matching of the user 3.
- step S108 the determination unit 124 determines whether or not the user 3 is a registrant by matching the third acoustic information including the biological information of the user 3 against the biological information of the registrant previously recorded in the HDD 104. If it is determined that the user 3 is the registrant (YES in step S109), the process proceeds to step S110. If it is determined that the user 3 is not the registrant (NO in step S109), the process proceeds to step S111.
- step S110 the information communication device 1 transmits a control signal indicating that the use of the earphone 2 by the user 3 is permitted to the earphone 2.
- the earphone 2 becomes a permission state by the user 3.
- step S111 the information communication device 1 transmits a control signal indicating that the use of the earphone 2 by the user 3 is not permitted to the earphone 2.
- the non-permission state may be, for example, a state in which no sound is emitted from the speaker 26 of the earphone 2.
- the control in steps S110 and S111 does not control the earphone 2 side but may control the information communication device 1 side.
- the communication connection between the information communication device 1 and the earphone 2 may be different to switch between the permission state and the non-permission state.
- the inspection sound generated by the speaker 26 in step S101 will be described in more detail with specific examples.
- a signal including a predetermined range of frequency components such as a chirp signal, a maximum length sequence (M-sequence signal), or white noise may be used.
- the frequency range of the inspection sound can be used for the wearing determination.
- Fig. 6 is a graph showing characteristics of the chirp signal.
- Fig. 6 shows the relationship between intensity and time, the relationship between frequency and time, and the relationship between intensity and frequency, respectively.
- a chirp signal is a signal whose frequency continuously changes with time.
- Fig. 6 shows an example of a chirp signal in which the frequency increases linearly with time.
- Fig. 7 is a graph showing characteristics of an M-sequence signal or white noise. Since the M-sequence signal generates a pseudo noise close to white noise, the characteristics of the M-sequence signal and the white noise are substantially the same.
- Fig. 7 like Fig. 6 , shows the relationship between intensity and time, the relationship between frequency and time, and the relationship between intensity and frequency. As shown in Fig. 7 , the M-sequence signal or white noise is a signal that evenly includes signals of a wide range of frequency.
- the chirp signal, the M-sequence signal, or the white noise has a frequency characteristic in which the frequency changes over a wide range. Therefore, by using these signals as inspection sounds, it is possible to obtain echo sound including a wide range of a frequency component in subsequent step S102.
- the process of generating the third acoustic information in step S107 will be described in more detail with a specific example.
- the signal acquired by the first acoustic information acquisition unit 121 (first acoustic information) can be expressed by the following Equation (1).
- X is a function representing the frequency spectrum of the inspection sound emitted from the speaker 26 to the ear canal.
- Y si,wj is a function representing the frequency spectrum of the echo sound obtained by the microphone 27.
- These frequency spectra are obtained, for example, by converting input/output signals in time sequences into frequency domains by Fourier transformation.
- C si is a function of the frequency domain representing a transmission characteristic of the i-th user's ear acoustic sound. Since the shape of the ear canal is unique to each person, C si is a function different from one user to another. In other words, C si is biological information that may be used to identify a person.
- G wj is a function of the frequency domain representing a change in the transmission characteristic due to a difference in the wearing state. Since G wj changes to a different function each time earphone 2 is worn again, it may act as a noise to C si .
- the echo sound obtained by the microphone 27 includes a mixture of the transmission characteristics of the ear acoustic sound and changes in the transmission characteristics depending on the wearing state, and in Equation (1), these can be separated into the form of the product of C si and G wj .
- the signal acquired by the second acoustic information acquisition unit 122 (second acoustic information) can be expressed by the following Equation (2) .
- U nk,t is a function indicating the frequency spectrum of the voice emitted by the user 3.
- V si,wj,nk,t is a function representing the frequency spectrum of the sound acquired by the microphone 27.
- These frequency spectra are obtained, for example, by converting input/output signals in time sequences into frequency domains by Fourier transformation.
- t (where t is a real number) is an argument indicating the time.
- D si is a function of a frequency domain indicating the transmission characteristic of the i-th user's voice. Since the voiceprint is unique to the user, U nk,t is a function different depending on the user. Since transmission characteristic of the voice depends on the shape of the user's skull, oral cavity, or the like, D si is also a function different from one user to another.
- Equation (2) G wj is common to Equation (1). This is because when the user 3 wears the earphone 2 and then acquires the first acoustic information and the second acoustic information in the same wearing state without putting on and taking off the earphone 2, G wi indicating the wearing state is in the same state.
- Equation (1) and Equation (2) an operation for converting both sides into logarithms is performed for Equation (1) and Equation (2). These equations are converted into the following Equation (3) and Equation (4), respectively.
- Equation (3) and Equation (4) the expression of the value of the base of the logarithm is omitted, but the value of the base is optional.
- log Y s 1 , w 1 log G w 1 + log C s 1 + log X
- log V s 1 , w 1 , n 1 , t log G w 1 + log D s 1 + log U n 1 , t
- Equation (3) When Equation (3) is subtracted from Equation (4), the term log G w1 common to both sides is canceled, and the following Equation (5) is obtained.
- the terms on the left side are observation signals or known signals acquired by the microphone 27.
- the terms of the right side are different functions depending on a user and may be used as biological information. Since the right side is equal to the left side from Equation (5), biological information can be calculated from the observation signal. In this way, the left side of Equation (5) is calculated from the first acoustic information represented by Equation (1) and the second acoustic information represented by Equation (2) by the above-described calculation, and can be used as the third acoustic information for the biometric matching.
- the third acoustic information does not include a term G wj indicating the effect of the difference in the wearing state. Therefore, the third acoustic information is robust against noise due to the wearing state. Therefore, the accuracy of the biometric matching is improved by using the third acoustic information for the biometric matching.
- the frequency band of the first acoustic information and the frequency band of the second acoustic information at least partially overlap.
- the first acoustic information includes the frequency band of the audible sound included in the voice of the user 3.
- the determination process in step S108 will be described in detail with reference to a specific example using a feature amount extraction technique.
- the determination unit 124 calculates a feature amount by a predetermined algorithm from frequency characteristics included in the third acoustic information. Thereafter, the determination unit 124 compares the feature amount of the third acoustic information with the feature amount of the registrant extracted by the similar technique to calculate a matching score indicating the similarity between the feature amounts. When there is a plurality of registrants, the same processing is performed for each of the plurality of registrants.
- the determination unit 124 determines whether or not the user 3 is a registrant based on whether or not the matching score exceeds a predetermined threshold. When there is a plurality of registrants, if the matching score exceeds a predetermined threshold for any one of the plurality of registrants, it is determined that the user 3 is a registrant.
- the accuracy of the biometric matching is improved by generating the third acoustic information used for the biometric matching by using the first acoustic information and the second acoustic information based on different sound sources each other. Therefore, an information processing device capable of improving the accuracy of a biometric matching using acoustic information acquired by a wearable device is provided.
- the information processing system of the present example embodiment is different from the first example embodiment in the content of a process for determining whether or not a user is a registrant.
- differences from the first example embodiment will be mainly described, and the description of the common parts will be omitted or simplified.
- Fig. 8 is a functional block diagram of the earphone 2 and the information communication device 1 according to the present example embodiment.
- the present example embodiment differs from the first example embodiment in that the determination unit 124 further uses not only the third acoustic information acquired by the third acoustic information acquisition unit 123 but also the first acoustic information acquired by the first acoustic information acquisition unit 121 to make a determination.
- Fig. 9 is a flowchart illustrating a biometric matching process according to present example embodiment performed by the information communication device 1. Since the difference with first example embodiment is only that step S108 is replaced with step S112, step S112 will be described here.
- step S112 the determination unit 124 determines whether or not the user 3 is the registrant by matching information obtained by integrating the first acoustic information and the third acoustic information against biological information of the registrant previously recorded in the HDD 104. If it is determined that the user 3 is the registrant (YES in step S109), the process proceeds to step S110. If it is determined that the user 3 is not the registrant (NO in step S109), the process proceeds to step S111.
- the integration of the first acoustic information and the third acoustic information in step S112 will be described in more detail.
- the first acoustic information is information mainly based on the ear acoustic sound of the user 3
- the third acoustic information is information obtained by performing arithmetic processing of the ear acoustic sound of the user 3 and the sound of the voice each other. Therefore, the first acoustic information and the third acoustic information include different biological information.
- performing a two-factor matching using two different kinds of biological information improves the accuracy of the matching. Therefore, in the present example embodiment, the first acoustic information and the third acoustic information are integrated in step S112, and the two-factor matching is performed by using the integrated result.
- the matching accuracy can be further improved.
- a specific example of integration of acoustic information will be described. Assume a case in which, as described in the first example embodiment, a technique is used in the matching in the determination unit 124 for extracting feature amounts from the acoustic information and calculating a matching score indicating the similarity of the feature amounts. In this case, the determination unit 124 calculates a first matching score based on the first acoustic information and a second matching score based on the third acoustic information. Thereafter, the determination unit 124 calculates a third matching score obtained by combining the first matching score and the second matching score by addition, averaging, linear combination, multiplication, or the like.
- the determination unit 124 determines whether or not the user 3 is a registrant based on whether or not the third matching score exceeds a predetermined threshold.
- the technique may be that the first matching based on the first acoustic information and the second matching based on the third acoustic information may be performed, and then the logical product or logical sum of the result of the first matching and the result of the second matching may be used as the final matching result to perform the determination.
- the matching result that the judgment is impossible may be output.
- the acoustic information added to the third acoustic information as described above is preferably the first acoustic information including information mainly based on the ear acoustic sound of the user 3.
- the second acoustic information and the third acoustic information may be integrated by using the second acoustic information instead of the first acoustic information.
- the first acoustic information, the second acoustic information, and the third acoustic information may be integrated.
- the matching accuracy is further improved.
- the information processing system of the present example embodiment is different from the first example embodiment in that it has a function of a noise cancellation.
- the difference from first example embodiment will be mainly described, and the description of the common parts will be omitted or simplified.
- Fig. 10 is a schematic diagram illustrating the general configuration of an information processing system according to the present example embodiment.
- an earphone 2 includes a plurality of microphones 27 and 28 arranged at different positions each other.
- the microphone 28 is controlled by an earphone control device 20.
- the microphone 28 is arranged on the back side opposite to the wearing surface of the earphone 2 so as to receive sound waves from the outside when the microphone is worn.
- the external sound may be noise. Therefore, in the present example embodiment, a plurality of microphones 27 and 28 are arranged in the earphone 2, and the earphone 2 has a function of reducing the influence of external environmental sound by performing noise cancellation using the sound acquired by the microphone 28.
- the environmental sound includes not only the sound generated in the outside of the user 3 but also the sound which may be noise in the matching such as the sound echoed in the outside among the sounds emitted by the user 3 or the speaker 26.
- the microphone 27 and the microphone 28 are sometimes referred to as a first microphone and a second microphone, respectively.
- Noise cancellation will be described.
- the environmental sound is incident on the microphones 27 and 28 with almost the same phase. Therefore, at least a part of the environmental sound can be canceled by superimposing the sound obtained by the microphone 27 on the sound obtained by the microphone 28 in reversed phase. Since the intensity of the sound received by the microphones 27 and 28 may be different from each other due to attenuation by the housing of the earphone 2 or the like, one or both sounds may be amplified or attenuated before being superimposed.
- the processing for superimposing the sound in the reversed phase may be signal processing performed digitally on the obtained audio data or may be performed as analog process by emitting the sound by the speaker 26 in the reversed phase to the sound obtained by the microphone 28.
- Fig. 11 is a functional block diagram of the information processing device 4 according to the fourth example embodiment.
- the information processing device 4 includes a first acoustic information acquisition unit 421, a second acoustic information acquisition unit 422, and a third acoustic information acquisition unit 423.
- the first acoustic information acquisition unit 421 acquires a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user.
- the second acoustic information acquisition unit 422 acquires a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device.
- the third acoustic information acquisition unit 423 acquires a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- an information processing device 4 capable of improving the accuracy of the biometric matching using the acoustic information acquired by the wearable device.
- the present invention is not limited to the example embodiments described above, and may be suitably modified within the scope of the present invention.
- an example in which a part of the configuration of one embodiment is added to another embodiment or an example in which a part of the configuration of another embodiment is replaced is also an example embodiment of the present invention.
- the earphone 2 is exemplified as an example of a wearable device
- the present invention is not limited to a device worn on the ear as long as acoustic information necessary for processing can be acquired.
- the wearable device may be a bone conduction type acoustic device.
- the second acoustic information is obtained by receiving the voice emitted by the user 3, but the present invention is not limited thereto. That is, the second sound source for generating the sound wave for acquiring the second acoustic information may be other than the voice emitting organ of the user 3. For example, when a second speaker different from the speaker 26 is separately provided in the earphone 2 or another device, the second speaker may be a second sound source.
- the earphone of the right ear may be an earphone 2 having the function of the ear acoustic sound matching described in the first to third example embodiment
- the earphone of the left ear may be an earphone having the second speaker described above.
- the same processing as the first to third example embodiment can be performed.
- the frequency usable for using the voice is limited to the range of the voice that the human can emit, but in this example, since the voice is not used, there is no such restriction, and it is possible to use a non-audible sound having a frequency such as an ultrasonic band, for example.
- a non-audible sound having a frequency such as an ultrasonic band, for example.
- each of the example embodiments also includes a processing method that stores, in a storage medium, a program that causes the configuration of each of the example embodiments to operate so as to implement the function of each of the example embodiments described above, reads the program stored in the storage medium as a code, and executes the program in a computer. That is, the scope of each of the example embodiments also includes a computer readable storage medium. Further, each of the example embodiments includes not only the storage medium in which the computer program described above is stored but also the computer program itself. Further, one or two or more components included in the example embodiments described above may be a circuit such as an application specific integrated circuit (ASIC)), a field programmable gate array (FPGA), or the like configured to implement the function of each component.
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- a floppy (registered trademark) disk for example, a hard disk, an optical disk, a magneto-optical disk, a compact disk (CD) -ROM, a magnetic tape, a nonvolatile memory card, or a ROM can be used.
- the scope of each of the example embodiments includes an example that operates on operating system (OS) to perform a process in cooperation with another software or a function of an add-in board without being limited to an example that performs a process by an individual program stored in the storage medium.
- OS operating system
- a service implemented by the function of each of the example embodiments described above may be provided to a user in a form of software as a service (SaaS).
- An information processing device comprising:
- the information processing device according to supplementary note 1, wherein the first acoustic information includes a transmission characteristic of an ear canal of the user.
- the information processing device according to supplementary 1 or 2, wherein the first sound source is a speaker provided in the wearable device.
- the information processing device according to any one of supplementary notes 1 to 3, wherein the wearable device is an earphone worn on an ear of the user.
- the information processing device according to any one of supplementary notes 1 to 4, wherein the second acoustic information includes a transmission characteristic of a voice emitted by the user.
- the information processing device according to any one of supplementary notes 1 to 5, wherein the second sound source is a sound emitting organ of the user.
- the information processing device according to any one of supplementary notes 1 to 7, wherein the third acoustic information acquisition unit generates and acquires the third acoustic information by subtracting or dividing one of the first acoustic information and the second acoustic information from the other.
- the information processing device according to any one of supplementary notes 1 to 8, wherein a frequency band of the first acoustic information and a frequency band of the second acoustic information at least partially overlap.
- the information processing device according to any one of supplementary notes 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information.
- the information processing device according to any one of supplementary notes 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information and at least one of the first acoustic information and the second acoustic information.
- a wearable device comprising:
- An information processing method comprising:
- a storage medium storing a program that causes a computer to perform:
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Otolaryngology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Headphones And Earphones (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
- The present invention relates to an information processing device, a wearable device, an information processing method, and a storage medium.
-
Patent Literature 1 discloses a headphone having a personal authentication function.Patent Literature 1 further discloses, as an example of the personal authentication function, a method for determining a person based on acoustic characteristics inside the ear. - PTL 1:
Japanese Patent Application Laid-open No. 2004-65363 - Acoustic characteristics acquired by a wearable device as described in
Patent Literature 1 may change depending on the wearing state. Thus, differences in the wearing states may affect the accuracy of the matching based on acoustic characteristics. - The present invention intends to provide an information processing device, a wearable device, an information processing method, and a storage medium which can improve the accuracy of a biometric matching using acoustic information acquired by the wearable device.
- According to one example aspect of the invention, provided is an information processing device including a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user, a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device and a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- According to another example aspect of the invention, provided is a wearable device including a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- According to another example aspect of the invention, provided is an information processing method including acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- According to another example aspect of the invention, provided is a storage medium storing a program that causes a computer to perform acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user, acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device, and acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- According to the present invention, an information processing device, a wearable device, an information processing method, and a storage medium which can improve the accuracy of a biometric matching using acoustic information acquired by the wearable device can be provided.
-
- [
Fig. 1 ]
Fig. 1 is a schematic diagram illustrating a general configuration of an information processing system according to a first example embodiment. - [
Fig. 2 ]
Fig. 2 is a block diagram illustrating a hardware configuration of an earphone according to the first example embodiment. - [
Fig. 3 ]
Fig. 3 is a block diagram illustrating a hardware configuration of an information communication device according to the first example embodiment. - [
Fig. 4 ]
Fig. 4 is a functional block diagram of an earphone and an information communication device according to the first example embodiment. - [
Fig. 5 ]
Fig. 5 is a flowchart illustrating a wearing determination process performed by the information communication device according to the first example embodiment. - [
Fig. 6 ]
Fig. 6 is a graph showing a characteristic of a chirp signal. - [
Fig. 7 ]
Fig. 7 is a graph showing a characteristic of an M-sequence signal or a white noise. - [
Fig. 8 ]
Fig. 8 is a functional block diagram of an earphone and an information communication device according to a second example embodiment. - [
Fig. 9 ]
Fig. 9 is a flowchart illustrating a wearing determination process performed by the information communication device according to the second example embodiment. - [
Fig. 10 ]
Fig. 10 is a schematic diagram illustrating a general configuration of an information processing system according to a third example embodiment. - [
Fig. 11 ]
Fig. 11 is a functional block diagram of an information communication device according to a fourth example embodiment. - Exemplary example embodiments of the present invention will be described below with reference to the drawings. Throughout the drawings, the same components or corresponding components are labeled with same references, and the description thereof may be omitted or simplified.
- An information processing system according to the present example embodiment will be described. The information processing system of the present example embodiment is a system for performing a biometric matching by a wearable device such as an earphone.
-
Fig. 1 is a schematic diagram illustrating a general configuration of an information processing system according to the present example embodiment. The information processing system is provided with aninformation communication device 1 and anearphone 2 which may be connected to each other by wireless communication. - The
earphone 2 includes anearphone control device 20, aspeaker 26, and amicrophone 27. Theearphone 2 is an acoustic device which can be worn on the ear of theuser 3, and is typically a wireless earphone, a wireless headset or the like. Thespeaker 26 functions as a sound wave generation unit which emits a sound wave toward the ear canal of theuser 3 when worn, and is arranged on the wearing surface side of theearphone 2. Themicrophone 27 is also arranged on the wearing surface side of theearphone 2 so as to receive sound waves echoed in the ear canal or the like of theuser 3 when worn. Theearphone control device 20 controls thespeaker 26 and themicrophone 27 and communicates with aninformation communication device 1. - Note that, in the present specification, "sound" such as sound waves and voices includes inaudible sounds whose frequency or sound pressure level is outside the audible range.
- The
information communication device 1 is, for example, a computer that is communicatively connected to theearphone 2, and performs a biometric matching based on an acoustic information. Theinformation communication device 1 further controls the operation of theearphone 2, transmits audio data for generating sound waves emitted from theearphone 2, and receives audio data acquired from the sound waves received by theearphone 2. As a specific example, when theuser 3 listens to music using theearphone 2, theinformation communication device 1 transmits compressed data of music to theearphone 2. When theearphone 2 is a telephone device for business command at an event site, a hospital or the like, theinformation communication device 1 transmits audio data of the business instruction to theearphone 2. In this case, the audio data of the utterance of theuser 3 may be transmitted from theearphone 2 to theinformation communication device 1. - Note that, the general configuration is an example, and for example, the
information communication device 1 and theearphone 2 may be connected by wire. Further, theinformation communication device 1 and theearphone 2 may be configured as an integrated device, and further another device may be included in the information processing system. -
Fig. 2 is a block diagram illustrating a hardware configuration example of theearphone control device 20. Theearphone control device 20 includes a central processing unit (CPU) 201, a random access memory (RAM) 202, a read only memory (ROM) 203, and aflash memory 204. Theearphone control device 20 also includes a speaker interface (I/F) 205, a microphone I/F 206, a communication I/F 207, and abattery 208. Note that, each unit of theearphone control device 20 are connected to each other via a bus, wiring, a driving device, or the like (not shown). - The
CPU 201 is a processor that has a function of performing a predetermined calculation according to a program stored in theROM 203, theflash memory 204, or the like, and also controlling each unit of theearphone control device 20. TheRAM 202 is composed of a volatile storage medium and provides a temporary memory area required for the operation of theCPU 201. TheROM 203 is composed of a non-volatile storage medium and stores necessary information such as a program used for the operation of theearphone control device 20. Theflash memory 204 is a storage device configured from a non-volatile storage medium and temporarily storing data, storing an operation program of theearphone control device 20, or the like. - The communication I/
F 207 is a communication interface based on standards such as Bluetooth (registered trademark) and Wi-Fi (registered trademark), and is a module for performing communication with theinformation communication device 1. - The speaker I/
F 205 is an interface for driving thespeaker 26. The speaker I/F 205 includes a digital-to-analog conversion circuit, an amplifier, or the like. The speaker I/F 205 converts the audio data into an analog signal and supplies the analog signal to thespeaker 26. Thus, thespeaker 26 emits sound waves based on the audio data. - The microphone I/
F 206 is an interface for acquiring a signal from themicrophone 27. The microphone I/F 206 includes an analog-to-digital conversion circuit, an amplifier, or the like. The microphone I/F 206 converts an analog signal generated by a sound wave received by themicrophone 27 into a digital signal. Thus, theearphone control device 20 acquires audio data based on the received sound waves. - The
battery 208 is, for example, a secondary battery, and supplies electric power required for the operation of theearphone 2. Thus, theearphone 2 can operate wirelessly without being connected to an external power source by wire. - Note that the hardware configuration illustrated in
Fig. 2 is an example, and devices other than these may be added or some devices may not be provided. Further, some devices may be replaced with another device having similar functions. For example, theearphone 2 may further be provided with an input device such as a button so as to be able to receive an operation by theuser 3, and further provided with a display device such as a display or an indicator lamp for providing information to theuser 3. Thus, the hardware configuration illustrated inFig. 2 can be appropriately changed. -
Fig. 3 is a block diagram illustrating a hardware configuration example of theinformation communication device 1. Theinformation communication device 1 includes aCPU 101, aRAM 102, aROM 103, and a hard disk drive (HDD) 104. Theinformation communication device 1 also includes a communication I/F 105, aninput device 106, and anoutput device 107. Note that, each unit of theinformation communication device 1 is connected to each other via a bus, wiring, a driving device, or the like (not shown). - In
Fig. 3 , each unit constituting theinformation communication device 1 is illustrated as an integrated device, but some of these functions may be provided by an external device. For example, theinput device 106 and theoutput device 107 may be external devices other than the unit constituting functions of a computer including theCPU 101 or the like. - The
CPU 101 is a processor that has a function of performing a predetermined calculation according to a program stored in theROM 103, theHDD 104, or the like, and also controlling each unit of theinformation communication device 1. TheRAM 102 is composed of a volatile storage medium and provides a temporary memory area required for the operation of theCPU 101. TheROM 103 is composed of a non-volatile storage medium and stores necessary information such as a program used for the operation of theinformation communication device 1. TheHDD 104 is a storage device configured from a non-volatile storage medium and temporarily storing data sent to and received from theearphone 2, storing an operation program of theinformation communication device 1, or the like. - The communication I/
F 105 is a communication interface based on standards such as Bluetooth (registered trademark) and Wi-Fi (registered trademark), and is a module for performing communication with the other devices such as theearphone 2. - The
input device 106 is a keyboard, a pointing device, or the like, and is used by theuser 3 to operate theinformation communication device 1. Examples of the pointing device include a mouse, a trackball, a touch panel, and a pen tablet. - The
output device 107 is, for example, a display device. The display device is a liquid crystal display, an organic light emitting diode (OLED) display, or the like, and is used for displaying information, graphical user interface (GUI) for operation input, or the like. Theinput device 106 and theoutput device 107 may be integrally formed as a touch panel. - Note that, the hardware configuration illustrated in
Fig. 3 is an example, and devices other than these may be added or some devices may not be provided. Further, some devices may be replaced with other devices having similar functions. Further, some of the functions of the present example embodiment may be provided by another device via a network, or the functions of the present example embodiment may be realized by being distributed to a plurality of devices. For example, theHDD 104 may be replaced with a solid state drive (SSD) using a semiconductor memory, or may be replaced with a cloud storage. Thus, the hardware configuration illustrated inFig. 3 can be appropriately changed. -
Fig. 4 is a functional block diagram of theearphone 2 and theinformation communication device 1 according to the present example embodiment. Theinformation communication device 1 includes a first acousticinformation acquisition unit 121, a second acousticinformation acquisition unit 122, a third acousticinformation acquisition unit 123, and adetermination unit 124. Since the configuration of theearphone 2 is the same as that ofFig. 1 , a description thereof will be omitted. - The
CPU 101 performs predetermined arithmetic processing by loading programs stored in theROM 103, theHDD 104 or the like into theRAM 102 and executing them. TheCPU 101 controls each part of theinformation communication device 1 such as the communication I/F 105 based on the program. Thus, theCPU 201 realizes the functions of the first acousticinformation acquisition unit 121, the second acousticinformation acquisition unit 122, the third acousticinformation acquisition unit 123, and thedetermination unit 124. Details of the specific processing performed by each functional block will be described later. - Note that, in
Fig. 4 , some or all of the functions of the functional blocks described in theinformation communication device 1 may be provided in theearphone control device 20 instead of theinformation communication device 1. That is, the above-described functions may be realized by theinformation communication device 1, may be realized by theearphone control device 20, or may be realized by cooperation between theinformation communication device 1 and theearphone control device 20. Theinformation communication device 1 and theearphone control device 20 may be sometimes referred to as information processing devices more generally. In the following description, unless otherwise specified, it is assumed that each of the functional blocks about acquisition and determination of acoustic information is provided in theinformation communication device 1 as illustrated inFig. 4 . -
Fig. 5 is a flowchart illustrating a biometric matching process performed by theinformation communication device 1 according to the present example embodiment. The operation of theinformation communication device 1 will be described with reference toFig. 5 . - The biometric matching process of
Fig. 5 is executed, for example, when theuser 3 starts using theearphone 2 by operating theearphone 2. Alternatively, the biometric matching process ofFig. 5 may be executed every time a predetermined time elapses when the power of theearphone 2 is turned on. - In step S101, the first acoustic
information acquisition unit 121 instructs theearphone control device 20 to emit an inspection sound. Theearphone control device 20 transmits an inspection signal to aspeaker 26, and thespeaker 26 emits an inspection sound generated based on the inspection signal to an ear canal of auser 3. Thespeaker 26 may be referred to as a first sound source more generally. - In the processing to be described later, since arithmetic processing is performed between the acoustic information based on the echo sound of the inspection sound and the acoustic information based on the voice of the
user 3, the frequency band of the inspection sound at least partially overlaps the frequency band of the voice of theuser 3, that is, the frequency band of the audible sound. - In step S102, the
microphone 27 receives the echo sound (ear acoustic sound) in the ear canal and converts it into an electrical signal. Themicrophone 27 transmits an electric signal based on the ear acoustic sound to anearphone control device 20, and theearphone control device 20 transmits the signal to theinformation communication device 1. - In step S103, the first acoustic
information acquisition unit 121 acquires first acoustic information based on echo sound in the ear canal. The first acoustic information includes a transmission characteristic of the ear canal of theuser 3. The acquired first acoustic information is stored in theHDD 104. - In step S104, the second acoustic
information acquisition unit 122 instructs theearphone control device 20 to urge theuser 3 to make a voice. An example of processing for urging theuser 3 to make a voice will be described. The second acousticinformation acquisition unit 122 generates notification information to urge theuser 3 to make a voice. The notification information is audio information, for example, and information used for controlling thespeaker 26 to emit a message voice such as "Please speak." or "Please say XXX (specific keyword)." through theearphone control device 20. In this way, theuser 3 is notified of the message to urge utterance. If theinformation communication device 1 or theearphone 2 includes a display device that theuser 3 can watch, the above message may be displayed on the display device. The reason for notifying the user to emit a specific keyword is to reduce the influence of a difference in frequency characteristics (formant) due to a difference in words emitted by theuser 3. - In other words, the vocal cords, lungs, oral cavity, nasal cavity, or the like of the
user 3 is a sound source for acquisition by the second acousticinformation acquisition unit 122. Therefore, the sound emitting organ of theuser 3 may be referred to as the second sound source more generally. - In step S105, the
microphone 27 receives the sound based on the voice of theuser 3 and converts it into an electric signal. Themicrophone 27 transmits the electric signal based on the voice of auser 3 to theearphone control device 20, and theearphone control device 20 transmits the signal to theinformation communication device 1. - In step S106, the second acoustic
information acquisition unit 122 acquires second acoustic information based on the voice of theuser 3. The first acoustic information includes a transmission characteristic of the voice from the sound emitting organ of theuser 3 to theearphone 2 and a frequency characteristic (voiceprint) of the voice of theuser 3. The acquired second acoustic information is stored in theHDD 104. Note that, the order of acquisition of the first acoustic information in steps S101 to S103 and the order of acquisition of the second acoustic information in steps S104 to S106 may be reversed, and at least a part of them may be performed in parallel. - In step S107, the third acoustic
information acquisition unit 123 reads the first acoustic information and the second acoustic information from theHDD 104, and generates the third acoustic information based on them. This processing may be to subtract or divide the first acoustic information from the second acoustic information. Alternatively, this processing may be to subtract or divide the second acoustic information from the first acoustic information. In other words, the third acoustic information is generated and acquired by subtracting or dividing one of the first acoustic information and the second acoustic information from the other. The third acoustic information is used for the biometric matching of theuser 3. - In step S108, the
determination unit 124 determines whether or not theuser 3 is a registrant by matching the third acoustic information including the biological information of theuser 3 against the biological information of the registrant previously recorded in theHDD 104. If it is determined that theuser 3 is the registrant (YES in step S109), the process proceeds to step S110. If it is determined that theuser 3 is not the registrant (NO in step S109), the process proceeds to step S111. - In step S110, the
information communication device 1 transmits a control signal indicating that the use of theearphone 2 by theuser 3 is permitted to theearphone 2. Thus, theearphone 2 becomes a permission state by theuser 3. - In step S111, the
information communication device 1 transmits a control signal indicating that the use of theearphone 2 by theuser 3 is not permitted to theearphone 2. Thus, theearphone 2 becomes a non-permission state by theuser 3. The non-permission state may be, for example, a state in which no sound is emitted from thespeaker 26 of theearphone 2. Note that, the control in steps S110 and S111 does not control theearphone 2 side but may control theinformation communication device 1 side. For example, the communication connection between theinformation communication device 1 and theearphone 2 may be different to switch between the permission state and the non-permission state. - The inspection sound generated by the
speaker 26 in step S101 will be described in more detail with specific examples. As an example of the inspection signal used for generating the inspection sound, a signal including a predetermined range of frequency components such as a chirp signal, a maximum length sequence (M-sequence signal), or white noise may be used. Thus, the frequency range of the inspection sound can be used for the wearing determination. -
Fig. 6 is a graph showing characteristics of the chirp signal.Fig. 6 shows the relationship between intensity and time, the relationship between frequency and time, and the relationship between intensity and frequency, respectively. A chirp signal is a signal whose frequency continuously changes with time.Fig. 6 shows an example of a chirp signal in which the frequency increases linearly with time. -
Fig. 7 is a graph showing characteristics of an M-sequence signal or white noise. Since the M-sequence signal generates a pseudo noise close to white noise, the characteristics of the M-sequence signal and the white noise are substantially the same.Fig. 7 , likeFig. 6 , shows the relationship between intensity and time, the relationship between frequency and time, and the relationship between intensity and frequency. As shown inFig. 7 , the M-sequence signal or white noise is a signal that evenly includes signals of a wide range of frequency. - The chirp signal, the M-sequence signal, or the white noise has a frequency characteristic in which the frequency changes over a wide range. Therefore, by using these signals as inspection sounds, it is possible to obtain echo sound including a wide range of a frequency component in subsequent step S102.
-
- Here, X is a function representing the frequency spectrum of the inspection sound emitted from the
speaker 26 to the ear canal. Ysi,wj is a function representing the frequency spectrum of the echo sound obtained by themicrophone 27. These frequency spectra are obtained, for example, by converting input/output signals in time sequences into frequency domains by Fourier transformation. si (i = 1, 2, ...) is an argument representing a person of a matching target, and since s1 is used in Equation (1), the Equation (1) is about the first user. wj (j = 1, 2, ...) is an argument representing the wearing state of theearphone 2, and Equation (1) is an equation about the first wearing state since w1 is used. - Csi is a function of the frequency domain representing a transmission characteristic of the i-th user's ear acoustic sound. Since the shape of the ear canal is unique to each person, Csi is a function different from one user to another. In other words, Csi is biological information that may be used to identify a person. Gwj is a function of the frequency domain representing a change in the transmission characteristic due to a difference in the wearing state. Since Gwj changes to a different function each
time earphone 2 is worn again, it may act as a noise to Csi. The echo sound obtained by themicrophone 27 includes a mixture of the transmission characteristics of the ear acoustic sound and changes in the transmission characteristics depending on the wearing state, and in Equation (1), these can be separated into the form of the product of Csi and Gwj. -
- Here, Unk,t is a function indicating the frequency spectrum of the voice emitted by the
user 3. Vsi,wj,nk,t is a function representing the frequency spectrum of the sound acquired by themicrophone 27. These frequency spectra are obtained, for example, by converting input/output signals in time sequences into frequency domains by Fourier transformation. The nk (k = 1, 2, ...) is an argument representing various situations included in the voice, such as the contents of the utterance and the characteristics of the voiceprint depending on the speaker. Equation (1) is for the first situation because n1 is used. t (where t is a real number) is an argument indicating the time. When auser 3 generates a sentence or phrase, a frequency spectrum changes according to time. For example, when theuser 3 utters "ABC", the frequency spectrum at the moment when the user utters "A" is different from the frequency spectrum at the moment when the user utters "B". Also, even when theuser 3 utters the same word in multiple times, the frequency spectrum may differ depending on the time. Thus, time t may also be an argument necessary to specify the frequency spectrum of the voice. Dsi is a function of a frequency domain indicating the transmission characteristic of the i-th user's voice. Since the voiceprint is unique to the user, Unk,t is a function different depending on the user. Since transmission characteristic of the voice depends on the shape of the user's skull, oral cavity, or the like, Dsi is also a function different from one user to another. In other words, Unk,t and Dsi are biological information that may be used to identify a person. In Equation (2), Gwj is common to Equation (1). This is because when theuser 3 wears theearphone 2 and then acquires the first acoustic information and the second acoustic information in the same wearing state without putting on and taking off theearphone 2, Gwi indicating the wearing state is in the same state. - Here, an operation for converting both sides into logarithms is performed for Equation (1) and Equation (2). These equations are converted into the following Equation (3) and Equation (4), respectively. In Equation (3) and Equation (4), the expression of the value of the base of the logarithm is omitted, but the value of the base is optional.
[Math. 3]
[Math. 4] -
- The terms on the left side are observation signals or known signals acquired by the
microphone 27. The terms of the right side are different functions depending on a user and may be used as biological information. Since the right side is equal to the left side from Equation (5), biological information can be calculated from the observation signal. In this way, the left side of Equation (5) is calculated from the first acoustic information represented by Equation (1) and the second acoustic information represented by Equation (2) by the above-described calculation, and can be used as the third acoustic information for the biometric matching. As can be understood from Equation (5), the third acoustic information does not include a term Gwj indicating the effect of the difference in the wearing state. Therefore, the third acoustic information is robust against noise due to the wearing state. Therefore, the accuracy of the biometric matching is improved by using the third acoustic information for the biometric matching. - In the above example, conversion to logarithm is performed for convenience of calculation, but it is not necessary. When the conversion to logarithm is not performed, the operation for subtracting the Equation (3) from the Equation (4) is replaced with the operation for dividing the Equation (3) from the Equation (4).
- In the above example, in order to perform addition and subtraction between the first acoustic information and the second acoustic information, typically, the frequency band of the first acoustic information and the frequency band of the second acoustic information at least partially overlap. When the second acoustic information is based on the voice of the
user 3, the first acoustic information includes the frequency band of the audible sound included in the voice of theuser 3. - The determination process in step S108 will be described in detail with reference to a specific example using a feature amount extraction technique. The
determination unit 124 calculates a feature amount by a predetermined algorithm from frequency characteristics included in the third acoustic information. Thereafter, thedetermination unit 124 compares the feature amount of the third acoustic information with the feature amount of the registrant extracted by the similar technique to calculate a matching score indicating the similarity between the feature amounts. When there is a plurality of registrants, the same processing is performed for each of the plurality of registrants. Thedetermination unit 124 determines whether or not theuser 3 is a registrant based on whether or not the matching score exceeds a predetermined threshold. When there is a plurality of registrants, if the matching score exceeds a predetermined threshold for any one of the plurality of registrants, it is determined that theuser 3 is a registrant. - As described above, according to the present example embodiment, the accuracy of the biometric matching is improved by generating the third acoustic information used for the biometric matching by using the first acoustic information and the second acoustic information based on different sound sources each other. Therefore, an information processing device capable of improving the accuracy of a biometric matching using acoustic information acquired by a wearable device is provided.
- The information processing system of the present example embodiment is different from the first example embodiment in the content of a process for determining whether or not a user is a registrant. In the following, differences from the first example embodiment will be mainly described, and the description of the common parts will be omitted or simplified.
-
Fig. 8 is a functional block diagram of theearphone 2 and theinformation communication device 1 according to the present example embodiment. The present example embodiment differs from the first example embodiment in that thedetermination unit 124 further uses not only the third acoustic information acquired by the third acousticinformation acquisition unit 123 but also the first acoustic information acquired by the first acousticinformation acquisition unit 121 to make a determination. -
Fig. 9 is a flowchart illustrating a biometric matching process according to present example embodiment performed by theinformation communication device 1. Since the difference with first example embodiment is only that step S108 is replaced with step S112, step S112 will be described here. - In step S112, the
determination unit 124 determines whether or not theuser 3 is the registrant by matching information obtained by integrating the first acoustic information and the third acoustic information against biological information of the registrant previously recorded in theHDD 104. If it is determined that theuser 3 is the registrant (YES in step S109), the process proceeds to step S110. If it is determined that theuser 3 is not the registrant (NO in step S109), the process proceeds to step S111. - The integration of the first acoustic information and the third acoustic information in step S112 will be described in more detail. The first acoustic information is information mainly based on the ear acoustic sound of the
user 3, and the third acoustic information is information obtained by performing arithmetic processing of the ear acoustic sound of theuser 3 and the sound of the voice each other. Therefore, the first acoustic information and the third acoustic information include different biological information. In general, performing a two-factor matching using two different kinds of biological information improves the accuracy of the matching. Therefore, in the present example embodiment, the first acoustic information and the third acoustic information are integrated in step S112, and the two-factor matching is performed by using the integrated result. Thus, the matching accuracy can be further improved. - A specific example of integration of acoustic information will be described. Assume a case in which, as described in the first example embodiment, a technique is used in the matching in the
determination unit 124 for extracting feature amounts from the acoustic information and calculating a matching score indicating the similarity of the feature amounts. In this case, thedetermination unit 124 calculates a first matching score based on the first acoustic information and a second matching score based on the third acoustic information. Thereafter, thedetermination unit 124 calculates a third matching score obtained by combining the first matching score and the second matching score by addition, averaging, linear combination, multiplication, or the like. Thereafter, thedetermination unit 124 determines whether or not theuser 3 is a registrant based on whether or not the third matching score exceeds a predetermined threshold. By using this technique, a two-factor matching for integrating and using a plurality of kinds of biological information is realized, and matching accuracy is further improved. - If the matching is based on the first acoustic information and the third acoustic information, a technique other than the above-described technique may be used. For example, the technique may be that the first matching based on the first acoustic information and the second matching based on the third acoustic information may be performed, and then the logical product or logical sum of the result of the first matching and the result of the second matching may be used as the final matching result to perform the determination. In addition, when the result of the first matching is different from the result of the second matching, the matching result that the judgment is impossible may be output.
- In general, it is known that higher matching accuracy can be obtained with the matching method using the ear acoustic sound than with the matching method using the frequency characteristic of the voice (voiceprint). Therefore, the acoustic information added to the third acoustic information as described above is preferably the first acoustic information including information mainly based on the ear acoustic sound of the
user 3. However, in a case where sufficient accuracy can be obtained by the matching method using the voiceprint, the second acoustic information and the third acoustic information may be integrated by using the second acoustic information instead of the first acoustic information. - As yet another example, the first acoustic information, the second acoustic information, and the third acoustic information may be integrated. In this case, since the matching in which more acoustic information is considered is realized, the matching accuracy is further improved.
- The information processing system of the present example embodiment is different from the first example embodiment in that it has a function of a noise cancellation. In the following, the difference from first example embodiment will be mainly described, and the description of the common parts will be omitted or simplified.
-
Fig. 10 is a schematic diagram illustrating the general configuration of an information processing system according to the present example embodiment. In the present example embodiment, anearphone 2 includes a plurality ofmicrophones microphone 28 is controlled by anearphone control device 20. Themicrophone 28 is arranged on the back side opposite to the wearing surface of theearphone 2 so as to receive sound waves from the outside when the microphone is worn. - In the acquisition of the first acoustic information or the second acoustic information, the external sound may be noise. Therefore, in the present example embodiment, a plurality of
microphones earphone 2, and theearphone 2 has a function of reducing the influence of external environmental sound by performing noise cancellation using the sound acquired by themicrophone 28. Here, the environmental sound includes not only the sound generated in the outside of theuser 3 but also the sound which may be noise in the matching such as the sound echoed in the outside among the sounds emitted by theuser 3 or thespeaker 26. Themicrophone 27 and themicrophone 28 are sometimes referred to as a first microphone and a second microphone, respectively. - Noise cancellation will be described. The environmental sound is incident on the
microphones microphone 27 on the sound obtained by themicrophone 28 in reversed phase. Since the intensity of the sound received by themicrophones earphone 2 or the like, one or both sounds may be amplified or attenuated before being superimposed. - The processing for superimposing the sound in the reversed phase may be signal processing performed digitally on the obtained audio data or may be performed as analog process by emitting the sound by the
speaker 26 in the reversed phase to the sound obtained by themicrophone 28. - The system described in the above example embodiment can also be configured as in the following fourth example embodiment.
-
Fig. 11 is a functional block diagram of theinformation processing device 4 according to the fourth example embodiment. Theinformation processing device 4 includes a first acousticinformation acquisition unit 421, a second acousticinformation acquisition unit 422, and a third acousticinformation acquisition unit 423. The first acousticinformation acquisition unit 421 acquires a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user. The second acousticinformation acquisition unit 422 acquires a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device. The third acousticinformation acquisition unit 423 acquires a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information. - According to the present example embodiment, there is provided an
information processing device 4 capable of improving the accuracy of the biometric matching using the acoustic information acquired by the wearable device. - The present invention is not limited to the example embodiments described above, and may be suitably modified within the scope of the present invention. For example, an example in which a part of the configuration of one embodiment is added to another embodiment or an example in which a part of the configuration of another embodiment is replaced is also an example embodiment of the present invention.
- In the above example embodiment, although the
earphone 2 is exemplified as an example of a wearable device, the present invention is not limited to a device worn on the ear as long as acoustic information necessary for processing can be acquired. For example, the wearable device may be a bone conduction type acoustic device. - In the example embodiment described above, it is assumed that the second acoustic information is obtained by receiving the voice emitted by the
user 3, but the present invention is not limited thereto. That is, the second sound source for generating the sound wave for acquiring the second acoustic information may be other than the voice emitting organ of theuser 3. For example, when a second speaker different from thespeaker 26 is separately provided in theearphone 2 or another device, the second speaker may be a second sound source. When the wearable device is a pair of earphones worn on both ears of theuser 3, for example, the earphone of the right ear may be anearphone 2 having the function of the ear acoustic sound matching described in the first to third example embodiment, and the earphone of the left ear may be an earphone having the second speaker described above. In this example, by emitting a sound wave for acquiring the second acoustic information from the second speaker in the left ear and receiving the sound wave by themicrophone 27 in the right ear, the same processing as the first to third example embodiment can be performed. In the first to third example embodiment, the frequency usable for using the voice is limited to the range of the voice that the human can emit, but in this example, since the voice is not used, there is no such restriction, and it is possible to use a non-audible sound having a frequency such as an ultrasonic band, for example. By using the non-audible sound, the sound wave for the matching can be made difficult to be perceived by theuser 3, and comfort in use is improved. - The scope of each of the example embodiments also includes a processing method that stores, in a storage medium, a program that causes the configuration of each of the example embodiments to operate so as to implement the function of each of the example embodiments described above, reads the program stored in the storage medium as a code, and executes the program in a computer. That is, the scope of each of the example embodiments also includes a computer readable storage medium. Further, each of the example embodiments includes not only the storage medium in which the computer program described above is stored but also the computer program itself. Further, one or two or more components included in the example embodiments described above may be a circuit such as an application specific integrated circuit (ASIC)), a field programmable gate array (FPGA), or the like configured to implement the function of each component.
- As the storage medium, for example, a floppy (registered trademark) disk, a hard disk, an optical disk, a magneto-optical disk, a compact disk (CD) -ROM, a magnetic tape, a nonvolatile memory card, or a ROM can be used. Further, the scope of each of the example embodiments includes an example that operates on operating system (OS) to perform a process in cooperation with another software or a function of an add-in board without being limited to an example that performs a process by an individual program stored in the storage medium.
- Further, a service implemented by the function of each of the example embodiments described above may be provided to a user in a form of software as a service (SaaS).
- It should be noted that the above-described embodiments are merely examples of embodying the present invention, and the technical scope of the present invention should not be limitedly interpreted by these. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.
- The whole or part of the example embodiments disclosed above can be described as, but not limited to, the following supplementary notes.
- An information processing device comprising:
- a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user;
- a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; and
- a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- The information processing device according to
supplementary note 1, wherein the first acoustic information includes a transmission characteristic of an ear canal of the user. - The information processing device according to supplementary 1 or 2, wherein the first sound source is a speaker provided in the wearable device.
- The information processing device according to any one of
supplementary notes 1 to 3, wherein the wearable device is an earphone worn on an ear of the user. - The information processing device according to any one of
supplementary notes 1 to 4, wherein the second acoustic information includes a transmission characteristic of a voice emitted by the user. - The information processing device according to any one of
supplementary notes 1 to 5, wherein the second sound source is a sound emitting organ of the user. - The information processing device according to any one of
supplementary notes 1 to 4, - wherein the first sound source is a speaker provided in the wearable device worn on an ear of the user, and
- wherein a second sound source is a speaker provided in the wearable device or another wearable device worn on the other ear of the user.
- The information processing device according to any one of
supplementary notes 1 to 7, wherein the third acoustic information acquisition unit generates and acquires the third acoustic information by subtracting or dividing one of the first acoustic information and the second acoustic information from the other. - The information processing device according to any one of
supplementary notes 1 to 8, wherein a frequency band of the first acoustic information and a frequency band of the second acoustic information at least partially overlap. - The information processing device according to any one of
supplementary notes 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information. - The information processing device according to any one of
supplementary notes 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information and at least one of the first acoustic information and the second acoustic information. - The information processing device according to any one of
supplementary notes 1 to 11, - wherein the wearable device includes a first microphone and a second microphone arranged at different positions each other, and
- wherein at least one of the first acoustic information acquisition unit and the second acoustic information acquisition unit acquires an acoustic information in which at least a part of an environmental sound is canceled based on a sound wave received by the first microphone and a sound wave received by the second microphone.
- A wearable device comprising:
- a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;
- a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; and
- a third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- An information processing method comprising:
- acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;
- acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; and
- acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- A storage medium storing a program that causes a computer to perform:
- acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;
- acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; and
- acquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- This application is based upon and claims the benefit of priority from
Japanese Patent Application No. 2019-004003, filed on January 15, 2019 -
- 1
- information communication device
- 2
- earphone
- 3
- user
- 4
- information processing device
- 20
- earphone control device
- 26
- speaker
- 27, 28
- microphone
- 101, 201
- CPU
- 102, 202
- RAM
- 103, 203
- ROM
- 104
- HDD
- 105, 207
- communication I/F
- 106
- input device
- 107
- output device
- 121, 421
- first acoustic information acquisition unit
- 122, 422
- second acoustic information acquisition unit
- 123, 423
- third acoustic information acquisition unit
- 124
- determination unit
- 204
- flash memory
- 205
- speaker I/F
- 206
- microphone I/F
- 208
- battery
Claims (15)
- An information processing device comprising:a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by a wearable device worn by a user;a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; anda third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- The information processing device according to claim 1, wherein the first acoustic information includes a transmission characteristic of an ear canal of the user.
- The information processing device according to claim 1 or 2, wherein the first sound source is a speaker provided in the wearable device.
- The information processing device according to any one of claims 1 to 3, wherein the wearable device is an earphone worn on an ear of the user.
- The information processing device according to any one of claims 1 to 4, wherein the second acoustic information includes a transmission characteristic of a voice emitted by the user.
- The information processing device according to any one of claims 1 to 5, wherein the second sound source is a sound emitting organ of the user.
- The information processing device according to any one of claims 1 to 4,wherein the first sound source is a speaker provided in the wearable device worn on an ear of the user, andwherein a second sound source is a speaker provided in the wearable device or another wearable device worn on the other ear of the user.
- The information processing device according to any one of claims 1 to 7, wherein the third acoustic information acquisition unit generates and acquires the third acoustic information by subtracting or dividing one of the first acoustic information and the second acoustic information from the other.
- The information processing device according to any one of claims 1 to 8, wherein a frequency band of the first acoustic information and a frequency band of the second acoustic information at least partially overlap.
- The information processing device according to any one of claims 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information.
- The information processing device according to any one of claims 1 to 9 further comprising a determination unit configured to determine whether the user is a registrant or not based on the third acoustic information and at least one of the first acoustic information and the second acoustic information.
- The information processing device according to any one of claims 1 to 11,wherein the wearable device includes a first microphone and a second microphone arranged at different positions each other, andwherein at least one of the first acoustic information acquisition unit and the second acoustic information acquisition unit acquires an acoustic information in which at least a part of an environmental sound is canceled based on a sound wave received by the first microphone and a sound wave received by the second microphone.
- A wearable device comprising:a first acoustic information acquisition unit configured to acquire a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;a second acoustic information acquisition unit configured to acquire a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; anda third acoustic information acquisition unit configured to acquire a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- An information processing method comprising:acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; andacquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
- A storage medium storing a program that causes a computer to perform:acquiring a first acoustic information obtained by receiving a sound wave emitted from a first sound source by the wearable device worn by a user;acquiring a second acoustic information obtained by receiving a sound wave emitted from a second sound source that is different from the first sound source by the wearable device; andacquiring a third acoustic information used for biometric matching of the user based on the first acoustic information and the second acoustic information.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2019004003 | 2019-01-15 | ||
PCT/JP2020/000195 WO2020149175A1 (en) | 2019-01-15 | 2020-01-07 | Information processing apparatus, wearable device, information processing method, and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3913926A1 true EP3913926A1 (en) | 2021-11-24 |
EP3913926A4 EP3913926A4 (en) | 2022-03-16 |
Family
ID=71613856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP20740784.2A Withdrawn EP3913926A4 (en) | 2019-01-15 | 2020-01-07 | Information processing apparatus, wearable device, information processing method, and storage medium |
Country Status (6)
Country | Link |
---|---|
US (1) | US20220093120A1 (en) |
EP (1) | EP3913926A4 (en) |
JP (1) | JP7131636B2 (en) |
CN (1) | CN113475095A (en) |
BR (1) | BR112021013445A2 (en) |
WO (1) | WO2020149175A1 (en) |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004065363A (en) | 2002-08-02 | 2004-03-04 | Sony Corp | Individual authentication device and method, and signal transmitter |
JP4411959B2 (en) | 2003-12-18 | 2010-02-10 | ソニー株式会社 | Audio collection / video imaging equipment |
JP4937661B2 (en) * | 2006-07-31 | 2012-05-23 | ナップエンタープライズ株式会社 | Mobile personal authentication method and electronic commerce method |
US9118488B2 (en) * | 2010-06-17 | 2015-08-25 | Aliphcom | System and method for controlling access to network services using biometric authentication |
EP3285497B1 (en) * | 2015-04-17 | 2021-10-27 | Sony Group Corporation | Signal processing device and signal processing method |
JP6855381B2 (en) * | 2015-10-21 | 2021-04-07 | 日本電気株式会社 | Personal authentication device, personal authentication method and personal authentication program |
JP6943248B2 (en) | 2016-08-19 | 2021-09-29 | 日本電気株式会社 | Personal authentication system, personal authentication device, personal authentication method and personal authentication program |
US10460095B2 (en) * | 2016-09-30 | 2019-10-29 | Bragi GmbH | Earpiece with biometric identifiers |
JP6835956B2 (en) * | 2017-04-28 | 2021-02-24 | 日本電気株式会社 | Personal authentication device, personal authentication method and personal authentication program |
WO2018213746A1 (en) * | 2017-05-19 | 2018-11-22 | Plantronics, Inc. | Headset for acoustic authentication of a user |
JP2019004003A (en) | 2017-06-13 | 2019-01-10 | 日東電工株式会社 | Electromagnetic wave absorber and electromagnetic wave absorber equipped molded article |
GB201801526D0 (en) * | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
-
2020
- 2020-01-07 JP JP2020566380A patent/JP7131636B2/en active Active
- 2020-01-07 BR BR112021013445-0A patent/BR112021013445A2/en unknown
- 2020-01-07 WO PCT/JP2020/000195 patent/WO2020149175A1/en unknown
- 2020-01-07 EP EP20740784.2A patent/EP3913926A4/en not_active Withdrawn
- 2020-01-07 CN CN202080016555.9A patent/CN113475095A/en active Pending
- 2020-01-07 US US17/421,512 patent/US20220093120A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
BR112021013445A2 (en) | 2021-10-19 |
EP3913926A4 (en) | 2022-03-16 |
WO2020149175A1 (en) | 2020-07-23 |
US20220093120A1 (en) | 2022-03-24 |
JP7131636B2 (en) | 2022-09-06 |
JPWO2020149175A1 (en) | 2021-10-28 |
CN113475095A (en) | 2021-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240080605A1 (en) | Information processing device, wearable device, information processing method, and storage medium | |
US11937040B2 (en) | Information processing device, information processing method, and storage medium | |
JP2010011447A (en) | Hearing aid, hearing-aid processing method and integrated circuit for hearing-aid | |
KR101535112B1 (en) | Earphone and mobile apparatus and system for protecting hearing, recording medium for performing the method | |
US10783903B2 (en) | Sound collection apparatus, sound collection method, recording medium recording sound collection program, and dictation method | |
EP2903002A1 (en) | Method, device, and program for voice masking | |
EP3913926A1 (en) | Information processing apparatus, wearable device, information processing method, and storage medium | |
JP2012063614A (en) | Masking sound generation device | |
EP3070709A1 (en) | Sound masking apparatus and sound masking method | |
JP4785563B2 (en) | Audio processing apparatus and audio processing method | |
KR20110018766A (en) | Sound source playing apparatus for compensating output sound source signal and method of performing thereof | |
KR102038464B1 (en) | Hearing assistant apparatus | |
KR102353771B1 (en) | Apparatus for generating test sound based hearing threshold and method of the same | |
JP2021022883A (en) | Voice amplifier and program | |
JP2011170113A (en) | Conversation protection degree evaluation system and conversation protection degree evaluation method | |
EP3900630A1 (en) | Information processing apparatus, wearable-type device, information processing method, and storage medium | |
JP2020071306A (en) | Voice transmission environment evaluation system and sensibility stimulus presentation device | |
CN116017250A (en) | Data processing method, device, storage medium, chip and hearing aid device | |
JP7315045B2 (en) | Information processing device, wearable device, information processing method, and storage medium | |
JP5691180B2 (en) | Maska sound generator and program | |
Aharonson et al. | Harnessing Music to Enhance Speech Recognition | |
US20220039779A1 (en) | Information processing device, wearable device, information processing method, and storage medium | |
JP2014202777A (en) | Generation device and generation method and program for masker sound signal | |
JP2021135361A (en) | Sound processing device, sound processing program and sound processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20210816 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20220211 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 17/00 20130101ALI20220207BHEP Ipc: G06F 21/32 20130101ALI20220207BHEP Ipc: H04R 1/10 20060101AFI20220207BHEP |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20230126 |