CN112420031A - Equipment control method and device - Google Patents

Equipment control method and device Download PDF

Info

Publication number
CN112420031A
CN112420031A CN201911209307.1A CN201911209307A CN112420031A CN 112420031 A CN112420031 A CN 112420031A CN 201911209307 A CN201911209307 A CN 201911209307A CN 112420031 A CN112420031 A CN 112420031A
Authority
CN
China
Prior art keywords
microphone
audio
signal
action
signal type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911209307.1A
Other languages
Chinese (zh)
Inventor
王乐临
余晓伟
李海婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN112420031A publication Critical patent/CN112420031A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Telephone Function (AREA)
  • Selective Calling Equipment (AREA)

Abstract

The application provides a device control method and a device, wherein the method is applied to a first device with at least one microphone and comprises the following steps: collecting an audio signal by a microphone; determining a signal type and an audio characteristic of the audio signal; the signal type at least comprises wind noise; judging whether the signal type and the audio characteristic meet preset conditions or not; and when the signal type and the audio characteristic accord with the preset condition, triggering a control command corresponding to the preset condition, wherein the control command is used for controlling the second equipment. By implementing the embodiment of the application, the device can be controlled by using the microphone under the condition that the user does not make a sound, and the user experience is improved.

Description

Equipment control method and device
Technical Field
The present application relates to the field of electronic device control, and in particular, to a device-based control method and apparatus.
Background
With the development of smart phones, more and more portable audio devices are provided, and many portable devices are equipped with microphones (microphones). For example, devices such as mobile phones, tablet computers, and earphones generally have a microphone, and the microphone is mainly used for collecting voice of a user to perform conversation, recording, and the like.
In order to expand the application of the microphone, in an existing scheme, the microphone can be used for realizing voice control on equipment. The user speaks to the microphone to send out a voice command, the microphone collects the voice command of the user, the semantics are recognized by the voice recognition technology to form a control command, and then the control of the equipment is completed according to the control command.
However, in some situations where silence is required, the user is not suitable or willing to speak, and in such situations, the device cannot be controlled by means of voice control, which causes trouble to the user.
Disclosure of Invention
The application provides a device control method and device, which can realize that a microphone is used for controlling a device under the condition that a user does not make a sound, solve the pain of the user and improve the user experience.
In a first aspect, the present application provides a device control method, applied to a first device having at least one microphone, comprising: acquiring an audio signal by the at least one microphone; determining a signal type and an audio characteristic of the audio signal; the signal type comprises at least wind noise; judging whether the signal type and the audio frequency feature accord with preset conditions or not; and when the signal type and the audio frequency characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions.
Wherein the audio signal may be caused by a silent operation of a user, the silent operation being an operation that is triggered by a hand of the user and is not perceived as a sound by human ears. For example, the minimum sound intensity audible to the human ear is 1 decibel (db), and the sound caused by the silent operation of the user may be lower than 1 decibel. The silent operation may be, for example, at least one of rubbing at least one microphone, clicking at least one microphone, fanning near at least one microphone, blowing near at least one microphone. The audio signal resulting from the silent operation may be a wind noise signal or similar to a wind noise signal.
Wherein the signal type may be a type of audio signal caused by wind noise. For example, in one particular classification, the signal types may be divided, for example, into: signal type caused by rubbing the microphone, signal type caused by clicking the microphone, signal type caused by blowing the microphone. The signal type directly reflects the action type of silent operation of the user, which can be divided into, for example: rubbing a microphone, clicking a microphone, blowing a microphone fan, or blowing a microphone, among other types of actions.
The audio features may be target features in the audio signal, and the audio features reflect motion features of the silent operation, such as motion frequency, motion strength, motion rhythm, motion sequence, and the like.
The combination of the signal type and the audio characteristic has a mapping relation with the control command, and when the signal type and the audio characteristic meet a preset condition (for example, the combination of the signal type and the audio characteristic meet a preset combination), the control command corresponding to the preset condition (for example, the preset combination) can be triggered.
It can be seen that, in the embodiment of the present application, when an audio signal (wind noise signal) caused by a silent operation of a user rubbing or clicking a microphone or a silent operation of fanning or blowing in the vicinity of the microphone is collected by the microphone, the first device may identify a signal type and an audio feature of the audio signal, thereby implementing identification of the silent operation, and further may implement control of the second device according to a control command corresponding to a combination of the signal type and the audio feature. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
Based on the first aspect, in a possible embodiment, the first device is communicatively connected to a second device, and the control command is used to control the second device.
For example, the first device may be a wearable device with a microphone, such as a headset, smart glasses, smart watch, smart bracelet, and the second device may be a mobile terminal with controlled requirements, such as a smartphone, a tablet computer, a laptop computer, and the like. Under the condition that the user does not make a sound, the first equipment can be used for realizing the control of the second equipment by using the method, and the use experience of the user is greatly improved.
Based on the first aspect, in a possible embodiment, the control command is used to control the first device. In such an arrangement, the first device may be self-controlled based on the methods described herein. For example, taking the first device as a smartphone, the smartphone may include one or more microphones, including, for example, a top microphone disposed at a top of the smartphone and a bottom microphone disposed at a bottom of the smartphone. Under the condition that the user does not make a sound, the user can perform silent operation on at least one of the two microphones according to needs to trigger the control on the smart phone, so that the control means of the user on the smart phone is expanded, and the use experience of the user is improved.
Based on the first aspect, in a possible embodiment, the determining the signal type of the audio signal includes: the signal type is obtained by detecting one or more of a time-domain feature and a frequency spectrum feature of the audio signal. Wherein the time domain feature represents a time domain pulse signal of the audio signal caused by the silent operation, and can be embodied by a variation relation of amplitude and time; the spectral feature represents the frequency spectral density of the audio signal caused by silent operation, and may be represented by a varying relationship of amplitude to frequency, for example. The signal type can be quickly identified through an algorithm mode, and the cost is low.
For example, the time domain characteristics of the audio signal caused by different silent operations are different, and the time domain characteristics represent the time domain pulse signals of the audio signal caused by the silent operations. Then after the audio signal is acquired, the signal type of the current signal may be determined by a detection algorithm based on the time-domain characteristics of the audio signal.
As another example, the audio signal caused by the silent operation may be a wind noise signal or similar to the wind noise signal, and the first device may detect whether the audio signal collected by the microphone includes a signal with wind noise characteristics by using a wind noise detection algorithm. For example, a digital signal processing-based method is adopted to calculate the power spectral density of the frequency spectrum of the acquired audio signal, and whether the audio signal has a signal with wind noise characteristics or not is identified through the characteristics of the power spectral density. If so, it indicates that the audio signal is caused by a silent operation, otherwise, it indicates that the audio signal is not caused by a silent operation (e.g. the user's speaking voice, background environmental noise, etc.). Then, the signal types corresponding to different silent operations can be distinguished by a detection algorithm. For example, for audio signals caused by rubbing or clicking on the microphone pick-up location, the resulting audio signal is more energetic than the audio signal caused by blowing/blowing air into the microphone due to the close proximity applied to the microphone pick-up location. The first device can distinguish the signal type of the audio signal (noise signal) caused by the silent operation by the magnitude of the energy of the signal, and the signal type caused by the action of fanning/blowing near the microphone when the energy is less than the set threshold value is considered, otherwise, the signal type caused by the action of rubbing or clicking the microphone sound collecting part position is considered. Thereby enabling the first device to distinguish between different types of signals for silent operation.
Based on the first aspect, in a possible embodiment, the determining the signal type of the audio signal includes: obtaining the signal type according to the audio signal and a neural network model; wherein the neural network model characterizes a mapping relationship between the audio signal and the signal type. The accuracy and the identification efficiency of the signal type can be improved through a machine learning method.
For example, a model may be obtained by performing model training using a large amount of training data in advance using a machine learning method. The training data includes audio signal data caused by silent operation (such as an audio signal caused by silent operation such as rubbing a microphone, clicking a microphone, blowing a microphone, or blowing a microphone) and signal type, i.e. the model can represent the mapping relationship between the audio signal and the signal type. The first device obtains and saves the model. When the first device captures an audio signal, the audio signal may be input to the model, thereby obtaining a signal type of the model output.
The machine learning model may specifically be one of the following: neural Network (NN) models, Deep Neural Network (DNN) models, Factorization-machine-supported Neural Network (FNN) models, Convolutional Neural Network (CNN) models, Inner Product-based Neural Network (IPNN) models, Outer Product-based Neural Network (OPNN) models, Neural decomposition machine (NFM) models, and so forth.
Based on the first aspect, in a possible embodiment, the determining the signal type and the audio feature of the audio signal includes: and obtaining the audio features by performing feature extraction on the audio signals.
Wherein the audio features include, for example: the audio signal may be one or more of frequency, energy, duration characteristic, intensity characteristic, and order of acquiring different signal types. The frequency of the wind noise pulse reflects the frequency of pulse excitation, the speed of pulse excitation and the like in the audio signal. The energy of the pulse represents the amount of energy excited by the pulse in the audio signal. The duration of the pulse represents the duration of the pulse in the audio signal. The order in which different signal types are collected may be the order in which a single microphone collects audio signals resulting from silent operation of different types (e.g., types of actions such as rubbing, clicking, blowing, or the like to the microphone), or the order in which different microphones (e.g., two microphones) collect audio signals resulting from silent operation of the same or different types.
The audio features reflect the action characteristics of the silent operation, that is, the action characteristics (e.g., one or more of action frequency, action strength, action rhythm, action sequence, etc.) of the silent operation can be identified according to the audio features (e.g., one or more of frequency, energy, duration, intensity, and sequence of acquisition of different signal types of pulses in the audio signal).
For example, the audio signal picked up by the microphone of the first device may comprise audio signals generated by a plurality of action types of silent operation, the audio signals may be segmented by voice activation detection techniques, each segment representing one action type, and the speed of the actions of different action types may be distinguished by detecting the duration of the audio signal corresponding to the action to be recognized within one segment and the interval of the start time of the audio signal corresponding to the action to be recognized within a different segment. And obtaining the action frequency of different action types by counting the action types in different segments. By counting the action types in different segments and the respective continuous action frequency in a period of time, the action rhythm or action sequence of the silent operation can be further obtained. Thus, it is achieved that the first device obtains the action characteristic corresponding to the silently operated audio signal.
Based on the first aspect, in a possible embodiment, the preset condition is one or more preset combinations including a signal type and an audio feature; the judging whether the signal type and the audio frequency feature meet preset conditions includes: and judging whether a combination formed by one or more of the signal type and the frequency, the energy, the duration characteristic, the tone intensity characteristic and the sequence of acquiring different signal types conforms to the preset combination.
And each preset combination has a mapping relation with at least one control instruction.
When the signal type and the audio characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions, wherein the control command comprises: and when the combination of the signal type and the audio frequency characteristics accords with a target preset combination in the preset combinations, triggering a control command corresponding to the target preset combination.
In the embodiment of the application, the action type and the signal type have a corresponding relationship, and the action characteristic and the signal characteristic have a corresponding relationship. Therefore, the combination (preset combination) of the signal type and the audio feature can be embodied by the combination (preset combination) of the action type and the action feature, and then the mapping relationship between the combination of the action type and the action feature and the control command can be stored in advance, so that the corresponding control command can be obtained according to the mapping relationship. Therefore, the method and the device can quickly determine the control command corresponding to the combination of the signal type and the audio characteristics, reduce the control time delay and improve the user experience.
Based on the first aspect, in a possible embodiment, the first device is, for example, a headset (such as a wireless headset), and the second device is, for example, a smart device; the first device and the second device are communicatively coupled. The control command is used for executing at least one of the following controls on the second device: controlling the second device to perform a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing. The designated functional mode may be, for example, one or more of a mute mode, a vibrate mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like.
Based on the first aspect, in a possible embodiment, the first device is, for example, a smart device; the control command is used for executing at least one of the following controls on the first device: controlling the first device to execute a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing. The designated functional mode may be, for example, one or more of a mute mode, a vibrate mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like.
Based on the first aspect, in a possible embodiment, the control command corresponding to the signal type and the audio feature may also be determined by: determining an action type corresponding to the signal type, the action type representing an operation type of a user action causing the noise signal; determining action characteristics corresponding to the audio characteristics, wherein the action characteristics comprise one or more of action frequency, action strength, action rhythm and action sequence of the user actions; and determining a control command corresponding to the action type and the action characteristic according to a mapping relation (such as a mapping table) between the action type and the action characteristic preset by a user and the control command.
Based on the first aspect, in a possible embodiment, the action type includes: to the at least one microphone fan;
the action characteristics include at least one of: the number of times of inputting to the at least one microphone fan is one or more, the number of times of inputting to the at least one microphone fan is larger than or equal to a first time threshold, the time length of inputting to the at least one microphone fan is larger than or equal to a first time threshold, the intensity of inputting to the at least one microphone fan is larger than or equal to a first intensity threshold, and the interval of inputting to the at least one microphone fan is larger than or equal to a first time interval.
Based on the first aspect, in a possible embodiment, the action type includes: rubbing the at least one microphone;
the action characteristics include at least one of: the at least one microphone is rubbed one or more times, the at least one microphone is rubbed more than or equal to a second time threshold, the at least one microphone is rubbed more than or equal to a second intensity threshold, and the at least one microphone is rubbed more than or equal to a second time interval.
Based on the first aspect, in a possible embodiment, the action type includes: blowing air into the at least one microphone;
the action characteristics include at least one of: the number of times of blowing to the at least one microphone is one or more, the number of times of blowing to the at least one microphone is greater than or equal to a third time threshold, the time duration of blowing to the at least one microphone is greater than or equal to a third time threshold, the intensity of blowing to the at least one microphone is greater than or equal to a third intensity threshold, and the interval of blowing to the at least one microphone is greater than or equal to a third time interval.
Based on the first aspect, in a possible embodiment, the action type includes: clicking the at least one microphone;
the action characteristics include at least one of: the number of times of clicking the at least one microphone is one or more, the number of times of clicking the at least one microphone is greater than or equal to a fourth time threshold, the time length of clicking the at least one microphone is greater than or equal to a fourth time threshold, the intensity of clicking the at least one microphone is greater than or equal to a fourth intensity threshold, and the interval of clicking the at least one microphone is greater than or equal to a fourth time interval.
Based on the first aspect, in a possible embodiment, when the first device is further provided with other sensors besides the microphone, the other sensors can be utilized to assist the first device in detecting whether a noise signal corresponding to a silent operation to be identified exists in an audio signal collected by the microphone, so that the accuracy and reliability of identification are improved, the user experience is further improved, and the power consumption caused by misrecognition can be saved.
Based on the first aspect, in a possible embodiment, the first device may further control, by using information provided by the second device or other terminal devices or a server, the on or off of the function of the device control method provided by the present application, so as to improve the reliability and accuracy of the device and improve the user experience.
In a second aspect, the present application provides an apparatus for device control, applied to a first device having at least one microphone, comprising: the acquisition module is used for acquiring the audio signals through the at least one microphone; a signal processing module for determining a signal type and an audio characteristic of the audio signal; the signal type comprises at least wind noise; the control module is used for judging whether the signal type and the audio frequency characteristics accord with preset conditions or not; and when the signal type and the audio frequency characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions. The functional modules of the device are specifically used for implementing the method described in the first aspect.
It can be seen that, in the embodiment of the present application, when an audio signal (wind noise signal) caused by a silent operation of a user rubbing or clicking a microphone or a silent operation of fanning or blowing in the vicinity of the microphone is collected by the microphone, the first device may identify a signal type and an audio feature of the audio signal, thereby implementing identification of the silent operation, and further may implement control of the second device according to a control command corresponding to a combination of the signal type and the audio feature. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
Based on the second aspect, in a possible embodiment, the first device is communicatively connected to a second device, and the control command is used to control the second device.
Based on the second aspect, in a possible embodiment, the control command is used for controlling the first device.
Based on the second aspect, in a possible embodiment, the signal processing module is configured to: obtaining the signal type by detecting one or more of time domain characteristics and frequency spectrum characteristics of the audio signal; and obtaining the audio features by performing feature extraction on the audio signals.
Based on the second aspect, in a possible embodiment, the signal processing module is configured to: obtaining the signal type according to the audio signal and a neural network model; wherein the neural network model characterizes a mapping relationship between the audio signal and the signal type; and obtaining the audio features by performing feature extraction on the audio signals.
Based on the second aspect, in a possible embodiment, the audio features include: the audio signal may be one or more of frequency, energy, duration characteristic, intensity characteristic, and order of acquiring different signal types.
Based on the second aspect, in a possible embodiment, the preset condition is at least one preset combination comprising a signal type and an audio feature; the control module is specifically configured to determine whether a combination formed by one or more of the signal type and the frequency, the energy, the duration characteristic, the sound length characteristic, the sound intensity characteristic of the wind noise pulse, and the sequence of acquiring different signal types matches the preset combination.
Based on the second aspect, in a possible embodiment, each preset combination has a mapping relation with at least one control instruction;
the control module is specifically configured to trigger a control command corresponding to the target preset combination when the combination of the signal type and the audio feature conforms to the target preset combination in the preset combination.
Based on the second aspect, in a possible embodiment, the first device is for example a headset (such as a wireless headset), and the second device is for example a smart device; the first device and the second device are communicatively coupled. The control command is used for executing at least one of the following controls on the second device: controlling the second device to perform a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing. The designated functional mode may be, for example, one or more of a mute mode, a vibrate mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like.
Based on the second aspect, in a possible embodiment, the first device is, for example, a smart device; the control command is used for executing at least one of the following controls on the first device: controlling the first device to execute a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing. The designated functional mode may be, for example, one or more of a mute mode, a vibrate mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like.
In a third aspect, an embodiment of the present application provides an apparatus, where the apparatus is a first apparatus, and the apparatus includes: at least one microphone; one or more processors; a memory; and one or more computer programs; wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the first device, cause the first device to perform the method as described in the first aspect.
In a fourth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to perform the method in the first aspect or any possible implementation manner of the first aspect.
Optionally, as an implementation, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, and when the instructions are executed, the processor is configured to execute the first aspect or the method in any possible implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium storing program code for execution by a device, the program code including instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.
In a sixth aspect, an embodiment of the present invention provides a computer program product, which may be a software installation package, the computer program product including program instructions, and when the computer program product is executed by an electronic device, a processor of the electronic device executes the method in any one of the foregoing first aspects.
It can be seen that, in the embodiment of the present application, when an audio signal caused by a silent operation of rubbing or clicking a microphone by a user or a silent operation of fanning or blowing near the microphone is collected by the microphone, the first device may identify a signal type and an audio feature of the audio signal, and further determine an action type and an action feature corresponding to the signal type and the audio feature, respectively, so as to identify the silent operation, and further may implement control over the second device according to a control command corresponding to a combination of the action type and the action feature. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
Drawings
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 2 is an exemplary schematic diagram of a wireless communication headset according to an embodiment of the present disclosure;
fig. 3 is an exemplary schematic diagram of an in-ear earphone for wireless communication according to an embodiment of the present disclosure;
fig. 4 is an exemplary diagram of a wireless communication neckphone according to an embodiment of the present disclosure;
fig. 5 is an exemplary schematic diagram of a smart phone according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an exemplary apparatus provided by an embodiment of the present application;
fig. 7 is a schematic flowchart of an apparatus control method according to an embodiment of the present application;
fig. 8 is a schematic diagram illustrating a silent operation scenario of a friction microphone according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram illustrating a scene of a silent operation of a click-to-wipe microphone according to an embodiment of the present application;
fig. 10 is a schematic diagram illustrating a scenario of silent operation of a fan wind to a microphone according to an embodiment of the present application;
fig. 11 is a schematic diagram of a scene of silent operation of blowing air into a microphone according to an embodiment of the present application;
fig. 12 is a schematic view of a scene implemented by the method of the present application according to an embodiment of the present application;
fig. 13 is a schematic view of a scene implemented by the method of the present application according to an embodiment of the present application;
fig. 14 is a schematic flowchart of an apparatus control method according to an embodiment of the present application;
fig. 15 is a schematic diagram of a time-domain pulse signal of a frictive microphone according to an embodiment of the present disclosure;
FIG. 16 is a schematic diagram of a time domain pulse signal to a microphone wind according to an embodiment of the present application;
fig. 17 is a schematic diagram of a time-domain pulse signal of a human voice according to an embodiment of the present application;
fig. 18 is a schematic flowchart of an apparatus control method according to an embodiment of the present application;
FIG. 19 is a schematic diagram of a user interface provided by an embodiment of the present application;
fig. 20 is a schematic structural diagram of a first device and a schematic structural diagram of a system composed of the first device and a second device according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described below with reference to the drawings. The terminology used in the description of the embodiments section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application. The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between different objects and not for limiting a particular order. In the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
As used in the examples of this application or the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is to be understood that the terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only, and is not intended to be limiting of the application.
The present application provides a microphone-based device control method that can be applied to a device having a microphone (microphone). A microphone as referred to herein is a device for sound signal collection. A microphone may also be referred to as a microphone, an ear microphone, a sound pick-up, a radio, a microphone, a sound sensor, a sound sensitive sensor, an audio acquisition device, or some other suitable terminology. The technical solution is mainly described herein by taking a microphone as an example.
Herein, a device to which the device control method described in the present application is applied may also be referred to as a first device, and a device controlled by the first device may be referred to as a second device.
In some aspects, the first device and the second device are different devices, that is, the first device may control the second device based on the methods described herein. The first device and the second device may communicate with each other, for example, the first device may control the second device through communication methods such as wireless-fidelity (Wifi) communication, bluetooth communication, infrared communication, or cellular 2/3/4/5generation (2/3/4/5generation, 2G/3G/4G/5G) communication.
Wherein at least one of the first device and the second device may also be referred to as a User Equipment (UE), a wearable device, a mobile unit, a subscriber unit, a wireless unit, a remote unit, a mobile device, a wireless communication device, a remote device, a mobile subscriber station, a terminal device, an access terminal, a mobile terminal, a wireless terminal, a smart terminal, a remote terminal, a handset, a user agent, a mobile client, a client, or some other suitable terminology.
For example, the first device may be a wearable device with a microphone, such as an earphone, smart glasses, a smart watch, and a smart bracelet, a mobile terminal with a microphone, such as a smart phone, a tablet computer, and a notebook computer, a smart home device with a microphone, such as an audio device, a smart television, a smart air conditioner, and a smart refrigerator, and a vehicle-mounted device with a microphone, such as an electric bicycle device and a car device. The embodiment of the present application does not specifically limit the specific form of the first device.
For example, the second device may be a mobile terminal with controlled requirements, such as a smart phone, a tablet computer, and a notebook computer, or may be a smart home device with controlled requirements, such as a sound device, a smart television, a smart air conditioner, and a smart refrigerator, or may be a vehicle-mounted device with controlled requirements, such as an electric bicycle device and an automobile device. The embodiment of the present application does not specifically limit the specific form of the second device.
For the convenience of understanding of the solution, the following description of the technical solution mainly takes the first device as an earphone and the second device as a mobile terminal (e.g., a mobile phone) as an example.
Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture provided in an embodiment of the present application, where the system architecture includes a mobile terminal (a smart phone is an example of the mobile terminal in the figure) and an earphone having a microphone, and a communication connection may be established between the earphone and the mobile terminal.
From the communication mode of the headset, the headset to which the present application is applied may be a wireless headset or a wired headset. The wireless earphone can be wirelessly connected with the mobile terminal, and can be further divided into the following parts according to the electromagnetic wave frequency used by the wireless earphone: infrared wireless headsets, meter wave wireless headsets (e.g., FM headsets), decimeter wave wireless headsets (e.g., bluetooth headsets), and the like. The wired headset may be connected to the mobile terminal through a wire (e.g., a cable), and may be classified into a cylindrical cable headset, a planar cable headset, and the like according to the shape of the cable.
From the wearing mode of the earphone, the earphone applying the application can be an in-ear earphone, a semi-in-ear earphone, an ear-hook earphone, a neck-hook earphone, a headset (an ear-hook earphone, an ear-wrapping earphone), a bone conduction earphone and the like.
From the structural function mode of the earphone, the earphone applying the present application may be a closed earphone, an open earphone, a semi-open earphone, and the like.
From the aspect of the Noise reduction mode of the earphone, the earphone to which the present application is applied may be an earphone with an Active Noise reduction (ANC) function, an earphone with a passive Noise reduction function, or a non-Noise reduction earphone.
In this document, the description of the scheme is mainly made by taking a wireless headset (e.g. a bluetooth headset) with a microphone as an example, and the number of the microphones may be one or more.
For example, fig. 2 shows a wireless communication (e.g., bluetooth headset) headset that includes a microphone that extends beyond the earpiece device of the headset to facilitate manipulation of the microphone by a user as desired.
For another example, fig. 3 shows an in-ear headphone for wireless communication (e.g., bluetooth headset), in which a microphone may be built into a headphone device of the in-ear headphone, thereby making the in-ear headphone more portable and portable. Each of the in-ear headphones includes, for example, a microphone, and it will be understood that a pair of in-ear headphones (i.e., including an in-ear headphone for the left ear and an in-ear headphone for the right ear) includes two microphones, the microphone of the in-ear headphone for the left ear being referred to as the left microphone and the microphone of the in-ear headphone for the right ear being referred to as the right microphone.
For another example, fig. 4 shows a wireless communication headset, which includes two microphones, a left microphone connected to a left earphone and a right microphone connected to a right earphone, and at least one of the two microphones can be operated by a user as required, so that the operation manner of the microphones by the user can be extended.
It should be noted that, in some other embodiments of the present application, the first device and the second device may be the same device, that is, in such embodiments, the first device may implement self-control based on the method described in the present application. For example, as shown in fig. 5, taking the first device as a smartphone as an example, the smartphone may include one or more microphones, for example, a top microphone disposed at the top of the smartphone and a bottom microphone disposed at the bottom of the smartphone. The user may operate at least one of the two microphones as desired. Specific implementation processes of the schemes can refer to implementation modes of the schemes when the first device and the second device are different devices, and detailed descriptions are not provided herein.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an exemplary apparatus 100 provided in an embodiment of the present application. In some embodiments, device 100 may be the first device described in embodiments herein. As shown in fig. 6, device 100 includes one or more processors 110, one or more memories 120, a communication interface 130, audio acquisition circuitry, and audio playback circuitry. Wherein the audio acquisition circuit further may include a microphone 140 and an Analog-to-Digital Converter (ADC) 150. The audio playback circuit further may include a speaker 160 and a Digital-to-Analog Converter (DAC). These components may communicate over one or more communication buses. Described below, respectively:
processor 110 is the control center of device 100 and may also be referred to as a control unit, a controller, a microcontroller, or some other suitable terminology. The processor 110 connects the various components of the device 100 using various interfaces and lines, and in a possible embodiment, the processor 110 may also include one or more processing cores.
The memory 120 may be coupled to the processor 110 or coupled to the processor 110 via a bus for storing various software programs and/or sets of instructions and data. In particular implementations, memory 120 may include high-speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 120 may also store one or more computer programs comprising program instructions for the methods described herein. The memory 120 may also store a communication program that may be used to communicate with at least one second device. The memory 120 may also store relevant data/code to execute the various functional modules of the first device as in the embodiment of fig. 20.
In particular embodiments of the present application, the processor 110 may be configured to call program instructions in the memory 120 to perform the functions of the first device side as in the embodiments of fig. 7, fig. 14, or fig. 18. Alternatively, the processor 110 may perform control of the second device by running various functional modules stored in the memory 120 and invoking data (e.g., captured audio signals) stored in the memory 120.
The communication interface 130 is used for communicating with the second device, and the communication mode may be a wired mode or a wireless mode. When the communication means is wired communication, the communication interface 130 may be accessed to the second device through a cable. When the communication mode is wireless communication, the communication interface 130 is configured to receive and transmit radio frequency signals, and the supported wireless communication mode may be at least one of Bluetooth (Bluetooth) communication, wireless-fidelity (Wifi) communication, infrared communication, or cellular 2/3/4/5generation (2/3/4/5generation, 2G/3G/4G/5G) communication, for example. In particular implementations, the communication interface 130 may include, but is not limited to: an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chip, a SIM card, a storage medium, and the like. In some embodiments, the communication interface 130 may be implemented on a separate chip.
The microphone 140 may be used to collect an audio signal (or a sound signal, the audio signal is an analog signal), and the analog-to-digital converter 150 is used to convert the analog signal collected by the microphone 140 into a digital signal and send the digital signal to the processor 110 for processing.
In the embodiment of the present application, the microphone 140 may collect an audio signal, which may be unrecognizable to human ears in a real life environment. For example, the user performs a silent operation around the microphone 140, which indicates an operation that is triggered by the hand of the user and is not perceived as sound by the human ear, such as the user fanning air near the sound collecting part of the microphone 140, or touching/rubbing/lightly clicking the sound collecting part of the microphone 140, i.e., the part of the microphone 140 used to collect the audio signal. The microphone 140 may collect an audio signal caused by a silent operation of the user and finally transmit the audio signal to the processor 110 through the analog-to-digital converter 150. The processor 110 may identify the signal collected by the microphone 140, and according to the characteristics of the identified signal and further according to a preset rule, send a control command to the second device through the communication interface 130, thereby completing control of the second device. For example, the second device is controlled to perform at least one of call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, and function mode turning on/off.
The communication interface 130 may be further configured to receive data (which is a digital signal) from the second device and transmit the data to the processor 110 for processing, the processor 110 transmits the processed data to the digital-to-analog converter 170, the digital-to-analog converter 170 may convert the received data into an analog signal and further transmit the analog signal to the speaker 160, and the speaker is configured to play according to the analog signal, so that the user can hear the played sound. Illustratively, the analog signal may be a music or voice signal.
Those skilled in the art will appreciate that the device 100 is merely one example provided by embodiments of the present application. In particular implementations of the present application, the apparatus 100 may have more or fewer components than shown, may combine two or more components, or may have a different configuration implementation of components.
For example, in one implementation, when the device 100 is an active noise reduction earphone, the device further includes a noise reduction processing circuit (ANC circuit) (not shown) for implementing an active noise reduction function of the device 100. The processor 110 and the noise reduction processing circuit may be integrated on one processor chip or may be implemented on two separate processor chips.
It should be noted that, in an alternative case, the above-mentioned components of the apparatus 100 may also be coupled together.
It should be understood that in various embodiments of the present application, the term "coupled" refers to being interconnected in a particular way, including being directly connected or indirectly connected through other devices, such as various interfaces, transmission lines, or buses, etc., which are usually electrical communication interfaces, but not excluding possible mechanical interfaces or other forms of interfaces, and the embodiments of the present application are not limited thereto.
Based on the above description, some device control methods provided by the embodiments of the present application are given below.
For the purposes of convenience, the method embodiments described below are all described as a combination of a series of action steps, but those skilled in the art should understand that the specific implementation of the technical solution of the present application is not limited by the order of the series of action steps described.
Referring to fig. 7, fig. 7 is a flowchart illustrating a method for controlling a device according to an embodiment of the present disclosure, and in some implementations, the method is applicable to a first device having at least one microphone. The method includes, but is not limited to, the steps of:
s101, detecting a silent operation of a user around (or near) at least one microphone of a first device.
In the present application, the "silent operation" refers to an operation that is triggered by the hand of the user and does not allow the human ear to perceive a sound. For example, the minimum sound intensity audible to the human ear is 1 decibel (db), and the sound caused by the user's operation may be lower than 1 decibel.
In some embodiments, the user may trigger a silent operation upon contacting the first device. For example, the user may use a finger to trigger a silent operation on a sound pick-up portion of at least one microphone of the first device, i.e. the portion of the microphone used for picking up audio signals. The silent operation may include, for example, at least one of: a rub microphone, a click (or touch, bump, tap) microphone.
Taking the silent operation of the in-ear phone by the user as an example, the in-ear phone may comprise at least one microphone. Referring to fig. 8 and 9, fig. 8 and 9 illustrate exemplary scenarios of several silent operations of the in-ear headphone by the user.
Where the silent operation shown in fig. 8 is a rub microphone, the user may rub the sound-collecting portion of the microphone with a finger, such as the user rubbing on the sound-collecting portion (sound-transmitting hole) of the microphone, such as the user's finger touching the microphone and moving a small distance along the microphone, such as the user's finger moving back and forth across the rub microphone, such as the user's finger looping, curving, etc. over the microphone. In this way, the first device may detect a silent operation triggered by the user. Fig. 8 (1) shows a schematic view of a scene in which a user rubs the microphone in a longitudinal direction of the sound collecting portion of the microphone, and fig. 8 (2) shows a schematic view of a scene in which a user rubs the microphone in a lateral direction of the sound collecting portion of the microphone. The arrows in the drawing indicate one-time rubbing operation, and (1) and (2) in fig. 8 each take 3 rubs as an example, and the direction of each rubbing may be the same or different. In the embodiment of the present application, the "friction microphone" may specifically be a combination of one or more of the following implementation forms:
a single rub of the at least one microphone. The first device may detect this operation, for example, each time the microphone is rubbed once.
The at least one microphone is rubbed a number of times greater than or equal to a second threshold number of times, the direction of each rub being the same or different, the second threshold number of times representing a value of the number of times the first device is triggered to detect the silent operation. For example, in one implementation, the second threshold is 2 times, and the first device may detect the operation when the number of times the microphone is rubbed is equal to 2 times, and for example, in an implementation, the first device may detect the operation when the number of times the microphone is rubbed exceeds 2 times, and the second threshold is 2 times.
The at least one microphone is continuously rubbed for a duration greater than or equal to a second duration threshold, the direction of each rubbing being the same or different, the second duration threshold representing a duration of operation that triggered the first device to detect the silent operation. For example, in one implementation, the second duration threshold is 2 seconds, and the first device may detect this operation each time the duration of the rubbing of the microphone's sound capture equals 2 seconds. For another example, in one implementation, the second time threshold is 2 seconds, and the first device may detect the operation when the time period for rubbing the microphone's sound collection portion exceeds 2 seconds.
Rubbing the at least one microphone at an intensity greater than or equal to a second intensity threshold. The location of each rubbing may be the same or different, and the second intensity threshold represents a rubbing intensity value that triggers the first device to detect rubbing of the at least one microphone to control a rubbing action of the second device, which may be characterized, for example, by a sound intensity characteristic or amplitude or energy of an audio signal caused by the rubbing action. For example, in one implementation, the first device may detect the operation when the intensity value of the friction of the at least one microphone equals a second intensity threshold. For another example, in one implementation, the first device may detect the action when a rubbing intensity value of the at least one microphone exceeds a second intensity threshold.
Rubbing the at least one microphone at intervals greater than or equal to a second time interval. The position of each rub may be the same or different, and the second time interval represents the time interval that triggers the first device to detect rubbing of the at least one microphone to control the rubbing action of the second device. For example, in one implementation, the second duration interval is 1 second, and the first device may detect the operation when the user rubs the at least one microphone at 1 second intervals. For another example, in one implementation, the second duration interval is 1 second, and the first device may detect the operation when the user rubs the at least one microphone at intervals exceeding 1 second.
It should be noted that the above example is described by taking the example of rubbing the microphone with the finger of the user, and it should be understood that, in a possible implementation, the microphone may also be rubbed with other human body parts (for example, palm) or other tools (for example, gloves, pens, etc.) to implement the silent operation.
It should be further noted that the above example is only used to explain the embodiment of the present application, and the second time threshold, the second duration threshold, the second intensity threshold, and the second duration interval may also be other values, which is not limited in the present application.
Fig. 9 illustrates the silent operation as clicking the microphone, i.e. the user can touch, tap, or tap the microphone's voice pick-up with a finger. In this way, the first device may detect a silent operation triggered by the user. The circle in the figure represents a one-click operation (one-click is taken as an example in the figure). In the embodiment of the present application, the "click microphone" may specifically be a combination of one or more of the following implementation forms:
a single click of the at least one microphone. The first device may detect this operation, for example, each time the microphone is clicked once.
The at least one microphone is clicked a number of times greater than or equal to a fourth threshold value, the position of each click being the same or different, the fourth threshold value representing a number of times the first device is triggered to detect the silent operation. For example, in one implementation, the fourth time threshold is 3 times, and the first device may detect this operation each time the microphone is clicked a number of times equal to 3 times. For another example, in one implementation, the fourth time threshold is 3 times, and the first device may detect the operation when the number of times the microphone is clicked exceeds 3 times.
Clicking on the at least one microphone continues for a duration greater than or equal to a fourth duration threshold, the location of each click being the same or different, the fourth duration threshold representing a duration of operation that triggered the detection of the silent operation by the first device. For example, in one implementation, the fourth time threshold is 2 seconds, and the first device may detect this operation each time the microphone is clicked for a time period equal to 2 seconds. For another example, in one implementation, the fourth time threshold is 2 seconds, and the first device may detect the operation each time the microphone is clicked for a time period exceeding 2 seconds.
Clicking the at least one microphone with an intensity greater than or equal to a fourth intensity threshold. The position of each click may be the same or different and the fourth intensity threshold represents a click intensity value triggering a click action of the first device to control the second device by detecting a click of the at least one microphone, which may be characterized, for example, by a sound intensity characteristic or amplitude or energy of the audio signal caused by the click action. For example, in one implementation, the first device may detect the operation when the click intensity value of clicking the at least one microphone equals a fourth intensity threshold. For another example, in one implementation, the first device may detect the operation when a click intensity value of clicking the at least one microphone exceeds a fourth intensity threshold.
Clicking the at least one microphone at an interval greater than or equal to a fourth time interval. The position of each click may be the same or different and the fourth time interval represents the time interval between two clicks that triggered the first device to detect the silent operation (i.e., click). For example, in one implementation, the fourth time interval is 1 second, and the first device may detect this operation when the user clicks on the at least one microphone at 1 second intervals. For another example, in one implementation, the fourth time interval is 1 second, and the first device may detect the operation when the user clicks on the at least one microphone at intervals exceeding 1 second.
It should be noted that the above example is described by taking the example of the user clicking the microphone with a finger, and it should be understood that, in a possible implementation, the microphone may be clicked with other human body parts (such as knuckles) or other tools (such as a pen, a tree branch, etc.) to implement the silent operation.
It should be further noted that the above example is only used to explain the embodiment of the present application, and the fourth time threshold, the fourth intensity threshold, and the fourth time interval may also be other values, which is not limited in the present application.
In still other embodiments, the user may trigger the silent operation without contacting the first device. For example, the user may trigger a silent operation at a location that is a close distance from the first device (i.e., in the vicinity of the first device), which may include, for example, at least one of: blowing a wind near the at least one microphone of the first device, blowing the at least one microphone of the first device.
Also exemplified by a silent operation of the in-ear phone by the user, the in-ear phone may comprise at least one microphone. Referring to fig. 10 and 11, fig. 10 and 11 illustrate exemplary scenarios for several more silent operations of the in-ear headphone by the user.
Fig. 10 shows a silent operation as a fan blowing to the microphone, i.e. the user can use the palm of the hand to blow to the sound collecting part of the microphone in the vicinity of the microphone. In this way, the first device may detect a silent operation triggered by the user. The arrows in the drawing indicate one-time fan operation (three-time fan operation is taken as an example in the drawing). In the embodiment of the present application, the "microphone wind" may specifically be a combination of one or more of the following implementation forms:
a single pass to the at least one microphone fan. The first device may detect this operation, for example, each time the sound pick-up section of the microphone is fanned once.
Blowing the at least one microphone a number of times greater than or equal to a first count threshold. The position of each fan may be the same or different, and the first time threshold represents a value of the number of times the first device is triggered to detect the silent operation. For example, in one implementation, the first time threshold is 3 times, and the first device may detect this operation each time the microphone is fanned a number of times equal to 3 times. For another example, in one implementation, the first threshold is 3 times, and the first device may detect the operation when the number of times the microphone is fanned exceeds 3 times.
Continuing to blow air to the at least one microphone for a duration greater than or equal to a first duration threshold. The position of each fan may be the same or different and the first duration threshold represents the duration of operation that triggered the detection of the silent operation by the first device. For example, in one implementation where the first time threshold is 2 seconds, the first device may detect this operation every time the duration of the fan to the microphone's sound pick-up portion equals 2 seconds. For another example, in one implementation, the first duration threshold is 2 seconds, and the first device may detect this operation each time the duration of the fan to the microphone's sound pick-up portion exceeds 2 seconds.
Blowing the at least one microphone with an intensity greater than or equal to a first intensity threshold. The position of each fan may be the same or different and the first intensity threshold represents a fan intensity value triggering the first device to detect a fan towards the microphone to control the second device's fan action, which may be characterized, for example, by the intensity characteristics or amplitude or energy of the fan-induced audio signal. For example, in one implementation, the first device may detect the operation when a fan intensity value of a fan toward a sound collection portion of the microphone is equal to a first intensity threshold. For another example, in one implementation, the first device may detect the operation when a fan intensity value of a fan toward a sound collection portion of the microphone exceeds a first intensity threshold.
The at least one microphone is ventilated at an interval greater than or equal to the first time interval. The position of each fan may be the same or different, and the first time interval represents the time interval between two fans that triggered the first device to detect the silent operation (i.e., the fan). For example, in one implementation, the first duration interval is 1 second, and the first device may detect this operation when the user fans the microphone at 1 second intervals. For another example, in one implementation, the first time interval is 1 second, and the first device may detect this operation when the user fans the microphone at intervals exceeding 1 second.
It should be noted that the above example is described by taking a palm fan as an example, and it should be understood that in a possible implementation, other tools (e.g., a paper sheet, a small fan, etc.) may be used to wind the microphone to implement the silent operation.
It should be further noted that the above example is only used to explain the embodiment of the present application, and the first count threshold, the first duration threshold, the first intensity threshold, and the first duration interval may also be other values, which is not limited in the present application.
Fig. 11 shows a silent operation as blowing air into the microphone, i.e. the user can blow/breathe/spit air into the microphone's sound pick-up near the microphone with his mouth. In this way, the first device may detect a silent operation triggered by the user. In the embodiment of the present application, the "blowing air into the microphone" may specifically be a combination of one or more of the following implementation forms:
blowing into the at least one microphone a single time. The first device may detect this operation, for example, each time a blow is made to the microphone's sound pick-up.
Blowing into the at least one microphone a number of times greater than or equal to a third time threshold. The position of each blow may be the same or different, and the third time threshold represents a value of the number of times the first device is triggered to detect the silent operation. For example, in one implementation, the third threshold is 2 times, and the first device may detect this operation every time the number of times the microphone is being fired equals 2 times. For another example, in one implementation, the third threshold is 2 times, and the first device may detect the operation when the number of times the microphone is being fired exceeds 2 times.
Continuously blowing air into the at least one microphone for a duration greater than or equal to a third duration threshold. The location of each blow may be the same or different, and the third duration threshold represents the duration of operation that triggered the detection of the silent operation by the first device. For example, in one implementation, the third time threshold is 1 second, and the first device may detect the operation when the time period for blowing air into the microphone is equal to 1 second. For another example, in one implementation, the third time threshold is 1 second, and the first device may detect the operation when the time period for blowing air into the microphone exceeds 1 second.
Blowing the at least one microphone with an intensity greater than or equal to a third intensity threshold. The position of each fan may be the same or different and the third intensity threshold represents a blowing intensity value that triggers the first device to detect blowing into the microphone to control the blowing action of the second device, which may be characterized, for example, by the sound intensity characteristic or amplitude or energy of the audio signal caused by blowing. For example, in one implementation, the first device may detect the operation when the blowing intensity value of blowing air into the microphone's microphone equals the third intensity threshold. For another example, in one implementation, the first device may detect the operation when the intensity value of the blowing fan to the sound collection portion of the microphone exceeds a third intensity threshold.
Blowing air into the at least one microphone at intervals greater than or equal to a third time interval. The position of each blow may be the same or different, and the third duration interval represents the duration interval between two blows that triggered the first device to detect the silent operation (i.e., blow). For example, in one implementation, the third duration interval is 1 second, and the first device may detect this operation when the user blows air at 1 second intervals into the microphone's sound pick-up. For another example, in one implementation, the third duration interval is 1 second, and the first device may detect this operation when the user blows air at intervals exceeding 1 second into the sound collection portion of the microphone.
It should be noted that the above example is only used to explain the embodiment of the present application, and the third time threshold, the third duration threshold, the third intensity threshold, and the third duration interval may also be other values, which is not limited in the present application.
It should be further noted that, the values of the second time threshold, the first time threshold, the third time threshold, and the fourth time threshold may be different or the same, and the present application is not limited thereto; the values of the second duration threshold, the first duration threshold, the third duration threshold, and the fourth duration threshold may be different or the same, and the present application is not limited. The values of the second intensity threshold, the first intensity threshold, the third intensity threshold and the fourth intensity threshold may be different or the same, and the application is not limited; the values of the second duration interval, the first duration interval, the third duration interval, and the fourth duration interval may be different or the same, and the present application is not limited.
It should be noted that, in addition to the above-mentioned silent operations, the present application may also modify the above-mentioned silent operations, or integrate two or more of the above-mentioned silent operations, for example, when the first device includes two or more microphones, the silent operations may also be formed in combination with the operation sequence of the different microphones. In addition, based on the technical idea of the present application, silent operations in other implementation forms can also be derived.
And S102, responding to the silent operation, and controlling the second equipment.
Specifically, in response to the silent operation detected in S101, the first device generates a corresponding control command (or referred to as a control command) for instructing the second device to perform a certain function. The first device sends the control command to the second device, and the second device can execute the relevant function operation according to the control command.
For example, in the case of the second device being a smart phone, the first device may control the second device to perform at least one of call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, song adjusting, volume adjusting, screen locking/unlocking, and function mode turning on/off through the control command. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, which is not limited in this application.
In order to better understand the scheme of the present application, the following description will be made by taking the scenario shown in fig. 12 and 13 as an example.
Referring to fig. 12, in a possible application scenario, taking the first device as an in-ear headphone and the second device as a smart phone as an example, the in-ear headphone may include at least one microphone, the headphone establishes a communication connection with the smart phone, and a user may listen to music in the smart phone through the in-ear headphone, as shown in (1) in fig. 12, a music playing interface is presented in the smart phone, and at this time, a speaker of the headphone plays the music. When the user wants to turn up the earphone volume, the user may choose to rub the microphone of the in-ear earphone with a finger, as shown in fig. 12 (2), the rubbing microphone, since the user may be inconvenient or dislike to operate directly on the smartphone. In this way, after the in-ear earphone detects the silent operation of the user, the in-ear earphone is triggered to generate a control command of "increase the volume", and the smartphone is controlled to increase the volume based on the control command, such as presenting an interface of increasing the volume of music playing in (3) in fig. 12. For example, in a specific implementation, the volume of playing music may be designed to increase with the increase of the time for rubbing the microphone, or the volume may be designed to increase by a preset degree for each time the microphone is rubbed, which is not limited in the present application. Therefore, the purpose of controlling the smart phone to adjust the music playing volume by using the microphone under the condition that the user does not make a sound is achieved, the operation is simple and convenient, and the user experience is improved.
Referring to fig. 13, in yet another possible application scenario, also taking the first device as an in-ear headset and the second device as a smart phone as an example, the in-ear headset may include at least one microphone, and the headset establishes a communication connection with the smart phone, for example, a user may listen to music (not shown) in the smart phone through the headset. In fig. 12 (1), the smartphone receives an incoming call as shown presenting an incoming call interface. The user may be inconvenient or dislike operating directly on the smartphone, and the user may choose to use the palm to face the microphone fan of the in-ear headset twice, as shown in fig. 12 (2). In this way, after the in-ear earphone detects the silent operation of the user, the in-ear earphone is triggered to generate a control command of "answer the call", and the smartphone is controlled to be connected based on the control command, for example, a call interface is presented in (3) in fig. 12. Subsequently, the user can continue to use the microphone to realize the voice call. Therefore, the purpose of controlling the smart phone to be connected with the call by using the microphone under the condition that the user does not make a sound is achieved, the operation is simple and convenient, and the user experience is improved.
It can be seen that in the embodiment of the present application, the first device can realize the control of the second device by detecting a silent operation of a user rubbing or clicking a microphone or a silent operation of fanning or blowing near the microphone. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
Referring to fig. 14, fig. 14 is a specific flowchart of a device control method provided by an embodiment of the present application, and in some implementations, the method is applicable to a first device having at least one microphone. The method includes, but is not limited to, the steps of:
s201, collecting audio signals through at least one microphone.
Herein, the audio signal may include a wind noise signal caused by a silent operation of the user.
In a specific embodiment of the present application, when the user performs a silent operation of rubbing the microphone, clicking the microphone, blowing the microphone, or the like on the at least one microphone, the audio signal acquired by the first device may be a wind noise signal caused by the silent operation of the user, i.e. the audio signal may be a signal with wind noise-like characteristics.
For example, a user may cause the sound collecting portion to vibrate and/or generate airflow disturbance by rubbing or clicking the sound collecting portion of the microphone with a finger, so as to drive the microphone diaphragm to vibrate to generate an audio signal.
For another example, the user can generate the movement of the air flow by the action of the palm of the hand blowing near the microphone, and further drive the microphone diaphragm to vibrate to generate the audio signal.
For another example, the user uses the mouth to blow air near the microphone, so as to generate the movement of the air flow, and further drive the microphone diaphragm to vibrate to generate the audio signal.
S202, determining whether a control command corresponding to the audio signal exists. If the control command corresponding to the audio signal exists, continuing to execute S204; if the control command corresponding to the audio signal does not exist, the process is ended.
For example, the control command comprises at least one of the following control commands: at least one of dialing a call, answering/hanging up a call, sending a message, playing/pausing music, playing/pausing video, adjusting a song, adjusting volume, locking/unlocking a screen, and turning on/off a designated functional mode. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, and the present application is not limited thereto.
In yet another possible implementation of the present application, when the model training is performed by using a machine learning method, the training data used includes audio signal data caused by a silent operation (e.g., an audio signal caused by a silent operation such as rubbing a microphone, clicking a microphone, blowing a microphone, or blowing a microphone) and a control tag (used for indicating a control command corresponding to the audio signal data), that is, the model may represent a mapping relationship between the audio signal and the control command. The first device obtains and saves the model. When the first device acquires the audio signal, the audio signal can be input to the model, so that the control command output by the model is obtained.
In yet another possible implementation of the present application, a mapping relationship between a combination of a signal type and an audio feature of an audio signal caused by a silent operation of a user and a control command may be pre-configured, and then a control command corresponding to the audio signal may be determined according to the mapping relationship.
The signal type may include wind noise, that is, the signal type may be a type of an audio signal caused by wind noise. For example, in one particular classification, the signal types may be divided, for example, into: the signal type caused by the friction microphone, for example, a user uses a finger to pass through the position of the sound collecting part of the friction microphone to cause the vibration of the sound collecting part and/or generate air flow disturbance, and then the microphone diaphragm is driven to vibrate to generate the audio signal; the type of the signal caused by clicking the microphone, for example, a user uses a finger to click the position of a sound collecting part of the microphone to cause the vibration of the sound collecting part and/or generate airflow disturbance, so as to drive a microphone diaphragm to vibrate to generate the audio signal; the type of signal induced to the microphone fan, for example, the user's palm is used to fan the air near the microphone to generate the movement of air flow, which in turn drives the microphone diaphragm to vibrate to generate the audio signal; the type of signal caused by blowing air into the microphone, such as the user's mouth blowing air near the microphone, produces a movement of the air flow that in turn causes the microphone diaphragm to vibrate to produce the audio signal. Of course, in other embodiments, there may be more or fewer classifications than illustrated, or signal types caused by other silent operations, which are not limited herein.
Audio features include, but are not limited to: the frequency of triggering the pulse signal, the duration of the pulse signal, the energy of the pulse signal, the sequence of collecting different signal types, the sound length characteristic, the sound intensity characteristic and the like. The audio characteristics of the audio signal can reflect the action characteristics of the silent operation of the user, and the action characteristics of the silent operation comprise one or more of action frequency, action strength, action rhythm and action sequence. The action frequency represents the number of times of execution of an action in the silent operation, and the action frequency represents the number of times of the action, the speed of the action and the like. The action strength represents the execution strength of the action in the silent operation. The action strength represents the execution strength of the action in the silent operation. The action tempo may represent a time interval before two actions, e.g. a time interval between two fanning actions; the action rhythm can also be formed by combining action frequency and action strength. The action sequence means the sequential order of execution of actions of different action types when there are a plurality of action types of silent operation, and/or the sequential order of execution of silent operation for different microphones when there are two or more microphones.
In another possible implementation of the present application, a machine learning method may be adopted in advance, and a large amount of training data is utilized to perform model training to obtain a model. The training data includes audio signal data caused by silent operation (e.g. audio signal caused by silent operation such as rubbing microphone, clicking microphone, blowing microphone, or blowing microphone) and signal type label (signal type for indicating corresponding silent operation), i.e. the model may characterize the mapping relationship between audio signal and signal type. The first device obtains and saves the model. When the first device captures an audio signal, the audio signal may be input to the model, thereby obtaining a signal type of the model output. In addition, the first device may further perform feature extraction on the audio signal of the signal type through an algorithm to obtain an audio feature. The control command corresponding to the audio signal may be determined according to a mapping relationship between a preset combination of signal type and audio characteristic and the control command.
In a specific implementation, the model related to the embodiments herein may be one of the following models: neural Network (NN) models, Deep Neural Network (DNN) models, Factorization-machine-supported Neural Network (FNN) models, Convolutional Neural Network (CNN) models, Inner Product-based Neural Network (IPNN) models, Outer Product-based Neural Network (OPNN) models, Neural decomposition machine (NFM) models, and so forth.
And S203, controlling the second equipment according to the control command.
In a particular embodiment, the control command is used to instruct the second device to perform a certain function. The first device sends the control command to the second device, and the second device can execute the relevant function operation according to the control command. For example, in the case of the second device being a smart phone, the first device may control the second device to perform at least one of call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, song adjusting, volume adjusting, screen locking/unlocking, and function mode turning on/off through the control command. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, which is not limited in this application.
It is noted that in a possible embodiment, the control command may also be used to instruct the first device to perform a certain function itself. For example, in a scenario where the first device is a mobile phone, the first device may perform at least one of dialing, answering/hanging up, sending information, playing/pausing music, playing/pausing video, adjusting a song, adjusting a volume, locking/unlocking a screen, and turning on/off a designated function mode according to the control command. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, which is not limited in this application.
It can be seen that in the embodiment of the present application, the first device collects, through the microphone, an audio signal caused by a silent operation of a user rubbing or clicking the microphone or a silent operation of fanning or blowing near the microphone, and determines, through the audio signal, a control command that needs to be triggered by the silent operation, so that the control of the second device can be implemented according to the control command corresponding to the silent operation. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
Referring to fig. 18, fig. 18 is a detailed flowchart of another apparatus control method provided in an embodiment of the present application, and in some implementations, the method may be applied to a first apparatus having at least one microphone. The method includes, but is not limited to, the steps of:
s301, the first device stores the action type and the action characteristic of the silent operation and the mapping relation between the control commands locally in advance.
That is, when the action type and the action feature conform to a preset condition (i.e., a preset combination), the preset condition (i.e., the preset combination) has a mapping relationship with the control command.
The action type of the silent operation indicates an operation type of an action in the silent operation, and for example, the action type may be rubbing a microphone, clicking a microphone, blowing a microphone fan, blowing a microphone, or the like.
The action characteristics of the silent operation include one or more of action frequency, action strength, action rhythm and action sequence. Wherein: an action type represents an embodiment of an action in the silent operation; the action frequency represents the implementation frequency of the action in the silent operation, and the action frequency represents the action frequency, the action speed and the like; the action strength represents the execution strength of the action in the silent operation; the action rhythm can represent the time interval between two actions, and can also be formed by combining action frequency and action strength; the action sequence means the sequential order of execution of actions of different action types when there are a plurality of action types of silent operation, and/or the sequential order of execution of silent operation for different microphones when there are two or more microphones.
The control command is used for controlling the second device, and the control command includes at least one of the following commands: and controlling at least one of the second device to make a call, answer/hang up the call, send information, play/pause music, play/pause video, adjust a song, adjust volume, lock/unlock a screen and open/close a specified function mode. The designated functional mode may be, for example, a mute mode, a vibrate mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like.
For example, in a possible implementation scenario, taking the first device as an earphone and the second device as a smartphone, when the first device only includes a single microphone or only configures a certain microphone in the first device for silent operation, mapping relationship information between the combination of the configurable action type and the action characteristic and the control command is shown in table 1 below. The mapping relationship information may be obtained by the first device from another device (for example, the second device or another terminal device) or the server in advance, or may be factory default setting.
TABLE 1
Preset combinations of action types and action characteristics Control command
Fan once near the microphone Handover start/stop
Click the microphone once Volume reduction
Duration of friction microphone How much to reduce the volume
How many times the microphone is rubbed How much to increase the volume
Click the microphone twice Answering
Fanning twice near the microphone Hang up absolutely
Click the microphone and rub itLower microphone Dialing emergency telephone
Table 1 is only for explaining the embodiments of the present application, and does not limit the present application.
For another example, in a possible implementation scenario, in which the first device is an earphone and the second device is a smartphone, when the first device includes two microphones (including a left microphone or a right microphone), or only two microphones in the first device are configured for silent operation, mapping relationship information between the combination of the action type and the action feature and the control command may be configured as shown in table 2 below. The mapping relationship information may be obtained by the first device from another device (for example, the second device or another terminal device) or the server in advance, or may be factory default setting.
TABLE 2
Figure BDA0002297712230000181
Table 2 is only for explaining the embodiments of the present application, and does not limit the present application.
For another example, in a possible implementation scenario, in the case where the first device is an earphone and the second device is a smartphone, the first device includes two microphones (including a left microphone or a right microphone), or only two microphones in the first device are configured for silent operation. The user can customize the rule of the mapping relation through the smart phone. For example, the headset is connected to a smartphone, and the smartphone displays a User Interface (UI) that customizes the rules of the mapping relationships, as shown in fig. 19. The user can reset, add, delete or modify part of rules for the mapping relation between the combination of the action type and the action characteristic and the control command through the UI interface of the smart phone. After the user sets the mapping relation, the smart phone can send the mapping relation information to the earphone for storage. The user can customize the rule through the UI interface in this embodiment, and the operation is more nimble convenient, accords with user's individual custom more, promotes user's use and experiences.
In the embodiment of the present application, the action type and the signal type have a corresponding relationship, and the action characteristic and the signal characteristic have a corresponding relationship. Therefore, in the above mapping relationship, the combination (preset combination) of the action type and the action feature directly reflects the combination (preset combination) of the signal type and the audio feature, and the mapping relationship between the combination of the action type and the action feature and the control command directly reflects the mapping relationship between the combination of the signal type and the audio feature and the control command.
The embodiment of fig. 19 is only for explaining the aspects of the present application, and is not intended to limit the present application.
S302, collecting audio signals through at least one microphone.
The audio signal may include a wind noise signal (or noise signal) caused by a silent operation of the user.
In a specific embodiment of the present application, when the user performs a silent operation of rubbing the microphone, clicking the microphone, blowing the microphone, or the like on the at least one microphone, the audio signal acquired by the first device may be a wind noise signal caused by the silent operation of the user, i.e. the audio signal may be a signal with wind noise-like characteristics.
For example, a user may cause the sound collecting portion to vibrate and/or generate airflow disturbance by rubbing or clicking the sound collecting portion of the microphone with a finger, so as to drive the microphone diaphragm to vibrate to generate an audio signal.
For another example, the user can generate the movement of the air flow by the action of the palm of the hand blowing near the microphone, and further drive the microphone diaphragm to vibrate to generate the audio signal.
For another example, the user uses the mouth to blow air near the microphone, so as to generate the movement of the air flow, and further drive the microphone diaphragm to vibrate to generate the audio signal.
And S303, determining the signal type and the audio characteristics of the audio signal.
The signal type may include wind noise, that is, the signal type may be a type of an audio signal caused by wind noise. For example, in one particular classification, the signal types may be divided, for example, into: signal type caused by rubbing the microphone, signal type caused by clicking the microphone, signal type caused by blowing the microphone. Of course, in other embodiments, there may be more or fewer classifications than illustrated, or signal types caused by other silent operations, which are not limited herein.
In a particular embodiment, the first device may obtain the signal type of the captured audio signal by detecting the audio signal. Specifically, the signal type may be obtained by detecting one or more of a time-domain feature and a spectral feature of the audio signal. Wherein the time domain feature represents a time domain pulse signal of the audio signal caused by the silent operation, and can be embodied by a variation relation of amplitude and time; the spectral feature represents the frequency spectral density of the audio signal caused by silent operation, and may be represented by a varying relationship of amplitude to frequency, for example.
In the embodiments of the present application, a signal type corresponding to an audio signal is determined according to the audio signal, and many different methods may be adopted. The signal type judgment can be carried out by adopting a traditional mode identification method based on audio signal extraction, and also can be carried out by adopting a neural network-based or deep learning-based method. For example, the captured audio signal is caused by the action of a wind towards the microphone, which signal may be referred to as a wind noise signal. Whether the traditional pattern recognition method or the neural network-based or deep learning-based method is adopted, a large amount of wind noise signals caused by the action of wind towards a microphone are collected firstly. The time-frequency characteristics of the collected audio signals are influenced by the different distances between the actions of the fan and the microphone, the different strength of the fan actions, the frequency of the fan actions and the like. If the traditional mode identification-based method is adopted, a large amount of wind noise signals caused by the action of wind towards a microphone are collected to form a training data set, the wind noise signals in the training data set are subjected to feature extraction to form feature vectors, and a judgment threshold value corresponding to each feature is obtained according to the feature vector training of the training data set and is used for judging whether the signal type belongs to the wind noise signals. In the process of judging the signal type, extracting the characteristics of the collected audio signals to obtain the characteristic vectors of the audio signals, and judging whether the collected audio signals are wind noise signals or not according to the characteristic vectors and the judgment threshold values corresponding to the characteristics. If the method based on the neural network or the deep learning is adopted, a large number of wind noise signals caused by the action of a fan towards the microphone are also required to be collected to form a training data set, and the network for judging whether the input signals are the wind noise signals is trained according to the training data set. In the process of judging the signal type, the collected audio signal is used as the input of the network, and the judgment result of the signal type is obtained to determine whether the collected audio signal belongs to the wind noise signal.
When the captured audio signal is caused by the action of a friction microphone, the type of such signal may be referred to as a friction signal. Training and signal type decision can be made by methods similar to those described above, and the extracted signal characteristics can be different from the decision of whether the signal is a wind noise signal.
In one possible implementation of the present application, the audio signal caused by the silent operation may be a wind noise signal or similar to the wind noise signal, and the first device may detect whether the audio signal collected by the microphone includes a signal with wind noise characteristics by using a wind noise detection algorithm. For example, a digital signal processing-based method is adopted to calculate the power spectral density of the frequency spectrum of the acquired audio signal, and whether the audio signal has a signal with wind noise characteristics or not is identified through the characteristics of the power spectral density. If so, it indicates that the audio signal is caused by a silent operation, otherwise, it indicates that the audio signal is not caused by a silent operation (e.g. the user's speaking voice, background environmental noise, etc.). Thereby it is achieved that the first device distinguishes between a silently operated audio signal and an otherwise operated audio signal.
Then, the signal types corresponding to different silent operations can be distinguished by a detection algorithm. For example, for audio signals caused by rubbing or clicking on the microphone pick-up location, the resulting audio signal is more energetic than the audio signal caused by blowing/blowing air into the microphone due to the close proximity applied to the microphone pick-up location. In order to distinguish the type of action of the silent operation, the first device may further detect the energy of a signal having a wind noise characteristic, thereby determining whether the signal is caused by an action of rubbing or clicking the microphone sound pick-up position or by an action of blowing or blowing near the microphone. For example, the first device may distinguish the signal type of the audio signal (noise signal) caused by the silent operation by the magnitude of the energy of the signal, and the signal type caused by the action of blowing/fanning near the microphone is considered when the energy is less than a set threshold, and the signal type caused by the action of rubbing or clicking the microphone sound collecting portion position is considered otherwise. Thereby enabling the first device to distinguish between different types of signals for silent operation.
In another possible implementation of the present application, a machine learning method may be adopted in advance, and a large amount of training data is utilized to perform model training to obtain a model. The training data includes audio signal data caused by silent operation (such as an audio signal caused by silent operation such as rubbing a microphone, clicking a microphone, blowing a microphone, or blowing a microphone) and signal type, i.e. the model can represent the mapping relationship between the audio signal and the signal type. The first device obtains and saves the model. When the first device captures an audio signal, the audio signal may be input to the model, thereby obtaining a signal type of the model output.
As will be explained below by way of an example, in one scenario, a user's hand-waving motion near the microphone of the first device generates an airflow that moves the microphone diaphragm to vibrate to generate an audio signal that is similar to a wind noise signal, i.e., the audio signal's audio features can be characterized by wind noise. In one implementation, the first device may detect whether a signal with wind noise characteristics is included in an audio signal collected by a microphone using a wind noise detection algorithm. For example, the first device calculates a power spectral density of a frequency spectrum of the audio signal by using a digital signal processing-based method, and identifies whether the audio signal has a signal with a wind noise characteristic or not by using a characteristic of the power spectral density. In another implementation, a deep learning method may be adopted in advance, and a model is obtained through training by using a large amount of training data. Wherein the training data is an audio signal collected by a microphone and generated by a fan in the vicinity of the microphone, and a wind dryness label (for indicating whether the corresponding audio feature has a wind noise characteristic). The first device may then use the model to identify whether the audio signal picked up by the microphone is a signal with wind noise characteristics. If the signal with the wind noise characteristic exists, the first device can also detect the energy of the audio signal with the wind noise characteristic and judge the audio signal caused by the action of which silent operation is performed. For example, the first device may distinguish the type of audio signal by the energy of the signal. The energy is less than the set threshold value, which is caused by the action of a fan near the microphone, or else, caused by the action of rubbing the microphone sound collecting part. Then, in this example, if the detected energy is less than the set threshold, the signal type of the audio signal is determined to be the type caused by the action of a fan near the microphone.
The audio feature may be a target feature in the audio signal. In a specific embodiment, the first device may obtain the audio feature by performing feature extraction on the audio signal.
The audio features include, for example: the frequency, energy, duration characteristic, sound intensity characteristic and sequence for collecting different signal types of the pulses in the audio signal. The frequency of the pulse represents the frequency of pulse excitation, the speed of pulse excitation and the like in the audio signal. The energy of the pulse represents the amount of energy excited by the pulse in the audio signal. The duration of the pulse represents the duration of the pulse in the audio signal. The order in which different signal types are collected may be the order in which a single microphone collects audio signals resulting from silent operation of different types (e.g., types of actions such as rubbing, clicking, blowing, or the like to the microphone), or the order in which different microphones (e.g., two microphones) collect audio signals resulting from silent operation of the same or different types. The duration characteristic represents the duration of the audio signal caused by the silent operation, and the intensity characteristic represents the energy of the audio signal caused by the silent operation.
The audio features reflect the action characteristics of the silent operation, that is, the action characteristics (e.g., one or more of action frequency, action strength, action rhythm, action sequence, etc.) of the silent operation can be identified according to the audio features (e.g., one or more of frequency, energy, duration, intensity, and sequence of acquisition of different signal types of pulses in the audio signal).
S304, determining the action type corresponding to the signal type, and determining the action characteristic corresponding to the audio characteristic.
In the embodiment of the present application, the audio signal may be caused by a user performing an action type of rubbing a microphone, clicking a microphone, blowing a microphone, or the like on the at least one microphone, so that the signal type of the audio signal has a one-to-one correspondence relationship with the action type, and then the action type of the silent operation may be determined according to the signal type. The action type represents an operation type of an action in the silent operation, for example, the action corresponding to the action type may be rubbing a microphone, clicking a microphone, blowing a microphone fan, blowing a microphone, or the like.
Similarly, the audio features of the audio signal reflect the action features of the silent operation, that is, the audio features of the audio signal and the action features also have corresponding relations, so the action features of the silent operation can be determined according to the audio features. The action characteristics of the silent operation include one or more of action frequency, action strength, action rhythm and action sequence. Wherein:
the operation frequency represents the number of times of performing an operation in the silent operation, and the operation frequency represents the number of times of the operation, the speed of the operation, and the like. For example, rubbing the microphone's sound pick-up portion by hand quickly once, twice, three times (i.e., the frequency of motion is one, two, three times, respectively), etc.; for another example, the microphone is quickly clicked by hand once, twice, three times (i.e., the action frequency is one, two, three times, respectively), and so on; also for example, fanning once, twice, three times (i.e., the frequency of motion is one, two, three times, respectively) near the microphone, etc.; for another example, the wind may be blown once, twice, three times (i.e., the action frequency is one, two, three times, respectively) near the microphone, and the like, and the action frequency is not particularly limited in the present application.
The action strength represents the execution strength of the action in the silent operation. For example, the microphone may be slightly rubbed, strongly rubbed, lightly touched, strongly pressed, slightly blown, largely blown, slightly blown, and sharply blown, and the motion force is not particularly limited in this application.
The action tempo may represent a time interval before two actions, e.g. a time interval between two fanning actions; the action rhythm can also be formed by combining action frequency and action strength. For example, one motion tempo may be "lightly rub microphone-hard rub resistant microphone-lightly rub microphone-hard rub resistant microphone …", and for example, one motion tempo may be "small amplitude towards microphone fan-large amplitude towards microphone fan …", which is not specifically limited by this application.
The action sequence means the sequential order of execution of actions of different action types when there are a plurality of action types of silent operation, and/or the sequential order of execution of silent operation for different microphones when there are two or more microphones. For example, one sequence of actions is to fan once in the vicinity of the microphone and then rub the microphone's sound pick-up portion with the hand. For another example, when the microphone includes a left microphone and a right microphone, one action sequence is that the sound collecting part of the left microphone is rubbed, and then the sound collecting part of the right microphone is rubbed, and the action sequence is not particularly limited in the present application.
In this way, since there is a one-to-one correspondence relationship between the motion characteristic and the audio characteristic of the silent operation, in combination with the motion characteristic based on the rule of the time axis, the audio characteristic based on the time axis can be generated, and the corresponding motion characteristic can be determined. In one example, the frequency of the motion may be determined according to the frequency of pulses in the audio signal, the strength of the motion may be determined according to the energy of the pulses, the rhythm of the motion may be determined according to the duration of the pulse signal, and the sequence of the motion may be determined according to the order in which the different signal types are acquired.
In one possible implementation of the present application, a time domain feature of an audio feature is taken as an example, and the time domain feature represents a time domain pulse signal of an audio signal caused by silent operation. Referring collectively to fig. 15, 16, and 17, fig. 15 illustratively shows a time domain pulse signal (audio signal) diagram resulting from a silent operation of "rubbing microphone fan", fig. 16 illustratively shows a time domain pulse signal (audio signal) diagram resulting from a silent operation of "to microphone fan", and fig. 17 illustratively shows a time domain pulse signal (audio signal) diagram resulting from a user speaking (i.e., not a silent operation). The first device can recognize whether the current audio signal is caused by a silent operation or other operations (such as speaking voice) based on waveform characteristics (such as parameters of amplitude, frequency, energy and the like of the waveform) of different pulse signals. If the current audio signal is identified to be caused by the silent operation, audio features in the audio signal, such as the frequency of triggering the pulse signal, the duration of the pulse signal, the energy of the pulse signal, etc., can be further identified.
For example, the first device recognizes that the pulse of the audio signal shown in fig. 15 is caused by the motion of the friction microphone, further recognizes the frequency of the trigger of the pulse signal (the different time intervals between two friction motions are marked by the friction frequency in the figure, and correspond to the motion frequency), the duration of the pulse signal (the total time of a plurality of friction motions is marked by the friction time length in the figure), the energy size of the pulse signal (the intensity of the friction motion is marked by the amplitude in the figure, and correspond to the motion strength), and thus recognizes the motion characteristic corresponding to the audio characteristic of the audio signal.
For another example, the first device identifies that the pulse of the audio signal shown in fig. 16 is caused by a motion toward the microphone fan, further identifies the frequency of the pulse signal trigger (the different time intervals between two fan motions are indicated by the fan frequency in the figure, and also correspond to the motion frequency), the duration of the pulse signal (the total duration of a plurality of fan motions is indicated by the fan duration in the figure), the energy level of the pulse signal (the intensity of the fan motion is indicated by the amplitude in the figure, and also corresponds to the motion strength), and thus identifies the motion characteristic corresponding to the audio characteristic of the audio signal.
It should be noted that the audio signals shown in fig. 15, 16, and 17 are only used to explain the scheme of the present application, and do not limit the present application.
In one possible implementation of the present application, the audio signal collected by the microphone of the first device may include audio signals generated by a plurality of action types of silent operation, the audio signal may be segmented by a voice activation detection technique, each segment representing one action type, and the speed of the action of different action types may be distinguished by detecting the duration of the audio signal corresponding to the action to be recognized in one segment and the interval of the start time of the audio signal corresponding to the action to be recognized in different segments. And obtaining the action frequency of different action types by counting the action types in different segments. By counting the action types in different segments and the respective continuous action frequency in a period of time, the action rhythm or action sequence of the silent operation can be further obtained. Thus, it is achieved that the first device obtains the action characteristic corresponding to the silently operated audio signal.
In another possible implementation of the present application, a machine learning method may be adopted in advance, and a large amount of training data is utilized to perform model training to obtain a model. Wherein the training data comprises audio features of the audio signal caused by the silent operation and motion feature labels (labels are used for indicating motion features of the corresponding silent operation), i.e. the model can characterize the mapping relationship between the audio features and the motion features. The first device obtains and saves the model. When the first device collects the audio signal, the audio features are extracted and input into the model, and therefore the action features output by the model are obtained.
In a specific implementation, the model related to the embodiments herein may be one of the following models: neural Network (NN) models, Deep Neural Network (DNN) models, Factorization-machine-supported Neural Network (FNN) models, Convolutional Neural Network (CNN) models, Inner Product-based Neural Network (IPNN) models, Outer Product-based Neural Network (OPNN) models, Neural decomposition machine (NFM) models, and so forth.
S305, determining whether a control command corresponding to the audio signal exists.
Having determined the action type and the action characteristic corresponding to the audio signal, the first device may determine whether a combination of the action type and the action characteristic satisfies a preset condition, for example, one or more preset combinations including the action type and the action characteristic, through S304. For example, by querying the mapping relationship between the preset combination (combination formed by the action type and the action feature) and the control command stored in S301, it is determined whether the combination of the action type and the action feature matches a target preset combination in the preset combination, that is, it is determined whether the combination exists in the mapping table preset in S301, if so, it is determined that the combination matches the target preset combination, and at this time, the combination has a corresponding control command in the mapping table, and then S306 may be continuously executed. Otherwise, the flow can be ended if the combination does not conform to the preset combination and the combination has no corresponding control command.
For example, if the mapping table preset by the first device is as shown in fig. 19, then if the action type determined by the first device according to the audio signal is rubbing the left microphone, the determined action characteristic is rubbing 1 time, i.e. the combination of the action type and the action characteristic exists in the mapping table, so that the control command of "lowering the volume" will be triggered later. If the motion type determined by the first device from the audio signal is rubbing the right microphone and the determined motion characteristic is rubbing 2 times, i.e. the combination of the motion type and the motion characteristic does not exist in the mapping table, the flow may end.
In this application, since the action type and the signal type have a corresponding relationship and the action characteristic and the signal characteristic have a corresponding relationship, the embodiments of the present application can also be described as follows: having determined the signal type and audio features corresponding to the audio signal, the first device may determine whether the combination of the signal type and the audio features satisfies a predetermined condition, such as one or more predetermined combinations including the signal type and the audio features, through S303. For example, by querying the mapping relationship between the preset combination (the combination formed by the signal type and the audio feature) and the control command stored in S301, it is determined whether the combination of the signal type and the audio feature matches the target preset combination in the preset combination, that is, it is determined whether the combination exists in the mapping table preset in S301, if so, it is determined that the combination matches the target preset combination, and at this time, the combination has a corresponding control command in the mapping table, and then S306 may be continuously executed. Otherwise, the flow can be ended if the combination does not conform to the preset combination and the combination has no corresponding control command.
And S306, triggering to control the second equipment according to the control command.
In a particular embodiment, the control command is used to instruct the second device to perform a certain function. The first device sends the control command to the second device, and the second device can execute the relevant function operation according to the control command. For example, in the case of the second device being a smart phone, the first device may control the second device to perform at least one of call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, song adjusting, volume adjusting, screen locking/unlocking, and function mode turning on/off through the control command. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, which is not limited in this application.
It is noted that in a possible embodiment, the control command may also be used to instruct the first device to perform a certain function itself. For example, in a scenario where the first device is a mobile phone, the first device may perform at least one of dialing, answering/hanging up, sending information, playing/pausing music, playing/pausing video, adjusting a song, adjusting a volume, locking/unlocking a screen, and turning on/off a designated function mode according to the control command. The designated function mode may be, for example, a mute mode, a vibration mode, an airplane mode, a power saving mode, an active noise reduction (ANC) function, a listening mode (heartbeat), and the like, which is not limited in this application.
It can be seen that, in the embodiment of the present application, the first device prestores a mapping relationship between the action type and the action feature of the silent operation and the control command, when an audio signal caused by the silent operation of a user rubbing or clicking the microphone or the silent operation of a fan or blowing near the microphone is collected by the microphone, the first device may identify the signal type and the audio feature of the audio signal, and further determine the action type and the action feature corresponding to the signal type and the audio feature, respectively, thereby implementing identification of the silent operation, and further implementing control over the second device according to the control command corresponding to a combination of the action type and the action feature. Therefore, the second equipment can be controlled under the condition that the user does not make a sound, and the control form of the user on the equipment is expanded. The input mode of the silent operation is simple and convenient, and the concealment of operations such as dialing an emergency call by the second equipment is improved. In addition, the first device used in the application is provided with the microphone, the control function can be realized through the microphone on the multiplexing device, additional sensor devices are not needed, the cost is low, and the power consumption of the device is low. Therefore, the application can greatly improve the use experience of the user.
It should be noted that the above examples are intended to illustrate some embodiments of the present application. In practical applications, further expansion/deformation/refinement can be performed according to the technical content of the above embodiments.
For example, in a possible embodiment of the present application, when another sensor is further disposed in the first device, the other sensor may also be utilized to assist the first device in detecting whether a noise signal corresponding to the silent operation to be identified exists in the audio signal collected by the microphone, so as to improve the accuracy of identification. For example, if the first device has a motion sensor and two microphones, and the first device is configured to detect silent operation with one microphone. Then, the first device may determine whether the user is in a stationary state or in a moving state according to the detection result of the motion sensor. Assuming that it is detected that the user is in a stationary state, an audio signal corresponding to a silent operation is detected by one microphone and an audio signal corresponding to a silent operation is not detected by another microphone at the same time, the first device may determine that the user is performing a silent operation in order to control the second device. Otherwise, if both microphones detect an audio signal, there may be a misidentification, and the first device may not perform a subsequent control operation. Therefore, other existing sensors in the first equipment are utilized and matched with different types of sensors, the control operation of the second equipment can be realized, the reliability of identification is improved, the user experience is further improved, and the power consumption caused by error identification can be saved.
For another example, in a possible embodiment of the present application, the first device may further control, by using information provided by the second device or other terminal devices or servers, the on or off of the functions of the device control method provided by the present application, so as to improve the reliability of the device. For example, the first device may obtain weather forecast information provided by the second device, and if the weather forecast information indicates that the wind power in the current environment exceeds a preset value, the user may be prompted to close the function of the device control method provided by the application, so that the accuracy of audio signal identification in the scheme of the application is prevented from being affected by strong wind, the reliability of the scheme is further improved, and the user experience is improved.
Referring to fig. 20, fig. 20 is a schematic diagram of a structure of a first device 40 and a schematic diagram of a system composed of the first device 40 and a second device 50, where the first device and the second device may be communicatively connected, and the communication connection is a wireless connection or a wired connection. The first device 40 includes an acquisition module 401, a signal processing module 402, and a control module 403, in some embodiments, the acquisition module 401, the signal processing module 402, and the control module 403 may be in the form of software codes, and in a specific implementation, data/codes of the acquisition module 401, the signal processing module 402, and the control module 403 may be stored in the memory 120 shown in fig. 6 and may be executed on the processor 110 shown in fig. 6. Wherein:
an acquisition module 401, configured to acquire an audio signal through the at least one microphone;
a signal processing module 402 for determining a signal type and an audio characteristic of the audio signal; the signal type comprises at least wind noise;
a control module 403, configured to determine whether the signal type and the audio feature meet preset conditions; and when the signal type and the audio frequency characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions, wherein the control command is used for controlling a second device.
In a specific embodiment, the acquisition module 401, the signal processing module 402, and the control module 403 may cooperate with each other to perform the function of the first device side in the embodiments shown in fig. 7, fig. 14, or fig. 18, and specific implementation contents of each functional module may refer to the description of the relevant steps in the above method embodiments, and are not repeated here for brevity of the description.
Embodiments of the present application also provide a computer-readable storage medium having stored therein instructions, which when executed on a computer or processor, cause the computer or processor to perform one or more steps of any one of the methods described above. The respective constituent modules of the signal processing apparatus may be stored in the computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products.
Based on such understanding, the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer or a processor, cause the computer or the processor to execute any one of the methods provided by the embodiments of the present application. The technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing a computer device or a processor therein to execute all or part of the steps of the method according to the embodiments of the present application.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application. For example, some specific operations in an apparatus embodiment may refer to previous method embodiments.

Claims (19)

1. A device control method applied to a first device having at least one microphone, comprising:
acquiring an audio signal by the at least one microphone;
determining a signal type and an audio characteristic of the audio signal; the signal type comprises at least wind noise;
judging whether the signal type and the audio frequency feature accord with preset conditions or not;
and when the signal type and the audio frequency characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions.
2. The method of claim 1, wherein the first device is communicatively coupled to a second device, and wherein the control command is used to control the second device.
3. The method of claim 1, wherein the control command is used to control the first device.
4. The method of any of claims 1-3, wherein determining the signal type and audio characteristics of the audio signal comprises:
obtaining the signal type by detecting one or more of time domain characteristics and frequency spectrum characteristics of the audio signal;
and obtaining the audio features by performing feature extraction on the audio signals.
5. The method of any of claims 1-3, wherein determining the signal type and audio characteristics of the audio signal comprises:
obtaining the signal type according to the audio signal and a neural network model; wherein the neural network model characterizes a mapping relationship between the audio signal and the signal type;
and obtaining the audio features by performing feature extraction on the audio signals.
6. The method of any of claims 1-5, wherein the audio features comprise: the audio signal may be one or more of frequency, energy, duration characteristic, intensity characteristic, and order of acquiring different signal types.
7. The method according to claim 6, wherein the preset condition is at least one preset combination comprising a signal type and an audio feature;
the judging whether the signal type and the audio frequency feature meet preset conditions includes:
and judging whether a combination formed by one or more of the signal type and the frequency, the energy, the duration characteristic, the tone intensity characteristic and the sequence of acquiring different signal types conforms to the preset combination.
8. The method of claim 7, wherein each predetermined combination has a mapping relationship with at least one control command;
when the signal type and the audio characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions, wherein the control command comprises:
and when the combination of the signal type and the audio frequency characteristics accords with a target preset combination in the preset combinations, triggering a control command corresponding to the target preset combination.
9. The method according to any one of claims 1-8, wherein the control command is used to perform at least one of the following controls:
controlling the second device to perform a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing.
10. An apparatus for device control, applied to a first device having at least one microphone, comprising:
the acquisition module is used for acquiring the audio signals through the at least one microphone;
a signal processing module for determining a signal type and an audio characteristic of the audio signal; the signal type comprises at least wind noise;
the control module is used for judging whether the signal type and the audio frequency characteristics accord with preset conditions or not; and when the signal type and the audio frequency characteristic accord with preset conditions, triggering a control command corresponding to the preset conditions.
11. The apparatus of claim 10, wherein the first device is communicatively coupled to a second device, and wherein the control command is configured to control the second device.
12. The apparatus of claim 10, wherein the control command is used to control the first device.
13. The apparatus of any one of claims 10-12, wherein the signal processing module is configured to:
obtaining the signal type by detecting one or more of time domain characteristics and frequency spectrum characteristics of the audio signal;
and obtaining the audio features by performing feature extraction on the audio signals.
14. The apparatus of any one of claims 10-12, wherein the signal processing module is configured to:
obtaining the signal type according to the audio signal and a neural network model; wherein the neural network model characterizes a mapping relationship between the audio signal and the signal type;
and obtaining the audio features by performing feature extraction on the audio signals.
15. The apparatus according to any of claims 10-14, wherein the audio features comprise: the audio signal may be one or more of frequency, energy, duration characteristic, intensity characteristic, and order of acquiring different signal types.
16. The apparatus of claim 15, wherein the predetermined condition is at least one predetermined combination comprising a signal type and an audio characteristic;
the control module is specifically configured to determine whether a combination formed by one or more of the signal type and the frequency, the energy, the duration characteristic, the sound length characteristic, the sound intensity characteristic of the wind noise pulse, and the sequence of acquiring different signal types matches the preset combination.
17. The apparatus of claim 16, wherein each predetermined combination has a mapping relationship with at least one control command;
the control module is specifically configured to trigger a control command corresponding to the target preset combination when the combination of the signal type and the audio feature conforms to the target preset combination in the preset combination.
18. The apparatus according to any one of claims 10-17, wherein the control command is used to perform at least one of the following controls:
controlling the second device to perform a call making, call answering/call hanging, message sending, music playing/pausing, video playing/pausing, track adjusting, volume adjusting, screen locking/unlocking, or specified function mode opening/closing.
19. An apparatus, the apparatus being a first apparatus, comprising: at least one microphone; one or more processors; a memory; and one or more computer programs; wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the first device, cause the first device to perform the method as described in any of claims 1-9.
CN201911209307.1A 2019-08-23 2019-11-30 Equipment control method and device Pending CN112420031A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910786484 2019-08-23
CN201910786484X 2019-08-23

Publications (1)

Publication Number Publication Date
CN112420031A true CN112420031A (en) 2021-02-26

Family

ID=74843996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911209307.1A Pending CN112420031A (en) 2019-08-23 2019-11-30 Equipment control method and device

Country Status (1)

Country Link
CN (1) CN112420031A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113329302A (en) * 2021-06-30 2021-08-31 歌尔科技有限公司 Earphone control method, electronic equipment and earphone

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120004913A1 (en) * 2010-07-01 2012-01-05 Samsung Electronics Co., Ltd. Method and apparatus for controlling operation of portable terminal using microphone
KR101919474B1 (en) * 2017-07-14 2018-11-19 부전전자 주식회사 Earphone with microphone performing button function
US20180348853A1 (en) * 2015-12-01 2018-12-06 Samsung Electronics Co., Ltd. Method and apparatus using frictional sound
CN208227260U (en) * 2018-05-08 2018-12-11 国光电器股份有限公司 A kind of smart bluetooth earphone and bluetooth interactive system
CN109782919A (en) * 2019-01-30 2019-05-21 维沃移动通信有限公司 A kind of control method and terminal device of terminal device
CN110111776A (en) * 2019-06-03 2019-08-09 清华大学 Interactive voice based on microphone signal wakes up electronic equipment, method and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120004913A1 (en) * 2010-07-01 2012-01-05 Samsung Electronics Co., Ltd. Method and apparatus for controlling operation of portable terminal using microphone
US20180348853A1 (en) * 2015-12-01 2018-12-06 Samsung Electronics Co., Ltd. Method and apparatus using frictional sound
KR101919474B1 (en) * 2017-07-14 2018-11-19 부전전자 주식회사 Earphone with microphone performing button function
CN208227260U (en) * 2018-05-08 2018-12-11 国光电器股份有限公司 A kind of smart bluetooth earphone and bluetooth interactive system
CN109782919A (en) * 2019-01-30 2019-05-21 维沃移动通信有限公司 A kind of control method and terminal device of terminal device
CN110111776A (en) * 2019-06-03 2019-08-09 清华大学 Interactive voice based on microphone signal wakes up electronic equipment, method and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113329302A (en) * 2021-06-30 2021-08-31 歌尔科技有限公司 Earphone control method, electronic equipment and earphone
CN113329302B (en) * 2021-06-30 2023-01-24 歌尔科技有限公司 Earphone control method, electronic equipment and earphone

Similar Documents

Publication Publication Date Title
US11705878B2 (en) Intelligent audio output devices
CN108710615B (en) Translation method and related equipment
CN107580113B (en) Reminding method, device, storage medium and terminal
CN110493678B (en) Earphone control method and device, earphone and storage medium
CN108922537B (en) Audio recognition method, device, terminal, earphone and readable storage medium
US11605372B2 (en) Time-based frequency tuning of analog-to-information feature extraction
CN108668009B (en) Input operation control method, device, terminal, earphone and readable storage medium
CN109151211B (en) Voice processing method and device and electronic equipment
US20220335924A1 (en) Method for reducing occlusion effect of earphone, and related apparatus
CN107886969B (en) Audio playing method and audio playing device
WO2021114953A1 (en) Voice signal acquisition method and apparatus, electronic device, and storage medium
CN108803859A (en) Information processing method, device, terminal, earphone and readable storage medium storing program for executing
CN111510814A (en) Noise reduction mode control method and device, electronic equipment and storage medium
CN112532266A (en) Intelligent helmet and voice interaction control method of intelligent helmet
CN110364156A (en) Voice interactive method, system, terminal and readable storage medium storing program for executing
CN105812567A (en) Mobile terminal control method and device
CN110097875A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN109121046B (en) Plugging hole treatment method and related product
CN108959273A (en) Interpretation method, electronic device and storage medium
US11297429B2 (en) Proximity detection for wireless in-ear listening devices
CN112399297A (en) Earphone, voice awakening method thereof and computer storage medium
CN113194383A (en) Sound playing method and device, electronic equipment and readable storage medium
CN108429956B (en) Wireless earphone, control operation method and related product
CN112420031A (en) Equipment control method and device
CN108154886A (en) Noise suppressing method and device, electronic device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination