WO2021003955A1 - 耳机播放状态的控制方法、装置、移动终端及存储介质 - Google Patents

耳机播放状态的控制方法、装置、移动终端及存储介质 Download PDF

Info

Publication number
WO2021003955A1
WO2021003955A1 PCT/CN2019/121190 CN2019121190W WO2021003955A1 WO 2021003955 A1 WO2021003955 A1 WO 2021003955A1 CN 2019121190 W CN2019121190 W CN 2019121190W WO 2021003955 A1 WO2021003955 A1 WO 2021003955A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
preset
earphone
prompt
mobile terminal
Prior art date
Application number
PCT/CN2019/121190
Other languages
English (en)
French (fr)
Inventor
温桂龙
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021003955A1 publication Critical patent/WO2021003955A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1091Details not provided for in groups H04R1/1008 - H04R1/1083
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/10Details of earpieces, attachments therefor, earphones or monophonic headphones covered by H04R1/10 but not provided for in any of its subgroups

Definitions

  • This application relates to the technical field of terminal control, and in particular to a method for controlling the playback state of headphones, a mobile terminal and a storage medium.
  • the main purpose of this application is to provide a method for controlling the playback state of earphones, a device for controlling the playback state of earphones, a mobile terminal and a storage medium, aiming to solve the inconvenience of life due to the use of earphones that affect the user to extract useful external environmental sounds problem.
  • this application proposes a method for controlling the playback state of headphones, and the method for controlling the playback state of headphones includes:
  • the first ambient audio is acquired through the receiver
  • the working state of the headset connected to the mobile terminal is adjusted according to an adjustment scheme corresponding to the preset audio.
  • the preset audio is a semantic prompt audio; when the single audio corresponding to the preset audio exists in the first ambient audio, adjust according to the preset audio
  • the step of adjusting the working state of the headset connected to the mobile terminal in the solution includes
  • the preset audio is a segment of prompt audio with semantics; when the single audio corresponding to the prompt audio is present in the first ambient audio, the prompt audio is sent to the headset After the step of making the earphone play the prompt audio, the play instruction includes:
  • the semantic character corresponding to the prompt audio is acquired, and the display screen of the mobile terminal is controlled to display the semantic character.
  • the semantic characters include inbound semantic characters and on-site semantic characters, and when the single audio in the first environment audio corresponds to the prompt audio, sending to the headset contains the After the step of playing the prompt audio instruction to make the earphone play the prompt audio, it includes:
  • the playback volume of the earphone is increased.
  • the method for controlling the playback state of the headset further includes:
  • the audio corresponding to the semantic character in the third environment audio is set as the prompt audio.
  • the step of adjusting the working state of the headset according to the adjustment scheme corresponding to the preset audio includes:
  • the active noise reduction function is turned off and the playback volume of the earphone is reduced;
  • the noise reduction type of the earphone is not active noise reduction, the first ambient audio is played.
  • the step of determining whether there is a single audio corresponding to a preset audio in the first ambient audio includes:
  • the aligned frame-level speech feature vector is consistent with the preset audio through a preset state network, where the preset state network is a database constructed by a hidden Markov model.
  • the present application also provides a device for controlling the playback state of the earphone, the control device comprising:
  • a first acquisition module configured to acquire current state information of the mobile terminal, and determine whether the state information is a headset connection;
  • a second acquisition module configured to acquire the first ambient audio through the receiver of the mobile terminal when the status information is a headset connection;
  • a judging module configured to judge whether there is a single audio matching a preset audio in the first ambient audio
  • An adjustment module configured to, when the single audio corresponding to the preset audio exists in the first ambient audio, adjust the mobile terminal connected to the mobile terminal according to the adjustment scheme corresponding to the preset audio The working status of the headset.
  • the application also provides a mobile terminal.
  • the mobile terminal includes a processor, a memory, a receiver, and computer-readable instructions stored on the memory that can be executed by the processor, wherein the computer-readable instructions When executed by the processor, the steps of the method for controlling the playback state of the headset as described above are realized.
  • the present application also provides a storage medium with computer-readable instructions stored on the storage medium, wherein when the computer-readable instructions are executed by a processor, the steps of the method for controlling the playback state of the headset are implemented.
  • the first environmental audio is acquired through the receiver, and the working state of the headset is adjusted according to the first environmental audio judgment scene, so as to prevent the user from being unable to make dangerous avoidance behaviors based on the external environmental sound in time, which may lead to dangerous accidents; Because of ignoring external environmental sounds, daily life is not convenient.
  • FIG. 1 is a schematic diagram of the hardware structure of a mobile terminal involved in a solution of an embodiment of the application
  • FIG. 2 is a schematic flowchart of a first embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 3 is a schematic flowchart of a second embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 4 is a schematic flowchart of a third embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 5 is a schematic flowchart of a fourth embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 6 is a schematic flowchart of a fifth embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 7 is a schematic flowchart of a sixth embodiment of a method for controlling the playback state of a headset according to this application.
  • FIG. 8 is a schematic flowchart of a seventh embodiment of a method for controlling the playback state of a headset according to this application;
  • FIG. 9 is a schematic diagram of modules of the control device for the earphone playback state of this application.
  • the method for controlling the playback state of earphones is mainly applied to a mobile terminal.
  • the mobile terminal is a device with processing functions, and may be a mobile phone, a tablet computer, a smart wearable device, or a portable computer.
  • FIG. 1 is a schematic diagram of the hardware structure of the mobile terminal involved in the solution of the embodiment of the application.
  • the mobile terminal may include a processor 1001 (for example, a CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to realize the connection and communication between these components;
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), a receiver (Receiver);
  • the network interface 1004 may optionally include WI- FI interface, SIM card interface, Bluetooth interface;
  • memory 1005 can be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may optionally be a storage device independent of the aforementioned processor 1001.
  • the receiver is an electro-acoustic device that converts audio electric signals into sound signals under the condition of no sound leakage (or according to the ITU standard 3.2 high/low leakage ring), so as to realize audio collection.
  • the mobile terminal communicates with the headset in a wired or wireless manner, and the mobile terminal sends out an electrical signal to drive the diaphragm of the headset to vibrate and sound.
  • FIG. 1 does not constitute a limitation on the device, and may include more or fewer components than shown in the figure, or combine certain components, or arrange different components.
  • the memory 1005 as a storage medium in FIG. 1 may include an operating system, an audio playback module, and computer readable instructions.
  • the audio playback module is mainly used to connect headphones to control the speaker of the headphones to vibrate and produce sound; and the processor 1001 can call computer-readable instructions stored in the memory 1005 and execute the steps of the method for controlling the playback state of the headphones.
  • This application provides a method for controlling the playback state of the headset.
  • the method for controlling the playback state of the headset includes the following steps:
  • Step S100 acquiring current state information of the mobile terminal, and determining whether the state information is a headset connection;
  • the method for controlling the playback state of headphones provided in this application is used in a mobile terminal.
  • the processor of the mobile terminal obtains current state information.
  • the state information may specifically include the working state of the mobile terminal's built-in speaker, whether it is connected to an external playback device, and the external playback device. Working status, etc.
  • Step S200 when the status information is earphone connection, obtain the first ambient audio through the receiver of the mobile terminal;
  • the earphone connection means that the mobile terminal emits sound through the earphone speaker.
  • the earphone and the mobile terminal can be directly connected via a wire, or wirelessly via Bluetooth or the like.
  • the first environmental audio is the audio information collected by the current receiver in real time, and the first environmental audio may include a mixture of multiple single audios such as vehicle driving noise, wind sound, arrival prompt sound, and pedestrian voice.
  • a receiver installed in the mobile terminal can be used, or a receiver connected to the mobile terminal can be used to collect the first environment audio. It can be a separate receiver or a receiver installed in the headset.
  • Step S300 Determine whether there is a single audio matching a preset audio in the first ambient audio
  • the preset audio is an audio file preset and stored by those skilled in the art. Specifically, a variety of preset audio can be set according to actual needs, such as: car brake sound, arrival prompt sound, etc.
  • Step S400 When the single audio corresponding to the preset audio exists in the first ambient audio, adjust the working state of the earphone connected to the mobile terminal according to the adjustment scheme corresponding to the preset audio.
  • Different adjustment schemes can be preset for different preset audios, which can specifically include increasing the volume, reducing the volume, turning off the active noise reduction function, etc., so that the working state of the headset can be adapted to different scenarios.
  • the preset audio is the car whistle sound, compare the obtained first environment audio with the car whistle sound, when the first environment audio has a single audio that matches the car whistle sound, it means that the user is in the environment
  • the adjustment scheme corresponding to the car whistle is called, the noise reduction function is turned off or the earphone playback volume is reduced, so that the user can hear the car whistle and make evasive behavior.
  • the first environment audio is collected by the receiver, and the first environment audio is compared with the preset audio to adjust the working state of the earphone according to the different first environment audio, so as to avoid the music played in the earphone or the earphone noise reduction function. Circumstances that cause users to be unable to evade dangerous behaviors or social behaviors based on external environmental sounds in time.
  • FIG. 3 is a schematic flowchart of a second embodiment of a method for controlling the playback state of a headset in this application.
  • the preset audio is a semantic prompt audio; the step S400 include:
  • Step S410 When the single audio corresponding to the prompt audio exists in the first ambient audio, send a play instruction including the prompt audio to the earphone, so that the earphone can play the prompt audio.
  • the preset audio is a semantic prompt audio, that is, the prompt audio is a human voice or an audio imitating a human voice.
  • the voice recognition software in the prior art can convert the prompt audio into a text form.
  • the pre-stored prompt audio is "the next stop arrives at site A", when the external sends out "the next stop arrives at site A", the receiver collects the first ambient audio including "the next stop arrives at site A”, and the first environmental audio is obtained by comparison. If there is a single audio match with the prompt audio in the ambient audio, the audio playback module controls the vibration of the earphone's diaphragm and emits a prompt sound of "the next stop arrives at site A". In order to prevent users from ignoring the external sending of "the next stop arrives at site A", leading to missed sites.
  • FIG. 4 is a schematic flowchart of a third embodiment of a method for controlling the playback state of a headset according to this application. Based on the first embodiment, the step S410 includes:
  • Step S411 Obtain the current display state of the display screen of the mobile terminal, and determine whether the display state is lit;
  • Step S412 When the display state is lit, obtain the semantic character corresponding to the prompt audio, and control the display screen of the mobile terminal to display the semantic character.
  • the semantic characters corresponding to the prompt audio are the characters corresponding to the prompt audio semantics. People often indulge in the content displayed on the mobile phone during the ride, resulting in missed sites. By judging the display status of the display to determine whether the user is currently reading the content of the display, control the display to display semantic characters to further remind the user.
  • FIG. 5 is a schematic flowchart of a fourth embodiment of a method for controlling the playback state of a headset according to this application. Based on the third embodiment, after the step S410, the method includes:
  • Step S414 If the semantic character is the preset inbound character, obtain the second ambient audio through the receiver;
  • Step S415 Determine whether there is a single audio corresponding to a preset driving audio in the second ambient audio
  • the preset driving audio is an audio preset by a person skilled in the art. Specifically, you can set the driving audio of multiple vehicles, subways, airplanes, etc.
  • Step S416 When a single audio in the second environment audio matches the preset driving audio, increase the playback volume of the earphone.
  • the method may further include: turning on the noise reduction function according to the noise reduction scheme corresponding to matching the preset driving audio.
  • the noise reduction processing can be performed using the following formula:
  • ⁇ 1, 0 ⁇ 1 P S (w) is the frequency spectrum of the input noisy speech
  • P n (w) is the estimated noise spectrum
  • the frequency spectrum ⁇ is the subtraction factor
  • is the lower limit threshold parameter of the frequency spectrum. Specifically determine the value of ⁇ and ⁇ according to the signal-to-noise ratio.
  • FIG. 6 is a schematic flowchart of a fifth embodiment of a method for controlling the playback state of headphones according to this application. Based on the third embodiment, the method for controlling the playback state of headphones further includes:
  • Step S420 Obtain a third environmental audio including the prompt audio through the receiver, perform semantic recognition on the third environmental audio, and generate text information;
  • Step S430 receiving a selection operation made by the user based on the text information, and setting all or part of the text information as semantic characters according to the selection operation;
  • Step S440 Set the audio corresponding to the semantic character in the third environmental audio as the prompt audio.
  • the third environment audio containing the prompt audio is collected through the receiver in advance, and then processed through noise reduction and semantic recognition to generate text information corresponding to the third environment audio. Since the audio collection process of the third environment is too noisy, the user decides the semantic characters that need to be set in the text information through selection operations to improve the recognition rate.
  • the user stands at "Site A” and records, obtains the third environment audio including train running sound, passerby conversation, wind and "Train is about to enter A station", and performs noise reduction processing on the third environment audio to reduce noise
  • the processed audio is converted into text information, that is, the text information "the train is about to enter station A” is obtained. If the generated text information does not match the actual situation, it proves that the audio quality of the recorded third environment is not high, and the user can record again.
  • the user manually selects "enter A site” as a semantic character, and the part corresponding to "enter A site” in the third environment audio after noise reduction is set as a prompt audio.
  • FIG. 7 is a schematic flowchart of a sixth embodiment of a method for controlling the playback state of a headset according to this application. Based on the first embodiment, the step S400 includes:
  • Step S450 when the single audio in the first ambient audio matches the preset audio, acquire the noise reduction type of the earphone, and determine whether the noise reduction type is active noise reduction;
  • Step S460 when the noise reduction type of the earphone is active noise reduction, turn off the active noise reduction function and reduce the playback volume of the earphone;
  • Step S470 When the noise reduction type of the earphone is not active noise reduction, the first ambient audio is played.
  • the existing headphones adopt active noise reduction or passive noise reduction.
  • Active noise reduction is the use of a noise reduction system to generate reverse sound waves equal to external noise, neutralize the noise, and achieve the effect of noise reduction.
  • Passive noise reduction is to reduce noise by using materials and structures to block noise waves.
  • the active noise reduction function is turned off and the playback volume of the earphone is appropriately reduced, so that the user can receive the audio that has been processed by noise reduction.
  • the earphone adopts passive noise reduction especially the structure with better sound insulation effect such as all-in-ear earphone, the user can directly play the first environment audio to obtain it.
  • FIG. 8 is a schematic flowchart of a seventh embodiment of a method for controlling the playback state of a headset according to this application. Based on the first embodiment, the step S300 includes:
  • Step S310 Perform framing processing on the first ambient audio to generate a speech frame
  • the framing process is to divide the first ambient audio into fixed-length audio segments through a moving window function.
  • each speech frame is 30 milliseconds, 5 milliseconds overlap, that is, in the Nth speech frame, the 1-5th millisecond and the N-1th
  • the 26-30th milliseconds of the voice frame are consistent, and the 26-30th milliseconds are consistent with the 1-5th milliseconds of the N+1th voice frame.
  • Step S320 Perform feature extraction on the voice frames to obtain the Mel frequency cepstral coefficient feature vector corresponding to each voice frame;
  • the feature vector is the feature vector of Mel frequency cepstrum coefficient, that is, the feature with recognition significance in the first environmental audio.
  • Step S330 input the Mel frequency cepstral coefficient feature vector into a preset phoneme model to obtain aligned frame-level speech feature vectors;
  • the preset phoneme model is a model pre-trained by a person skilled in the art through a large amount of speech data. Through the preset phoneme model, several speech frames can be corresponded to phonemes, and then words are composed of several phonemes.
  • Step S340 Determine whether the aligned frame-level speech feature vector is consistent with the preset audio through a preset state network, where the preset state network is a hidden Markov model (Hidden Markov Model, HMM).
  • HMM hidden Markov model
  • the preset state network is a text network set by those skilled in the art according to actual needs, so that several phonemes can find corresponding words in the preset state network.
  • the preset state network includes "today”, “tomorrow” and “the day after tomorrow", then no matter what the frame-level speech feature vector obtained is, the final corresponding words can only be "today”, “tomorrow” and "the day after tomorrow” One of them to increase the comparison efficiency and accuracy.
  • the present application also provides a device for controlling the playback state of headphones, and the device for controlling the playback state of headphones includes:
  • the first obtaining module 10 is configured to obtain current state information of the mobile terminal and determine whether the state information is a headset connection;
  • the second acquiring module 20 is configured to acquire the first ambient audio through the receiver of the mobile terminal when the status information is earphone connection;
  • the judgment module 30 is configured to judge whether there is a single audio matching a preset audio in the first ambient audio
  • the adjustment module 40 is configured to, when the single audio corresponding to the preset audio exists in the first ambient audio, adjust the corresponding adjustment scheme to the mobile terminal according to the adjustment scheme corresponding to the preset audio.
  • the working status of the connected headset is configured to, when the single audio corresponding to the preset audio exists in the first ambient audio, adjust the corresponding adjustment scheme to the mobile terminal according to the adjustment scheme corresponding to the preset audio. The working status of the connected headset.
  • the preset audio is a semantic prompt audio
  • the adjustment module 40 is further configured to send to the earphone when the single audio corresponding to the prompt audio exists in the first ambient audio Include a play instruction of the prompt audio, so that the earphone plays the prompt audio.
  • the preset audio is a prompt audio with semantics;
  • the adjustment module 40 includes:
  • a first obtaining unit configured to obtain the current display state of the display screen of the mobile terminal, and determine whether the display state is on;
  • the display unit is configured to obtain the semantic character corresponding to the prompt audio when the display state is lit, and control the display screen of the mobile terminal to display the semantic character.
  • semantic characters include inbound semantic characters and inbound semantic characters
  • adjustment module 40 includes:
  • a first judging unit the first judging unit is used to judge whether the semantic character is a preset inbound character
  • a second acquiring unit is configured to acquire a second ambient audio through the receiver if the semantic character is the preset inbound character;
  • a second determining unit configured to determine whether there is a single audio corresponding to a preset driving audio in the second ambient audio
  • the playback adjustment unit is configured to increase the playback volume of the earphone when there is a single audio in the second ambient audio corresponding to the preset driving audio.
  • control device includes:
  • a third acquisition module 50 configured to acquire a third environmental audio including the prompt audio through the receiver, perform semantic recognition on the third environmental audio, and generate text information;
  • a setting module 60 configured to receive a selection operation made by a user based on the text information, and set all or part of the text information as semantic characters according to the selection operation;
  • the audio corresponding to the semantic character in the third environment audio is set as the prompt audio.
  • the adjustment module 40 includes:
  • a second judging unit is configured to judge whether the noise reduction type of the earphone is active noise reduction when there is a match between the single audio and the preset audio in the first ambient audio;
  • the playback adjustment unit is further configured to turn off the active noise reduction function and reduce the playback volume of the headset when the noise reduction type of the earphone is active noise reduction;
  • the noise reduction type of the earphone is not active noise reduction, the first ambient audio is played.
  • judgment module 30 includes:
  • a framing processing unit configured to perform framing processing on the first ambient audio to generate a speech frame
  • a feature extraction unit configured to perform feature extraction on the speech frame to obtain a Mel frequency cepstral coefficient feature vector corresponding to each speech frame;
  • a feature processing unit configured to input the Mel frequency cepstral coefficient feature vector into a preset phoneme model to obtain aligned frame-level speech feature vectors
  • a matching processing unit configured to determine whether the aligned frame-level speech feature vector is consistent with the preset audio through a preset state network, wherein the preset state network is constructed by a hidden Markov model Database.
  • the present application also provides a storage medium on which computer-readable instructions are stored.
  • the steps of the method for controlling the playback state of the headset are implemented.
  • the method implemented when the computer-readable instruction is executed can refer to the various embodiments of the method for controlling the playback state of the headset of the present application, which will not be repeated here.
  • the storage medium in this application may specifically be a non-volatile computer-readable storage medium.
  • the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. ⁇
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM , Magnetic disk, optical disk), including several instructions to make a terminal (can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the application.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Headphones And Earphones (AREA)

Abstract

本申请公开了一种耳机播放状态的控制方法、耳机播放状态的控制装置移动终端及存储介质,该方法包括:获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;当所述状态信息为耳机连接,则通过受话器获取第一环境音频;判断所述第一环境音频中是否存在单音频与预设音频匹配;当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。

Description

耳机播放状态的控制方法、装置、移动终端及存储介质
本申请要求于2019年07月10日提交中国专利局、申请号为201910624028.5、发明名称为“耳机播放状态的控制方法、装置、移动终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及终端控制技术领域,尤其涉及一种耳机播放状态的控制方法、移动终端及存储介质。
背景技术
随着移动终端的发展,手持移动终端已经成为人们日常生活的必备品,与移动终端配合使用的耳机成为人们长时间佩戴的可穿戴设备。由于耳机播放的声音较大,会干扰用户听取外界环境音,特别是现有降噪耳机、全包耳耳机为了提升耳机音效品质,可有效降低外界环境音,导致用户无法及时根据外界环境音作出规避危险的行为,使用户处于危险中;或忽略外界环境音,造成日常生活不便利。
发明内容
本申请的主要目的在于提供一种耳机播放状态的控制方法、耳机播放状态的控制装置、移动终端及存储介质,旨在解决由于使用耳机影响用户提取有用的外界环境音,导致生活不便利的技术问题。
为实现上述目的,本申请提出一种耳机播放状态的控制方法,所述耳机播放状态的控制方法包括:
获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
当所述状态信息为耳机连接,则通过受话器获取第一环境音频;
判断所述第一环境音频中是否存在单音频与预设音频匹配;
当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
可选地,所述预设音频为一段具有语义的提示音频;所述当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态的步骤,包括
当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
可选地,所述预设音频为一段具有语义的提示音频;所述当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频的步骤之后,包括:
获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
可选地,所述语义字符包括进站语义字符和到站语义字符,所述当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频的步骤之后,包括:
判断所述语义字符是否为预设进站字符;
若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
判断所述第二环境音频中是否存在单音频与预设行驶音频对应;
当所述第二环境音频中存在单音频与所述预设行驶音频对应,则增加所述耳机的播放音量。
可选地,所述耳机播放状态的控制方法还包括:
通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
可选地,所述当所述第一环境音频中存在所述单音频与所述预设音频匹配,则根据所述预设音频对应的调节方案调节耳机工作状态的步骤包括:
当所述第一环境音频中存在所述单音频与所述预设音频匹配,则判断所述耳机的降噪类型是否为主动降噪;
当所述耳机的降噪类型为主动降噪,则关闭主动降噪功能并降低所述耳机播放音量;
当所述耳机的降噪类型不为主动降噪,则播放所述第一环境音频。
可选地,所述判断所述第一环境音频中是否存在单音频与预设音频对应的步骤包括:
对所述第一环境音频进行分帧处理,生成语音帧;
对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库。
本申请还提供了一种耳机播放状态的控制装置,所述控制装置包括:
第一获取模块,所述第一获取模块用于获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
第二获取模块,所述第二获取模块用于当所述状态信息为耳机连接,则通过移动终端的受话器获取第一环境音频;
判断模块,所述判断模块用于判断所述第一环境音频中是否存在单音频与预设音频匹配;
调节模块,所述调节模块用于当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
本申请还提供了一种移动终端,所述移动终端包括处理器、存储器、受话器以及存储在所述存储器上的可被所述处理器执行的计算机可读指令,其中,所述计算机可读指令被所述处理器执行时,实现如上述的耳机播放状态的控制方法的步骤。
本申请还提供了一种存储介质,所述存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时,实现如上述的耳机播放状态的控制方法的步骤。
本申请技术方案中,通过受话器获取第一环境音频,根据第一环境音频判定场景,对耳机工作状态进行调节,避免用户无法及时根据外界环境音作出规避危险的行为,导致危险意外发生;避免用户因为忽略外界环境音,造成日常生活不便利。
附图说明
图1为本申请实施例方案中涉及的移动终端的硬件结构示意图;
图2为本申请耳机播放状态的控制方法第一实施例的流程示意图;
图3为本申请耳机播放状态的控制方法第二实施例的流程示意图;
图4为本申请耳机播放状态的控制方法第三实施例的流程示意图;
图5为本申请耳机播放状态的控制方法第四实施例的流程示意图;
图6为本申请耳机播放状态的控制方法第五实施例的流程示意图;
图7为本申请耳机播放状态的控制方法第六实施例的流程示意图;
图8为本申请耳机播放状态的控制方法第七实施例的流程示意图;
图9为本申请耳机播放状态的控制装置的模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例涉及的耳机播放状态的控制方法主要应用于移动终端,该移动终端是具有处理功能的设备,可以是手机、平板电脑、智能穿戴设备或者便携计算机。
参照图1,图1为本申请实施例方案中涉及的移动终端的硬件结构示意图。本申请实施例中,移动终端可以包括处理器1001(例如CPU),通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信;用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard)、受话器(Receiver);网络接口1004可选的可以包括WI-FI接口、SIM卡接口、蓝牙接口;存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器,存储器1005可选的还可以是独立于前述处理器1001的存储装置。
受话器为一种在无声音泄漏(或按ITU标准的3.2型高/低泄漏环)条件下将音频电信号转换成声音信号的电声器件,从而实现音频采集。移动终端通过有线或无线的方式与耳机进行通信连接,移动终端发出电信号驱动耳机的振膜振动发声。
本领域技术人员可以理解,图1中示出的硬件结构并不构成对设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
继续参照图1,图1中作为一种存储介质的存储器1005可以包括操作系统、音频播放模块、以及计算机可读指令。
在图1中,音频播放模块主要用于连接耳机,控制耳机的扬声器振动发声;而处理器1001可以调用存储器1005中存储的计算机可读指令,并执行耳机播放状态的控制方法的步骤。
基于上述终端的硬件结构,提出本申请耳机播放状态的控制方法的各个实施例。
本申请提供一种耳机播放状态的控制方法。
请参阅图2,在本申请第一实施例中,耳机播放状态的控制方法包括以下步骤:
步骤S100,获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
本申请提供的耳机播放状态的控制方法用于移动终端,移动终端的处理器获取当前的状态信息,状态信息具体可以包括移动终端自带扬声器的工作状态、是否与外接播放设备连接、外接播放设备的工作状态等。
步骤S200,当所述状态信息为耳机连接,则通过移动终端的受话器获取第一环境音频;
耳机连接为移动终端通过耳机的扬声器进行发声。耳机与移动终端可以直接通过线路有线连接,也可以通过蓝牙等无线连接。第一环境音频为当前受话器实时采集的音频信息,第一环境音频可以包括车辆行驶噪音、风声、到站提示音、行人说话声等多种单音频混合而成。具体可以采用移动终端内安装的受话器,也可以使用与移动终端连接的受话器进行第一环境音频的采集,可以是单独的受话器,也可以是设于耳机内的受话器。
步骤S300,判断所述第一环境音频中是否存在单音频与预设音频匹配;
预设音频为本领域技术人员预先设置并存储的音频文件。具体地,可以根据实际需要设置多种预设音频,例如:汽车刹车音、到站提示音等。
步骤S400,当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
当所述第一环境音频中不存在所述单音频与所述预设音频对应,则不做处理。
针对不同的预设音频可以预先设置不同的调节方案,具体可以包括增大音量、降低音量、关闭主动降噪功能等,以使得耳机工作状态可以适应不同的场景。
例如:预设音频为汽车鸣笛声,将获取的第一环境音频与汽车鸣笛声比较,当第一环境音频存在单音频有与汽车鸣笛声匹配,则表示用户所处的环境中存在汽车鸣笛声,此时调用与汽车鸣笛声对应的调节方案,关闭降噪功能或降低耳机播放音量,以便用户可听到汽车鸣笛声,做出规避行为。
本申请通过受话器采集第一环境音频,再将第一环境音频与预设音频比较,以根据不同的第一环境音频调节耳机的工作状态,从而避免因为耳机内播放的音乐或耳机降噪功能,导致用户无法及时根据外界环境音作出规避危险行为或社会行为的情况。
进一步地,请参照图3,图3为本申请耳机播放状态的控制方法第二实施例的流程示意图,基于第二实施例,所述预设音频为一段具有语义的提示音频;所述步骤S400包括:
步骤S410,当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
在本实施例中预设音频为一段具有语义的提示音频,即提示音频为人声或模仿人声的音频,现有技术中的语音识别软件能将提示音频转换为文字形式表达出来。
例如:预存的提示音频为“下一站到达A站点”,当外部发出“下一站到达A站点”,受话器采集到包含“下一站到达A站点”的第一环境音频,通过比较得到第一环境音频中存在单音频与提示音频匹配,则音频播放模块控制耳机的振膜振动,发出“下一站到达A站点”的提示音。以避免用户忽略了外部发出“下一站到达A站点”,导致错过站点。
进一步地,请参照图4,图4为本申请耳机播放状态的控制方法第三实施例的流程示意图,基于第一实施例,所述步骤S410之后包括:
步骤S411,获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
步骤S412,当所述显示状态为亮起,则获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
本领域技术人员预先设置与提示音频对应的语义字符,语义字符为与提示音频语义对应的字符。人们在乘车过程中,经常沉迷于手机显示内容中,造成错过站点,通过判断显示屏的显示状态,以判断用户当前是否正在阅读显示屏的内容,控制显示屏显示语义字符,进一步提醒用户。
进一步地,请参照图5,图5为本申请耳机播放状态的控制方法第四实施例的流程示意图,基于第三实施例,所述步骤S410之后,包括:
步骤S413,判断所述语义字符是否为预设进站字符;
步骤S414,若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
步骤S415,判断所述第二环境音频中是否存在单音频与预设行驶音频对应;
预设行驶音频为本领域技术人员预先设置的音频。具体可以设置多个汽车、地铁、飞机等交通工具的行驶音频,
步骤S416,当所述第二环境音频中存在单音频与所述预设行驶音频匹配,则增加所述耳机的播放音量。
当所述第二环境音频中存在单音频与所述预设行驶音频匹配,则表明用户已上交通工具,并且交通工具正在行驶中,通过增加所述耳机的播放音量,降低由于车辆行驶过程中的噪音对耳机播放音的干扰。可选地,步骤S416之后,还可以包括:根据匹配所述预设行驶音频对应的降噪方案,开启降噪功能。本申请中,降噪处理可采用以下公式进行:
let D(w)=P S (w)-αP n (w);
如果D(w)>βP n (w),P ' S (w)=D(w);
如果D(w)≤βP n (w),P ' S (w)=βP n (w);
其中,α≥1,0<β<1,P S (w)是输入的带噪语音的频谱,P n (w)是估计出的噪音的频谱,两者相减得到D(w)差值频谱,α为相减因子,β为频谱下限阈值参数。具体根据信噪比确定α和β的值。
进一步地,请参照图6,图6为本申请耳机播放状态的控制方法第五实施例的流程示意图,基于第三实施例,所述耳机播放状态的控制方法还包括:
步骤S420,通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
步骤S430,接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
步骤S440,将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
用户可根据自身需要,自行设置提示音频。预先通过受话器采集含有提示音频的第三环境音频,再通过降噪、语义识别等处理,生成与第三环境音频对应的文字信息。由于第三环境音频采集过程过于嘈杂,用户通过选择操作,自行决定文字信息中需要设置的语义字符,以提高识别率。
例如:用户站在“A站点”录制,获取包含列车行驶音、路人交谈音、风声和“列车即将进入A站点”的第三环境音频,对该第三环境音频进行降噪处理,将降噪处理后的音频转换为文字信息,即得到“列车即将进入A站点”的文字信息。若生成的文字信息与实际不符,证明录制的第三环境音频质量不高,用户可重新进行录制。用户手动选择“进入A站点”为语义字符,则将降噪后第三环境音频中与“进入A站点”对应部分设置为提示音频。
进一步地,请参照图7,图7为本申请耳机播放状态的控制方法第六实施例的流程示意图,基于第一实施例,所述步骤S400包括:
步骤S450,当所述第一环境音频中存在所述单音频与所述预设音频匹配,则获取所述耳机的降噪类型,判断所述降噪类型是否为主动降噪;
步骤S460,当所述耳机的降噪类型为主动降噪,则关闭主动降噪功能并降低所述耳机播放音量;
步骤S470,当所述耳机的降噪类型不为主动降噪,则播放所述第一环境音频。
为了避免外界环境音的干扰,提升音质,现有的耳机采用主动降噪或者被动降噪。主动降噪是通过降噪系统产生与外界噪音相等的反向声波,将噪音中和,从而实现降噪的效果。被动降噪是通过使用材料和结构等阻隔噪音声波,以降低噪音。当耳机采用主动降噪时,关闭主动降噪功能并适当降低所述耳机播放音量,即可让用户接收到被降噪处理掉的音频。当耳机采用被动降噪时,特别是全包耳耳机等隔音效果较佳的结构,直接通过播放所述第一环境音频,使用户获取。
进一步地,请参照图8,图8为本申请耳机播放状态的控制方法第七实施例的流程示意图,基于第一实施例,所述步骤S300包括:
步骤S310,对所述第一环境音频进行分帧处理,生成语音帧;
分帧处理为通过移动窗函数将第一环境音频分割成固定长度的音频片段。依次排列的语音帧和语音帧之间有一定的交叠,例如:将每个语音帧30毫秒,5毫秒的交叠,即在第N语音帧中,第1-5毫秒与第N-1语音帧的第26-30毫秒一致,第26-30毫秒与第N+1语音帧的第1-5毫秒一致。
步骤S320,对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
由于语音帧本身在时域上不具有描述能力,根据人类耳朵的生理特性,将语音帧进行傅里叶变换、三角滤波、对数变换、离散余弦变换等处理,将语音帧处理为一个13维的特征向量,该特征向量为梅尔频率倒谱系数特征向量,即第一环境音频中具有识别意义的特征。
步骤S330,将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
预设音素模型为本领域技术人员通过大量语音数据预先训练好的模型。通过预设音素模型能将若干语音帧对应到音素,再由若干音素组成词语。
步骤S340,通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型(Hidden Markov Model,HMM)构建的数据库。
预设状态网络为本领域技术人员根据实际需要设置的文本网络,使得若干音素在预设状态网络中找到对应的词语。例如:预设状态网络中包括“今天”、“明天”和“后天”,那么无论获得的帧级语音特征向量是什么,最后对应的词语只能是“今天”、“明天”和“后天”中的一个,以增加比对效率和准确率。
此外,本申请还提供一种耳机播放状态的控制装置,所述耳机播放状态的控制装置包括:
第一获取模块10,所述第一获取模块10用于获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
第二获取模块20,所述第二获取模块20用于当所述状态信息为耳机连接,则通过移动终端的受话器获取第一环境音频;
判断模块30,所述判断模块30用于判断所述第一环境音频中是否存在单音频与预设音频匹配;
调节模块40,所述调节模块40用于当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
进一步地,所述预设音频为一段具有语义的提示音频;所述调节模块40还用于当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
进一步地,所述预设音频为一段具有语义的提示音频;所述调节模块40包括:
第一获取单元,所述第一获取单元用于获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
显示单元,所述显示单元用于当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
进一步地,所述语义字符包括进站语义字符和到站语义字符,所述调节模块40包括:
第一判断单元,所述第一判断单元用于判断所述语义字符是否为预设进站字符;
第二获取单元,所述第二获取单元用于若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
第二判断单元,所述第二判断单元用于判断所述第二环境音频中是否存在单音频与预设行驶音频对应;
播放调节单元,所述播放调节单元用于当所述第二环境音频中存在单音频与所述预设行驶音频对应,则增加所述耳机的播放音量。
进一步地,所述控制装置包括:
第三获取模块50,所述第三获取模块50用于通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
设置模块60,所述设置模块60用于接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
进一步地,所述调节模块40包括:
第二判断单元,所述第二判断单元用于当所述第一环境音频中存在所述单音频与所述预设音频匹配,则判断所述耳机的降噪类型是否为主动降噪;
所述播放调节单元还用于当所述耳机的降噪类型为主动降噪,则关闭主动降噪功能并降低所述耳机播放音量;
当所述耳机的降噪类型不为主动降噪,则播放所述第一环境音频。
进一步地,所述判断模块30包括:
分帧处理单元,所述分帧处理单元用于对所述第一环境音频进行分帧处理,生成语音帧;
特征提取单元,所述特征提取单元用于对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
特征处理单元,所述特征处理单元用于将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
匹配处理单元,所述匹配处理单元用于通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库。
此外,本申请还提供一种存储介质,存储介质上存储有计算机可读指令,其中,计算机可读指令被处理器执行时,实现如上述的耳机播放状态的控制方法的步骤。
其中计算机可读指令被执行时所实现的方法可参照本申请耳机播放状态的控制方法的各个实施例,此处不再赘述。在本申请中存储介质具体可以为非易失性计算机可读存储介质。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种耳机播放状态的控制方法,其中,包括:
    获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
    当所述状态信息为耳机连接,则通过受话器获取第一环境音频;
    判断所述第一环境音频中是否存在单音频与预设音频匹配;
    当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态,
    其中,所述判断所述第一环境音频中是否存在单音频与预设音频匹配的步骤包括:
    对所述第一环境音频进行分帧处理,生成语音帧;
    对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
    将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
    通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库。
  2. 如权利要求1所述的耳机播放状态的控制方法,其中,所述预设音频为一段具有语义的提示音频;所述当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态的步骤,包括
    当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
  3. 如权利要求2所述的耳机播放状态的控制方法,其中,所述当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频的步骤之后,包括:
    获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
    当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
  4. 如权利要求3所述的耳机播放状态的控制方法,其中,所述耳机播放状态的控制方法还包括:
    通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
    接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
    将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
  5. 如权利要求2所述的耳机播放状态的控制方法,其中,所述语义字符包括进站语义字符和到站语义字符,所述当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频的步骤之后,包括:
    判断所述语义字符是否为预设进站字符;
    若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
    判断所述第二环境音频中是否存在单音频与预设行驶音频匹配;
    当所述第二环境音频中存在单音频与所述预设行驶音频对应,则增加所述耳机的播放音量。
  6. 如权利要求1所述的耳机播放状态的控制方法,其中,所述当所述第一环境音频中存在所述单音频与所述预设音频匹配,则根据所述预设音频对应的调节方案调节耳机工作状态的步骤包括:
    当所述第一环境音频中存在所述单音频与所述预设音频匹配,则判断所述耳机的降噪类型是否为主动降噪;
    当所述耳机的降噪类型为主动降噪,则关闭主动降噪功能并降低所述耳机播放音量;
    当所述耳机的降噪类型不为主动降噪,则播放所述第一环境音频。
  7. 一种耳机播放状态的控制装置,其中,所述控制装置包括:
    第一获取模块,所述第一获取模块用于获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
    第二获取模块,所述第二获取模块用于当所述状态信息为耳机连接,则通过移动终端的受话器获取第一环境音频;
    判断模块,所述判断模块用于判断所述第一环境音频中是否存在单音频与预设音频匹配;
    调节模块,所述调节模块用于当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态;
    其中,所述判断模块包括:
    分帧处理单元,所述分帧处理单元用于对所述第一环境音频进行分帧处理,生成语音帧;
    特征提取单元,所述特征提取单元用于对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
    特征处理单元,所述特征处理单元用于将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
    匹配处理单元,所述匹配处理单元用于通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库。
  8. 如权利要求7所述的耳机播放状态的控制装置,其中,所述预设音频为一段具有语义的提示音频;所述调节模块还用于当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
  9. 如权利要求8所述的耳机播放状态的控制装置,其中,所述调节模块包括:
    第一获取单元,所述第一获取单元用于获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
    显示单元,所述显示单元用于当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
  10. 一种移动终端,其中,包括处理器、存储器、受话器以及存储在所述存储器上的可被所述处理器执行的计算机可读指令,其中,所述计算机可读指令被所述处理器执行时,实现如下步骤:
    获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
    当所述状态信息为耳机连接,则通过受话器获取第一环境音频;
    对所述第一环境音频进行分帧处理,生成语音帧;
    对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
    将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
    通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库;
    当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
  11. 如权利要求10所述的移动终端,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    所述预设音频为一段具有语义的提示音频;当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
  12. 如权利要求11所述的移动终端,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
    当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
  13. 如权利要求12所述的移动终端,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
    接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
    将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
  14. 如权利要求11所述的移动终端,其中,所述语义字符包括进站语义字符和到站语义字符,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    判断所述语义字符是否为预设进站字符;
    若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
    判断所述第二环境音频中是否存在单音频与预设行驶音频匹配;
    当所述第二环境音频中存在单音频与所述预设行驶音频对应,则增加所述耳机的播放音量。
  15. 如权利要求10所述的移动终端,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    当所述第一环境音频中存在所述单音频与所述预设音频匹配,则判断所述耳机的降噪类型是否为主动降噪;
    当所述耳机的降噪类型为主动降噪,则关闭主动降噪功能并降低所述耳机播放音量;
    当所述耳机的降噪类型不为主动降噪,则播放所述第一环境音频。
  16. 一种存储介质,其中,所述存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时,实现如下步骤:
    获取移动终端当前的状态信息,判断所述状态信息是否为耳机连接;
    当所述状态信息为耳机连接,则通过受话器获取第一环境音频;
    对所述第一环境音频进行分帧处理,生成语音帧;
    对所述语音帧进行特征提取,以得到每个所述语音帧对应的梅尔频率倒谱系数特征向量;
    将所述梅尔频率倒谱系数特征向量输入预设音素模型中,以得到对齐的帧级语音特征向量;
    通过预设状态网络,判断所述对齐的帧级语音特征向量是否和所述预设音频一致,其中,预设状态网络为通过隐马尔可夫模型构建的数据库;
    当所述第一环境音频中存在所述单音频与所述预设音频对应,则根据所述预设音频对应的调节方案调节与所述移动终端相连接的耳机工作状态。
    所述计算机可读指令被所述处理器执行时,还实现如下步骤:
  17. 如权利要求16所述的存储介质,其中,所述预设音频为一段具有语义的提示音频;所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    当所述第一环境音频中存在所述单音频与所述提示音频对应,则向所述耳机发送包含所述提示音频的播放指令,以使所述耳机播放所述提示音频。
  18. 如权利要求17所述的存储介质,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    获取所述移动终端显示屏当前的显示状态,判断所述显示状态是否为亮起;
    当所述显示状态为亮起,获取所述提示音频对应的语义字符,控制所述移动终端的显示屏显示所述语义字符。
  19. 如权利要求18所述的存储介质,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    通过所述受话器获取包含所述提示音频的第三环境音频,对所述第三环境音频进行语义识别,生成文字信息;
    接收用户基于所述文字信息做出的选择操作,根据所述选择操作将所述文字信息中的全部或部分设置为语义字符;
    将所述第三环境音频中与所述语义字符对应的音频设置为所述提示音频。
  20. 如权利要求17所述的存储介质,其中,所述语义字符包括进站语义字符和到站语义字符,所所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    判断所述语义字符是否为预设进站字符;
    若所述语义字符为所述预设进站字符,则通过所述受话器获取第二环境音频;
    判断所述第二环境音频中是否存在单音频与预设行驶音频匹配;
    当所述第二环境音频中存在单音频与所述预设行驶音频对应,则增加所述耳机的播放音量。
PCT/CN2019/121190 2019-07-10 2019-11-27 耳机播放状态的控制方法、装置、移动终端及存储介质 WO2021003955A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910624028.5 2019-07-10
CN201910624028.5A CN110475170A (zh) 2019-07-10 2019-07-10 耳机播放状态的控制方法、装置、移动终端及存储介质

Publications (1)

Publication Number Publication Date
WO2021003955A1 true WO2021003955A1 (zh) 2021-01-14

Family

ID=68507273

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/121190 WO2021003955A1 (zh) 2019-07-10 2019-11-27 耳机播放状态的控制方法、装置、移动终端及存储介质

Country Status (2)

Country Link
CN (1) CN110475170A (zh)
WO (1) WO2021003955A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113223508A (zh) * 2021-03-29 2021-08-06 深圳市芯中芯科技有限公司 一种双模tws蓝牙耳机的管理方法
CN113409801A (zh) * 2021-08-05 2021-09-17 云从科技集团股份有限公司 用于实时音频流播放的噪音处理方法、系统、介质和装置
CN113490089A (zh) * 2021-06-02 2021-10-08 安克创新科技股份有限公司 降噪控制方法、电子设备及计算机可读存储装置
CN115037831A (zh) * 2021-03-05 2022-09-09 深圳市万普拉斯科技有限公司 一种模式控制方法、装置、电子设备及耳机
WO2023103144A1 (zh) * 2021-12-11 2023-06-15 新线科技有限公司 网络会议控制方法、装置、电子设备及存储介质
CN116761114A (zh) * 2023-07-14 2023-09-15 润芯微科技(江苏)有限公司 一种车载音响播放声音调节方法及其系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110475170A (zh) * 2019-07-10 2019-11-19 深圳壹账通智能科技有限公司 耳机播放状态的控制方法、装置、移动终端及存储介质
CN110996205A (zh) * 2019-11-28 2020-04-10 歌尔股份有限公司 耳机的控制方法、耳机及可读存储介质
CN111464902A (zh) * 2020-03-31 2020-07-28 联想(北京)有限公司 信息处理方法、装置及耳机和存储介质
CN111683317B (zh) * 2020-05-28 2022-04-08 江苏紫米电子技术有限公司 一种应用于耳机的提示方法、装置、终端及存储介质
CN111933184B (zh) * 2020-09-29 2021-01-08 平安科技(深圳)有限公司 一种语音信号处理方法、装置、电子设备和存储介质
CN113259826B (zh) * 2021-06-23 2021-10-01 央广新媒体文化传媒(北京)有限公司 在电子终端中实现助听的方法和装置
WO2023011370A1 (zh) * 2021-08-04 2023-02-09 维沃移动通信有限公司 音频播放方法、装置
CN114629990A (zh) * 2022-01-24 2022-06-14 武汉小码联城科技有限公司 播放状态调节方法、装置、存储介质和移动终端

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2806618A1 (en) * 2013-05-20 2014-11-26 Samsung Electronics Co., Ltd Apparatus for recording conversation and method thereof
CN105205955A (zh) * 2015-09-25 2015-12-30 小米科技有限责任公司 一种发出提示信号的方法和装置
CN106601272A (zh) * 2016-11-24 2017-04-26 歌尔股份有限公司 耳机及其语音识别方法
CN107147795A (zh) * 2017-05-24 2017-09-08 上海与德科技有限公司 一种提示方法及移动终端
WO2018045536A1 (zh) * 2016-09-08 2018-03-15 华为技术有限公司 声音信号处理的方法、终端和耳机
CN107948801A (zh) * 2017-12-21 2018-04-20 广东小天才科技有限公司 一种耳机的控制方法及耳机
CN109192213A (zh) * 2018-08-21 2019-01-11 平安科技(深圳)有限公司 庭审语音实时转写方法、装置、计算机设备及存储介质
CN110475170A (zh) * 2019-07-10 2019-11-19 深圳壹账通智能科技有限公司 耳机播放状态的控制方法、装置、移动终端及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8447031B2 (en) * 2008-01-11 2013-05-21 Personics Holdings Inc. Method and earpiece for visual operational status indication
CN106686481A (zh) * 2016-11-29 2017-05-17 维沃移动通信有限公司 一种耳机降噪的方法及耳机

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2806618A1 (en) * 2013-05-20 2014-11-26 Samsung Electronics Co., Ltd Apparatus for recording conversation and method thereof
CN105205955A (zh) * 2015-09-25 2015-12-30 小米科技有限责任公司 一种发出提示信号的方法和装置
WO2018045536A1 (zh) * 2016-09-08 2018-03-15 华为技术有限公司 声音信号处理的方法、终端和耳机
CN106601272A (zh) * 2016-11-24 2017-04-26 歌尔股份有限公司 耳机及其语音识别方法
CN107147795A (zh) * 2017-05-24 2017-09-08 上海与德科技有限公司 一种提示方法及移动终端
CN107948801A (zh) * 2017-12-21 2018-04-20 广东小天才科技有限公司 一种耳机的控制方法及耳机
CN109192213A (zh) * 2018-08-21 2019-01-11 平安科技(深圳)有限公司 庭审语音实时转写方法、装置、计算机设备及存储介质
CN110475170A (zh) * 2019-07-10 2019-11-19 深圳壹账通智能科技有限公司 耳机播放状态的控制方法、装置、移动终端及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115037831A (zh) * 2021-03-05 2022-09-09 深圳市万普拉斯科技有限公司 一种模式控制方法、装置、电子设备及耳机
CN113223508A (zh) * 2021-03-29 2021-08-06 深圳市芯中芯科技有限公司 一种双模tws蓝牙耳机的管理方法
CN113223508B (zh) * 2021-03-29 2023-08-04 深圳市芯中芯科技有限公司 一种双模tws蓝牙耳机的管理方法
CN113490089A (zh) * 2021-06-02 2021-10-08 安克创新科技股份有限公司 降噪控制方法、电子设备及计算机可读存储装置
CN113409801A (zh) * 2021-08-05 2021-09-17 云从科技集团股份有限公司 用于实时音频流播放的噪音处理方法、系统、介质和装置
CN113409801B (zh) * 2021-08-05 2024-03-19 云从科技集团股份有限公司 用于实时音频流播放的噪音处理方法、系统、介质和装置
WO2023103144A1 (zh) * 2021-12-11 2023-06-15 新线科技有限公司 网络会议控制方法、装置、电子设备及存储介质
CN116761114A (zh) * 2023-07-14 2023-09-15 润芯微科技(江苏)有限公司 一种车载音响播放声音调节方法及其系统
CN116761114B (zh) * 2023-07-14 2024-01-26 润芯微科技(江苏)有限公司 一种车载音响播放声音调节方法及其系统

Also Published As

Publication number Publication date
CN110475170A (zh) 2019-11-19

Similar Documents

Publication Publication Date Title
WO2021003955A1 (zh) 耳机播放状态的控制方法、装置、移动终端及存储介质
WO2020141824A2 (en) Processing method of audio signal and electronic device supporting the same
WO2012102464A1 (ko) 이어마이크로폰 및 이어마이크로폰용 전압 제어 장치
WO2020055048A1 (en) Method for determining earphone wearing state, method for controlling electronic apparatus, and electronic apparatus
WO2019078588A1 (ko) 전자 장치 및 그의 동작 방법
US9336786B2 (en) Signal processing device, signal processing method, and storage medium
WO2020215373A1 (zh) 健康监护方法、系统及计算机可读存储介质
WO2020166944A1 (en) Sound outputting device including plurality of microphones and method for processing sound signal using plurality of microphones
WO2017188648A1 (ko) 이어셋 및 그 제어 방법
WO2020155089A1 (zh) 蓝牙耳机的控制方法、蓝牙耳机及计算机可读存储介质
WO2010019634A2 (en) Wearable headset with self-contained vocal feedback and vocal command
WO2020045835A1 (ko) 전자 장치 및 그 제어 방법
WO2020130549A1 (en) Electronic device and method for controlling electronic device
WO2018038381A1 (ko) 외부 기기를 제어하는 휴대 기기 및 이의 오디오 신호 처리 방법
WO2020050509A1 (en) Voice synthesis device
WO2018066731A1 (ko) 통화 기능을 수행하는 단말 장치 및 방법
WO2020189837A1 (ko) 웨어러블 디바이스를 동작하기 위한 장치 및 방법
WO2010087632A2 (en) Portable terminal and sound detector, which both communicate using body area network, and data controlling method therefor
US20210168491A1 (en) Automatic keyword pass-through system
WO2021125784A1 (ko) 전자장치 및 그 제어방법
WO2021040201A1 (ko) 전자 장치 및 이의 제어 방법
WO2014088202A1 (ko) 음향패턴을 이용하여 휴대형 단말기용 이어폰 인식하는 음향처리 시스템, 음향패턴을 이용한 휴대형 단말기용 이어폰 인식방법 및 이를 이용한 입력음향 처리 방법, 이어마이크셋 기반의 오디오신호 증폭 출력 자동전환 방법 및 이를 위한 컴퓨터로 판독가능한 기록매체.
WO2022054978A1 (ko) 자연어 또는 비자연어를 구분하는 스마트 히어링 디바이스, 인공지능 히어링 시스템 및 그 방법
WO2018030687A1 (ko) 오디오 신호를 처리하기 위한 장치 및 방법
WO2020101358A2 (ko) 이어셋을 이용한 서비스 제공방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19936577

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.05.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19936577

Country of ref document: EP

Kind code of ref document: A1