CN112017687A - Voice processing method, device and medium of bone conduction equipment - Google Patents

Voice processing method, device and medium of bone conduction equipment Download PDF

Info

Publication number
CN112017687A
CN112017687A CN202010954775.8A CN202010954775A CN112017687A CN 112017687 A CN112017687 A CN 112017687A CN 202010954775 A CN202010954775 A CN 202010954775A CN 112017687 A CN112017687 A CN 112017687A
Authority
CN
China
Prior art keywords
voice signal
bone conduction
test
target
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010954775.8A
Other languages
Chinese (zh)
Other versions
CN112017687B (en
Inventor
朱宗霞
安康
吴劼
舒开发
韩菲菲
杨征
李钉云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Goertek Techology Co Ltd
Original Assignee
Goertek Techology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Goertek Techology Co Ltd filed Critical Goertek Techology Co Ltd
Priority to CN202010954775.8A priority Critical patent/CN112017687B/en
Publication of CN112017687A publication Critical patent/CN112017687A/en
Application granted granted Critical
Publication of CN112017687B publication Critical patent/CN112017687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Details Of Audible-Bandwidth Transducers (AREA)

Abstract

The application discloses a voice processing method, a device and a medium of bone conduction equipment, wherein the method comprises the following steps: acquiring a first test voice signal and a second test voice signal through a bone conduction microphone and an air conduction microphone in the same voice test environment, determining a corresponding target transfer function, and synthesizing the initial voice signal into a target voice signal according to the target transfer function after obtaining the initial voice signal. Be applied to above technical scheme, only use a bone conduction microphone can realize the collection of full frequency channel's speech signal, need not to increase the microphone and carry out the compensation of high frequency attenuation, so reduced the hardware cost, and reduced the inner space that occupies bone conduction equipment, be favorable to the miniaturization of product. In addition, only the target transfer function is used in the synthesis process, so that the function can be conveniently written into the algorithm logic of the bone conduction device, and the generality and the portability are good. Finally, the low frequency part still keeps the advantages of private conversation and noise reduction.

Description

Voice processing method, device and medium of bone conduction equipment
Technical Field
The present application relates to the field of bone conduction technologies, and in particular, to a method, an apparatus, and a medium for processing speech of a bone conduction device.
Background
The conventional air conduction microphone has a principle of collecting a sound wave signal through air as a propagation medium and converting the sound wave signal into an electric signal. Since the air conduction microphone needs to rely on air as a propagation medium, noise in the air is likely to be used as an effective sound wave signal together to generate noise pollution, which results in poor listening effect of a receiving party. On the basis of the above, with the development of bone conduction technology, more and more bone conduction devices are available, for example, bone conduction earphones in which a bone conduction microphone is incorporated. The bone conduction microphone is used for receiving the voice, and the voice collection is the same as that of the traditional air conduction microphone in voice receiving, but the principle of the bone conduction microphone is different from that of the traditional air conduction microphone in that the facial bone of a user can vibrate in the speaking process, the facial bone directly collects vibration signals transmitted by bones and converts the vibration signals into electric signals, and therefore compared with the traditional air conduction microphone, the bone conduction microphone is good in noise reduction effect.
However, in the process of sound pickup, the cut-off frequency of the bone conduction microphone is 2KHz, so when the bone conduction microphone is independently used as a voice collecting device, the problem of attenuation exists in a high-frequency part. In the prior art, in order to overcome the problem, the microphone is usually used in cooperation with a traditional air conduction microphone, so that two voice acquisition devices are needed in equipment, the problem of cost increase is solved, the problem of large occupied space is solved, and the miniaturization of products is not facilitated.
Therefore, on the basis of not increasing the cost and the space, the difficulty of improving the voice acquisition effect of the bone conduction device is the problem of the technical personnel in the field.
Disclosure of Invention
The application aims to provide a voice processing method, a voice processing device and a voice processing medium for bone conduction equipment.
In order to solve the above technical problem, the present application provides a speech processing method for bone conduction equipment, including:
determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone respectively in the same voice test environment;
acquiring an initial voice signal acquired by a current bone conduction microphone;
and synthesizing the initial voice signal into a target voice signal according to the target transfer function.
Preferably, the determining the target transfer functions corresponding to the first test voice signal and the second test voice signal specifically includes:
obtaining a plurality of pairs of the first test voice signal and the second test voice signal in advance;
extracting characteristic parameters corresponding to a plurality of pairs of the first test voice signals and the second test voice signals, and dividing the plurality of pairs of the characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test group into a preset training model as training samples to obtain a current transfer function;
verifying a plurality of pairs of the characteristic parameters of the verification group by using a current transfer function;
and taking the current transfer function as the target transfer function under the condition that the obtained verification result meets the requirement.
Preferably, the synthesizing the initial speech signal into the target speech signal according to the target transfer function specifically includes:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in the initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through the target transfer function;
and taking the medium-low frequency voice signal and the target high-frequency voice signal as the target voice signal.
Preferably, the training model is a deep learning model.
Preferably, the characteristic parameters include frequency, amplitude, and sound pressure.
Preferably, the first test speech signal and the second test speech signal comprise a high frequency speech signal and a mid-low frequency speech signal.
Preferably, the method further comprises the following steps:
presetting a judgment rule;
judging whether the target voice signal meets the judgment rule or not after the target voice signal is obtained;
if yes, outputting the target voice signal;
if not, adjusting the target transfer function, and returning to the step of synthesizing the initial voice signal into the target voice signal according to the target transfer function.
In order to solve the above technical problem, the present application further provides a speech processing apparatus for bone conduction equipment, including:
the determining module is used for determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone respectively in the same voice test environment;
the acquisition module is used for acquiring an initial voice signal acquired by a current bone conduction microphone;
and the synthesis module is used for synthesizing the initial voice signal into a target voice signal according to the target transfer function.
In order to solve the above technical problem, the present application further provides a speech processing apparatus of a bone conduction device, including a memory for storing a computer program;
a processor for implementing the steps of the speech processing method of the bone conduction device as described when executing the computer program.
In order to solve the above technical problem, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the voice processing method of the bone conduction device as described above.
According to the voice processing method of the bone conduction equipment, in the testing process, the bone conduction microphone and the air conduction microphone collect signals in the same voice testing environment to obtain a first testing voice signal and a second testing voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, the current bone conduction microphone collects the signals to obtain an initial voice signal, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. Be applied to above technical scheme, only use a bone conduction microphone can realize the collection of full frequency channel's speech signal, need not to increase extra microphone and carry out the compensation of high frequency attenuation, so reduced the hardware cost, and reduced the inner space that occupies bone conduction equipment, be favorable to the miniaturization of product. In addition, only the target transfer function is used in the synthesis process, so that the function can be conveniently written into the algorithm logic of the bone conduction device, and the generality and the portability are good. Finally, the low frequency part still keeps the advantages of private conversation and noise reduction.
The voice processing device and the medium of the bone conduction equipment correspond to the method, and the effects are the same.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a speech processing method of a bone conduction device according to an embodiment of the present application;
fig. 2 is a schematic diagram of obtaining a target transfer function according to an embodiment of the present application;
fig. 3 is a flowchart of speech processing of another bone conduction device according to an embodiment of the present application;
fig. 4 is a structural diagram of a voice processing device of a bone conduction device according to an embodiment of the present application;
fig. 5 is a block diagram of a speech processing apparatus of another bone conduction device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.
The core of the application is to provide a voice processing method, a voice processing device and a voice processing medium for bone conduction equipment.
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings.
It should be noted that the bone conduction device in the present application may be a bone conduction headset, such as a True Wireless Stereo (TWS) headset, or other device including a bone conduction microphone, and the device for voice acquisition only includes the bone conduction microphone, without adding an additional air conduction microphone. It will be appreciated that the device may comprise, in addition to the bone conduction microphone mentioned above, a processor for processing the speech signal, and of course, depending on the particular type of device, a loudspeaker, a bluetooth module, etc.
The speech processing method of the bone conduction device mainly relates to two parts, wherein the first part belongs to a testing process, and the second part belongs to an application process. In the testing process, a tester needs to perform voice testing through the bone conduction microphone tool and the air conduction microphone tool so as to obtain a first testing voice signal and a second voice testing signal. It is clear that thousands of tests are required to ensure the balance of the test sample, and that it is common for the tester to include men, women, old and young, thereby obtaining a more comprehensive test sample.
Fig. 1 is a flowchart of a speech processing method of a bone conduction device according to an embodiment of the present application. As illustrated in fig. 1, the method comprises:
s10: and determining the target transfer functions corresponding to the first test voice signal and the second test voice signal.
The first test voice signal and the second test voice signal are acquired by the bone conduction microphone and the air conduction microphone respectively in the same voice test environment. It should be noted that, in the same voice test environment, the voice signals collected by the bone conduction microphone and the air conduction microphone have a comparative significance, specifically, the bone conduction microphone tool can be worn by the tester, and the air conduction microphone tool can be tested in the silence test room, so that the tester speaks, and then the bone conduction microphone and the air conduction microphone can simultaneously acquire the voice of the tester, the bone conduction microphone converts the vibration signal of the bone into an electrical signal, and the air conduction microphone converts the sound wave signal into the electrical signal corresponding to the first voice test signal in the above step, and corresponds to the second voice test signal in the above step. It will be appreciated that if the bone conduction microphone and the air conduction microphone are both ideal, the first test speech signal and the second test speech signal should be identical. However, since the bone conduction microphone has a problem of severe attenuation in the high frequency band portion, in the present application, it is determined how to compensate the attenuation in the high frequency band portion of the bone conduction microphone by the air conduction microphone, that is, the target transfer functions corresponding to the first test voice signal and the second test voice signal are determined.
It is to be understood that the meaning of the target transfer function is the correspondence between the first test signal and the second test signal. In a specific implementation, the target transfer function is usually predetermined, i.e. the function is used directly during the application process, without being tested in the field.
S11: and acquiring an initial voice signal acquired by a current bone conduction microphone.
S12: and synthesizing the initial voice signal into a target voice signal according to the target transfer function.
It is understood that step S10 pertains to the step in the test procedure, and S11 and S12 pertain to the step in the application procedure. In S10, the target transfer function is obtained, that is, the corresponding relationship between the speech signal collected by the bone conduction microphone and the speech signal collected by the air conduction microphone is obtained, so that when the initial speech signal collected by the current bone conduction microphone is obtained in the actual application process, the initial speech signal can be synthesized into the target speech signal through the target transfer function. Obviously, the initial voice signal is attenuated in the high-frequency part, so that the attenuation can be compensated after the synthesis through the target transfer function, namely, the bandwidth of the bone conduction microphone is expanded, the obtained target voice signal is more accurate, and the advantages of private conversation and noise reduction are still kept in the low-frequency part.
For example, the sentence spoken by the user is "the capital of china is beijing city", the initial voice signal acquired by the bone conduction microphone is distorted in the high frequency part, so that the output sentence is "the capital of the middle part is beijing city", and after the synthesis is performed by adopting the target transfer function, the output sentence is "the capital of china is beijing city", thereby realizing that the voice signal of the full frequency band can be accurately acquired only by using the bone conduction microphone.
In the speech processing method of the bone conduction device provided in this embodiment, in the test process, signals are collected by the bone conduction microphone and the air conduction microphone in the same speech test environment to obtain a first test speech signal and a second test speech signal, then target transfer functions corresponding to the two signals are determined, in the application process, signals are collected by the current bone conduction microphone to obtain an initial speech signal, and then the initial speech signal is synthesized into a target speech signal according to the target transfer functions. Be applied to above technical scheme, only use a bone conduction microphone can realize the collection of full frequency channel's speech signal, need not to increase extra microphone and carry out the compensation of high frequency attenuation, so reduced the hardware cost, and reduced the inner space that occupies bone conduction equipment, be favorable to the miniaturization of product. In addition, only the target transfer function is used in the synthesis process, so that the function can be conveniently written into the algorithm logic of the bone conduction device, and the generality and the portability are good. Finally, the low frequency part still keeps the advantages of private conversation and noise reduction.
Fig. 2 is a schematic diagram of obtaining a target transfer function according to an embodiment of the present disclosure. As shown in fig. 2, the horizontal axis represents the signal corresponding to pure air conduction speech, i.e. the second speech test signal y (t), and the vertical axis represents the signal corresponding to bone conduction speech, i.e. the first speech test signal x (t), and the path transfer functions corresponding to the two signals are respectively defined as hAC(t) and hBC(t) the speech signal is e (t), the first speech test signal can be regarded as being obtained from the second speech test signal by a transfer function h (t), and the following relationships are present:
x(t)=e(t)*hBC(t)=y(t)/hAC(t)*hBC(t)=y(t)*h(t);
therefore, h (t) y (t)/x (t);
after obtaining the speech signal collected by the bone conduction microphone, the relationship between the speech signal and the speech signal corresponding to the air conduction microphone can be expressed as: y (t) ═ x (t)/h (t), thereby realizing speech synthesis.
In the theoretical analysis process, how to obtain the transfer function h (t) is a key point. In this embodiment, the obtaining of the transfer function is obtained by a method of artificial training, that is, by training a training sample and a training model, and an artificial neural network algorithm may be used. Further, the training model is a deep learning model. The deep learning model can be referred to in the prior art, and the description of the embodiment is omitted.
As a preferred embodiment, the determining the target transfer functions corresponding to the first test speech signal and the second test speech signal specifically includes:
obtaining a plurality of pairs of first test voice signals and second test voice signals in advance;
extracting characteristic parameters corresponding to a plurality of pairs of first test voice signals and second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test group as training samples into a preset training model to obtain a current transfer function;
verifying a plurality of pairs of characteristic parameters of the verification group by using the current transfer function;
and taking the current transfer function as a target transfer function under the condition that the obtained verification result meets the requirement.
In order to ensure the accuracy of the transfer function, a plurality of pairs of first test voice signals and second test voice signals are required, and it can be understood that a pair of first test voice signals and second test voice signals refers to signals acquired by a bone conduction microphone and an air conduction microphone respectively in the same test environment and in the same sound source. In order to verify the accuracy of the transfer function, in the present embodiment, a part of the plurality of pairs of the first test voice signal and the second test voice signal is used as the test sample, and a part thereof is used as the verification sample. Preferably, the characteristic parameters include frequency, amplitude and sound pressure. It will be appreciated that the test and validation samples are not repeated and that the test and validation samples should be as balanced as possible in terms of frequency, amplitude, sound pressure, etc.
In the above embodiment, after the target transfer function is obtained, it is not limited whether all signals in the initial speech signal are involved in synthesis or part of signals are involved in synthesis. In this embodiment, considering that the signals of the bone conduction microphone at the middle and low frequency parts are accurate and only the signals at the high frequency part are attenuated, on the basis of the above embodiment, synthesizing the initial speech signal into the target speech signal according to the target transfer function specifically includes:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in an initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through a target transfer function;
and taking the medium-low frequency voice signal and the target high-frequency voice signal as target voice signals.
In a specific implementation, the high-frequency voice signal may be a signal above 2KHz, and the signals below 2KHz are both medium-low frequency voice signals.
In this embodiment, only the high-frequency speech signal in the initial speech signal is synthesized, and the low-and-medium-frequency speech signal still maintains the original state, so that the amount of calculation is reduced, and the synthesis speed is increased.
As a preferred embodiment, the first test speech signal and the second test speech signal comprise a high frequency speech signal and a mid-low frequency speech signal.
In order to ensure the balance of the test sample and further ensure the accuracy of the target transfer function, in this embodiment, there is a certain requirement for the sound source signal, that is, the first test speech signal and the second test speech signal include a high-frequency speech signal and a middle-low frequency speech signal. It will be appreciated that the inclusion of only high frequency speech signals or only medium and low frequency speech signals may result in the target transfer function being accurate only in the high frequency signal portion or the medium and low frequency signal portion and ultimately in an inaccurate target transfer function.
Fig. 3 is a flowchart of speech processing of another bone conduction device according to an embodiment of the present application. As shown in fig. 3, on the basis of the above embodiment, the method further includes:
s20: presetting a judgment rule;
s21: judging whether the target voice signal meets the judgment rule, if so, entering S22, otherwise, entering S23;
s22: outputting the target voice signal;
s23: adjusts the target transfer function and returns to S12.
In this embodiment, in order to prevent distortion of the output voice signal due to an inaccurate current target transfer function, after the target voice signal is obtained, whether the target voice signal meets a conventional logic is determined by voice, that is, the determination rule mentioned in this embodiment, and if the target voice signal does not meet the determination rule, the current target transfer function is considered to be inaccurate, and needs to be further adjusted until the obtained target voice signal meets the determination rule. For example, the voice text corresponding to the target voice signal is "beijing", and the rule base only contains "beijing", which indicates that the target voice signal does not satisfy the determination rule and further needs to adjust the target transfer function.
In order to make the voice processing process of the bone conduction device provided by the present invention more clear to those skilled in the art, a specific application scenario is given below for explanation, and a TWS headset is specifically taken as an example. Firstly, after a large number of tests are carried out in the test stage, a target transfer function is obtained, then the function is written into a memory of the TWS earphone, and meanwhile, a corresponding algorithm is written into the memory. The user wears the TWS earphone to carry out voice call, the MCU of the TWS earphone calls a target transfer function after acquiring an initial voice signal acquired by the bone conduction microphone, a high-frequency voice signal is synthesized by the target transfer function, then the middle-low frequency voice signal and the synthesized signal are output to the Bluetooth module as a final target voice signal, the Bluetooth module receives the signal and transmits the signal to the Bluetooth module of another call terminal, the processor of the other terminal analyzes the signal, and finally the corresponding signal is output to the loudspeaker, so that the other user can listen.
In the foregoing embodiments, the speech processing method of the bone conduction device is described in detail, and the present application also provides embodiments corresponding to the speech processing apparatus of the bone conduction device. It should be noted that the present application describes the embodiments of the apparatus portion from two perspectives, one from the perspective of the function module and the other from the perspective of the hardware.
Fig. 4 is a structural diagram of a speech processing apparatus of a bone conduction device according to an embodiment of the present application. As shown in fig. 4, the apparatus includes:
a determining module 10, configured to determine target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone respectively in the same voice test environment;
the acquiring module 11 is configured to acquire an initial voice signal acquired by a current bone conduction microphone;
and the synthesis module 12 is configured to synthesize the initial speech signal into a target speech signal according to the target transfer function.
Preferably, the method further comprises the following steps:
the setting module is used for presetting a judgment rule;
the judging module is used for judging whether the target voice signal meets a judging rule after the target voice signal is obtained, if so, the output module is triggered, and otherwise, the adjusting module is triggered;
the output module is used for outputting the target voice signal;
and an adjusting module for adjusting the target transfer function and triggering the synthesizing module 12.
Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.
In the speech processing apparatus of the bone conduction device provided in this embodiment, in the test process, the bone conduction microphone and the air conduction microphone acquire signals in the same speech test environment to obtain a first test speech signal and a second test speech signal, then target transfer functions corresponding to the two signals are determined, in the application process, the current bone conduction microphone acquires signals to obtain an initial speech signal, and then the initial speech signal is synthesized into a target speech signal according to the target transfer functions. Be applied to above technical scheme, only use a bone conduction microphone can realize the collection of full frequency channel's speech signal, need not to increase extra microphone and carry out the compensation of high frequency attenuation, so reduced the hardware cost, and reduced the inner space that occupies bone conduction equipment, be favorable to the miniaturization of product. In addition, only the target transfer function is used in the synthesis process, so that the function can be conveniently written into the algorithm logic of the bone conduction device, and the generality and the portability are good. Finally, the low frequency part still keeps the advantages of private conversation and noise reduction.
Fig. 5 is a block diagram of a speech processing apparatus of another bone conduction device according to an embodiment of the present application. As shown in fig. 5, the apparatus comprises a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the speech processing method of the bone conduction device as in the above embodiments when executing the computer program.
The speech processing device of the bone conduction device provided by the embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
The processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 21 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 21 may further include an AI (Artificial Intelligence) processor for processing a calculation operation related to machine learning.
The memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, wherein after being loaded and executed by the processor 21, the computer program can implement the relevant steps of the voice processing method of the bone conduction device disclosed in any one of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202, data 203, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 202 may include, among others, Windows, Unix, Linux, and the like. Data 203 may include, but is not limited to, data required by the target transfer function, and the like.
In some embodiments, the speech processing device of the bone conduction apparatus may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
Those skilled in the art will appreciate that the configuration shown in fig. 5 does not constitute a limitation of the speech processing means of the bone conduction device and may include more or fewer components than those shown.
The speech processing device of the bone conduction equipment provided by the embodiment of the application comprises a memory and a processor, and when the processor executes a program stored in the memory, the following method can be realized: in the testing process, signals are collected through a bone conduction microphone and an air conduction microphone in the same voice testing environment so as to obtain a first testing voice signal and a second testing voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, the signals are collected through the current bone conduction microphone so as to obtain an initial voice signal, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. Be applied to above technical scheme, only use a bone conduction microphone can realize the collection of full frequency channel's speech signal, need not to increase extra microphone and carry out the compensation of high frequency attenuation, so reduced the hardware cost, and reduced the inner space that occupies bone conduction equipment, be favorable to the miniaturization of product. In addition, only the target transfer function is used in the synthesis process, so that the function can be conveniently written into the algorithm logic of the bone conduction device, and the generality and the portability are good. Finally, the low frequency part still keeps the advantages of private conversation and noise reduction.
Finally, the application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps as set forth in the above-mentioned method embodiments.
It is to be understood that if the method in the above embodiments is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and executes all or part of the steps of the methods described in the embodiments of the present application, or all or part of the technical solutions. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The foregoing details the speech processing method, apparatus, and medium of the bone conduction device provided in the present application. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method of speech processing for a bone conduction device, comprising:
determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone respectively in the same voice test environment;
acquiring an initial voice signal acquired by a current bone conduction microphone;
and synthesizing the initial voice signal into a target voice signal according to the target transfer function.
2. The speech processing method of bone conduction equipment according to claim 1, wherein the determining the target transfer functions corresponding to the first test speech signal and the second test speech signal specifically comprises:
obtaining a plurality of pairs of the first test voice signal and the second test voice signal in advance;
extracting characteristic parameters corresponding to a plurality of pairs of the first test voice signals and the second test voice signals, and dividing the plurality of pairs of the characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test group into a preset training model as training samples to obtain a current transfer function;
verifying a plurality of pairs of the characteristic parameters of the verification group by using a current transfer function;
and taking the current transfer function as the target transfer function under the condition that the obtained verification result meets the requirement.
3. The speech processing method of a bone conduction device according to claim 1, wherein the synthesizing the initial speech signal into a target speech signal according to the target transfer function specifically comprises:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in the initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through the target transfer function;
and taking the medium-low frequency voice signal and the target high-frequency voice signal as the target voice signal.
4. The method of speech processing for a bone conduction device of claim 2, wherein the training model is a deep learning model.
5. The speech processing method of a bone conduction device according to claim 2, wherein the characteristic parameters include frequency, amplitude, and sound pressure.
6. The method of speech processing of a bone conduction device according to any one of claims 1-5, wherein the first test speech signal and the second test speech signal comprise a high frequency speech signal and a mid-low frequency speech signal.
7. The speech processing method of a bone conduction device according to claim 1, further comprising:
presetting a judgment rule;
judging whether the target voice signal meets the judgment rule or not after the target voice signal is obtained;
if yes, outputting the target voice signal;
if not, adjusting the target transfer function, and returning to the step of synthesizing the initial voice signal into the target voice signal according to the target transfer function.
8. A speech processing apparatus of a bone conduction device, comprising:
the determining module is used for determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone respectively in the same voice test environment;
the acquisition module is used for acquiring an initial voice signal acquired by a current bone conduction microphone;
and the synthesis module is used for synthesizing the initial voice signal into a target voice signal according to the target transfer function.
9. A speech processing apparatus of a bone conduction device, comprising a memory for storing a computer program;
a processor for implementing the steps of the speech processing method of the bone conduction device according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of speech processing of a bone conduction device according to any one of claims 1 to 7.
CN202010954775.8A 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment Active CN112017687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010954775.8A CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010954775.8A CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Publications (2)

Publication Number Publication Date
CN112017687A true CN112017687A (en) 2020-12-01
CN112017687B CN112017687B (en) 2024-03-29

Family

ID=73521746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010954775.8A Active CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Country Status (1)

Country Link
CN (1) CN112017687B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767963A (en) * 2021-01-28 2021-05-07 歌尔科技有限公司 Voice enhancement method, device and system and computer readable storage medium
CN113314134A (en) * 2021-05-11 2021-08-27 紫光展锐(重庆)科技有限公司 Bone conduction signal compensation method and device
CN114697814A (en) * 2022-02-24 2022-07-01 深圳市佳骏兴科技有限公司 Bone conduction communication assembly, bone conduction earphone and control method and control device thereof
WO2023104122A1 (en) * 2020-12-08 2023-06-15 Shanghai Pedawise Intelligent Technology Co., Ltd Methods for clear call under noisy conditions
WO2024000854A1 (en) * 2022-06-30 2024-01-04 歌尔科技有限公司 Speech denoising method and apparatus, and device and computer-readable storage medium
WO2024050802A1 (en) * 2022-09-09 2024-03-14 华为技术有限公司 Speech signal processing method, neural network training method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007003702A (en) * 2005-06-22 2007-01-11 Ntt Docomo Inc Noise eliminator, communication terminal, and noise eliminating method
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
CN106686494A (en) * 2016-12-27 2017-05-17 广东小天才科技有限公司 Voice input control method of wearable equipment and the wearable equipment
US20180268798A1 (en) * 2017-03-15 2018-09-20 Synaptics Incorporated Two channel headset-based own voice enhancement
US10535364B1 (en) * 2016-09-08 2020-01-14 Amazon Technologies, Inc. Voice activity detection using air conduction and bone conduction microphones
CN110931031A (en) * 2019-10-09 2020-03-27 大象声科(深圳)科技有限公司 Deep learning voice extraction and noise reduction method fusing bone vibration sensor and microphone signals

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007003702A (en) * 2005-06-22 2007-01-11 Ntt Docomo Inc Noise eliminator, communication terminal, and noise eliminating method
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
US10535364B1 (en) * 2016-09-08 2020-01-14 Amazon Technologies, Inc. Voice activity detection using air conduction and bone conduction microphones
CN106686494A (en) * 2016-12-27 2017-05-17 广东小天才科技有限公司 Voice input control method of wearable equipment and the wearable equipment
US20180268798A1 (en) * 2017-03-15 2018-09-20 Synaptics Incorporated Two channel headset-based own voice enhancement
CN110931031A (en) * 2019-10-09 2020-03-27 大象声科(深圳)科技有限公司 Deep learning voice extraction and noise reduction method fusing bone vibration sensor and microphone signals

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023104122A1 (en) * 2020-12-08 2023-06-15 Shanghai Pedawise Intelligent Technology Co., Ltd Methods for clear call under noisy conditions
CN112767963A (en) * 2021-01-28 2021-05-07 歌尔科技有限公司 Voice enhancement method, device and system and computer readable storage medium
CN113314134A (en) * 2021-05-11 2021-08-27 紫光展锐(重庆)科技有限公司 Bone conduction signal compensation method and device
CN113314134B (en) * 2021-05-11 2022-11-11 紫光展锐(重庆)科技有限公司 Bone conduction signal compensation method and device
CN114697814A (en) * 2022-02-24 2022-07-01 深圳市佳骏兴科技有限公司 Bone conduction communication assembly, bone conduction earphone and control method and control device thereof
WO2024000854A1 (en) * 2022-06-30 2024-01-04 歌尔科技有限公司 Speech denoising method and apparatus, and device and computer-readable storage medium
WO2024050802A1 (en) * 2022-09-09 2024-03-14 华为技术有限公司 Speech signal processing method, neural network training method and device

Also Published As

Publication number Publication date
CN112017687B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN112017687B (en) Voice processing method, device and medium of bone conduction equipment
CN107360530B (en) Echo cancellation testing method and device
CN110505332A (en) A kind of noise-reduction method, device, mobile terminal and storage medium
CN105530565A (en) Automatic sound equalization device
CN105723459B (en) For improving the device and method of the perception of sound signal
CN108922558B (en) Voice processing method, voice processing device and mobile terminal
CN112954563B (en) Signal processing method, electronic device, apparatus, and storage medium
WO2022174727A1 (en) Howling suppression method and apparatus, hearing aid, and storage medium
CN110956973A (en) Echo cancellation method and device and intelligent terminal
US20240163623A1 (en) In-sync digital waveform comparison to determine pass/fail results of a device under test (dut)
CN110931019B (en) Public security voice data acquisition method, device, equipment and computer storage medium
WO2021081333A1 (en) Measuring and evaluating a test signal generated by a device under test (dut)
CN115604628A (en) Filter calibration method and device based on earphone loudspeaker frequency response
US20230276165A1 (en) Audio signal processing method, terminal device and storage medium
CN111800699B (en) Volume adjustment prompting method and device, earphone equipment and storage medium
CN103581934A (en) Terminal voice quality evaluation method and terminal
CN115604630A (en) Sound field expansion method, audio apparatus, and computer-readable storage medium
WO2020073564A1 (en) Method and apparatus for detecting loudness of audio signal
US11445324B2 (en) Audio rendering method and apparatus
JP2003500701A (en) Real-time quality analyzer for voice and audio signals
CN112741622B (en) Audiometric system, audiometric method, audiometric device, earphone and terminal equipment
CN113517000A (en) Echo cancellation test method, terminal and storage device
CN108932953B (en) Audio equalization function determination method, audio equalization method and equipment
CN115376501B (en) Voice enhancement method and device, storage medium and electronic equipment
CN111048107B (en) Audio processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant