CN112017687B - Voice processing method, device and medium of bone conduction equipment - Google Patents

Voice processing method, device and medium of bone conduction equipment Download PDF

Info

Publication number
CN112017687B
CN112017687B CN202010954775.8A CN202010954775A CN112017687B CN 112017687 B CN112017687 B CN 112017687B CN 202010954775 A CN202010954775 A CN 202010954775A CN 112017687 B CN112017687 B CN 112017687B
Authority
CN
China
Prior art keywords
voice signal
test
bone conduction
transfer function
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010954775.8A
Other languages
Chinese (zh)
Other versions
CN112017687A (en
Inventor
朱宗霞
安康
吴劼
舒开发
韩菲菲
杨征
李钉云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Goertek Techology Co Ltd
Original Assignee
Goertek Techology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Goertek Techology Co Ltd filed Critical Goertek Techology Co Ltd
Priority to CN202010954775.8A priority Critical patent/CN112017687B/en
Publication of CN112017687A publication Critical patent/CN112017687A/en
Application granted granted Critical
Publication of CN112017687B publication Critical patent/CN112017687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Details Of Audible-Bandwidth Transducers (AREA)

Abstract

The application discloses a voice processing method, a device and a medium of bone conduction equipment, wherein the method comprises the following steps: the bone conduction microphone and the air conduction microphone are used for acquiring a first test voice signal and a second test voice signal under the same voice test environment, a corresponding target transfer function is determined, and after an initial voice signal is acquired, the initial voice signal is synthesized into a target voice signal according to the target transfer function. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and high-frequency attenuation compensation is not needed to be carried out by adding the microphone, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

Description

Voice processing method, device and medium of bone conduction equipment
Technical Field
The present disclosure relates to the field of bone conduction technologies, and in particular, to a method, an apparatus, and a medium for processing speech of bone conduction devices.
Background
The principle of a conventional air conduction microphone is to collect an acoustic wave signal through air as a propagation medium and convert the acoustic wave signal into an electrical signal. Since the air-guide microphone needs to rely on air as a propagation medium, noise in the air is easy to be used as an effective sound wave signal together to generate noise pollution, and the receiving party has poor listening effect. On this basis, with the development of bone conduction technology, more and more bone conduction devices, such as bone conduction headphones, including bone conduction microphones, have emerged. The bone conduction microphone is used for receiving a call, the receiving call collects sound and plays the same role as the traditional air conduction microphone, but the principle is different from the traditional air conduction microphone, and the principle is that a user can vibrate in the speaking process, directly collect vibration signals transmitted by bones and convert the vibration signals into electric signals, so that compared with the traditional air conduction microphone, the bone conduction microphone has a better noise reduction effect.
However, in the process of sound pickup, the cut-off frequency of the bone conduction microphone is 2KHz, so that the bone conduction microphone alone as a voice acquisition device has a problem of attenuation in a high-frequency part. In the prior art, in order to overcome the problem, the voice-frequency microphone is usually matched with a traditional air-conduction microphone, so that two voice acquisition devices are needed in the equipment, the problem of cost increase is caused, the problem of large occupied space is caused, and the miniaturization of products is not facilitated.
It can be seen that how to improve the speech collection effect of bone conduction devices without increasing the cost and space is a difficult problem for those skilled in the art.
Disclosure of Invention
The purpose of the application is to provide a voice processing method, a device and a medium of bone conduction equipment, wherein the equipment is used for improving the voice quality of the bone conduction equipment by adopting a corresponding processing algorithm only through a bone conduction microphone as a voice acquisition device, and no additional hardware is needed, and no additional space is needed.
In order to solve the above technical problems, the present application provides a method for processing speech of a bone conduction device, including:
determining a target transfer function corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;
acquiring an initial voice signal acquired by a current bone conduction microphone;
and synthesizing the initial voice signal into a target voice signal according to the target transfer function.
Preferably, the determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:
pre-acquiring a plurality of pairs of the first test voice signal and the second test voice signal;
extracting a plurality of pairs of characteristic parameters corresponding to the first test voice signals and the second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test set as training samples into a preset training model to obtain a current transfer function;
verifying a plurality of pairs of the characteristic parameters of the verification group by using a current transfer function;
and under the condition that the obtained verification result meets the requirement, taking the current transfer function as the target transfer function.
Preferably, the synthesizing the initial speech signal into the target speech signal according to the target transfer function specifically includes:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in the initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through the target transfer function;
and taking the middle-low frequency voice signal and the target high frequency voice signal as the target voice signal.
Preferably, the training model is a deep learning model.
Preferably, the characteristic parameters include frequency, amplitude and sound pressure.
Preferably, the first test voice signal and the second test voice signal include a high frequency voice signal and a medium and low frequency voice signal.
Preferably, the method further comprises:
presetting a judgment rule;
judging whether the target voice signal meets the judging rule after the target voice signal is obtained;
if yes, outputting the target voice signal;
if not, the target transfer function is adjusted, and the step of synthesizing the initial voice signal into a target voice signal according to the target transfer function is returned.
In order to solve the above technical problem, the present application further provides a speech processing device of a bone conduction apparatus, including:
the determining module is used for determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;
the acquisition module is used for acquiring an initial voice signal acquired by the current bone conduction microphone;
and the synthesis module is used for synthesizing the initial voice signal into a target voice signal according to the target transfer function.
In order to solve the technical problem, the application also provides a voice processing device of the bone conduction equipment, which comprises a memory for storing a computer program;
a processor for implementing the steps of the speech processing method of the bone conduction apparatus as described when executing the computer program.
To solve the above technical problem, the present application further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the speech processing method of the bone conduction apparatus as described.
According to the voice processing method of the bone conduction equipment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone, so that a first test voice signal and a second test voice signal are obtained, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.
The voice processing device and the medium of the bone conduction equipment provided by the application correspond to the method, and the effects are the same as the above.
Drawings
For a clearer description of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for processing speech of a bone conduction device according to an embodiment of the present application;
FIG. 2 is a schematic diagram of obtaining a target transfer function according to an embodiment of the present application;
FIG. 3 is a flowchart of speech processing by another bone conduction apparatus according to an embodiment of the present application;
fig. 4 is a block diagram of a voice processing apparatus of a bone conduction device according to an embodiment of the present application;
fig. 5 is a block diagram of a voice processing apparatus of another bone conduction device according to an embodiment of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments herein without making any inventive effort are intended to fall within the scope of the present application.
The core of the application is to provide a voice processing method, a device and a medium of bone conduction equipment.
In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description.
It should be noted that, the bone conduction device in the present application may be a bone conduction earphone, for example, a Truly Wireless Stereo (TWS) earphone, or other devices including a bone conduction microphone, where the device used for voice capturing only includes the bone conduction microphone, without adding an air conduction microphone. It will be appreciated that the device may comprise, in addition to the bone conduction microphone mentioned above, a processor for processing the speech signal, and of course, depending on the particular type of device, a loudspeaker, a bluetooth module, etc.
The speech processing method of the bone conduction device mentioned in the application mainly relates to two major parts, namely a first part belongs to a testing process and a second part belongs to an application process. In the test process, a tester is required to conduct voice test through the bone conduction microphone tool and the air conduction microphone tool so as to obtain a first test voice signal and a second voice test signal. It is apparent that thousands of tests are required to ensure the uniformity of the test sample, and that men, women, old, and young are also commonly included for the tester, resulting in a more comprehensive test sample.
Fig. 1 is a flowchart of a method for processing speech of a bone conduction device according to an embodiment of the present application. As shown in fig. 1, the method includes:
s10: and determining target transfer functions corresponding to the first test voice signal and the second test voice signal.
The first test voice signal and the second test voice signal are acquired by the bone conduction microphone and the air conduction microphone under the same voice test environment respectively. It should be noted that, under the same voice test environment, the voice signals collected by the bone conduction microphone and the air conduction microphone have a comparative meaning, specifically, the bone conduction microphone tool can be worn by a tester, and the air conduction microphone tool tests in a silent test room, and the tester speaks, so that the bone conduction microphone and the air conduction microphone can simultaneously obtain the voice of the tester, the bone conduction microphone converts the vibration signal of the bone into an electric signal, corresponding to the first voice test signal in the above steps, and the air conduction microphone converts the sound wave signal into an electric signal, corresponding to the second voice test signal in the above steps. It will be appreciated that the first test speech signal and the second test speech signal should be identical if both the bone conduction microphone and the air conduction microphone are ideal. However, since the bone conduction microphone has a serious attenuation problem in the high-frequency band portion, it is determined how to compensate the attenuation of the high-frequency band portion of the bone conduction microphone by the air conduction microphone, that is, the objective transfer functions corresponding to the first test voice signal and the second test voice signal are determined.
It will be appreciated that the meaning of the target transfer function is the correspondence between the first test signal and the second test signal. In practice, the target transfer function is usually predetermined, i.e. it is used directly during the application process, without the need for in-situ testing.
S11: an initial speech signal acquired by a current bone conduction microphone is acquired.
S12: the initial speech signal is synthesized into a target speech signal according to a target transfer function.
It will be appreciated that step S10 belongs to a step in the test procedure, and that S11 and S12 belong to a step in the application procedure. In S10, the objective transfer function is already obtained, that is, the correspondence between the voice signal collected by the bone conduction microphone and the voice signal collected by the air conduction microphone is obtained, so that when the initial voice signal collected by the current bone conduction microphone is obtained in the actual application process, the initial voice signal can be synthesized into the objective voice signal through the objective transfer function. It is obvious that the initial voice signal has attenuation in the high frequency part, so after the target transfer function is synthesized, the attenuation of the part can be compensated, which is equivalent to expanding the bandwidth of the bone conduction microphone, so that the obtained target voice signal is more accurate, and the advantages of privacy conversation and noise reduction are still maintained in the low frequency part.
For example, the sentence spoken by the user is "capital of china is beijing city", the initial speech signal collected by the bone conduction microphone is distorted in the high frequency part, resulting in that the output sentence is "capital of middle is beijing city", and after synthesis by adopting the target transfer function, the output sentence is "capital of china is beijing city", thereby realizing that the speech signal of the full frequency band can be accurately collected by using only the bone conduction microphone.
In the voice processing method of the bone conduction device provided by the embodiment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.
Fig. 2 is a schematic diagram of acquiring a target transfer function according to an embodiment of the present application. As shown in fig. 2, the horizontal axis represents the signal corresponding to pure air conduction voice, namely the second voice test signal y (t), the vertical axis represents the signal corresponding to bone conduction voice, namely the first voice test signal x (t), and the path transfer functions corresponding to the two signals are respectively defined as h AC (t) and h BC (t)The speech signal is e (t), the first speech test signal can be regarded as being obtained from the second speech test signal through a transfer function h (t), and the following relation is provided:
x(t)=e(t)*h BC (t)=y(t)/h AC (t)*h BC (t)=y(t)*h(t);
thus, h (t) =y (t)/x (t);
when the voice signal collected by the bone conduction microphone is obtained, the relationship between the signal and the voice signal corresponding to the air conduction microphone can be expressed as follows: y (t) =x (t)/h (t), thereby realizing speech synthesis.
In the theoretical analysis, it is known from the analysis how to obtain the transfer function h (t) as a key point. In this embodiment, the transfer function is obtained by a method of manual training, that is, by training a training sample and a training model, and an artificial neural network algorithm may be used. Further, the training model is a deep learning model. The deep learning model can be referred to in the prior art, and the description of this embodiment is omitted.
As a preferred embodiment, determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:
a plurality of pairs of first test voice signals and second test voice signals are obtained in advance;
extracting characteristic parameters corresponding to a plurality of pairs of first test voice signals and second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test set as training samples into a preset training model to obtain a current transfer function;
verifying the multiple pairs of characteristic parameters of the verification group by using the current transfer function;
and taking the current transfer function as a target transfer function under the condition that the obtained verification result meets the requirement.
In order to ensure the accuracy of the transfer function, a plurality of pairs of first test voice signals and second test voice signals are required, and it is understood that a pair of first test voice signals and second test voice signals refer to signals acquired by a bone conduction microphone and an air conduction microphone respectively in the same test environment. In order to verify the accuracy of the transfer function, in the present embodiment, a part of the pairs of the first test voice signal and the second test voice signal is used as a test sample, and a part is used as a verification sample. Preferably, the characteristic parameters include frequency, amplitude and sound pressure. It will be appreciated that the test sample and the validation sample are not repeated and that the test sample and the validation sample should be as balanced as possible in terms of frequency, amplitude, sound pressure, etc.
In the above embodiment, after the objective transfer function is obtained, it is not limited whether all the signals in the initial speech signal are synthesized or partially participate in the synthesis. In this embodiment, considering that the signal of the bone conduction microphone in the middle-low frequency part is accurate and only the signal of the high frequency part is attenuated, synthesizing the initial voice signal into the target voice signal according to the target transfer function based on the above embodiment specifically includes:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in an initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through a target transfer function;
the middle-low frequency voice signal and the target high frequency voice signal are used as target voice signals.
In a specific implementation, the high-frequency voice signal can be a signal above 2KHz, and the signals below 2KHz are all medium-low frequency voice signals.
In this embodiment, only the high-frequency speech signal in the initial speech signal is synthesized, and the middle-low frequency speech signal still maintains the original state, thereby reducing the operation amount and improving the synthesis speed.
As a preferred embodiment, the first test speech signal and the second test speech signal comprise a high frequency speech signal and a medium and low frequency speech signal.
In order to ensure the equalization of the test sample and further ensure the accuracy of the target transfer function, in this embodiment, there is a certain requirement for the sound source signal, that is, the first test voice signal and the second test voice signal include a high-frequency voice signal and a middle-low frequency voice signal. It will be appreciated that if only high frequency speech signals or mid-low frequency speech signals are included, this may result in the target transfer function being accurate only in the high frequency signal portion or mid-low frequency signal portion, which ultimately results in the target transfer function being inaccurate.
Fig. 3 is a flowchart of a voice process of another bone conduction device according to an embodiment of the present application. As shown in fig. 3, on the basis of the above embodiment, the method further includes:
s20: presetting a judgment rule;
s21: judging whether the target voice signal meets the judging rule, if so, entering S22, otherwise, entering S23;
s22: outputting the target voice signal;
s23: the target transfer function is adjusted and returns to S12.
In this embodiment, in order to prevent the output speech signal from being distorted due to inaccurate current target transfer function, after the target speech signal is obtained, whether the speech judgment meets the conventional logic, that is, the judgment rule mentioned in this embodiment, if the speech judgment does not meet the judgment rule, the current target transfer function is considered to be inaccurate, and further adjustment is required until the obtained target speech signal meets the judgment rule. For example, the voice text corresponding to the target voice signal is "beijing city", and the rule base only includes "beijing city", which indicates that the target voice signal does not meet the judgment rule, and the target transfer function needs to be further adjusted.
In order to make the speech processing procedure of the bone conduction device provided by the invention more clear for those skilled in the art, a specific application scenario is given below for explanation, and a TWS earphone is specifically taken as an example. Firstly, after a large number of tests in a test stage, a target transfer function is obtained, and then the function is written into a memory of the TWS earphone, and meanwhile, a corresponding algorithm is written. The user wears TWS earphone to carry out voice call, MCU of TWS earphone calls target transfer function after obtaining initial voice signal that bone conduction microphone gathered, synthesizes high-frequency voice signal with target transfer function, then exports well low-frequency voice signal and synthesized signal as final target voice signal to bluetooth module, and bluetooth module transmits to another bluetooth module of talking terminal after receiving this signal, and the processor of another terminal analyzes this signal, exports corresponding signal to the speaker finally for the opposite side user can hear.
In the above embodiments, the method for processing the voice of the bone conduction device is described in detail, and the application also provides corresponding embodiments of the voice processing device of the bone conduction device. It should be noted that the present application describes an embodiment of the device portion from two angles, one based on the angle of the functional module and the other based on the angle of the hardware.
Fig. 4 is a block diagram of a voice processing apparatus of a bone conduction device according to an embodiment of the present application. As shown in fig. 4, the apparatus includes:
a determining module 10, configured to determine a target transfer function corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by the bone conduction microphone and the air conduction microphone under the same voice test environment respectively;
an acquisition module 11, configured to acquire an initial voice signal acquired by a current bone conduction microphone;
the synthesizing module 12 is configured to synthesize the initial speech signal into a target speech signal according to a target transfer function.
Preferably, the method further comprises:
the setting module is used for presetting a judging rule;
the judging module is used for judging whether the target voice signal meets the judging rule after the target voice signal is obtained, if so, triggering the output module, and if not, triggering the adjusting module;
the output module is used for outputting the target voice signal;
the adjusting module is used for adjusting the target transfer function and triggering the synthesizing module 12.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.
In the voice processing device of the bone conduction equipment provided by the embodiment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.
Fig. 5 is a block diagram of a voice processing apparatus of another bone conduction device according to an embodiment of the present application. As shown in fig. 5, the apparatus comprises a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the speech processing method of the bone conduction apparatus as in the above-described embodiment when executing a computer program.
The voice processing device of the bone conduction device provided in this embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, which, when loaded and executed by the processor 21, is capable of implementing the relevant steps of the speech processing method of the bone conduction apparatus disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. The operating system 202 may include Windows, unix, linux, among others. The data 203 may include, but is not limited to, data required for a target transfer function, and the like.
In some embodiments, the voice processing device of the bone conduction apparatus may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
It will be appreciated by those skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the speech processing means of the bone conduction apparatus and may include more or less components than those illustrated.
The voice processing device of the bone conduction equipment provided by the embodiment of the application comprises a memory and a processor, wherein the processor can realize the following method when executing a program stored in the memory: in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, the current bone conduction microphone collects the signals so as to obtain an initial voice signal, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.
Finally, the present application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps as described in the method embodiments above.
It will be appreciated that the methods of the above embodiments, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored on a computer readable storage medium. With such understanding, the technical solution of the present application, or a part contributing to the prior art or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, performing all or part of the steps of the method described in the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The method, the device and the medium for processing the voice of the bone conduction device provided by the application are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (9)

1. A method of speech processing by a bone conduction apparatus, comprising:
determining a target transfer function corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;
acquiring an initial voice signal acquired by a current bone conduction microphone;
synthesizing the initial voice signal into a target voice signal according to the target transfer function;
the determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:
pre-acquiring a plurality of pairs of the first test voice signal and the second test voice signal;
extracting a plurality of pairs of characteristic parameters corresponding to the first test voice signals and the second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;
inputting a plurality of pairs of characteristic parameters of the test set as training samples into a preset training model to obtain a current transfer function;
verifying a plurality of pairs of the characteristic parameters of the verification group by using a current transfer function;
and under the condition that the obtained verification result meets the requirement, taking the current transfer function as the target transfer function.
2. The method according to claim 1, wherein synthesizing the initial speech signal into a target speech signal according to the target transfer function comprises:
acquiring a high-frequency voice signal and a middle-low frequency voice signal in the initial voice signal;
converting the high-frequency voice signal into a target high-frequency voice signal through the target transfer function;
and taking the middle-low frequency voice signal and the target high frequency voice signal as the target voice signal.
3. The method for processing speech of a bone conduction apparatus according to claim 1, wherein the training model is a deep learning model.
4. The method of claim 1, wherein the characteristic parameters include frequency, amplitude, and sound pressure.
5. The method according to any one of claims 1 to 4, wherein the first test voice signal and the second test voice signal include a high-frequency voice signal and a middle-low-frequency voice signal.
6. The method for processing speech of a bone conduction apparatus according to claim 1, further comprising:
presetting a judgment rule;
judging whether the target voice signal meets the judging rule after the target voice signal is obtained;
if yes, outputting the target voice signal;
if not, the target transfer function is adjusted, and the step of synthesizing the initial voice signal into a target voice signal according to the target transfer function is returned.
7. A speech processing apparatus of a bone conduction device, comprising:
the determining module is used for acquiring a plurality of pairs of first test voice signals and second test voice signals in advance, extracting a plurality of pairs of characteristic parameters corresponding to the first test voice signals and the second test voice signals, dividing the plurality of pairs of characteristic parameters into a test group and a verification group, inputting the plurality of pairs of characteristic parameters of the test group as training samples into a preset training model to obtain a current transfer function, verifying the plurality of pairs of characteristic parameters of the verification group by using the current transfer function, and taking the current transfer function as a target transfer function under the condition that the obtained verification result meets the requirement; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;
the acquisition module is used for acquiring an initial voice signal acquired by the current bone conduction microphone;
and the synthesis module is used for synthesizing the initial voice signal into a target voice signal according to the target transfer function.
8. A speech processing apparatus of a bone conduction device, comprising a memory for storing a computer program;
a processor for implementing the steps of the speech processing method of the bone conduction apparatus according to any one of claims 1 to 6 when executing the computer program.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the speech processing method of the bone conduction apparatus according to any one of claims 1 to 6.
CN202010954775.8A 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment Active CN112017687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010954775.8A CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010954775.8A CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Publications (2)

Publication Number Publication Date
CN112017687A CN112017687A (en) 2020-12-01
CN112017687B true CN112017687B (en) 2024-03-29

Family

ID=73521746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010954775.8A Active CN112017687B (en) 2020-09-11 2020-09-11 Voice processing method, device and medium of bone conduction equipment

Country Status (1)

Country Link
CN (1) CN112017687B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220180886A1 (en) * 2020-12-08 2022-06-09 Fuliang Weng Methods for clear call under noisy conditions
CN112767963B (en) * 2021-01-28 2022-11-25 歌尔科技有限公司 Voice enhancement method, device and system and computer readable storage medium
CN113314134B (en) * 2021-05-11 2022-11-11 紫光展锐(重庆)科技有限公司 Bone conduction signal compensation method and device
CN114697814A (en) * 2022-02-24 2022-07-01 深圳市佳骏兴科技有限公司 Bone conduction communication assembly, bone conduction earphone and control method and control device thereof
CN115171713A (en) * 2022-06-30 2022-10-11 歌尔科技有限公司 Voice noise reduction method, device and equipment and computer readable storage medium
WO2024050802A1 (en) * 2022-09-09 2024-03-14 华为技术有限公司 Speech signal processing method, neural network training method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007003702A (en) * 2005-06-22 2007-01-11 Ntt Docomo Inc Noise eliminator, communication terminal, and noise eliminating method
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
CN106686494A (en) * 2016-12-27 2017-05-17 广东小天才科技有限公司 Voice input control method of wearable equipment and the wearable equipment
US10535364B1 (en) * 2016-09-08 2020-01-14 Amazon Technologies, Inc. Voice activity detection using air conduction and bone conduction microphones
CN110931031A (en) * 2019-10-09 2020-03-27 大象声科(深圳)科技有限公司 Deep learning voice extraction and noise reduction method fusing bone vibration sensor and microphone signals

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10614788B2 (en) * 2017-03-15 2020-04-07 Synaptics Incorporated Two channel headset-based own voice enhancement

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007003702A (en) * 2005-06-22 2007-01-11 Ntt Docomo Inc Noise eliminator, communication terminal, and noise eliminating method
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
US10535364B1 (en) * 2016-09-08 2020-01-14 Amazon Technologies, Inc. Voice activity detection using air conduction and bone conduction microphones
CN106686494A (en) * 2016-12-27 2017-05-17 广东小天才科技有限公司 Voice input control method of wearable equipment and the wearable equipment
CN110931031A (en) * 2019-10-09 2020-03-27 大象声科(深圳)科技有限公司 Deep learning voice extraction and noise reduction method fusing bone vibration sensor and microphone signals

Also Published As

Publication number Publication date
CN112017687A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
CN112017687B (en) Voice processing method, device and medium of bone conduction equipment
Li et al. On the importance of power compression and phase estimation in monaural speech dereverberation
CN107360530B (en) Echo cancellation testing method and device
KR20160023767A (en) Systems and methods for measuring speech signal quality
CN110505332A (en) A kind of noise-reduction method, device, mobile terminal and storage medium
CN108922558B (en) Voice processing method, voice processing device and mobile terminal
CN112954563B (en) Signal processing method, electronic device, apparatus, and storage medium
EP3005362B1 (en) Apparatus and method for improving a perception of a sound signal
CN112565981B (en) Howling suppression method, howling suppression device, hearing aid, and storage medium
CN115604628B (en) Filter calibration method and device based on earphone loudspeaker frequency response
CN110956973A (en) Echo cancellation method and device and intelligent terminal
CN108335701A (en) A kind of method and apparatus carrying out noise reduction
CN110931019B (en) Public security voice data acquisition method, device, equipment and computer storage medium
WO2024067782A1 (en) Sound field expansion method, audio device, and computer readable storage medium
US20230276165A1 (en) Audio signal processing method, terminal device and storage medium
Nogueira et al. Artificial speech bandwidth extension improves telephone speech intelligibility and quality in cochlear implant users
CN103581934A (en) Terminal voice quality evaluation method and terminal
CN113517000A (en) Echo cancellation test method, terminal and storage device
CN108932953B (en) Audio equalization function determination method, audio equalization method and equipment
Shankar et al. Smartphone-based single-channel speech enhancement application for hearing aids
CN115376501B (en) Voice enhancement method and device, storage medium and electronic equipment
Rund et al. Objective quality assessment for the acoustic zoom
EP3896999A1 (en) Systems and methods for a hearing assistive device
CN111510837B (en) Hearing recovery method, recovery system, storage medium and hearing aid
Mansour Assessing hearing device benefit using virtual sound environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant