CN112017687B

CN112017687B - Voice processing method, device and medium of bone conduction equipment

Info

Publication number: CN112017687B
Application number: CN202010954775.8A
Authority: CN
Inventors: 朱宗霞; 安康; 吴劼; 舒开发; 韩菲菲; 杨征; 李钉云
Original assignee: Goertek Techology Co Ltd
Current assignee: Goertek Techology Co Ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2024-03-29
Anticipated expiration: 2040-09-11
Also published as: CN112017687A

Abstract

The application discloses a voice processing method, a device and a medium of bone conduction equipment, wherein the method comprises the following steps: the bone conduction microphone and the air conduction microphone are used for acquiring a first test voice signal and a second test voice signal under the same voice test environment, a corresponding target transfer function is determined, and after an initial voice signal is acquired, the initial voice signal is synthesized into a target voice signal according to the target transfer function. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and high-frequency attenuation compensation is not needed to be carried out by adding the microphone, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

Description

Voice processing method, device and medium of bone conduction equipment

Technical Field

The present disclosure relates to the field of bone conduction technologies, and in particular, to a method, an apparatus, and a medium for processing speech of bone conduction devices.

Background

The principle of a conventional air conduction microphone is to collect an acoustic wave signal through air as a propagation medium and convert the acoustic wave signal into an electrical signal. Since the air-guide microphone needs to rely on air as a propagation medium, noise in the air is easy to be used as an effective sound wave signal together to generate noise pollution, and the receiving party has poor listening effect. On this basis, with the development of bone conduction technology, more and more bone conduction devices, such as bone conduction headphones, including bone conduction microphones, have emerged. The bone conduction microphone is used for receiving a call, the receiving call collects sound and plays the same role as the traditional air conduction microphone, but the principle is different from the traditional air conduction microphone, and the principle is that a user can vibrate in the speaking process, directly collect vibration signals transmitted by bones and convert the vibration signals into electric signals, so that compared with the traditional air conduction microphone, the bone conduction microphone has a better noise reduction effect.

However, in the process of sound pickup, the cut-off frequency of the bone conduction microphone is 2KHz, so that the bone conduction microphone alone as a voice acquisition device has a problem of attenuation in a high-frequency part. In the prior art, in order to overcome the problem, the voice-frequency microphone is usually matched with a traditional air-conduction microphone, so that two voice acquisition devices are needed in the equipment, the problem of cost increase is caused, the problem of large occupied space is caused, and the miniaturization of products is not facilitated.

It can be seen that how to improve the speech collection effect of bone conduction devices without increasing the cost and space is a difficult problem for those skilled in the art.

Disclosure of Invention

The purpose of the application is to provide a voice processing method, a device and a medium of bone conduction equipment, wherein the equipment is used for improving the voice quality of the bone conduction equipment by adopting a corresponding processing algorithm only through a bone conduction microphone as a voice acquisition device, and no additional hardware is needed, and no additional space is needed.

In order to solve the above technical problems, the present application provides a method for processing speech of a bone conduction device, including:

determining a target transfer function corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;

acquiring an initial voice signal acquired by a current bone conduction microphone;

and synthesizing the initial voice signal into a target voice signal according to the target transfer function.

Preferably, the determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:

pre-acquiring a plurality of pairs of the first test voice signal and the second test voice signal;

extracting a plurality of pairs of characteristic parameters corresponding to the first test voice signals and the second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;

inputting a plurality of pairs of characteristic parameters of the test set as training samples into a preset training model to obtain a current transfer function;

verifying a plurality of pairs of the characteristic parameters of the verification group by using a current transfer function;

and under the condition that the obtained verification result meets the requirement, taking the current transfer function as the target transfer function.

Preferably, the synthesizing the initial speech signal into the target speech signal according to the target transfer function specifically includes:

acquiring a high-frequency voice signal and a middle-low frequency voice signal in the initial voice signal;

converting the high-frequency voice signal into a target high-frequency voice signal through the target transfer function;

and taking the middle-low frequency voice signal and the target high frequency voice signal as the target voice signal.

Preferably, the training model is a deep learning model.

Preferably, the characteristic parameters include frequency, amplitude and sound pressure.

Preferably, the first test voice signal and the second test voice signal include a high frequency voice signal and a medium and low frequency voice signal.

Preferably, the method further comprises:

presetting a judgment rule;

judging whether the target voice signal meets the judging rule after the target voice signal is obtained;

if yes, outputting the target voice signal;

if not, the target transfer function is adjusted, and the step of synthesizing the initial voice signal into a target voice signal according to the target transfer function is returned.

In order to solve the above technical problem, the present application further provides a speech processing device of a bone conduction apparatus, including:

the determining module is used for determining target transfer functions corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;

the acquisition module is used for acquiring an initial voice signal acquired by the current bone conduction microphone;

and the synthesis module is used for synthesizing the initial voice signal into a target voice signal according to the target transfer function.

In order to solve the technical problem, the application also provides a voice processing device of the bone conduction equipment, which comprises a memory for storing a computer program;

a processor for implementing the steps of the speech processing method of the bone conduction apparatus as described when executing the computer program.

To solve the above technical problem, the present application further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the speech processing method of the bone conduction apparatus as described.

According to the voice processing method of the bone conduction equipment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone, so that a first test voice signal and a second test voice signal are obtained, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

The voice processing device and the medium of the bone conduction equipment provided by the application correspond to the method, and the effects are the same as the above.

Drawings

For a clearer description of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a method for processing speech of a bone conduction device according to an embodiment of the present application;

FIG. 2 is a schematic diagram of obtaining a target transfer function according to an embodiment of the present application;

FIG. 3 is a flowchart of speech processing by another bone conduction apparatus according to an embodiment of the present application;

fig. 4 is a block diagram of a voice processing apparatus of a bone conduction device according to an embodiment of the present application;

fig. 5 is a block diagram of a voice processing apparatus of another bone conduction device according to an embodiment of the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments herein without making any inventive effort are intended to fall within the scope of the present application.

The core of the application is to provide a voice processing method, a device and a medium of bone conduction equipment.

In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description.

It should be noted that, the bone conduction device in the present application may be a bone conduction earphone, for example, a Truly Wireless Stereo (TWS) earphone, or other devices including a bone conduction microphone, where the device used for voice capturing only includes the bone conduction microphone, without adding an air conduction microphone. It will be appreciated that the device may comprise, in addition to the bone conduction microphone mentioned above, a processor for processing the speech signal, and of course, depending on the particular type of device, a loudspeaker, a bluetooth module, etc.

The speech processing method of the bone conduction device mentioned in the application mainly relates to two major parts, namely a first part belongs to a testing process and a second part belongs to an application process. In the test process, a tester is required to conduct voice test through the bone conduction microphone tool and the air conduction microphone tool so as to obtain a first test voice signal and a second voice test signal. It is apparent that thousands of tests are required to ensure the uniformity of the test sample, and that men, women, old, and young are also commonly included for the tester, resulting in a more comprehensive test sample.

Fig. 1 is a flowchart of a method for processing speech of a bone conduction device according to an embodiment of the present application. As shown in fig. 1, the method includes:

s10: and determining target transfer functions corresponding to the first test voice signal and the second test voice signal.

The first test voice signal and the second test voice signal are acquired by the bone conduction microphone and the air conduction microphone under the same voice test environment respectively. It should be noted that, under the same voice test environment, the voice signals collected by the bone conduction microphone and the air conduction microphone have a comparative meaning, specifically, the bone conduction microphone tool can be worn by a tester, and the air conduction microphone tool tests in a silent test room, and the tester speaks, so that the bone conduction microphone and the air conduction microphone can simultaneously obtain the voice of the tester, the bone conduction microphone converts the vibration signal of the bone into an electric signal, corresponding to the first voice test signal in the above steps, and the air conduction microphone converts the sound wave signal into an electric signal, corresponding to the second voice test signal in the above steps. It will be appreciated that the first test speech signal and the second test speech signal should be identical if both the bone conduction microphone and the air conduction microphone are ideal. However, since the bone conduction microphone has a serious attenuation problem in the high-frequency band portion, it is determined how to compensate the attenuation of the high-frequency band portion of the bone conduction microphone by the air conduction microphone, that is, the objective transfer functions corresponding to the first test voice signal and the second test voice signal are determined.

It will be appreciated that the meaning of the target transfer function is the correspondence between the first test signal and the second test signal. In practice, the target transfer function is usually predetermined, i.e. it is used directly during the application process, without the need for in-situ testing.

S11: an initial speech signal acquired by a current bone conduction microphone is acquired.

S12: the initial speech signal is synthesized into a target speech signal according to a target transfer function.

It will be appreciated that step S10 belongs to a step in the test procedure, and that S11 and S12 belong to a step in the application procedure. In S10, the objective transfer function is already obtained, that is, the correspondence between the voice signal collected by the bone conduction microphone and the voice signal collected by the air conduction microphone is obtained, so that when the initial voice signal collected by the current bone conduction microphone is obtained in the actual application process, the initial voice signal can be synthesized into the objective voice signal through the objective transfer function. It is obvious that the initial voice signal has attenuation in the high frequency part, so after the target transfer function is synthesized, the attenuation of the part can be compensated, which is equivalent to expanding the bandwidth of the bone conduction microphone, so that the obtained target voice signal is more accurate, and the advantages of privacy conversation and noise reduction are still maintained in the low frequency part.

For example, the sentence spoken by the user is "capital of china is beijing city", the initial speech signal collected by the bone conduction microphone is distorted in the high frequency part, resulting in that the output sentence is "capital of middle is beijing city", and after synthesis by adopting the target transfer function, the output sentence is "capital of china is beijing city", thereby realizing that the speech signal of the full frequency band can be accurately collected by using only the bone conduction microphone.

In the voice processing method of the bone conduction device provided by the embodiment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

Fig. 2 is a schematic diagram of acquiring a target transfer function according to an embodiment of the present application. As shown in fig. 2, the horizontal axis represents the signal corresponding to pure air conduction voice, namely the second voice test signal y (t), the vertical axis represents the signal corresponding to bone conduction voice, namely the first voice test signal x (t), and the path transfer functions corresponding to the two signals are respectively defined as h _AC (t) and h _BC (t)The speech signal is e (t), the first speech test signal can be regarded as being obtained from the second speech test signal through a transfer function h (t), and the following relation is provided:

x(t)＝e(t)*h _BC (t)＝y(t)/h _AC (t)*h _BC (t)＝y(t)*h(t)；

thus, h (t) =y (t)/x (t);

when the voice signal collected by the bone conduction microphone is obtained, the relationship between the signal and the voice signal corresponding to the air conduction microphone can be expressed as follows: y (t) =x (t)/h (t), thereby realizing speech synthesis.

In the theoretical analysis, it is known from the analysis how to obtain the transfer function h (t) as a key point. In this embodiment, the transfer function is obtained by a method of manual training, that is, by training a training sample and a training model, and an artificial neural network algorithm may be used. Further, the training model is a deep learning model. The deep learning model can be referred to in the prior art, and the description of this embodiment is omitted.

As a preferred embodiment, determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:

a plurality of pairs of first test voice signals and second test voice signals are obtained in advance;

extracting characteristic parameters corresponding to a plurality of pairs of first test voice signals and second test voice signals, and dividing the plurality of pairs of characteristic parameters into a test group and a verification group;

verifying the multiple pairs of characteristic parameters of the verification group by using the current transfer function;

and taking the current transfer function as a target transfer function under the condition that the obtained verification result meets the requirement.

In order to ensure the accuracy of the transfer function, a plurality of pairs of first test voice signals and second test voice signals are required, and it is understood that a pair of first test voice signals and second test voice signals refer to signals acquired by a bone conduction microphone and an air conduction microphone respectively in the same test environment. In order to verify the accuracy of the transfer function, in the present embodiment, a part of the pairs of the first test voice signal and the second test voice signal is used as a test sample, and a part is used as a verification sample. Preferably, the characteristic parameters include frequency, amplitude and sound pressure. It will be appreciated that the test sample and the validation sample are not repeated and that the test sample and the validation sample should be as balanced as possible in terms of frequency, amplitude, sound pressure, etc.

In the above embodiment, after the objective transfer function is obtained, it is not limited whether all the signals in the initial speech signal are synthesized or partially participate in the synthesis. In this embodiment, considering that the signal of the bone conduction microphone in the middle-low frequency part is accurate and only the signal of the high frequency part is attenuated, synthesizing the initial voice signal into the target voice signal according to the target transfer function based on the above embodiment specifically includes:

acquiring a high-frequency voice signal and a middle-low frequency voice signal in an initial voice signal;

converting the high-frequency voice signal into a target high-frequency voice signal through a target transfer function;

the middle-low frequency voice signal and the target high frequency voice signal are used as target voice signals.

In a specific implementation, the high-frequency voice signal can be a signal above 2KHz, and the signals below 2KHz are all medium-low frequency voice signals.

In this embodiment, only the high-frequency speech signal in the initial speech signal is synthesized, and the middle-low frequency speech signal still maintains the original state, thereby reducing the operation amount and improving the synthesis speed.

As a preferred embodiment, the first test speech signal and the second test speech signal comprise a high frequency speech signal and a medium and low frequency speech signal.

In order to ensure the equalization of the test sample and further ensure the accuracy of the target transfer function, in this embodiment, there is a certain requirement for the sound source signal, that is, the first test voice signal and the second test voice signal include a high-frequency voice signal and a middle-low frequency voice signal. It will be appreciated that if only high frequency speech signals or mid-low frequency speech signals are included, this may result in the target transfer function being accurate only in the high frequency signal portion or mid-low frequency signal portion, which ultimately results in the target transfer function being inaccurate.

Fig. 3 is a flowchart of a voice process of another bone conduction device according to an embodiment of the present application. As shown in fig. 3, on the basis of the above embodiment, the method further includes:

s20: presetting a judgment rule;

s21: judging whether the target voice signal meets the judging rule, if so, entering S22, otherwise, entering S23;

s22: outputting the target voice signal;

s23: the target transfer function is adjusted and returns to S12.

In this embodiment, in order to prevent the output speech signal from being distorted due to inaccurate current target transfer function, after the target speech signal is obtained, whether the speech judgment meets the conventional logic, that is, the judgment rule mentioned in this embodiment, if the speech judgment does not meet the judgment rule, the current target transfer function is considered to be inaccurate, and further adjustment is required until the obtained target speech signal meets the judgment rule. For example, the voice text corresponding to the target voice signal is "beijing city", and the rule base only includes "beijing city", which indicates that the target voice signal does not meet the judgment rule, and the target transfer function needs to be further adjusted.

In order to make the speech processing procedure of the bone conduction device provided by the invention more clear for those skilled in the art, a specific application scenario is given below for explanation, and a TWS earphone is specifically taken as an example. Firstly, after a large number of tests in a test stage, a target transfer function is obtained, and then the function is written into a memory of the TWS earphone, and meanwhile, a corresponding algorithm is written. The user wears TWS earphone to carry out voice call, MCU of TWS earphone calls target transfer function after obtaining initial voice signal that bone conduction microphone gathered, synthesizes high-frequency voice signal with target transfer function, then exports well low-frequency voice signal and synthesized signal as final target voice signal to bluetooth module, and bluetooth module transmits to another bluetooth module of talking terminal after receiving this signal, and the processor of another terminal analyzes this signal, exports corresponding signal to the speaker finally for the opposite side user can hear.

In the above embodiments, the method for processing the voice of the bone conduction device is described in detail, and the application also provides corresponding embodiments of the voice processing device of the bone conduction device. It should be noted that the present application describes an embodiment of the device portion from two angles, one based on the angle of the functional module and the other based on the angle of the hardware.

Fig. 4 is a block diagram of a voice processing apparatus of a bone conduction device according to an embodiment of the present application. As shown in fig. 4, the apparatus includes:

a determining module 10, configured to determine a target transfer function corresponding to the first test voice signal and the second test voice signal; the first test voice signal and the second test voice signal are acquired by the bone conduction microphone and the air conduction microphone under the same voice test environment respectively;

an acquisition module 11, configured to acquire an initial voice signal acquired by a current bone conduction microphone;

the synthesizing module 12 is configured to synthesize the initial speech signal into a target speech signal according to a target transfer function.

Preferably, the method further comprises:

the setting module is used for presetting a judging rule;

the judging module is used for judging whether the target voice signal meets the judging rule after the target voice signal is obtained, if so, triggering the output module, and if not, triggering the adjusting module;

the output module is used for outputting the target voice signal;

the adjusting module is used for adjusting the target transfer function and triggering the synthesizing module 12.

Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.

In the voice processing device of the bone conduction equipment provided by the embodiment, in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, an initial voice signal is obtained by collecting the signals through the current bone conduction microphone, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

Fig. 5 is a block diagram of a voice processing apparatus of another bone conduction device according to an embodiment of the present application. As shown in fig. 5, the apparatus comprises a memory 20 for storing a computer program;

a processor 21 for implementing the steps of the speech processing method of the bone conduction apparatus as in the above-described embodiment when executing a computer program.

The voice processing device of the bone conduction device provided in this embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.

Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, which, when loaded and executed by the processor 21, is capable of implementing the relevant steps of the speech processing method of the bone conduction apparatus disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. The operating system 202 may include Windows, unix, linux, among others. The data 203 may include, but is not limited to, data required for a target transfer function, and the like.

In some embodiments, the voice processing device of the bone conduction apparatus may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.

It will be appreciated by those skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the speech processing means of the bone conduction apparatus and may include more or less components than those illustrated.

The voice processing device of the bone conduction equipment provided by the embodiment of the application comprises a memory and a processor, wherein the processor can realize the following method when executing a program stored in the memory: in the test process, signals are collected under the same voice test environment through the bone conduction microphone and the air conduction microphone so as to obtain a first test voice signal and a second test voice signal, then target transfer functions corresponding to the two signals are determined, in the application process, the current bone conduction microphone collects the signals so as to obtain an initial voice signal, and then the initial voice signal is synthesized into a target voice signal according to the target transfer functions. The method is applied to the technical scheme, the full-band voice signals can be acquired by only using one bone conduction microphone, and the additional microphone is not required to be added for high-frequency attenuation compensation, so that the hardware cost is reduced, the occupation of the internal space of bone conduction equipment is reduced, and the method is beneficial to miniaturization of products. In addition, since only the target transfer function is used in the synthesis process, the function can be conveniently written into the algorithm logic of the bone conduction device, and the universality and the portability are good. Finally, the low frequency portion still retains the advantages of privacy talking and noise reduction.

Finally, the present application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps as described in the method embodiments above.

It will be appreciated that the methods of the above embodiments, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored on a computer readable storage medium. With such understanding, the technical solution of the present application, or a part contributing to the prior art or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, performing all or part of the steps of the method described in the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The method, the device and the medium for processing the voice of the bone conduction device provided by the application are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.

It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method of speech processing by a bone conduction apparatus, comprising:

synthesizing the initial voice signal into a target voice signal according to the target transfer function;

the determining the target transfer function corresponding to the first test voice signal and the second test voice signal specifically includes:

2. The method according to claim 1, wherein synthesizing the initial speech signal into a target speech signal according to the target transfer function comprises:

3. The method for processing speech of a bone conduction apparatus according to claim 1, wherein the training model is a deep learning model.

4. The method of claim 1, wherein the characteristic parameters include frequency, amplitude, and sound pressure.

5. The method according to any one of claims 1 to 4, wherein the first test voice signal and the second test voice signal include a high-frequency voice signal and a middle-low-frequency voice signal.

6. The method for processing speech of a bone conduction apparatus according to claim 1, further comprising:

presetting a judgment rule;

if yes, outputting the target voice signal;

7. A speech processing apparatus of a bone conduction device, comprising:

the determining module is used for acquiring a plurality of pairs of first test voice signals and second test voice signals in advance, extracting a plurality of pairs of characteristic parameters corresponding to the first test voice signals and the second test voice signals, dividing the plurality of pairs of characteristic parameters into a test group and a verification group, inputting the plurality of pairs of characteristic parameters of the test group as training samples into a preset training model to obtain a current transfer function, verifying the plurality of pairs of characteristic parameters of the verification group by using the current transfer function, and taking the current transfer function as a target transfer function under the condition that the obtained verification result meets the requirement; the first test voice signal and the second test voice signal are acquired by a bone conduction microphone and an air conduction microphone in the same voice test environment respectively;

8. A speech processing apparatus of a bone conduction device, comprising a memory for storing a computer program;

a processor for implementing the steps of the speech processing method of the bone conduction apparatus according to any one of claims 1 to 6 when executing the computer program.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the speech processing method of the bone conduction apparatus according to any one of claims 1 to 6.