CN118430566A - Voice communication method and system - Google Patents
Voice communication method and system Download PDFInfo
- Publication number
- CN118430566A CN118430566A CN202410881084.8A CN202410881084A CN118430566A CN 118430566 A CN118430566 A CN 118430566A CN 202410881084 A CN202410881084 A CN 202410881084A CN 118430566 A CN118430566 A CN 118430566A
- Authority
- CN
- China
- Prior art keywords
- band
- data
- point
- extreme
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000006854 communication Effects 0.000 title claims abstract description 43
- 238000004891 communication Methods 0.000 title claims abstract description 42
- 238000012545 processing Methods 0.000 claims abstract description 22
- 230000000737 periodic effect Effects 0.000 claims abstract description 17
- 238000003379 elimination reaction Methods 0.000 claims abstract description 15
- 230000008030 elimination Effects 0.000 claims abstract description 14
- 238000001228 spectrum Methods 0.000 claims abstract description 14
- 238000010606 normalization Methods 0.000 claims description 15
- 230000000875 corresponding effect Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 230000002596 correlated effect Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 abstract description 6
- 230000000694 effects Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Landscapes
- Noise Elimination (AREA)
Abstract
The invention relates to the technical field of data processing, in particular to a voice communication method and a voice communication system, comprising the following steps: collecting voice audio, windowing the audio, searching similar extreme points on a waveform curve of the audio in a window, and taking the waveform curve between the two similar extreme points as a wave band; according to the similarity difference between a wave band and an adjacent wave band, obtaining the periodic instability of the wave band; and calculating the sum of the similarity between the data points in one wave band and the left and right adjacent wave bands. According to the method, the noise degree of each data point is obtained according to the change characteristics of the audio data, the noise degree of each data point is adopted to correct the audio data point, and then when the frequency spectrum elimination method is used, the noise processing effect on the voice audio can be improved, the influence of the noise on the audio can be reduced, and the situation that the noise is reduced by using the frequency spectrum elimination method and the distortion of the voice signal is caused can be reduced.
Description
Technical Field
The invention relates to the technical field of data processing. More particularly, the invention relates to a voice communication method and system.
Background
Voice communication, i.e., voice communication, refers to a communication scheme in which information is transferred by voice. It is one of the earliest emerging communication means in human society and is also an integral part of modern communication technology.
The voice communication optimization has the effect of improving the efficiency, quality and user experience of the voice communication system. By optimizing the voice encoding and decoding algorithm, the network transmission protocol and the like, the packet loss rate, delay and jitter in the voice communication process can be reduced, so that the call quality is improved, and the distortion and interruption of sound are reduced. Spectral cancellation is often used as a technique for signal processing and analysis, and its main role is to remove or suppress interference or noise in a specific frequency range from a signal, so that extracting a signal component of interest to optimize a speech coding algorithm and a network transmission protocol can reduce bandwidth and network resource occupation of speech communication, thereby reducing communication costs, especially in the case of using a mobile network or a limited bandwidth.
In the related art, for example, chinese patent application document with publication No. CN101859568a discloses a method and apparatus for eliminating voice background noise, which are implemented by detecting an effective value of a received audio signal to obtain an output signal reflecting an average power of the audio signal, comparing the output signal reflecting the average power of the audio signal with a first threshold value to generate a noise cancellation control signal, and performing an elimination process on the noise signal in the audio signal under the control of the noise cancellation control signal, and amplifying the voice signal, so as to achieve the purpose of eliminating the voice background noise.
In the prior art, a frequency spectrum elimination method is also adopted to process the voice, but noise can occur in the voice frequency at present, and the occurrence of the noise can cause the distortion of a voice signal when the noise is eliminated by the frequency spectrum elimination method, so that the quality of voice communication is affected.
Disclosure of Invention
The invention provides a voice communication method and a voice communication system, which aim to solve the problems that noise occurs in voice audio in the related technology, and the occurrence of the noise can cause distortion of voice signals when noise is reduced and removed by a frequency spectrum elimination method, so that the quality of voice communication is affected.
In a first aspect, the present invention provides a voice communication method, comprising: collecting voice audio, windowing the audio, searching similar extreme points on a waveform curve of the audio in a window, and taking the waveform curve between the two similar extreme points as a wave band; according to the similarity difference between a wave band and an adjacent wave band, obtaining the periodic instability of the wave band; calculating the sum of the similarity between the data points in one wave band and the left and right adjacent wave bands, taking the product of the similarity and the periodic instability of the wave bands as the data noise parameters of the data points, thereby obtaining the noise degree of each data point, and comprising the following steps: ; in the method, in the process of the invention, The noise level of the q-th data point representing band j,、DTW matching values respectively representing the frequency domain map of the band j and the frequency domain maps of the left and right adjacent bands,、、Represents the first band of band jData noise parameters of the data points and corresponding data points in the left and right adjacent wave bands; and weighting the data point values according to the noise degree of the data points to obtain corrected data point values, and processing the corrected data point values by adopting a frequency spectrum elimination method to obtain noise-reduced voice audio. According to the change characteristics of the audio data, the noise degree of each data point is obtained through calculation, then the noise degree of each data point is used for weighting the data point value to obtain a corrected data point value, and finally the corrected data point value is subjected to noise reduction treatment through a frequency spectrum elimination method, so that the noise reduction effect of the frequency spectrum elimination method on the audio is improved.
In an embodiment, searching for similar extremum points on the waveform curve of the audio in the window, taking the waveform curve between two similar extremum points as a wave band, includes: obtaining a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform characteristic quantity of each extreme point, wherein the included angle of a connecting line of the waveform characteristic quantity and one extreme point and the left and right adjacent extreme points is positively correlated with the amplitude of the extreme point, and the wave band is determined according to the difference value between the waveform characteristic quantity of the extreme point and the waveform characteristic quantity of other extreme points. The wave band is determined by considering the waveform characteristic quantity between the two extreme points, and the method has the advantages of simpler calculation process, less data operation quantity and improved data processing speed.
In one embodiment, obtaining a maximum point and a minimum point of a waveform curve in a window, and calculating waveform feature quantities of each extreme point includes: extreme pointWaveform characteristic quantity of (2); Wherein,Representing extreme pointsIs characterized by the waveform characteristic quantity of (a),Representing extreme pointsAn angle between the two adjacent extreme points,Representing extreme pointsIs used for the amplitude of (a) and (b),Representing a standard normalization function.
In an embodiment, the method searches for similar extremum points on the waveform curve of the audio in the window, takes the waveform curve between two similar extremum points as a wave band, and further includes: obtaining a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform characteristic quantity of each extreme point, wherein the waveform characteristic quantity is positively correlated with the connecting line included angle of one extreme point and the left and right adjacent extreme points and the amplitude of the extreme point; carrying out normalization processing on the waveform characteristic quantity of each extreme point, and calculating the waveform characteristic quantity difference value of each extreme point and the left and right adjacent extreme points to respectively obtain a left waveform characteristic quantity difference value and a right waveform characteristic quantity difference value of each extreme point; and calculating the difference value of the waveform characteristic quantity of the two extreme points, the difference value of the left waveform characteristic quantity and the difference value of the right waveform characteristic quantity, and taking a waveform curve in the middle of the two extreme points as a wave band if the three difference values all meet preset conditions and the distance between the two extreme points is nearest. The similarity of two extreme points is calculated by adopting the waveform characteristic vector of the extreme point and the waveform characteristic vector of the adjacent extreme point, and the change characteristic near the extreme point is considered, so that the calculated result is more accurate.
In an embodiment, calculating the period instability corresponding to each band according to the difference between the band and the adjacent band includes: ; wherein, Indicating the periodic instability of band j,A waveform feature quantity representing the p-th extremum data within the band j,Representing the waveform feature quantity of the p-th extremum data in the left adjacent band of band j,Representing the waveform feature quantity of the p-th extremum data in the right adjacent band of band j,The DTW values representing band j and the left adjacent band,The DTW values for band j and the right adjacent band are indicated.
In one embodiment, based on the product of the periodic instability of the band and the sum of the similarity of the data points in the band and the left and right adjacent bands, the data noise parameters of the data points are obtained, and the calculation formula is; Wherein,A data noise parameter representing the q-th data point of band j,Indicating the periodic instability of band j,AndThe q-th data point of the representing wave band j is matched with the matching distance after DTW matching is carried out on the left and right adjacent wave bands respectively.
In an embodiment, if all three differences meet the preset condition and the distances between the two extreme points are closest, taking the band in the middle of the two extreme points as the band includes: the fact that all three differences meet the preset condition means that all three differences are smaller than 0.3.
In one embodiment, correcting the data point value according to the noise level of the data point to obtain a corrected data point value includes: and carrying out normalization processing on the noise degree of the data points, and multiplying the noise degree of the data points after normalization processing by the data point values to obtain corrected data point values. By adopting the method, the calculation is simpler, and the data processing speed is improved.
In an embodiment, the correcting the data point value according to the noise level of the data point to obtain a corrected data point value, further includes: normalizing the noise degree of the data points; and calculating the absolute value of subtracting the mean value of the adjacent data points from the data point value, taking the product of the absolute value of subtracting the mean value of the adjacent data point value from the data point value after normalization processing as a correction value, and subtracting the correction value from the data point value to obtain the corrected data point value. And the data point value is adjusted towards the direction of the mean value, so that the influence of noise on the audio frequency is reduced, and the voice quality is improved.
The second aspect of the present invention also provides a voice communication system, comprising a processor and a memory, wherein the memory stores a computer program, and the processor executes the computer program to implement the voice communication method and system described in any one of the above.
The beneficial effects are that: according to the change characteristics of the audio data, the noise degree of each data point is obtained, the noise degree of each data point is adopted to correct the audio data point, and then when the frequency spectrum elimination method is used, the noise processing effect on the voice audio can be improved, the influence of noise on the audio can be reduced, and the situation that the noise is reduced when the frequency spectrum elimination method is used, the distortion of the voice signal can be caused can be reduced.
Drawings
Several embodiments of the present invention are illustrated by way of example and not by way of limitation, by reading the following detailed description in conjunction with the accompanying drawings in which like or corresponding reference numerals indicate like or corresponding parts and in which:
FIG. 1 is a graph schematically illustrating amplitude versus time according to an embodiment of the invention;
FIG. 2 is a flow chart schematically illustrating correction of audio data points according to an embodiment of the present invention;
Fig. 3 is a schematic diagram schematically illustrating a system configuration according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1 and 2, a detailed description of embodiments of the present invention will be given below with reference to the accompanying drawings.
Step S101: voice audio is collected.
In one embodiment, the voice audio is captured by a voice capture device, wherein the voice capture device includes, but is not limited to: smart phones, microphones, and recording pens, etc.
Step S102: segmenting the voice audio to obtain a plurality of wave bands.
It should be noted that, in order to improve the quality of the collected voice audio, the voice audio needs to be preprocessed before being segmented, so as to improve the quality of the voice audio. For example: the method comprises the steps of preprocessing collected voice audios by adopting an echo cancellation technology, eliminating echo influence of the collected voice audios, improving audibility of the voices and obtaining high-quality voice audios. And then a waveform diagram of the high-quality voice audio, namely a high-quality voice audio time domain diagram, is obtained through Adobe Audition, so that subsequent analysis of voice waveforms is facilitated, wherein the horizontal axis of the waveform diagram is time, and the vertical axis of the waveform diagram is amplitude.
For the segmentation process, it includes: in order to analyze the voice waveform, the high-quality voice audio time domain diagram is framed, and because the voice signal has short-time stationarity, the voice is required to be framed and windowed, then similar extreme points on the waveform curve of the audio in the window are searched, and the waveform curve between the two similar extreme points is used as a wave band.
In one embodiment, finding similar extremum points on the waveform profile of the audio within the window includes: calculating a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform characteristic quantity of each extreme point, wherein the angle between the waveform characteristic quantity and a connecting line of one extreme point and the left and right adjacent extreme points is positively correlated with the amplitude of the extreme point, and the calculation formula of the waveform characteristic quantity is as follows: extreme pointWaveform characteristic quantity of (2); Wherein,Representing extreme pointsIs characterized by the waveform characteristic quantity of (a),Representing extreme pointsAn angle between the two adjacent extreme points,Representing extreme pointsIs used for the amplitude of (a) and (b),Representing a standard normalization function.
In one embodiment, a method for obtaining two similar extreme points includes: and obtaining the waveform characteristic quantity of each extreme point, and determining the wave band according to the difference value between the waveform characteristic quantity of the extreme point and the waveform characteristic quantity of other extreme points. Specifically, if the difference between the waveform characteristic quantity of one extreme point and the waveform characteristic quantity of the other extreme point accords with a preset threshold, the two extreme points are taken as similar extreme points, and the waveform curve between the two similar extreme points is taken as a wave band. Wherein, the preset threshold value can be adjusted manually.
For example, if the preset threshold is Z1, four extreme points are provided in the window from left to right, the four extreme points are respectively a first extreme point, a second extreme point, a third extreme point and a fourth extreme point, and the waveform characteristic quantity of the first extreme point isThe waveform characteristic quantity of the second polar point isThe waveform characteristic quantity of the third electrode point isThe waveform characteristic quantity of the fourth electrode point isIf (if)-The first extreme point and the second extreme point are not similar extreme points, if-And when the first extreme value and the third extreme value are similar extreme values, the waveform curve between the first extreme value and the third extreme value is used as a wave band.
In another embodiment, the method for obtaining two similar extreme points further includes: after the waveform characteristic quantity of each extreme point is obtained, carrying out normalization processing on the waveform characteristic quantity of each extreme point, and calculating the waveform characteristic quantity difference value of the extreme point and the left and right adjacent extreme points to respectively obtain a left waveform characteristic quantity difference value and a right waveform characteristic quantity difference value of each extreme point; and calculating the difference value of the waveform characteristic quantity of the two extreme points, the difference value of the left waveform characteristic quantity and the difference value of the right waveform characteristic quantity, and taking a waveform curve in the middle of the two extreme points as a wave band if the three difference values all meet the preset condition and the distance between the two extreme points is nearest, so as to realize the period division. The preset condition may be set manually, for example, the preset condition is less than 0.3 or less than 0.4. By adopting the method for searching two similar extreme points in the embodiment, the characteristics of the adjacent extreme points are considered, so that the method for calculating the similar extreme points is more accurate.
For example, there are two extremum points, namely extremum point a and extremum point B, in the waveform curve in the window, and the waveform characteristic quantity of the extremum point a itself is calculated as X1, the left waveform characteristic quantity is X2, and the right waveform characteristic quantity is X3. Calculating the waveform characteristic quantity of the extreme point B as X5, the left waveform characteristic quantity as X6 and the right waveform characteristic quantity as X7; then, the difference (X1-X5) between the own waveform feature amounts of the extreme point A and the extreme point B, the difference (X2-X6) between the left waveform feature amounts, and the difference (X3-X7) between the right waveform feature amounts are calculated, and if the difference (X1-X5) between the extreme point A and the extreme point B, the difference (X2-X6) between the left waveform feature amounts, and the difference (X3-X7) between the right waveform feature amounts are smaller than 0.3, the extreme point A and the extreme point B are regarded as the extreme points of the approximate standard, and the waveform curve between the extreme point A and the extreme point B is confirmed as the wave band. If there are a plurality of extreme points of the approximate standard in the window, two extreme points closest to each other are selected, and a waveform curve between the two extreme points closest to each other is used as a band.
Step S103: and calculating the period instability corresponding to each wave band according to the difference between the wave band and the adjacent wave band.
In one embodiment, the periodic instability corresponding to each band is calculated as: . Wherein, Indicating the periodic instability of band j,A waveform feature quantity representing the p-th extremum data within the band j,Representing the waveform feature quantity of the p-th extremum data in the left adjacent band of band j,Representing the waveform feature quantity of the p-th extremum data in the right adjacent band of band j,The DTW values representing band j and the left adjacent band,The DTW values for band j and the right adjacent band are indicated.
In the formula, because the voice audio has certain periodic characteristics in short time and the periodic characteristics of adjacent audio are the most similar, the voice audio is produced byThe difference between each extreme value data point of the wave band and the adjacent extreme value data points on the left and right is compared, so that the period difference is reflected, and the larger the value is, the larger the difference is, and the more likely the difference is affected by noise.The DTW values representing band j and the left adjacent band,DTW values representing band j and right adjacent band, byThe larger the value of the difference between each data of the reaction band j and the adjacent band is, the more the audio of the band is more likely to be affected by noise.
Step S104: and obtaining the data noise parameters of each data point based on the product of the periodic instability of the wave band and the sum of the similarity of the data point in the wave band and the left and right adjacent wave bands.
In one embodiment, when noise exists in a wave band, the periodicity of data is affected by the noise, meanwhile, because in a short-time waveform, the influence of the noise does not suddenly appear or disappear, the noise influence has persistence, and the noise influence has randomness, the wave band is converted into a frequency domain diagram through fourier transformation, analysis is performed by combining with the time domain waveform diagram characteristics of the wave band, when the noise influence exists in the wave band, the continuous data in the wave band changes suddenly, and the similarity between the wave band and an adjacent wave band is low, wherein the low similarity means that the similarity is smaller than a set threshold, and the set threshold can be adjusted manually. Therefore, the noise degree of each data point is obtained according to the time domain and frequency domain approximate characteristics of the data noise parameters of the data in the wave band: . In the method, in the process of the invention, The noise level of the q-th data point representing band j,、DTW matching values respectively representing the frequency domain map of the band j and the frequency domain maps of the left and right adjacent bands,、、Represents the first band of band jData noise parameters of data points and corresponding data points in left and right adjacent bands.
Because the noise influence has randomness, when noise exists in the wave band, the periodicity of the data is influenced by the noise, the continuous data change suddenly, and the similarity between the continuous data and the adjacent wave band is low, wherein the low similarity means that the similarity is smaller than a set threshold, and the set threshold can be manually adjusted. So byTo reflect the difference between the adjacent data of the q-th data and the corresponding data of the adjacent band of the band, the larger the value of the difference is, the larger the degree of influence of noise on the data point is. Thus, the noise degree corresponding to each audio data point can be obtained.
Step S105: the data point is weighted according to its noise level to obtain a corrected data point value.
Specifically, the data point values are weighted according to the noise degree of the data points to obtain corrected data point values, and then the corrected data point values are processed by adopting a frequency spectrum elimination method to obtain noise-reduced voice audio.
In one embodiment, to obtain a corrected data point value, comprising: normalizing the noise degree of the data points; and calculating the absolute value of subtracting the mean value of the adjacent data points from the data point value, taking the product of the absolute value of subtracting the mean value of the adjacent data point value from the data point value after normalization processing as a correction value, and subtracting the correction value from the data point value to obtain the corrected data point value.
Illustratively, the data point value is Y1, the correction value is H, and the adjustment direction is adjusted toward the mean, then the corrected data point value is calculated as Y1-H. And the data points which are affected by noise and are larger are adjusted to the average value direction, so that the influence of noise on the audio data is reduced.
In another embodiment, to obtain a corrected data point value, further comprising: and carrying out normalization processing on the noise degree of the data points, and multiplying the noise degree of the data points after normalization processing by the data point values to obtain corrected data point values.
Step S106: and (5) carrying out numerical processing on the corrected data points to obtain noise-reduced voice audio.
Specifically, a spectrum elimination method is adopted to process the corrected audio data points, and noise-reduced voice audio is obtained.
According to the steps, the noise degree of each data point is obtained according to the periodic characteristics and the frequency domain characteristics of the time domain of the audio data, and the noise degree of each data point is combined when the voice audio is processed by using the frequency spectrum elimination method, so that the voice communication quality is improved, the distortion and interruption of the sound are reduced, and the voice stability and the voice reliability of the voice communication system are improved.
The invention also provides a voice communication system. As shown in fig. 3, the system comprises a processor and a memory storing computer program instructions which, when executed by the processor, implement a voice communication method according to the first aspect of the invention.
In one embodiment, the present invention provides a computer device whose internal structure may be as shown in FIG. 3. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. The processor of the computer equipment is used for providing calculation and control capability, and various varieties such as CPU, singlechip, DSP or FPGA can be selected. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The steps described in the above method embodiments, e.g. S101-S106, may be completed when the computer program is executed. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for communicating with an external terminal in a wired or wireless manner, and the wireless manner can be realized through Wi-Fi, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a voice communication method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in FIG. 3 is a block diagram of only some of the structures associated with the aspects of the present invention and is not limiting of the computer device of the present invention, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
The system further comprises other components known to those skilled in the art, such as a communication bus and a communication interface, the arrangement and function of which are known in the art and are therefore not described in detail herein.
In the context of this patent, the foregoing memory may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, the computer-readable storage medium may be any suitable magnetic or magneto-optical storage medium, such as, for example, resistance change Memory RRAM (Resistive Random Access Memory), dynamic Random Access Memory DRAM (Dynamic Random Access Memory), static Random Access Memory SRAM (Static Random-Access Memory), enhanced dynamic Random Access Memory EDRAM (ENHANCED DYNAMIC Random Access Memory), high-Bandwidth Memory HBM (High-Bandwidth Memory), hybrid storage cube HMC (Hybrid Memory Cube), or the like, or any other medium that may be used to store the desired information and that may be accessed by an application, a module, or both. Any such computer storage media may be part of, or accessible by, or connectable to, the device. Any of the applications or modules described herein may be implemented using computer-readable/executable instructions that may be stored or otherwise maintained by such computer-readable media.
In the description of the present specification, the meaning of "a plurality", "a number" or "a plurality" is at least two, for example, two, three or more, etc., unless explicitly defined otherwise.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the claims. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.
Claims (10)
1. A method of voice communication, comprising:
Collecting voice audio, windowing the audio, searching similar extreme points on a waveform curve of the audio in a window, and taking the waveform curve between the two similar extreme points as a wave band;
according to the similarity difference between a wave band and an adjacent wave band, obtaining the periodic instability of the wave band;
Calculating the sum of the similarity between the data points in one wave band and the left and right adjacent wave bands, taking the product of the similarity and the periodic instability of the wave bands as the data noise parameters of the data points, thereby obtaining the noise degree of each data point, and comprising the following steps:
;
In the method, in the process of the invention, The noise level of the q-th data point representing band j,、DTW matching values respectively representing the frequency domain map of the band j and the frequency domain maps of the left and right adjacent bands,、、Represents the first band of band jData noise parameters of the data points and corresponding data points in the left and right adjacent wave bands;
And weighting the data point values according to the noise degree of the data points to obtain corrected data point values, and processing the corrected data point values by adopting a frequency spectrum elimination method to obtain noise-reduced voice audio.
2. The voice communication method according to claim 1, wherein searching for similar extreme points on the waveform curve of the audio within the window, taking the waveform curve between two similar extreme points as a band, comprises:
obtaining a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform characteristic quantity of each extreme point, wherein the included angle of a connecting line of the waveform characteristic quantity and one extreme point and the left and right adjacent extreme points is positively correlated with the amplitude of the extreme point, and the wave band is determined according to the difference value between the waveform characteristic quantity of the extreme point and the waveform characteristic quantity of other extreme points.
3. The voice communication method according to claim 2, wherein obtaining a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform feature amounts of the respective extreme points, comprises:
Extreme point Waveform characteristic quantity of (2);
Wherein,Representing extreme pointsIs characterized by the waveform characteristic quantity of (a),Representing extreme pointsAn angle between the two adjacent extreme points,Representing extreme pointsIs used for the amplitude of (a) and (b),Representing a standard normalization function.
4. The voice communication method according to claim 1, wherein a waveform curve between two similar extreme points is taken as a band, further comprising:
Obtaining a maximum value point and a minimum value point of a waveform curve in a window, and calculating waveform characteristic quantity of each extreme point, wherein the waveform characteristic quantity is positively correlated with the connecting line included angle of one extreme point and the left and right adjacent extreme points and the amplitude of the extreme point;
carrying out normalization processing on the waveform characteristic quantity of the extreme point, and calculating the waveform characteristic quantity difference value between the extreme point and the left and right adjacent extreme points to obtain a left waveform characteristic quantity difference value and a right waveform characteristic quantity difference value of each extreme point;
And calculating the difference value of the waveform characteristic quantity of the two extreme points, the difference value of the left waveform characteristic quantity and the difference value of the right waveform characteristic quantity, and taking a waveform curve in the middle of the two extreme points as a wave band if the three difference values all meet preset conditions and the distance between the two extreme points is nearest.
5. The voice communication method according to claim 1, wherein calculating the periodic instability corresponding to each band based on the difference between the band and the adjacent band comprises:
;
wherein, Indicating the periodic instability of band j,A waveform feature quantity representing the p-th extremum data within the band j,Representing the waveform feature quantity of the p-th extremum data in the left adjacent band of band j,Representing the waveform feature quantity of the p-th extremum data in the right adjacent band of band j,The DTW values representing band j and the left adjacent band,The DTW values for band j and the right adjacent band are indicated.
6. The voice communication method according to claim 1, wherein the data noise parameter of each data point is obtained based on the product of the period instability of the band and the sum of the similarity between the data point in the band and the left and right adjacent bands, and the calculation formula is:
;
wherein, A data noise parameter representing the q-th data point of band j,Indicating the periodic instability of band j,AndThe q-th data point of the representing wave band j is matched with the matching distance after DTW matching is carried out on the left and right adjacent wave bands respectively.
7. The voice communication method according to claim 4, wherein if all three differences satisfy a preset condition, comprising:
the fact that all three differences meet the preset condition means that all three differences are smaller than 0.3.
8. The voice communication method according to claim 1, wherein correcting the data point value according to the noise level of the data point to obtain a corrected data point value comprises:
And carrying out normalization processing on the noise degree of the data points, and multiplying the noise degree of the data points after normalization processing by the data point values to obtain corrected data point values.
9. The voice communication method according to claim 1, wherein the data point value is corrected according to the noise level of the data point, and the corrected data point value is obtained, further comprising:
normalizing the noise degree of the data points;
And calculating the absolute value of subtracting the mean value of the adjacent data points from the data point value, taking the product of the absolute value of subtracting the mean value of the adjacent data point value from the data point value after normalization processing as a correction value, and subtracting the correction value from the data point value to obtain the corrected data point value.
10. A voice communication system comprising a processor and a memory, the memory storing a computer program, wherein the processor executes the computer program to implement the voice communication method of any of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410881084.8A CN118430566B (en) | 2024-07-03 | 2024-07-03 | Voice communication method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410881084.8A CN118430566B (en) | 2024-07-03 | 2024-07-03 | Voice communication method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118430566A true CN118430566A (en) | 2024-08-02 |
CN118430566B CN118430566B (en) | 2024-10-11 |
Family
ID=92316244
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410881084.8A Active CN118430566B (en) | 2024-07-03 | 2024-07-03 | Voice communication method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118430566B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1511011A2 (en) * | 2003-08-25 | 2005-03-02 | Microsoft Corporation | Method und apparatus for robust speech recognition |
US20090248411A1 (en) * | 2008-03-28 | 2009-10-01 | Alon Konchitsky | Front-End Noise Reduction for Speech Recognition Engine |
KR20190125064A (en) * | 2018-04-27 | 2019-11-06 | (주)투미유 | Apparatus for judging the similiarity between voices and the method for judging the similiarity between voices |
CN115985273A (en) * | 2023-03-21 | 2023-04-18 | 北京卓颜翰景科技有限公司 | Notation method and system based on multi-sensor data fusion |
CN116935880A (en) * | 2023-09-19 | 2023-10-24 | 深圳市一合文化数字科技有限公司 | Integrated machine man-machine interaction system and method based on artificial intelligence |
CN117037834A (en) * | 2023-10-08 | 2023-11-10 | 广州市艾索技术有限公司 | Conference voice data intelligent acquisition method and system |
CN117059120A (en) * | 2023-09-13 | 2023-11-14 | 深圳市匠心原创科技有限公司 | Signal enhancement processing method of bone conduction earphone |
CN117373471A (en) * | 2023-12-05 | 2024-01-09 | 鸿福泰电子科技(深圳)有限公司 | Audio data optimization noise reduction method and system |
CN117711419A (en) * | 2024-02-05 | 2024-03-15 | 卓世智星(成都)科技有限公司 | Intelligent data cleaning method for data center |
CN117995178A (en) * | 2024-04-07 | 2024-05-07 | 深圳市西昊智能家具有限公司 | Intelligent office voice control method and system based on voice recognition |
-
2024
- 2024-07-03 CN CN202410881084.8A patent/CN118430566B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1511011A2 (en) * | 2003-08-25 | 2005-03-02 | Microsoft Corporation | Method und apparatus for robust speech recognition |
US20090248411A1 (en) * | 2008-03-28 | 2009-10-01 | Alon Konchitsky | Front-End Noise Reduction for Speech Recognition Engine |
KR20190125064A (en) * | 2018-04-27 | 2019-11-06 | (주)투미유 | Apparatus for judging the similiarity between voices and the method for judging the similiarity between voices |
CN115985273A (en) * | 2023-03-21 | 2023-04-18 | 北京卓颜翰景科技有限公司 | Notation method and system based on multi-sensor data fusion |
CN117059120A (en) * | 2023-09-13 | 2023-11-14 | 深圳市匠心原创科技有限公司 | Signal enhancement processing method of bone conduction earphone |
CN116935880A (en) * | 2023-09-19 | 2023-10-24 | 深圳市一合文化数字科技有限公司 | Integrated machine man-machine interaction system and method based on artificial intelligence |
CN117037834A (en) * | 2023-10-08 | 2023-11-10 | 广州市艾索技术有限公司 | Conference voice data intelligent acquisition method and system |
CN117373471A (en) * | 2023-12-05 | 2024-01-09 | 鸿福泰电子科技(深圳)有限公司 | Audio data optimization noise reduction method and system |
CN117711419A (en) * | 2024-02-05 | 2024-03-15 | 卓世智星(成都)科技有限公司 | Intelligent data cleaning method for data center |
CN117995178A (en) * | 2024-04-07 | 2024-05-07 | 深圳市西昊智能家具有限公司 | Intelligent office voice control method and system based on voice recognition |
Also Published As
Publication number | Publication date |
---|---|
CN118430566B (en) | 2024-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5874344B2 (en) | Voice determination device, voice determination method, and voice determination program | |
US10014005B2 (en) | Harmonicity estimation, audio classification, pitch determination and noise estimation | |
US8571231B2 (en) | Suppressing noise in an audio signal | |
CN109643554A (en) | Adaptive voice Enhancement Method and electronic equipment | |
CN111128213B (en) | Noise suppression method and system for processing in different frequency bands | |
CN104067339B (en) | Noise-suppressing device | |
CN111383646B (en) | Voice signal transformation method, device, equipment and storage medium | |
WO2012158156A1 (en) | Noise supression method and apparatus using multiple feature modeling for speech/noise likelihood | |
CN108200526B (en) | Sound debugging method and device based on reliability curve | |
CN110556125B (en) | Feature extraction method and device based on voice signal and computer storage medium | |
CN108022595A (en) | A kind of voice signal noise-reduction method and user terminal | |
CN112485761B (en) | Sound source positioning method based on double microphones | |
CN114785379A (en) | Underwater sound JANUS signal parameter estimation method and system | |
CN113241089A (en) | Voice signal enhancement method and device and electronic equipment | |
CN118430566B (en) | Voice communication method and system | |
WO2024041512A1 (en) | Audio noise reduction method and apparatus, and electronic device and readable storage medium | |
CN113160846A (en) | Noise suppression method and electronic device | |
CN103310800A (en) | Voiced speech detection method and voiced speech detection system for preventing noise interference | |
CN103337245B (en) | Based on the noise suppressing method of signal to noise ratio curve and the device of subband signal | |
CN115941084A (en) | Underwater acoustic communication preamble signal detection method and device based on time-frequency graph template matching | |
CN111341347B (en) | Noise detection method and related equipment | |
CN111048096B (en) | Voice signal processing method and device and terminal | |
Cai et al. | Application of three-threshold FCME and extended interpolation algorithm in narrowband interference suppression | |
WO2019100327A1 (en) | Signal processing method, device and terminal | |
CN117711434B (en) | Audio processing method and device, electronic equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |