CN111354365A - Pure voice data sampling rate identification method, device and system - Google Patents
Pure voice data sampling rate identification method, device and system Download PDFInfo
- Publication number
- CN111354365A CN111354365A CN202010160577.4A CN202010160577A CN111354365A CN 111354365 A CN111354365 A CN 111354365A CN 202010160577 A CN202010160577 A CN 202010160577A CN 111354365 A CN111354365 A CN 111354365A
- Authority
- CN
- China
- Prior art keywords
- frequency
- sampling rate
- voice data
- assumed
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000005070 sampling Methods 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000012545 processing Methods 0.000 claims abstract description 30
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 5
- 238000004891 communication Methods 0.000 description 10
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
Abstract
The embodiment of the application discloses a pure voice data sampling rate identification method, a pure voice data sampling rate identification device and a pure voice data sampling rate identification system, wherein the method comprises the following steps: carrying out Fourier transform on the pure voice data to obtain frequency domain data; processing the frequency domain data according to the received prior threshold data to obtain frequency band information; acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to preset different sampling rates; and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate. According to the method, the voice sent by a person is compared with different assumed frequencies of the pure voice data according to the prior characteristic that the bandwidth range of the voice in a frequency domain is 200 Hz-4000 Hz, and the actual sampling rate can be determined according to the similarity value of the comparison result, so that the sampling rate of the pure voice data can be automatically pre-judged, and the problems that the voice processing effect is greatly influenced under the condition of unknown sampling rate and the like are prevented.
Description
Technical Field
The invention belongs to the technical field of voice processing, and particularly relates to a pure voice data sampling rate identification method, device and system.
Background
The sampling rate defines the number of samples extracted from continuous signals per second and forming discrete signals, is used for describing the tone quality and tone of sound files, and is a quality standard for measuring sound cards and sound files.
In the speech processing process, pure speech data can be encountered sometimes, the speech data refers to speech data without sampling rate information, and under the condition that the sampling rate cannot be known, if the speech data is processed according to the wrong sampling rate, the result of the speech processing often has large deviation, so that the effect of outputting speech is poor, and further the user experience of a speech product is influenced.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a pure voice data sampling rate identification method, a pure voice data sampling rate identification device and a pure voice data sampling rate identification system.
The embodiment of the invention provides the following specific technical scheme:
in a first aspect, the present invention provides a pure voice data sampling rate identification method, including:
carrying out Fourier transform on the pure voice data to obtain frequency domain data;
processing the frequency domain data according to the received prior threshold data to obtain frequency band information;
acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
Preferably, the range of the prior frequency is 200Hz to 4000 Hz.
Preferably, comparing the different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate specifically includes:
and calculating Euclidean distances between different assumed frequencies and the prior frequency, and determining the sampling rate corresponding to the assumed frequency with the minimum Euclidean distance as the actual sampling rate.
Preferably, after performing fourier transform on the pure voice data to obtain frequency domain data, the method further comprises:
and carrying out normalization processing on the frequency domain data.
Preferably, the pure voice data is pure voice data containing voice segments;
the method for acquiring the pure voice data containing the voice segments comprises the following steps:
receiving voice data and analyzing the voice data;
when the voice data does not include the sampling rate information, performing Fourier transform on the voice data to obtain the energy of the voice data;
and acquiring voice data corresponding to the energy larger than the energy threshold value according to a preset energy threshold value to obtain the pure voice data containing the voice segments.
Preferably, the method further comprises:
and decoding the pure voice data according to the actual sampling rate.
In a second aspect, the present invention provides a pure voice data sampling rate recognition apparatus, including:
the conversion module is used for carrying out Fourier transform on the pure voice data to obtain frequency domain data;
the acquisition module is used for processing the frequency domain data according to the received prior threshold data to obtain frequency band information; and a high-frequency cut-off frequency point for acquiring the frequency band information;
the calculation module is used for calculating the corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and the processing module is used for comparing different assumed frequencies with the prior frequency and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
Preferably, the range of the prior frequency is 200Hz to 4000 Hz.
The processing module is specifically configured to:
and calculating Euclidean distances between different assumed frequencies and the prior frequency, and determining the sampling rate corresponding to the assumed frequency with the minimum Euclidean distance as the actual sampling rate.
Preferably, the conversion module is further configured to perform normalization processing on the frequency domain data after performing fourier transform on the pure voice data to obtain the frequency domain data.
Preferably, the pure voice data is pure voice data containing voice segments;
the device further comprises:
the receiving module is used for receiving voice data;
the analysis module is used for analyzing the voice data;
the conversion module is further used for carrying out Fourier transform on the voice data to obtain the energy of the voice data when the voice data does not include the sampling rate information;
the processing module is further configured to obtain, according to a preset energy threshold, the voice data corresponding to the energy greater than the energy threshold to obtain the pure voice data including the voice segment.
Preferably, the apparatus further comprises:
and the decoding module is used for decoding the pure voice data according to the actual sampling rate.
In a third aspect, the present invention provides a computer system comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
carrying out Fourier transform on the pure voice data to obtain frequency domain data;
processing the frequency domain data according to the received prior threshold data to obtain frequency band information;
acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
The embodiment of the invention has the following beneficial effects:
the invention compares the voice sent by a person in different assumed frequencies of pure voice data according to the prior characteristic that the bandwidth range of the voice in a frequency domain is 200 Hz-4000 Hz, and can determine the actual sampling rate according to the similarity value of the comparison result, thereby automatically pre-judging the sampling rate of the pure voice data, preventing the problem of generating larger influence on the voice processing effect under the condition of unknown sampling rate, and the like, improving the bad packet phenomenon in the network transmission voice data packet, increasing the sampling rate checking and reminding functions of a voice communication engine, common audio processing software and the like, and improving the robustness of voice processing equipment.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart of a pure speech data sampling rate recognition method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a pure speech data sampling rate recognition apparatus according to a second embodiment of the present application;
fig. 3 is a schematic structural diagram of a computer system according to a third embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
When network switching exists between two fixed telephones, if a PCM voice packet received by one party loses sampling rate information, the voice cannot be accurately processed, so that the playing effect of the voice is affected, and abnormal played voice may be caused.
At this time, if the sampling rate of the voice can be preliminarily identified, the situation can be avoided, and thus the bad packet phenomenon occurring in the voice data packet transmitted by the network can be improved.
Based on this, as shown in fig. 1, the present application provides a pure voice data sampling rate recognition method, which can be applied to an audio device, where the audio device performs the following processing:
and S11, receiving the voice data and preprocessing the voice data to obtain pure voice data.
In this embodiment, the pure voice data is pure voice data including a voice segment, and the steps specifically include:
1. analyzing the voice data;
specifically, the related information of the voice data is obtained and analyzed, and the related information includes a data packet, a sampling rate, and the like.
2. When the voice data does not include the sampling rate information, performing Fourier transform on the voice data to obtain the energy of the voice data;
3. and acquiring voice data corresponding to the energy larger than the energy threshold value according to a preset energy threshold value to obtain pure voice data containing voice segments.
The receiving voice data and preprocessing to obtain pure voice data can be realized by the following steps:
1. analyzing the voice data;
2. when the voice data does not include the sampling rate information, the voice data is filtered to obtain pure voice data containing voice segments.
The filtering process is intended to enhance speech data by removing information such as noise and silence in the effective speech data.
And S12, carrying out Fourier transform on the pure voice data to obtain frequency domain data.
Pure voice data in the time domain can be converted into voice data in the frequency domain by fourier transform.
And S13, normalizing the frequency domain data.
And S14, processing the frequency domain data according to the received prior threshold data to obtain frequency band information.
Wherein the prior threshold data is obtained by processing the prior information.
The bandwidth range of the voice sent by people in the frequency domain is 200 Hz-4000 Hz, so the information is introduced as prior information.
And S15, acquiring high-frequency cut-off frequency points of the frequency band information, and calculating corresponding assumed frequencies at the high-frequency cut-off frequency points according to different preset sampling rates.
And S16, comparing the different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison results are most similar as the actual sampling rate.
Specifically, the step may include:
and calculating Euclidean distances between different assumed frequencies and the prior frequency, and determining the sampling rate corresponding to the assumed frequency with the minimum Euclidean distance as the actual sampling rate.
When the Euclidean distance is minimum, the similarity between the assumed frequency and the prior frequency is the highest, so that the sampling rate corresponding to the assumed frequency is the closest to the actual value, and the actual sampling rate of the pure voice data can be obtained.
When the euclidean distances between different assumed frequencies and the prior frequencies are calculated, a number of frequency points in the prior frequencies can be selected for calculation, such as: 200Hz and 4000 Hz.
And S17, decoding and playing the pure voice data according to the actual sampling rate.
Therefore, for bad packets losing sampling rate information in the telephone exchange network, decoding and playing of normal processes can be carried out according to the identified sampling rate, and therefore quality and experience of voice call are improved.
In order to prove the effect brought by the method, the method tests the audio files with various sampling rates, and the identification result is as follows:
TABLE 1 test accuracy for Audio files of different sampling rates
Experimental conditions | Judging the accuracy | Judging the number |
Pcm file 8k sampling | 85% | 100 of |
Pcm file 16k sampling | 81% | 100 of |
Pcm file 32k samples | 88% | 100 of |
According to the experimental results, the identification accuracy rate of the scheme is higher and is more than 80%.
In addition, for communication engines such as IP telephone service and the like, if the wrong sampling rate is set due to misoperation of workers, the sampling rate can be pre-judged by using the method, the workers can be reminded in time, and unnecessary risks and losses are reduced; for common audio processing software, a user can be timely reminded of sampling rate setting errors, and time waste and redundant operation of the user in life and work are reduced.
Example two
As shown in fig. 2, the present application further provides a pure voice data sampling rate recognition apparatus, including:
the conversion module 21 is configured to perform fourier transform on the pure voice data to obtain frequency domain data;
an obtaining module 22, configured to process the frequency domain data according to the received prior threshold data to obtain frequency band information; and a high-frequency cut-off frequency point for acquiring frequency band information;
the calculating module 23 is configured to calculate assumed frequencies corresponding to the high-frequency cut-off frequency points according to preset different sampling rates;
and the processing module 24 is configured to compare the different assumed frequencies with the prior frequencies, and determine a sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
Preferably, the range of the prior frequency is 200Hz to 4000 Hz.
Preferably, the processing module 24 is specifically configured to calculate euclidean distances between different assumed frequencies and the prior frequency, and determine a sampling rate corresponding to the assumed frequency when the euclidean distance is the smallest as the actual sampling rate.
Preferably, the converting module 21 is further configured to perform normalization processing on the frequency domain data after performing fourier transform on the pure voice data to obtain the frequency domain data.
Preferably, the pure voice data is pure voice data containing voice segments;
the above-mentioned device still includes:
a receiving module 25, configured to receive voice data;
an analysis module 26 for analyzing the voice data;
the conversion module 21 is further configured to, when the voice data does not include the sampling rate information, perform fourier transform on the voice data to obtain energy of the voice data;
the processing module 24 is further configured to obtain, according to a preset energy threshold, voice data corresponding to energy greater than the energy threshold to obtain pure voice data including a voice segment.
Preferably, the above apparatus further comprises:
and a decoding module 27, configured to decode the pure speech data according to the actual sampling rate.
EXAMPLE III
The present application further provides a computer system comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
carrying out Fourier transform on the pure voice data to obtain frequency domain data;
processing the frequency domain data according to the received prior threshold data to obtain frequency band information;
acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to preset different sampling rates;
and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
FIG. 3 illustrates an architecture of a computer system that may include, in particular, a processor 32, a video display adapter 34, a disk drive 36, an input/output interface 38, a network interface 310, and a memory 312. The processor 32, video display adapter 34, disk drive 36, input/output interface 38, network interface 310, and memory 312 may be communicatively coupled via a communication bus 314.
The processor 32 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided in the present Application.
The Memory 312 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 312 may store an operating system 316 for controlling the operation of the computer system 30, a Basic Input Output System (BIOS)318 for controlling low-level operations of the computer system. In addition, a web browser 320, a data storage management system 322, and the like may also be stored. In summary, when the technical solution provided by the present application is implemented by software or firmware, the relevant program code is stored in the memory 312 and invoked by the processor 32 for execution.
The input/output interface 38 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The network interface 310 is used for connecting a communication module (not shown in the figure) to realize communication interaction between the device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Communication bus 314 includes a path to transfer information between the various components of the device, such as processor 32, video display adapter 34, disk drive 36, input/output interface 38, network interface 310, and memory 312.
In addition, the computer system can also obtain the information of specific receiving conditions from the virtual resource object receiving condition information database for condition judgment and the like.
It should be noted that although the above-described device only shows the processor 32, the video display adapter 34, the disk drive 36, the input/output interface 38, the network interface 310, the memory 312, the communication bus 314, etc., in a specific implementation, the device may also include other components necessary for normal operation.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a cloud server, or a network device) to execute the method according to the embodiments or some parts of the embodiments of the present application.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention. In addition, the computer system, the pure voice data sampling rate recognition apparatus and the pure voice data sampling rate recognition method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments and are not described herein again.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (10)
1. A method for pure speech data sample rate recognition, the method comprising:
carrying out Fourier transform on the pure voice data to obtain frequency domain data;
processing the frequency domain data according to the received prior threshold data to obtain frequency band information;
acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
2. The method of claim 1, wherein the a priori frequencies are in a range of 200Hz to 4000 Hz.
3. The method of claim 1, wherein comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency with the most similar comparison result as the actual sampling rate specifically comprises:
and calculating Euclidean distances between different assumed frequencies and the prior frequency, and determining the sampling rate corresponding to the assumed frequency with the minimum Euclidean distance as the actual sampling rate.
4. The method of any one of claims 1 to 3, wherein after performing a Fourier transform on the pure speech data to obtain frequency domain data, the method further comprises:
and carrying out normalization processing on the frequency domain data.
5. The method according to any one of claims 1 to 3, wherein the pure voice data is pure voice data containing voice segments;
the method for acquiring the pure voice data containing the voice segments comprises the following steps:
receiving voice data and analyzing the voice data;
when the voice data does not include the sampling rate information, performing Fourier transform on the voice data to obtain the energy of the voice data;
and acquiring voice data corresponding to the energy larger than the energy threshold value according to a preset energy threshold value to obtain the pure voice data containing the voice segments.
6. The method according to any one of claims 1 to 3, further comprising:
and decoding the pure voice data according to the actual sampling rate.
7. A pure speech data sample rate recognition apparatus, comprising:
the conversion module is used for carrying out Fourier transform on the pure voice data to obtain frequency domain data;
the acquisition module is used for processing the frequency domain data according to the received prior threshold data to obtain frequency band information; and a high-frequency cut-off frequency point for acquiring the frequency band information;
the calculation module is used for calculating the corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and the processing module is used for comparing different assumed frequencies with the prior frequency and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
8. The apparatus of claim 6, wherein the a priori frequency is in a range of 200Hz to 4000 Hz.
9. The apparatus of claim 6, wherein the processing module is specifically configured to:
and calculating Euclidean distances between different assumed frequencies and the prior frequency, and determining the sampling rate corresponding to the assumed frequency with the minimum Euclidean distance as the actual sampling rate.
10. A computer system, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
carrying out Fourier transform on the pure voice data to obtain frequency domain data;
processing the frequency domain data according to the received prior threshold data to obtain frequency band information;
acquiring a high-frequency cut-off frequency point of the frequency band information, and calculating a corresponding assumed frequency at the high-frequency cut-off frequency point according to different preset sampling rates;
and comparing different assumed frequencies with the prior frequencies, and determining the sampling rate corresponding to the assumed frequency when the comparison result is most similar as the actual sampling rate.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010160577.4A CN111354365B (en) | 2020-03-10 | 2020-03-10 | Pure voice data sampling rate identification method, device and system |
PCT/CN2020/097008 WO2021179470A1 (en) | 2020-03-10 | 2020-06-19 | Method, device and system for recognizing sampling rate of pure voice data |
CA3175103A CA3175103A1 (en) | 2020-03-10 | 2020-06-19 | Method for sampling-rate recognition of pure voice data, apparatus, and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010160577.4A CN111354365B (en) | 2020-03-10 | 2020-03-10 | Pure voice data sampling rate identification method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111354365A true CN111354365A (en) | 2020-06-30 |
CN111354365B CN111354365B (en) | 2023-10-31 |
Family
ID=71196071
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010160577.4A Active CN111354365B (en) | 2020-03-10 | 2020-03-10 | Pure voice data sampling rate identification method, device and system |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN111354365B (en) |
CA (1) | CA3175103A1 (en) |
WO (1) | WO2021179470A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113447713A (en) * | 2021-06-25 | 2021-09-28 | 南京丰道电力科技有限公司 | Fourier-based fast high-precision power system frequency measurement method and device |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020150302A1 (en) * | 1997-07-31 | 2002-10-17 | The Regents Of The University Of California | Apparatus and methods for image and signal processing |
CN101320560A (en) * | 2008-07-01 | 2008-12-10 | 上海大学 | Method for speech recognition system improving discrimination by using sampling velocity conversion |
CN101582264A (en) * | 2009-06-12 | 2009-11-18 | 瑞声声学科技(深圳)有限公司 | Method and voice collecting system for speech enhancement |
JP2012002858A (en) * | 2010-06-14 | 2012-01-05 | Pioneer Electronic Corp | Time scaling method, pitch shift method, audio data processing apparatus and program |
US20130117031A1 (en) * | 2010-07-13 | 2013-05-09 | Actions Semiconductor Co., Ltd. | Audio data encoding method and device |
CN103745726A (en) * | 2013-11-07 | 2014-04-23 | 中国电子科技集团公司第四十一研究所 | Self-adaptive variable-sampling rate audio frequency sampling method |
CN105513590A (en) * | 2015-11-23 | 2016-04-20 | 百度在线网络技术(北京)有限公司 | Voice recognition method and device |
CN107833581A (en) * | 2017-10-20 | 2018-03-23 | 广州酷狗计算机科技有限公司 | A kind of method, apparatus and readable storage medium storing program for executing of the fundamental frequency for extracting sound |
CN109328383A (en) * | 2016-06-27 | 2019-02-12 | 高通股份有限公司 | Use the audio decoder of intermediate samples rate |
CN109509478A (en) * | 2013-04-05 | 2019-03-22 | 杜比国际公司 | Apparatus for processing audio |
-
2020
- 2020-03-10 CN CN202010160577.4A patent/CN111354365B/en active Active
- 2020-06-19 WO PCT/CN2020/097008 patent/WO2021179470A1/en active Application Filing
- 2020-06-19 CA CA3175103A patent/CA3175103A1/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020150302A1 (en) * | 1997-07-31 | 2002-10-17 | The Regents Of The University Of California | Apparatus and methods for image and signal processing |
CN101320560A (en) * | 2008-07-01 | 2008-12-10 | 上海大学 | Method for speech recognition system improving discrimination by using sampling velocity conversion |
CN101582264A (en) * | 2009-06-12 | 2009-11-18 | 瑞声声学科技(深圳)有限公司 | Method and voice collecting system for speech enhancement |
JP2012002858A (en) * | 2010-06-14 | 2012-01-05 | Pioneer Electronic Corp | Time scaling method, pitch shift method, audio data processing apparatus and program |
US20130117031A1 (en) * | 2010-07-13 | 2013-05-09 | Actions Semiconductor Co., Ltd. | Audio data encoding method and device |
CN109509478A (en) * | 2013-04-05 | 2019-03-22 | 杜比国际公司 | Apparatus for processing audio |
CN103745726A (en) * | 2013-11-07 | 2014-04-23 | 中国电子科技集团公司第四十一研究所 | Self-adaptive variable-sampling rate audio frequency sampling method |
CN105513590A (en) * | 2015-11-23 | 2016-04-20 | 百度在线网络技术(北京)有限公司 | Voice recognition method and device |
CN109328383A (en) * | 2016-06-27 | 2019-02-12 | 高通股份有限公司 | Use the audio decoder of intermediate samples rate |
CN107833581A (en) * | 2017-10-20 | 2018-03-23 | 广州酷狗计算机科技有限公司 | A kind of method, apparatus and readable storage medium storing program for executing of the fundamental frequency for extracting sound |
Non-Patent Citations (1)
Title |
---|
杨佳俊: "网络音频质量无参考客观评估" * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113447713A (en) * | 2021-06-25 | 2021-09-28 | 南京丰道电力科技有限公司 | Fourier-based fast high-precision power system frequency measurement method and device |
CN113447713B (en) * | 2021-06-25 | 2023-03-07 | 南京丰道电力科技有限公司 | Fourier-based fast high-precision power system frequency measurement method and device |
Also Published As
Publication number | Publication date |
---|---|
WO2021179470A1 (en) | 2021-09-16 |
CN111354365B (en) | 2023-10-31 |
CA3175103A1 (en) | 2021-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10339956B2 (en) | Method and apparatus for detecting audio signal according to frequency domain energy | |
WO2019227580A1 (en) | Voice recognition method, apparatus, computer device, and storage medium | |
CN109410986B (en) | Emotion recognition method and device and storage medium | |
CN107580155B (en) | Network telephone quality determination method, network telephone quality determination device, computer equipment and storage medium | |
CN111916109B (en) | Audio classification method and device based on characteristics and computing equipment | |
CN111739542A (en) | Method, device and equipment for detecting characteristic sound | |
CN111341333B (en) | Noise detection method, noise detection device, medium, and electronic apparatus | |
CN111354365B (en) | Pure voice data sampling rate identification method, device and system | |
CN115394318A (en) | Audio detection method and device | |
WO2020186695A1 (en) | Voice information batch processing method and apparatus, computer device, and storage medium | |
CN112087726A (en) | Method and system for identifying polyphonic ringtone, electronic equipment and storage medium | |
CN111696529A (en) | Audio processing method, audio processing device and readable storage medium | |
CN117037840A (en) | Abnormal sound source identification method, device, equipment and readable storage medium | |
CN105791602B (en) | Sound quality testing method and system | |
CN111640450A (en) | Multi-person audio processing method, device, equipment and readable storage medium | |
CN110600056A (en) | Voice quality inspection method and device | |
CN114155845A (en) | Service determination method and device, electronic equipment and storage medium | |
CN115273880A (en) | Voice noise reduction method, model training method, device, equipment, medium and product | |
CN113782036A (en) | Audio quality evaluation method and device, electronic equipment and storage medium | |
CN111028860B (en) | Audio data processing method and device, computer equipment and storage medium | |
WO2021143095A1 (en) | Dialing test method and apparatus, and computer device and storage medium | |
CN113851114B (en) | Method and device for determining fundamental frequency of voice signal | |
CN116567145A (en) | Customer service call operation quality inspection method and device, electronic equipment and storage medium | |
CN113316074A (en) | Howling detection method and device and electronic equipment | |
CN116168681A (en) | TTS audio anomaly detection method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |