CN111048103A - Method for processing plosive of audio data of player - Google Patents

Method for processing plosive of audio data of player Download PDF

Info

Publication number
CN111048103A
CN111048103A CN201911157687.9A CN201911157687A CN111048103A CN 111048103 A CN111048103 A CN 111048103A CN 201911157687 A CN201911157687 A CN 201911157687A CN 111048103 A CN111048103 A CN 111048103A
Authority
CN
China
Prior art keywords
audio data
pcm audio
data segment
player
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911157687.9A
Other languages
Chinese (zh)
Inventor
赵俊淞
肖戈
张万忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Bowan Technology Co Ltd
Original Assignee
Hunan Bowan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Bowan Technology Co Ltd filed Critical Hunan Bowan Technology Co Ltd
Priority to CN201911157687.9A priority Critical patent/CN111048103A/en
Publication of CN111048103A publication Critical patent/CN111048103A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

The invention relates to the field, in particular to a method for processing audio data popping of a player, which ensures that a player continuously plays a segment of interruption buffer PCM audio data segment when playing PCM audio data is interrupted, so that the PCM data waveform can not suddenly change from a higher value to a zero point when playing is interrupted, but smoothly transits from a break point to the zero point, and the phenomenon of popping when interrupted is avoided; by inserting a segment of continuous play buffered PCM audio data into the player before the start of playing the PCM audio data, the PCM data waveform does not suddenly change from the zero point to a higher point when the playing starts, but smoothly changes from the zero point to the playing start point, thereby avoiding the phenomenon of popping sound when the playing starts.

Description

Method for processing plosive of audio data of player
Technical Field
The invention relates to the technical field of audio data processing, in particular to a method for processing audio data popping of a player.
Background
Pulse Code ModulatioN (PCM) is to convert a continuous analog signal into a discrete digital signal, and transmit the discrete digital signal in a channel. Pulse code modulation is the process of sampling the analog signal, quantizing the amplitude of the sample and coding. The audio data is usually stored in PCM digital format, when the player plays the audio data, if the player performs operations such as pause and play, the audio data is suddenly interrupted, the sound may be cracked, and the PCM data is sent to the sound card, i.e. the sound card can sound. When the PCM data waveform sent into the sound card changes suddenly, a popping sound can occur on the loudspeaker. This phenomenon occurs particularly in a situation where the player pauses or plays, etc., causing a change in the data sent to the sound card.
Disclosure of Invention
In view of the above, the present invention provides a method for processing pop sound of audio data of a player, so that the PCM data waveform sent to the player during operations such as pause or play of the player does not suddenly change, thereby avoiding the pop sound phenomenon during operations such as pause or play of the player.
In order to achieve the above object, a method for processing pop sound of audio data of a player of the present invention comprises the steps of:
(1) decoding the PCM audio data segment in the player to obtain the sampling rate fs of the PCM audio data segment and a corresponding coded data string;
(2a) when the player generates interruption in playing the PCM audio data segment, recording an interruption coding value X corresponding to the PCM audio data of the last moment of interruption, and generating an interruption buffering PCM audio data segment with duration t, wherein the coding value of the interruption buffering PCM audio data segment gradually decreases from the interruption coding value X to 0 within the duration t;
(3a) causing said player to play said interrupt-buffered PCM audio data segment at an interrupt generation;
(2b) when the player starts to play the PCM audio data, recording a continuous playing coded value Y corresponding to the first PCM audio data when the player starts to play, and generating a continuous playing buffered PCM audio data segment with the same duration t, wherein the corresponding coded value of the continuous playing buffered PCM audio data segment gradually rises from 0 to the continuous playing coded value Y within the duration t;
(3b) the player is caused to insert the resume buffered PCM audio data segment before the PCM audio data segment to be played at the beginning of playback.
Further in the present invention, the duration t has a value of 50ms.
Further to the present invention, the step of generating the interrupt-buffered PCM audio data segment comprises:
using the interrupt code value X as an initial value to generate a decreasing arithmetic progression A with a sub-term number N, wherein N is a sampling rate fs duration t and a tolerance
Figure BDA0002285244670000021
And taking the arithmetic difference sequence A as a PCM coding value to obtain the interruption buffering PCM audio data segment.
Further in accordance with the present invention, the step of generating the resume buffered PCM audio data segment comprises:
with 0 as initial value and Y as end value, an increasing arithmetic progression B with a number N of subentries is generated, where N is the sampling rate fs duration t and tolerance
Figure BDA0002285244670000022
And taking the arithmetic progression B as a PCM coding value to obtain the continuous playing buffer PCM audio data segment.
Further to the present invention, the playing interruption condition of the player includes but is not limited to pause, data interruption and end of playing; the start play situation of the player includes, but is not limited to, continue play and start play.
Further, the encoded data string, the interrupted encoded value X, the interrupted buffered PCM audio data segment, the continued playing encoded value Y, the continued playing buffered PCM audio data segment, and the present PCM encoded value are all binary codes or binary encoded strings.
The invention has the beneficial effects that: the method for processing the popping sound of the audio data of the player ensures that the waveform of the PCM data can not suddenly change from a higher value to a zero point but smoothly transits from the middle breakpoint to the zero point when the player continuously plays a segment of interruption buffer PCM audio data when the playing of the PCM audio data is interrupted, thereby avoiding the phenomenon of popping sound when the playing is interrupted; by inserting a segment of continuous play buffered PCM audio data into the player before the start of playing the PCM audio data, the PCM data waveform does not suddenly change from the zero point to a higher point when the playing starts, but smoothly changes from the zero point to the playing start point, thereby avoiding the phenomenon of popping sound when the playing starts.
Drawings
The invention is further described below with reference to the following figures and examples:
FIG. 1 is a flow chart of the operation of the present invention;
FIG. 2 is a graph of one end simulated audio frequency in an embodiment of the present invention;
FIG. 3 is a PCM audio curve and an interrupted PCM buffer segment curve according to an embodiment of the present invention.
Detailed Description
As shown in fig. 1: the method for processing the crackle of the audio data of the player comprises the following steps:
(1) decoding the PCM audio data segment in the player to obtain the sampling rate fs of the PCM audio data segment and a corresponding coded data string;
(2a) when the player generates interruption in playing the PCM audio data segment, recording an interruption code value X corresponding to the PCM audio data of the last moment of interruption generation, and generating an interruption buffer PCM audio data segment with duration t, wherein the corresponding code value of the interruption buffer PCM audio data segment is gradually reduced to 0 from the interruption code value X within the duration t;
(3a) making the player generate playing interruption buffer PCM audio data segment in interruption;
(2b) when the player starts to play the PCM audio data, recording a continuous playing coded value Y corresponding to the first PCM audio data when the player starts to play, and generating a continuous playing buffered PCM audio data segment with the same duration as t, wherein the corresponding coded value of the continuous playing buffered PCM audio data segment is gradually increased from 0 to the continuous playing coded value Y within the duration t;
(3b) the player is caused to insert a resume buffered segment of PCM audio data prior to the segment of PCM audio data to be played at the start of playback.
In this embodiment, the duration t is 50ms, and the effect of the sound on the human auditory organ is not immediately eliminated with the disappearance of the sound, but a short time is set, that is, in order to make the influence of the interrupted and continued buffered PCM audio data segments on the human ear small enough, so that the listener will not generate an obvious sound delay phenomenon when the playback is interrupted, or generate an obvious delayed playback phenomenon when the playback starts, the duration of the interrupted and continued buffered PCM audio data segments is set to 50ms.
In this embodiment, the step of generating the interrupt buffer PCM audio data segment includes:
using the interrupt code value X as the initial value, generating a decreasing arithmetic sequence A with a sub-term number N, wherein N is the sampling rate fs duration t and the tolerance
Figure BDA0002285244670000041
The generated arithmetic sequence A is used as a corresponding code of the interruption buffering PCM audio data segment and is converted into a binary code which can be identified by a machine, so that the player can not suddenly change to a zero value immediately after interruption but linearly change, and the amplitude of a sound fluctuation curve is linearly reduced to 0 within 50ms, thereby avoiding the phenomenon of sonic boom.
In this embodiment, the generating step of the continuous play buffered PCM audio data segment includes:
with 0 as initial value and Y as end value, an increasing arithmetic progression B with a number N of subentries is generated, where N is the sampling rate fs duration t and tolerance
Figure BDA0002285244670000042
The generated arithmetic sequence B is used as the corresponding code of the continuous playing buffer PCM audio data segment and converted into binary code which can be identified by a machine, so that the player can not suddenly change from zero value to a certain value immediately before playing startsThe specific value (the starting value of the PCM audio data segment) is changed linearly, and the amplitude of the sound fluctuation curve is increased linearly from 0 to a specific value within 50ms, so that the phenomenon of sound explosion is avoided.
In this embodiment, the interruption situation of the player includes, but is not limited to, pause, data interruption, and end of play; the player's start play situation includes, but is not limited to, continue play and play start.
In this embodiment, the encoded data string, the interrupted encoded value X, the interrupted buffered PCM audio data segment, the resume encoded value Y, the resume buffered PCM audio data segment, and the occurring PCM encoded value are all binary codes or binary encoded strings.
The specific implementation mode is as follows:
taking an example of a segment of audio data played by a computer, inputting PCM audio data into a sound card, the corresponding speaker will sound, as shown in fig. 2, the analog audio data is generally continuous, and the PCM audio data is discrete after sampling, taking MP3 format as an example, the sampling rate is 44100Hz, that is, the audio data in MP3 format is sampled 44100 times per second, once every 22.7 μ S.
For convenience of illustration, the curve of the PCM audio data shown in fig. 3 has an abscissa as a time axis and an ordinate as a PCM coded value (decimal), and assuming that a sampling rate fs is 200Hz and samples are taken every 5ms, please refer to fig. 2 and 3, the coordinates of the break point of the PCM audio data are (150ms.10), and thus the corresponding number of the generated break buffer PCM data segments is: (155ms.9) (160ms.8) (165ms.7) (170ms.6) (175ms.5) (180ms.4) (185ms.3) (190ms.2) (195ms.1) (200ms.0), the corresponding value of the audio signal is linearly decreased to a zero coordinate, so that when the interruption occurs and the original PCM data segment is terminated, the computer continues to input the interruption buffer PCM data segment of the coordinate value to the sound card, thereby avoiding the phenomenon of sound explosion, and for convenience of display, the values in the embodiment are the corresponding values obtained by multiplying the real value by a specific multiple; similar to the above, if the corresponding global coordinate value of the first sample point of the PCM audio is (150ms.10), then the computer should input the continuous playing PCM audio data segment to the sound card in advance before the PCM audio is played, the coordinates are (100ms.0), (105ms.1), (110ms.2), (115ms.3), (120ms.4), (125ms.5), (130ms.6), (135ms.7), (140ms.8), (145ms.9), the sound card plays the continuous playing PCM audio data segment in advance before driving the original PCM data segment to sound, and the corresponding code value is increased linearly from 0 to 10, thereby avoiding the sonic boom phenomenon. It should be noted that, in order to facilitate the display of decimal ordinate values in the present embodiment, binary values are used in the computer or sound card, and the sampling time of the abscissa is also the time after doubling, and if 22.7 μ S is used as the abscissa, the interruption buffer PCM data segment or the continuous play buffer PCM data segment of 50ms is inconvenient to display in the figure.
The invention ensures that the player continuously plays a section of interruption buffer PCM audio data section when playing PCM audio data is interrupted, so that the PCM data waveform does not suddenly change from a higher value to a zero point when playing is interrupted, but smoothly transits from the break point to the zero point, and the phenomenon of popping sound when interrupted is avoided; by inserting a segment of continuous play buffered PCM audio data into the player before the start of playing the PCM audio data, the PCM data waveform does not suddenly change from the zero point to a higher point when the playing starts, but smoothly changes from the zero point to the playing start point, thereby avoiding the phenomenon of popping sound when the playing starts.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims (6)

1. A method of processing pop audio data for a player, comprising: the method comprises the following steps:
decoding the PCM audio data segment in the player to obtain the sampling rate fs of the PCM audio data segment and a corresponding coded data string;
when the player generates interruption in playing the PCM audio data segment, recording an interruption coding value X corresponding to the PCM audio data of the last moment of interruption, and generating an interruption buffering PCM audio data segment with duration t, wherein the coding value of the interruption buffering PCM audio data segment gradually decreases from the interruption coding value X to 0 within the duration t;
causing said player to play said interrupt-buffered PCM audio data segment at an interrupt generation;
when the player starts to play the PCM audio data, recording a continuous playing coded value Y corresponding to the first PCM audio data when the player starts to play, and generating a continuous playing buffered PCM audio data segment with the same duration t, wherein the corresponding coded value of the continuous playing buffered PCM audio data segment gradually rises from 0 to the continuous playing coded value Y within the duration t;
the player is caused to insert the resume buffered PCM audio data segment before the PCM audio data segment to be played at the beginning of playback.
2. The method of claim 1, wherein the method further comprises: the duration t has a value of 50ms.
3. The method of claim 1, wherein the method further comprises: the step of generating the interrupt-buffered PCM audio data segment comprises:
using the interrupt code value X as an initial value to generate a decreasing arithmetic progression A with a sub-term number N, wherein N is a sampling rate fs duration t and a tolerance
Figure FDA0002285244660000011
And taking the arithmetic difference sequence A as a PCM coding value to obtain the interruption buffering PCM audio data segment.
4. The method of claim 1, wherein the method further comprises: the generating of the resume buffered PCM audio data segment comprises:
with 0 as initial value and Y as end value, an increasing arithmetic progression B with a number N of subentries is generated, where N is the sampling rate fs duration t and tolerance
Figure FDA0002285244660000012
And taking the arithmetic progression B as a PCM coding value to obtain the continuous playing buffer PCM audio data segment.
5. The method of claim 1, wherein the method further comprises: the player's playback interruption situations include, but are not limited to, pause, data interruption, and end of playback; the start play situation of the player includes, but is not limited to, continue play and start play.
6. A method of processing pop audio data for a player according to any of claims 1-5, wherein: the encoded data string, the interrupted encoded value X, the interrupted buffered PCM audio data segment, the resume encoded value Y, the resume buffered PCM audio data segment, and the occurring PCM encoded value are all binary codes or binary encoded strings.
CN201911157687.9A 2019-11-22 2019-11-22 Method for processing plosive of audio data of player Pending CN111048103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911157687.9A CN111048103A (en) 2019-11-22 2019-11-22 Method for processing plosive of audio data of player

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911157687.9A CN111048103A (en) 2019-11-22 2019-11-22 Method for processing plosive of audio data of player

Publications (1)

Publication Number Publication Date
CN111048103A true CN111048103A (en) 2020-04-21

Family

ID=70233150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911157687.9A Pending CN111048103A (en) 2019-11-22 2019-11-22 Method for processing plosive of audio data of player

Country Status (1)

Country Link
CN (1) CN111048103A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092507A (en) * 2023-03-22 2023-05-09 广州感音科技有限公司 Audio mixing method, equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070229717A1 (en) * 2006-03-29 2007-10-04 Tatung Company Circuit for eliminating abnormal sound
CN101409094A (en) * 2008-10-30 2009-04-15 炬力集成电路设计有限公司 Method for eliminating sound effect switching noise and audio play equipment
US20130294615A1 (en) * 2012-05-03 2013-11-07 Hyundai Mobis Co., Ltd. Pop-noise removing method
CN104240716A (en) * 2014-06-11 2014-12-24 杭州联汇数字科技有限公司 Audio data quality optimization method
CN104683920A (en) * 2015-01-30 2015-06-03 惠州市德赛西威汽车电子有限公司 Method and device for realizing smooth rise and fall of sound volume
CN105828255A (en) * 2016-05-12 2016-08-03 深圳市金立通信设备有限公司 Method for optimizing pops and clicks of audio device and terminal
CN106170113A (en) * 2016-09-29 2016-11-30 北京奇艺世纪科技有限公司 A kind of method and apparatus eliminating noise and electronic equipment
CN106228993A (en) * 2016-09-29 2016-12-14 北京奇艺世纪科技有限公司 A kind of method and apparatus eliminating noise and electronic equipment
CN106775551A (en) * 2016-10-31 2017-05-31 乐视控股(北京)有限公司 Audio frequency playing method and system
CN108922551A (en) * 2017-05-16 2018-11-30 博通集成电路(上海)股份有限公司 For compensating the circuit and method of lost frames

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070229717A1 (en) * 2006-03-29 2007-10-04 Tatung Company Circuit for eliminating abnormal sound
CN101409094A (en) * 2008-10-30 2009-04-15 炬力集成电路设计有限公司 Method for eliminating sound effect switching noise and audio play equipment
US20130294615A1 (en) * 2012-05-03 2013-11-07 Hyundai Mobis Co., Ltd. Pop-noise removing method
CN104240716A (en) * 2014-06-11 2014-12-24 杭州联汇数字科技有限公司 Audio data quality optimization method
CN104683920A (en) * 2015-01-30 2015-06-03 惠州市德赛西威汽车电子有限公司 Method and device for realizing smooth rise and fall of sound volume
CN105828255A (en) * 2016-05-12 2016-08-03 深圳市金立通信设备有限公司 Method for optimizing pops and clicks of audio device and terminal
CN106170113A (en) * 2016-09-29 2016-11-30 北京奇艺世纪科技有限公司 A kind of method and apparatus eliminating noise and electronic equipment
CN106228993A (en) * 2016-09-29 2016-12-14 北京奇艺世纪科技有限公司 A kind of method and apparatus eliminating noise and electronic equipment
CN106775551A (en) * 2016-10-31 2017-05-31 乐视控股(北京)有限公司 Audio frequency playing method and system
CN108922551A (en) * 2017-05-16 2018-11-30 博通集成电路(上海)股份有限公司 For compensating the circuit and method of lost frames

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092507A (en) * 2023-03-22 2023-05-09 广州感音科技有限公司 Audio mixing method, equipment and medium

Similar Documents

Publication Publication Date Title
US9978395B2 (en) Method and system for mitigating delay in receiving audio stream during production of sound from audio stream
US20190079586A1 (en) Systems and methods for enhanced haptic effects
US20050222843A1 (en) System for permanent alignment of text utterances to their associated audio utterances
US5943648A (en) Speech signal distribution system providing supplemental parameter associated data
TW401671B (en) Silence compression for recorded voice messages
US20140372117A1 (en) Transcription support device, method, and computer program product
US20130144626A1 (en) Rap music generation
US20180166073A1 (en) Speech Recognition Without Interrupting The Playback Audio
JP2013025299A (en) Transcription support system and transcription support method
CN111048103A (en) Method for processing plosive of audio data of player
CN111105776A (en) Audio playing device and playing method thereof
US11594113B2 (en) Decoding device, decoding method, and program
JP2007041302A (en) Voice reproducing apparatus and voice reproduction processing program
US7092884B2 (en) Method of nonvisual enrollment for speech recognition
JP3620787B2 (en) Audio data encoding method
KR100330779B1 (en) Play method for variable speed of digital vocal
WO2006030860A1 (en) Electronic device, digital signal generating method, digital signal recording medium, signal processing device
US12026199B1 (en) Generating description pages for media entities
JP2007256815A (en) Voice-reproducing apparatus, voice-reproducing method, and voice reproduction program
JPH0713596A (en) Speech speed converting method
JP6387044B2 (en) Text processing apparatus, text processing method, and text processing program
CN117708492A (en) Vibration control method, vibration control device, electronic apparatus, and computer-readable storage medium
US9264818B2 (en) Digital signal processor with search function
US20120226372A1 (en) Audio-signal correction apparatus, audio-signal correction method and audio-signal correction program
JPH06337696A (en) Device and method for controlling speed conversion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200421

RJ01 Rejection of invention patent application after publication