CN111787268B - Audio signal processing method and device, electronic equipment and storage medium - Google Patents
Audio signal processing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111787268B CN111787268B CN202010622129.1A CN202010622129A CN111787268B CN 111787268 B CN111787268 B CN 111787268B CN 202010622129 A CN202010622129 A CN 202010622129A CN 111787268 B CN111787268 B CN 111787268B
- Authority
- CN
- China
- Prior art keywords
- audio signal
- audio
- frame
- amplitude
- energy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 312
- 238000003672 processing method Methods 0.000 title description 20
- 238000000034 method Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000004891 communication Methods 0.000 claims abstract description 20
- 230000005856 abnormality Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 230000008447 perception Effects 0.000 abstract description 5
- 230000002159 abnormal effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 206010044565 Tremor Diseases 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
- H04N7/155—Conference systems involving storage of or access to video conference sessions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The invention provides a method and a device for processing audio signals, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an audio signal sent by an opposite terminal communication node received in an anti-jitter buffer of terminal equipment; and adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal. Compared with the prior art, when the playing speed of the audio frame in the audio signal is adjusted, the energy and/or the amplitude of the audio frame in the audio signal are referred to, the audio frame with lower energy or lower amplitude in the audio signal is adjusted greatly, and the audio frame with higher energy or larger amplitude in the audio signal is adjusted slightly, so that the stability of the audio signal during playing can be improved, the perception of a user on the speed change of the audio signal is reduced, and the experience of the user is improved.
Description
Technical Field
The present invention relates to the field of voice communication, and in particular, to a method and an apparatus for processing an audio signal, an electronic device, and a storage medium.
Background
In a video conference, when the network is unstable, audio signals are briefly lost and accumulated. In this case, the anti-jitter buffer provided in the terminal device is usually increased to make the sound playing smooth. But too large an anti-jitter buffer may result in too much sound delay and thus degrade the user experience.
In the prior art, in order to avoid the excessive sound delay caused by the anti-jitter buffer, there is a method for automatically adjusting the playing speed. When the network is abnormal, the playing speed of the audio signal is slowed down, and when the network is recovered, the previous audio signal is obtained again and the playing speed of the audio signal is accelerated.
However, although the conventional method for adjusting the playing speed of the audio signal avoids the problem of too large sound delay when the network is abnormal, the audio signal may be played suddenly and suddenly, which causes poor stability when the audio signal is played, and further reduces user experience.
Disclosure of Invention
The invention provides a method and a device for processing an audio signal, electronic equipment and a storage medium, which aim to solve the problem of poor stability of the audio signal during playing in the prior art.
A first aspect of the present invention provides a method of processing an audio signal, the method comprising:
acquiring an audio signal sent by an opposite terminal communication node received in an anti-jitter buffer of terminal equipment;
and adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal.
In an optional embodiment, the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal includes:
determining a speed variation corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal;
and adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
In an optional embodiment, the determining, according to the energy and/or amplitude of an audio frame in the audio signal, a speed variation corresponding to each audio frame in the audio signal includes:
determining an average amplitude of audio in the audio signal from the amplitude of each audio frame in the audio signal;
and determining the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
In an optional embodiment, the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal includes:
determining a mute frame in the audio signal according to the energy and/or amplitude of an audio frame in the audio signal;
and adjusting the playing speed of the mute frame in the audio signal.
In an optional embodiment, the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal includes:
and if the network abnormality of the terminal equipment is detected, reducing the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
In an optional embodiment, the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal includes:
and if the terminal equipment is detected to recover from the network abnormality, accelerating the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
A second aspect of the present invention provides an apparatus for processing an audio signal, the apparatus comprising:
the terminal equipment comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an audio signal sent by an opposite terminal call node received in an anti-jitter buffer of the terminal equipment;
and the adjusting module is used for adjusting the playing speed of the audio frame in the audio signal according to the energy and/or the amplitude of the audio frame in the audio signal.
In an optional implementation manner, the adjusting module is specifically configured to determine, according to energy and/or amplitude of an audio frame in the audio signal, a speed variation corresponding to each audio frame in the audio signal; and adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
In an optional embodiment, the adjusting module is specifically configured to determine an average amplitude of audio in the audio signal according to an amplitude of each audio frame in the audio signal; and determining the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
In an optional embodiment, the adjusting module is specifically configured to determine a mute frame in the audio signal according to an energy and/or an amplitude of an audio frame in the audio signal; and adjusting the playing speed of the mute frame in the audio signal.
In an optional implementation manner, the adjusting module is specifically configured to, if a network anomaly of the terminal device is detected, reduce a playing speed of an audio frame in the audio signal according to energy and/or amplitude of the audio frame in the audio signal.
In an optional implementation manner, the adjusting module is specifically configured to, if it is detected that the terminal device recovers from a network abnormality, increase a playing speed of an audio frame in the audio signal according to energy and/or amplitude of the audio frame in the audio signal.
In a third aspect of the embodiments of the present invention, there is provided an electronic device, including: a memory, a processor and a computer program, the computer program being stored in the memory, the processor running the computer program to perform the various optional audio signal processing methods of the first aspect and the first aspect of the invention.
A fourth aspect of the present invention provides a storage medium having stored thereon a computer program for executing the method of processing the audio signal according to the first aspect and its various alternatives.
According to the audio signal processing method and device, the electronic device and the storage medium, the audio signal sent by the opposite terminal communication node and received in the anti-shake buffer area of the terminal device is obtained, and then the playing speed of the audio frame in the audio signal is adjusted according to the energy and/or amplitude of the audio frame in the audio signal. Compared with the prior art, when the playing speed of the audio frame in the audio signal is adjusted, the energy and/or the amplitude of the audio frame in the audio signal are referred to, the audio frame with lower energy or lower amplitude in the audio signal is adjusted greatly, and the audio frame with higher energy or larger amplitude in the audio signal is adjusted slightly, so that the stability of the audio signal during playing can be improved, the perception of a user on the speed change of the audio signal is reduced, and the experience of the user is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the following briefly introduces the drawings needed to be used in the description of the embodiments or the prior art, and obviously, the drawings in the following description are some embodiments of the present invention, and those skilled in the art can obtain other drawings according to the drawings without inventive labor.
Fig. 1 is a schematic view of an application scenario of a method for processing an audio signal according to an embodiment of the present application;
fig. 2 is a flowchart illustrating a method for processing an audio signal according to an embodiment of the present application;
fig. 3 is a schematic flowchart of another audio signal processing method according to an embodiment of the present application;
fig. 4 is a flowchart illustrating a further audio signal processing method according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a further audio signal processing method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an apparatus for processing an audio signal according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a video conference, when the network is unstable, audio signals are briefly lost and accumulated. In this case, the anti-jitter buffer provided in the terminal device is usually increased to make the sound playing smooth. But too large an anti-jitter buffer may result in too much sound delay and thus degrade the user experience. In the prior art, in order to avoid the excessive sound delay caused by the anti-jitter buffer, there is a method for automatically adjusting the playing speed. When the network is abnormal, the playing speed of the audio signal is slowed down, and when the network is recovered, the previous audio signal is obtained again and the playing speed of the audio signal is accelerated.
However, although the conventional method for adjusting the playing speed of the audio signal avoids the problem of too large sound delay when the network is abnormal, the audio signal may be suddenly played quickly and suddenly and slowly, which causes poor stability when the audio signal is played, thereby reducing user experience.
In order to solve the above problem, embodiments of the present application provide a method and an apparatus for processing an audio signal, an electronic device, and a storage medium, so as to solve the problem of poor stability when the audio signal is played. The invention conception of the application is as follows: when the playing speed of the audio signal is adjusted, the audio frame with lower energy or lower amplitude in the audio signal is adjusted greatly, and the audio frame with higher energy or larger amplitude in the audio signal is adjusted slightly, so that the stability of the audio signal during playing is improved, the perception of a user on the speed change of the audio signal is reduced, and the experience is improved.
Fig. 1 is a schematic view of an application scenario of a method for processing an audio signal according to an embodiment of the present application. As shown in fig. 1, a user performs a video conference through a first terminal apparatus 101 and a second terminal apparatus 102, and the first terminal apparatus 101 transmits an audio signal to the second terminal apparatus 102 through a server 103. At this time, if the network of the first terminal device 101 is abnormal, the first terminal device 101 may decrease the playing speed of the audio signal that has been received in the anti-jitter buffer. Subsequently, after waiting for the network of the first terminal device 101 to recover from the abnormality, the subsequent audio signal may be received again, and the playing speed of the subsequently received audio signal may be increased, so as to reduce the delay of both parties of the video conference.
The first terminal device 101 and the second terminal device 102 may be a mobile phone (mobile phone), a tablet (pad), a computer with a wireless transceiving function, a Virtual Reality (VR) terminal device, an Augmented Reality (AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in remote surgery (remote medical supply), a wireless terminal in smart grid (smart grid), a wireless terminal in smart home (smart home), and the like. In the embodiment of the present application, the apparatus for implementing the function of the terminal may be a terminal device, or may be an apparatus capable of supporting the terminal to implement the function, for example, a chip system, and the apparatus may be installed in the terminal device. In the embodiment of the present application, the chip system may be composed of a chip, and may also include a chip and other discrete devices.
The server 103 may be a server or a server in a cloud service platform. The server is configured to receive the video signal and the audio signal sent by the first terminal device 101 and the second terminal device 102, and send the video signal and the audio signal to the opposite-end call terminal.
It should be noted that the application scenario in the technical solution of the present application may be the application scenario in fig. 1, but is not limited to this, and may also be applied to other scenarios requiring voice call.
It can be understood that the processing method of the audio signal can be implemented by the processing apparatus of the audio signal provided in the embodiment of the present application, and the processing apparatus of the audio signal can be a part or all of a certain device, and for example, can be a terminal device or a processor of the terminal device.
The following describes in detail the technical solutions of the embodiments of the present application with specific embodiments, taking a terminal device integrated or installed with a relevant execution code as an example. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 2 is a schematic flowchart of a processing method of an audio signal according to an embodiment of the present application, where an execution main body of the embodiment is a terminal device, and the embodiment relates to a specific process of how to adjust a playing speed of an audio frame in an audio signal. As shown in fig. 2, the method includes:
s201, obtaining an audio signal sent by an opposite terminal communication node and received in an anti-shake buffer area of the terminal equipment.
In this application, terminal equipment can be provided with anti trembling buffer, and when terminal equipment and opposite terminal conversation node carried out video conference or voice call, the opposite terminal conversation node can continuously send audio signal to terminal equipment, and at this moment, terminal equipment can exist audio signal in anti trembling buffer temporarily. When the terminal device needs to play an audio signal or adjust the play speed of an audio frame in the audio signal, the audio signal sent by the opposite-end call node can be obtained from the anti-shake buffer of the terminal device.
The anti-jitter buffer is a shared data area, and audio signals are collected, stored and sent to the processor at regular intervals in the anti-jitter buffer. The jitter buffer intentionally delays the arriving audio signal when receiving the audio signal, thereby preventing the voice playing during the call from being more stable.
S202, adjusting the playing speed of the audio frame in the audio signal according to the energy and/or the amplitude of the audio frame in the audio signal.
In this step, after the terminal device obtains the audio signal sent by the peer end communication node and received in the anti-jitter buffer, the playing speed of the audio frame in the audio signal may be adjusted according to the energy and/or amplitude of the audio frame in the audio signal.
The adjusting the playing speed of the audio frame in the audio signal includes increasing the playing speed of the audio frame in the audio signal and decreasing the playing speed of the audio frame in the audio signal.
In some embodiments, if the network abnormality of the terminal device is detected, the terminal device reduces the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal. In other embodiments, if it is detected that the terminal device recovers from the network anomaly, the playing speed of the audio frame in the audio signal is increased according to the energy and/or amplitude of the audio frame in the audio signal.
In some embodiments, because there are always some places to stop and rest during normal conversation, so that there are silent frames in the audio signal, the audio signal is slowed down or speeded up in silent sections without sound, only the silent sections are lengthened or shortened, and the user feels no obvious. For example, the terminal device may first determine the mute frame in the anti-jitter buffer according to the energy and/or amplitude of the audio frame in the audio signal. And then, the terminal equipment adjusts the playing speed of the mute frames in the audio signal.
In other embodiments, if there are no silence frames in some audio signals, the speed variation of the audio signals may be assigned according to the energy and/or amplitude of the sound. When the energy or amplitude of the audio signal is low, a large rate of speed change is assigned. When the energy or amplitude of the audio signal is high, a smaller rate of speed change is assigned. For example, the terminal device may first determine a speed variation corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal. And then, the terminal equipment adjusts the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
According to the audio signal processing method provided by the embodiment of the application, the playing speed of the audio frame in the audio signal is adjusted according to the energy and/or amplitude of the audio frame in the audio signal by acquiring the audio signal sent by the opposite terminal communication node and received in the anti-shake buffer zone of the terminal device. Compared with the prior art, when the playing speed of the audio frame in the audio signal is adjusted, the energy and/or the amplitude of the audio frame in the audio signal are referred to, the audio frame with lower energy or lower amplitude in the audio signal is adjusted greatly, and the audio frame with higher energy or larger amplitude in the audio signal is adjusted slightly, so that the stability of the audio signal during playing can be improved, the perception of a user on the speed change of the audio signal is reduced, and the experience of the user is improved.
On the basis of the above embodiments, the following provides two ways to adjust the playing speed of the audio frames in the audio signal based on the energy and/or amplitude of the audio frames in the audio signal. Fig. 3 is a flowchart illustrating another audio signal processing method according to an embodiment of the present application, and fig. 3 is a first manner of adjusting a playing speed of an audio frame in an audio signal based on an energy and/or an amplitude of the audio frame in the audio signal. As shown in fig. 3, the audio signal processing method includes:
s301, audio signals sent by the opposite terminal communication node and received in the anti-shake buffer area of the terminal device are obtained.
Technical terms, technical effects, technical features and optional embodiments of step S301 can be understood with reference to step S201 shown in fig. 2, and repeated contents will not be described herein.
S302, determining the speed variation corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
In this step, after acquiring the audio signal sent by the peer end communication node and received in the anti-jitter buffer, the terminal device may determine, according to the energy and/or amplitude of the audio frame in the audio signal, a speed variation corresponding to each audio frame in the audio signal.
The embodiment of the application does not limit how to determine the speed variation corresponding to each audio frame, and only needs to ensure that a larger speed variation proportion is allocated when the energy or amplitude of the audio signal is lower, and a smaller speed variation proportion is allocated when the energy or amplitude of the audio signal is higher.
In some embodiments, the terminal device may determine an average amplitude of audio in the audio signal from the amplitude of each audio frame in the audio signal. Then, the terminal device determines the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
In the first mode, the terminal device may allocate the speed variation corresponding to each frame of audio frame in a manner that the amplitude and the speed variation are in inverse linear proportion.
For example, if the time length of the audio signal in the anti-shake buffer is X seconds, stretching/compressing to Y seconds is required, and the frame length of the audio signal is T seconds, the audio signal may be divided into N ═ X/T frames. Correspondingly, the terminal device may count the average amplitude a of the whole audio signal of X seconds, and for the nth frame of audio frame with the duration T, the terminal device may also count its amplitude An. Subsequently, the terminal device may input the average amplitude a of the audio signal and the amplitude An of the n-th frame audio frame into equations (1) and (2), and determine the speed change amount corresponding to each audio frame in the audio signal. Equations (1) and (2) are as follows:
Cn=s*A/An.................................(1)
s=(Y-X)/(A/A1+A/A2+......+A/An)...................(2)
cn is the speed variation corresponding to the nth frame of audio frame, s is the coefficient to be determined, A is the average amplitude of the audio signal, An is the amplitude of the nth frame of audio frame, X is the original time length of the audio signal, and the target stretching/compressing time length of the Y audio signal.
In addition, in order to prevent that when An is equal to 0 or small, it results in An unsolvability or too large allocation difference, the terminal device may add a small positive number M to An. At this time, the terminal device may input the average amplitude a of the audio signal and the amplitude An of the n-th frame audio frame into equations (3) and (4), and determine the speed variation corresponding to each audio frame in the audio signal. Equations (3) and (4) are as follows:
Cn=s*A/(An+M)................................(3)
s=(Y-X)/(A/(A1+M)+A/(A2+M)+......+A/(An+M))...................(4)
it should be noted that, in the embodiment of the present application, the value of M is not limited, and may be specifically set according to an actual situation, for example, 5% of the average amplitude may be taken.
In the second way, the terminal device may further allocate the speed variation corresponding to each frame of the audio frame in a manner that the amplitude and the speed variation are non-linearized.
For example, the terminal device may input the average amplitude a of the audio signal and the amplitude An of the nth frame audio frame into equations (5) and (6), and determine the speed variation corresponding to each audio frame in the audio signal. Equations (5) and (6) are as follows:
Cn=s*(A/(An+M))^z................................(5)
s=(Y-X)/((A/(A1+M))^z+(A/(A2+M))^z+......+(A/(An+M))^z)..........(6)
wherein, Z is a constant and can be specifically set according to the actual situation. If more speed variations need to be assigned for audio frames with small amplitudes, z may take a value greater than 1. If it is desired to distribute the audio frames of different amplitudes evenly, z may take a value less than 1.
It should be noted that, in the embodiment of the present application, the adjustment of the playing speed of the audio frame in the audio signal is performed by taking the amplitude as an example, but the present application is not limited thereto. The amplitude a in the formulas (1) to (6) related to the embodiments of the present application may be replaced by energy E or power P.
And S303, adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
In this step, after the terminal device determines the speed variation corresponding to each audio frame in the audio signal, the playing speed of each audio frame in the audio signal may be adjusted based on the speed variation corresponding to each audio frame in the audio signal.
Fig. 4 is a flowchart illustrating a further audio signal processing method according to an embodiment of the present application, and fig. 4 is a first manner of adjusting a playing speed of an audio frame in an audio signal based on an energy and/or an amplitude of the audio frame in the audio signal. As shown in fig. 4, the audio signal processing method includes:
s401, audio signals sent by the opposite terminal communication node and received in an anti-shake buffer area of the terminal device are obtained.
The technical terms, technical effects, technical features and optional embodiments of step S401 can be understood by referring to step S201 shown in fig. 2, and repeated contents will not be described herein.
S402, determining a mute frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
In this step, after the terminal device obtains the audio signal sent by the peer end communication node and received in the anti-jitter buffer, the terminal device may determine a mute frame in the audio signal according to the energy and/or amplitude of an audio frame in the audio signal.
In the embodiments of the present application, there is no limitation on how to determine the mute frame in the audio signal, and in some embodiments, when the energy and/or amplitude of a certain frame of audio frame in the audio signal is zero, the terminal device may determine that the certain frame of audio frame is a mute frame. In other embodiments, when the energy of a frame of audio frames in the audio signal is below an energy threshold and/or the amplitude is below an amplitude threshold, the terminal device may determine that the audio frame is a silence frame.
S403, adjusting the playing speed of the mute frames in the audio signal.
In this step, after the terminal device determines the mute frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal, the playing speed of the mute frame in the audio signal may be adjusted.
In the embodiments of the present application, how to adjust the play speed of the mute frames in the audio signal is not limited, and in some embodiments, the terminal device may equally allocate a speed variation to each mute frame. In other embodiments, the terminal device may assign more speed variance to consecutive silence frames.
On the basis of the above embodiment, the following provides a method for processing an audio signal after a network abnormality occurs during a video conference. Fig. 5 is a schematic flowchart of another audio signal processing method according to an embodiment of the present application, and as shown in fig. 5, the audio signal processing method includes:
s501, obtaining an audio signal sent by an opposite terminal communication node and received in an anti-shake buffer area of the terminal equipment.
S502, if the network abnormality of the terminal equipment is detected, reducing the playing speed of the audio frame in the audio signal according to the energy and/or the amplitude of the audio frame in the audio signal.
S503, if the terminal equipment is detected to recover from the network abnormity, the playing speed of the audio frame in the audio signal is accelerated according to the energy and/or the amplitude of the audio frame in the audio signal.
According to the audio signal processing method provided by the embodiment of the application, the playing speed of the audio frame in the audio signal is adjusted according to the energy and/or amplitude of the audio frame in the audio signal by acquiring the audio signal sent by the opposite terminal communication node and received in the anti-shake buffer zone of the terminal device. Compared with the prior art, when the playing speed of the audio frame in the audio signal is adjusted, the energy and/or the amplitude of the audio frame in the audio signal are referred to, the audio frame with lower energy or lower amplitude in the audio signal is adjusted greatly, and the audio frame with higher energy or larger amplitude in the audio signal is adjusted slightly, so that the stability of the audio signal during playing can be improved, the perception of a user on the speed change of the audio signal is reduced, and the experience of the user is improved.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Fig. 6 is a schematic structural diagram of an apparatus for processing an audio signal according to an embodiment of the present disclosure. The processing device of the audio signal may be implemented by software, hardware or a combination of the two, and may be, for example, the terminal device or the chip of the terminal device in the above embodiments, so as to execute the processing method of the audio signal in the above embodiments. As shown in fig. 6, the audio signal processing apparatus includes: an obtaining module 601 and an adjusting module 602.
An obtaining module 601, configured to obtain an audio signal sent by an opposite end call node and received in an anti-jitter buffer of a terminal device;
the adjusting module 602 is configured to adjust a playing speed of an audio frame in the audio signal according to an energy and/or an amplitude of the audio frame in the audio signal.
In an optional implementation manner, the adjusting module 602 is specifically configured to determine, according to energy and/or amplitude of an audio frame in an audio signal, a speed variation corresponding to each audio frame in the audio signal; and adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
In an optional implementation, the adjusting module 602 is specifically configured to determine an average amplitude of audio in the audio signal according to an amplitude of each audio frame in the audio signal; and determining the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
In an optional embodiment, the adjusting module 602 is specifically configured to determine a mute frame in the audio signal according to an energy and/or an amplitude of an audio frame in the audio signal; the play speed of the mute frames in the audio signal is adjusted.
In an optional implementation manner, the adjusting module 602 is specifically configured to, if a network anomaly of the terminal device is detected, reduce a playing speed of an audio frame in the audio signal according to energy and/or amplitude of the audio frame in the audio signal.
In an optional implementation manner, the adjusting module 602 is specifically configured to, if it is detected that the terminal device recovers from the network anomaly, increase the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
The audio signal processing apparatus provided in the embodiment of the present application may perform the audio signal processing method in the foregoing method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device may include: at least one processor 701 and a memory 702. Fig. 7 shows an electronic device as an example of a processor.
And a memory 702 for storing programs. In particular, the program may include program code including computer operating instructions.
The memory 702 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 701 is configured to execute the computer-executable instructions stored in the memory 702 to implement the crowd density prediction method;
the processor 701 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present Application.
Optionally, in a specific implementation, if the communication interface, the memory 702 and the processor 701 are implemented independently, the communication interface, the memory 702 and the processor 701 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. Buses may be classified as address buses, data buses, control buses, etc., but do not represent only one bus or type of bus.
Alternatively, in a specific implementation, if the communication interface, the memory 702 and the processor 701 are integrated into a chip, the communication interface, the memory 702 and the processor 701 may complete communication through an internal interface.
The embodiment of the application also provides a chip which comprises a processor and an interface. Wherein the interface is used for inputting and outputting data or instructions processed by the processor. The processor is configured to perform the methods provided in the above method embodiments. The chip can be applied to a processing device of audio signals.
The present application also provides a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, are specifically, the computer-readable storage medium stores program information, and the program information is used for the processing method of the audio signal.
Embodiments of the present application also provide a program, which when executed by a processor, is configured to perform the method for processing an audio signal provided by the above method embodiments.
Embodiments of the present application further provide a program product, such as a computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the method for processing an audio signal provided by the above method embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (12)
1. A method of processing an audio signal, the method comprising:
acquiring an audio signal sent by an opposite terminal communication node received in an anti-jitter buffer of terminal equipment;
adjusting the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal;
the adjusting the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal includes:
determining a speed variation corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal;
and adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
2. The method of claim 1, wherein determining the speed change amount corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal comprises:
determining an average amplitude of audio in the audio signal from the amplitude of each audio frame in the audio signal;
and determining the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
3. The method of claim 1, wherein the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal comprises:
determining a mute frame in the audio signal according to the energy and/or amplitude of an audio frame in the audio signal;
and adjusting the playing speed of the mute frame in the audio signal.
4. The method according to any one of claims 1-3, wherein the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal comprises:
and if the network abnormality of the terminal equipment is detected, reducing the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
5. The method according to any one of claims 1-3, wherein the adjusting the playing speed of the audio frames in the audio signal according to the energy and/or amplitude of the audio frames in the audio signal comprises:
and if the terminal equipment is detected to recover from the network abnormality, accelerating the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal.
6. An apparatus for processing an audio signal, the apparatus comprising:
the terminal equipment comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an audio signal sent by an opposite terminal call node received in an anti-jitter buffer of the terminal equipment;
the adjusting module is used for adjusting the playing speed of the audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal;
the adjusting module is specifically configured to determine a speed variation corresponding to each audio frame in the audio signal according to the energy and/or amplitude of the audio frame in the audio signal; and adjusting the playing speed of each audio frame in the audio signal according to the speed variation corresponding to each audio frame in the audio signal.
7. The apparatus according to claim 6, wherein the adjusting module is specifically configured to determine an average amplitude of audio in the audio signal according to an amplitude of each audio frame in the audio signal; and determining the speed variation corresponding to each audio frame in the audio signal according to the average amplitude of the audio in the audio signal and the amplitude of each audio frame in the audio signal.
8. The apparatus according to claim 6, wherein the adjusting module is specifically configured to determine the silence frame in the audio signal according to an energy and/or an amplitude of an audio frame in the audio signal; and adjusting the playing speed of the mute frame in the audio signal.
9. The apparatus according to any one of claims 6 to 8, wherein the adjusting module is specifically configured to, if a network anomaly of the terminal device is detected, reduce a playing speed of an audio frame in the audio signal according to an energy and/or an amplitude of the audio frame in the audio signal.
10. The apparatus according to any one of claims 6 to 8, wherein the adjusting module is specifically configured to, if it is detected that the terminal device recovers from a network anomaly, increase a playing speed of an audio frame in the audio signal according to an energy and/or an amplitude of the audio frame in the audio signal.
11. An electronic device, comprising: a memory and a processor;
the memory for storing executable instructions of the processor;
the processor is configured to perform the method of any of claims 1-5 via execution of the executable instructions.
12. A storage medium having a computer program stored thereon, comprising: the program, when executed by a processor, implements the method of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622129.1A CN111787268B (en) | 2020-07-01 | 2020-07-01 | Audio signal processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622129.1A CN111787268B (en) | 2020-07-01 | 2020-07-01 | Audio signal processing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111787268A CN111787268A (en) | 2020-10-16 |
CN111787268B true CN111787268B (en) | 2022-04-22 |
Family
ID=72760528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010622129.1A Active CN111787268B (en) | 2020-07-01 | 2020-07-01 | Audio signal processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111787268B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114979798B (en) * | 2022-04-21 | 2024-03-22 | 维沃移动通信有限公司 | Playing speed control method and electronic equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103747287A (en) * | 2014-01-13 | 2014-04-23 | 合一网络技术(北京)有限公司 | Video playing speed regulation method and system applied to flash |
CN110799936A (en) * | 2017-08-18 | 2020-02-14 | Oppo广东移动通信有限公司 | Volume adjusting method and device, terminal equipment and storage medium |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070263672A1 (en) * | 2006-05-09 | 2007-11-15 | Nokia Corporation | Adaptive jitter management control in decoder |
CN101119323A (en) * | 2007-09-21 | 2008-02-06 | 腾讯科技(深圳)有限公司 | Method and device for solving network jitter |
US20100290454A1 (en) * | 2007-11-30 | 2010-11-18 | Telefonaktiebolaget Lm Ericsson (Publ) | Play-Out Delay Estimation |
KR101953613B1 (en) * | 2013-06-21 | 2019-03-04 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | Jitter buffer control, audio decoder, method and computer program |
TWI559300B (en) * | 2015-01-21 | 2016-11-21 | 宇智網通股份有限公司 | Time domain based voice event detection method and related device |
CN105245496B (en) * | 2015-08-26 | 2019-03-12 | 广州市百果园网络科技有限公司 | A kind of method and apparatus of playing audio-fequency data |
CN106559635A (en) * | 2015-09-30 | 2017-04-05 | 杭州萤石网络有限公司 | A kind of player method and device of multimedia file |
WO2017095276A1 (en) * | 2015-11-30 | 2017-06-08 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and receiving device for adapting a play-out rate of a jitter buffer |
WO2018120627A1 (en) * | 2016-12-30 | 2018-07-05 | 华为技术有限公司 | Audio data processing method and apparatus |
-
2020
- 2020-07-01 CN CN202010622129.1A patent/CN111787268B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103747287A (en) * | 2014-01-13 | 2014-04-23 | 合一网络技术(北京)有限公司 | Video playing speed regulation method and system applied to flash |
CN110799936A (en) * | 2017-08-18 | 2020-02-14 | Oppo广东移动通信有限公司 | Volume adjusting method and device, terminal equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111787268A (en) | 2020-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR20220062101A (en) | Information transmission method, apparatus, readable medium and electronic device | |
CN108076226B (en) | Method for adjusting call quality, mobile terminal and storage medium | |
US9626364B2 (en) | Streaming media | |
CN113490055B (en) | Data processing method and device | |
CN107682752B (en) | Method, device and system for displaying video picture, terminal equipment and storage medium | |
US20120170760A1 (en) | Audio Processing | |
CN109660467B (en) | Method and apparatus for controlling flow | |
US20150134846A1 (en) | Method and apparatus for media segment request retry control | |
CN112799793B (en) | Scheduling method and device, electronic equipment and storage medium | |
KR20170139513A (en) | Transmission device, transmission method and program | |
CN111787268B (en) | Audio signal processing method and device, electronic equipment and storage medium | |
CN110602338B (en) | Audio processing method, device, system, storage medium and equipment | |
CN112423074B (en) | Audio and video synchronization processing method and device, electronic equipment and storage medium | |
CN112866970B (en) | Communication connection method and device, electronic equipment and wireless transmission system | |
CN111478916B (en) | Data transmission method, device and storage medium based on video stream | |
CN108616767B (en) | Audio data transmission method and device | |
CN109309805B (en) | Multi-window display method, device, equipment and system for video conference | |
CN114416013A (en) | Data transmission method, data transmission device, electronic equipment and computer-readable storage medium | |
CN114979344A (en) | Echo cancellation method, device, equipment and storage medium | |
CN111355919B (en) | Communication session control method and device | |
KR20230012105A (en) | System and method for removal of howling and computer program for the same | |
CN109716432B (en) | Gain processing method and device, electronic equipment, signal acquisition method and system | |
CN114337916A (en) | Network transmission rate adjusting method, device, equipment and storage medium | |
WO2020237569A1 (en) | Method, device and system for processing audio data, and storage medium | |
CN114546910A (en) | Access control method, device, storage medium and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |