CN108600936B

CN108600936B - Multi-channel audio processing method, device, computer-readable storage medium and terminal

Info

Publication number: CN108600936B
Application number: CN201810356544.XA
Authority: CN
Inventors: 黄传增
Original assignee: Beijing Microlive Vision Technology Co Ltd
Current assignee: Beijing Microlive Vision Technology Co Ltd
Priority date: 2018-04-19
Filing date: 2018-04-19
Publication date: 2020-01-03
Anticipated expiration: 2038-04-19
Also published as: CN108600936A

Abstract

The invention discloses a multi-channel audio processing method, a multi-channel audio processing device, a multi-channel audio processing hardware device, a computer readable storage medium and a multi-channel audio processing terminal. The multichannel audio processing method comprises the steps of obtaining multichannel audio to be processed; acquiring audio processing parameters; and processing the multi-channel audio to be processed based on the audio processing parameters. By adopting the technical scheme, the multi-channel audio to be processed can be correspondingly processed according to the audio processing parameters, so that the corresponding audio effect can be obtained according to different audio processing parameters, and the technical problem of how to improve the user experience effect is solved.

Description

Multi-channel audio processing method, device, computer-readable storage medium and terminal

Technical Field

The present invention relates to the field of audio technologies, and in particular, to a method and an apparatus for processing a multi-channel audio, a computer-readable storage medium, and a terminal.

Background

With the popularity of audio interaction, audio is increasingly used as a carrier of information dissemination for such interactions. In order to obtain a good interactive experience, users are increasingly paying attention to the experience of audio.

Currently, the prior art generally processes monaural audio. For multi-channel audio, the characteristics of each channel audio in the multi-channel audio are not considered due to the method for processing the mono audio. Therefore, the user cannot adjust for the multi-channel audio to obtain better experience.

In view of the above, it is an urgent technical problem to provide a multi-channel audio processing method capable of obtaining a good user experience.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a multi-channel audio processing method to at least partially solve the technical problem of how to improve the user experience. In addition, a multi-channel audio processing apparatus, a multi-channel audio processing hardware apparatus, a computer readable storage medium, and a multi-channel audio processing terminal are also provided.

In order to achieve the above object, according to one aspect of the present invention, the following technical solutions are provided:

a multi-channel audio processing method, comprising:

acquiring multi-channel audio to be processed;

acquiring audio processing parameters;

and processing the multi-channel audio to be processed based on the audio processing parameters.

Further, the step of obtaining audio processing parameters includes:

and when an audio mode instruction is received, acquiring audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction.

Further, the audio mode instruction is an offset mode instruction;

the step of acquiring the audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction when the audio mode instruction is received comprises the following steps:

when the offset mode instruction is received, acquiring an audio envelope of the multi-channel audio to be processed, which corresponds to the offset mode instruction;

the step of processing the multi-channel audio to be processed based on the audio processing parameters comprises:

and adjusting the audio envelope of the multi-channel audio to be processed.

Further, the step of adjusting the audio envelope of the multi-channel audio to be processed specifically includes:

and adjusting the fundamental frequency and the formants of the multi-channel audio to be processed.

Further, the audio mode command is a hold mode command;

when the hold mode instruction is received, acquiring the audio envelope and the fundamental frequency of the multi-channel audio to be processed corresponding to the hold mode instruction;

adjusting the fundamental frequency while maintaining the audio envelope.

Further, the audio mode command is a crisp mode command;

when a crisp mode instruction is received, acquiring transient pulses and audio phases of the multi-channel audio to be processed, which correspond to the crisp mode instruction;

setting the phase of the transient pulse to be zero in the full frequency domain of the multi-channel audio to be processed, or setting the phase of the transient pulse to be zero in the full frequency domain of the multi-channel audio to be processed, and limiting the amplitude of the transient pulse;

setting the audio phase to a preset value.

Further, the audio mode instruction is a smooth mode instruction;

when the smooth mode instruction is received, acquiring transient pulses of the multi-channel audio to be processed, which correspond to the smooth mode instruction;

and based on the transient pulse, smoothing the amplitude of the frequency domain corresponding to the transient pulse in the preset frequency range in the frequency domain of the multi-channel audio to be processed.

Further, the audio mode command is a speed mode command;

when the speed mode instruction is received, acquiring a preset sampling rate and a formant or a fundamental frequency of the multi-channel audio to be processed, which correspond to the speed mode instruction;

down-sampling the multi-channel audio to be processed according to a preset sampling rate;

smoothing the formants of the multi-channel audio to be processed; alternatively, the first and second electrodes may be,

adjusting the fundamental frequency of the multi-channel audio to be processed.

Further, the audio mode instruction is a voice quality mode instruction;

when the tone quality mode instruction is received, acquiring transient pulses, fundamental frequencies and formants of the multi-channel audio to be processed, which correspond to the tone quality mode instruction;

smoothing the transient pulses and the formants of the multi-channel audio to be processed in real time;

and adjusting the fundamental frequency of the multi-channel audio to be processed in real time.

Further, the audio mode instruction is a cross-band smoothing mode instruction;

when the cross-frequency-band smoothing mode instruction is received, acquiring an audio phase of the multi-channel audio to be processed, which corresponds to the cross-frequency-band smoothing mode instruction;

and based on the audio phase, smoothing the phase in at least two frequency bands of the multi-channel audio to be processed.

Further, the audio mode instruction is a sub-band in-band flat sliding mode instruction;

when the in-subband smooth mode instruction is received, acquiring an audio phase of the multi-channel audio to be processed, which corresponds to the in-subband smooth mode instruction;

and smoothing the phase in a sub-band of any frequency band of the multi-channel audio to be processed based on the audio phase.

Further, the audio mode command is a variable speed and constant adjustment mode command;

when the speed change invariant mode adjusting instruction is received, acquiring the audio time length of the multi-channel audio to be processed corresponding to the speed change invariant mode adjusting instruction;

and shortening or prolonging the audio time of the multi-channel audio to be processed.

Further, the audio mode command is a pitch-changing and constant-speed mode command;

when the tone-changing speed-changing-free mode instruction is received, obtaining the fundamental frequency of the multi-channel audio to be processed corresponding to the tone-changing speed-changing-free mode instruction;

adjusting the fundamental frequency of the multi-channel audio to be processed.

Further, the audio mode command is a tone-changing and speed-changing mode command;

when the tone-changing speed-changing mode instruction is received, acquiring the audio duration and the fundamental frequency of the multi-channel audio to be processed, which correspond to the tone-changing speed-changing mode instruction;

shortening or prolonging the audio time length;

the fundamental frequency is adjusted.

In order to achieve the above object, according to another aspect of the present invention, the following technical solutions are also provided:

a multi-channel audio processing apparatus comprising:

the first acquisition module is used for acquiring multi-channel audio to be processed;

the second acquisition module is used for acquiring audio processing parameters;

and the processing module is used for processing the multi-channel audio to be processed based on the audio processing parameters.

Further, the second obtaining module is further configured to, when an audio mode instruction is received, obtain an audio processing parameter of the multi-channel audio to be processed, which corresponds to the audio mode instruction.

Further, the audio mode instruction is an offset mode instruction;

the second obtaining module is specifically configured to, when the offset mode instruction is received, obtain an audio envelope of the multi-channel audio to be processed, which corresponds to the offset mode instruction;

the processing module is specifically configured to adjust an audio envelope of the multi-channel audio to be processed.

Further, the processing module is specifically configured to adjust a fundamental frequency and a formant of the multi-channel audio to be processed.

Further, the audio mode command is a hold mode command;

the second obtaining module is specifically configured to, when the hold mode instruction is received, obtain an audio envelope and a fundamental frequency of the multi-channel audio to be processed, which correspond to the hold mode instruction;

the processing module is specifically configured to adjust the fundamental frequency while maintaining the audio envelope.

Further, the audio mode command is a crisp mode command;

the second obtaining module is specifically configured to obtain, when a crisp mode instruction is received, a transient pulse and an audio phase of the multi-channel audio to be processed, which correspond to the crisp mode instruction;

the processing module is specifically configured to set a phase of the transient pulse to zero in a full frequency domain of the multi-channel audio to be processed, or set a phase of the transient pulse to zero in the full frequency domain of the multi-channel audio to be processed, limit an amplitude of the transient pulse, and set the audio phase to a preset value.

Further, the audio mode instruction is a smooth mode instruction;

the second obtaining module is specifically configured to, when the smooth mode instruction is received, obtain a transient pulse of the multi-channel audio to be processed, which corresponds to the smooth mode instruction;

the processing module is specifically configured to smooth, in the frequency domain of the multi-channel audio to be processed, an amplitude of a frequency domain corresponding to the transient pulse in a predetermined frequency range based on the transient pulse.

Further, the audio mode command is a speed mode command;

the second obtaining module is specifically configured to, when the speed mode instruction is received, obtain a predetermined sampling rate and a formant or a fundamental frequency of the multi-channel audio to be processed, which correspond to the speed mode instruction;

the processing module is specifically configured to down-sample the multi-channel audio to be processed according to a predetermined sampling rate; and, carrying out smoothing processing on the formants of the multi-channel audio to be processed; or, adjusting the fundamental frequency of the multi-channel audio to be processed.

Further, the audio mode instruction is a voice quality mode instruction;

the second obtaining module is specifically configured to, when the sound quality mode instruction is received, obtain a transient pulse, a fundamental frequency, and a formant of the multi-channel audio to be processed, which correspond to the sound quality mode instruction;

the processing module is specifically configured to smooth the transient impulse and the formant of the multi-channel audio to be processed in real time, and adjust a fundamental frequency of the multi-channel audio to be processed in real time.

Further, the audio mode instruction is a cross-band smoothing mode instruction;

the second obtaining module is specifically configured to, when the cross-band smoothing mode instruction is received, obtain an audio phase of the multi-channel audio to be processed, which corresponds to the cross-band smoothing mode instruction;

the processing module is specifically configured to smooth phases in at least two frequency bands of the multi-channel audio to be processed based on the audio phase.

the second obtaining module is specifically configured to, when the in-subband smooth mode instruction is received, obtain an audio phase of the multi-channel audio to be processed, which corresponds to the in-subband smooth mode instruction;

the processing module is specifically configured to smooth a phase within a subband of any frequency band of the multi-channel audio to be processed based on the audio phase.

the second obtaining module is specifically configured to obtain, when the variable speed constant mode adjustment instruction is received, an audio time length of the multi-channel audio to be processed, which corresponds to the variable speed constant mode adjustment instruction;

and the processing module shortens or prolongs the audio time of the multi-channel audio to be processed.

the second obtaining module is specifically configured to obtain, when the pitch-shifted and non-speed-changed mode instruction is received, a fundamental frequency of the multi-channel audio to be processed, which corresponds to the pitch-shifted and non-speed-changed mode instruction;

the processing module is specifically configured to adjust the fundamental frequency of the multi-channel audio to be processed.

the second obtaining module is specifically configured to obtain, when the tonal modification and speed change mode instruction is received, an audio duration and a fundamental frequency of the multi-channel audio to be processed, which correspond to the tonal modification and speed change mode instruction;

the processing module is specifically configured to shorten or lengthen the audio duration and adjust the fundamental frequency.

In order to achieve the above object, according to another aspect of the present invention, the following technical solutions are further provided:

a multi-channel audio processing hardware apparatus, comprising:

a memory for storing non-transitory computer readable instructions; and

a processor for executing the computer readable instructions, so that the processor implements the steps of any of the above-mentioned multi-channel audio processing method technical solutions when executed.

a computer readable storage medium storing non-transitory computer readable instructions which, when executed by a computer, cause the computer to perform the steps of any of the above-described multi-channel audio processing method aspects.

a multi-channel audio processing terminal comprises any one of the multi-channel audio processing devices.

The embodiment of the invention provides a multi-channel audio processing method, a multi-channel audio processing device, a multi-channel audio processing hardware device, a computer readable storage medium and a multi-channel audio processing terminal. The multichannel audio processing method comprises the steps of obtaining multichannel audio to be processed; acquiring audio processing parameters; and processing the multi-channel audio to be processed based on the audio processing parameters. . By adopting the technical scheme, the multi-channel audio to be processed can be correspondingly processed according to the audio processing parameters, so that the corresponding audio effect can be obtained according to different audio processing parameters, and the user experience effect is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understandable, the following preferred embodiments are described in detail with reference to the accompanying drawings.

Drawings

FIG. 1a is a flow chart illustrating a multi-channel audio processing method according to an embodiment of the present invention;

FIG. 1b is a flow chart illustrating a multi-channel audio processing method according to another embodiment of the present invention;

FIG. 2 is a schematic diagram for selecting a tapping mode and a smoothing mode according to one embodiment of the present invention;

FIG. 3 is a flow chart illustrating a multi-channel audio processing method according to another embodiment of the present invention;

FIG. 4 is a block diagram of a multi-channel audio processing apparatus according to an embodiment of the present invention;

FIG. 5 is a block diagram of a multi-channel audio processing hardware device according to an embodiment of the invention;

FIG. 6 is a schematic diagram of a structure of a computer-readable storage medium according to one embodiment of the invention;

fig. 7 is a schematic structural diagram of a multi-channel audio processing terminal according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a multi-channel audio processing terminal according to another embodiment of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than the number, shape and size of the components in practical implementation, and the type, quantity and proportion of the components in practical implementation can be changed freely, and the layout of the components can be more complicated.

In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.

In order to solve the technical problem of how to improve the user experience effect, an embodiment of the present invention provides a multi-channel audio processing method. As shown in fig. 1a, the multi-channel audio processing method mainly includes the following steps S1 to S3. Wherein:

step S1: and acquiring multi-channel audio to be processed.

The to-be-processed multi-channel audio may be off-line to-be-processed multi-channel audio or on-line to-be-processed multi-channel audio, which is not limited in the present invention. The multi-channel audio includes, but is not limited to, 3.1 channel audio, 5.1 channel audio, 7.1 channel audio, etc.

Step S2: audio processing parameters are obtained.

The audio processing parameters include, but are not limited to, fundamental frequency, transient impulse, audio phase, audio duration, etc.

Step S3: and processing the multi-channel audio to be processed based on the audio processing parameters.

For ease of understanding, the following detailed description of the process of processing audio processing parameters is provided in a specific embodiment.

For the joint processing parameters, the audio processing parameters such as fundamental frequency, formants, transient impulse, and audio envelope of each channel audio may be mixed and then processed.

For the separation processing parameters, the audio processing parameters such as fundamental frequency, formants, transient pulses, audio envelopes and the like of the audio of each channel can be independently processed.

For the formants, smoothing processing (also referred to as clipping processing) may be performed; for example, the amplitude of the formants may be thresholded;

for transient pulses, smoothing may be performed.

For the audio envelope, a smoothing process may be performed.

For the sampling rate, up-sampling or down-sampling processing may be performed at a predetermined frequency.

For the fundamental frequency, the pitch may be increased or decreased by increasing or decreasing the fundamental frequency by zero-inserting or decimating to increase or decrease the pitch, thereby increasing or decreasing the pitch;

for the audio phase, the audio phase may be increased or decreased;

for the audio time length, shortening or prolonging treatment can be carried out;

the above-mentioned manners for processing different audio processing parameters are not exhaustive, and those skilled in the art may make simple changes (e.g., permutation, combination) or equivalent substitutions based on the above-mentioned manners, which are also included in the scope of the present invention.

By adopting the technical scheme, the multi-channel audio to be processed can be correspondingly processed according to the audio processing parameters, so that the corresponding audio effect can be obtained according to different audio processing parameters, and the user experience effect is improved.

In an alternative embodiment, as shown in fig. 1b, step S2 may include:

and when an audio mode instruction is received, acquiring the audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction.

In this embodiment, the audio mode command may be implemented by a user through the terminal in a manner of touch, mouse click, keyboard stroke, or the like.

The terminal includes, but is not limited to, a smart phone, a palm computer, a computer, and the like.

The audio modes include, but are not limited to: a joint processing mode, a split processing mode, a speed mode, a quality mode, a balance mode, an offset mode, a hold mode, a tap mode, a soft threshold mode, a compound mode, a crisp mode, a stable mode, a smooth mode, a cross-band smooth mode, an intra-sub-band flat mode, an overall smooth mode, and the like.

The audio mode command corresponds to the audio processing parameter, which includes but is not limited to the following modes:

the audio processing parameter corresponding to the joint processing mode instruction is a joint processing parameter; the joint processing parameter is a parameter obtained by mixing the audio processing parameters of all the sound channels;

the audio processing parameter corresponding to the separation processing mode instruction is a separation processing parameter; wherein the separation processing parameter is an audio processing parameter independent of each channel;

the audio processing parameter corresponding to the offset mode instruction is an audio envelope;

keeping the audio processing parameters corresponding to the mode command as audio envelope and fundamental frequency; wherein, the fundamental frequency refers to the frequency of fundamental tone, and is used for determining the pitch of the audio;

audio processing parameters corresponding to the crisp mode instruction are transient pulse and audio phase;

the audio processing parameter corresponding to the smooth mode instruction is a transient pulse;

the stable mode instruction is considered as a crisp mode instruction and a smooth mode instruction;

the audio processing parameters corresponding to the speed mode instruction are sampling rate, formants or fundamental frequency and the like;

the audio processing parameters corresponding to the quality mode instruction are transient pulse, fundamental frequency and formant;

the audio processing parameters corresponding to the balanced mode instruction are audio consistency and audio consistency;

the audio processing parameter corresponding to the cross-frequency band smoothing mode instruction is the phase of the audio corresponding to each frequency band in the audio frequency domain;

the audio processing parameter corresponding to the sliding mode instruction in the sub-band is the phase of the audio corresponding to the preset frequency band in the audio frequency domain;

the audio processing parameter corresponding to the variable speed invariable mode adjusting instruction is audio duration;

changing the audio processing parameter corresponding to the tone-changing non-variable speed mode instruction into a fundamental frequency;

and the audio processing parameters corresponding to the tone-changing and speed-changing mode command are the duration of the audio and the fundamental frequency of the audio.

It should be clear to a person skilled in the art that the above mentioned corresponding ways are not exhaustive, and that a person skilled in the art may also make simple variations or equivalents on the basis of the above mentioned ways, which are also intended to be included within the scope of protection of the present invention.

In this step, the corresponding relation between the audio mode instruction and the audio processing parameter of the multi-channel audio to be processed can be pre-established; then, when an audio mode instruction is received, according to the corresponding relation, audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction are obtained. The obtaining mode includes, but is not limited to, a local obtaining mode and a remote obtaining mode. The local acquisition mode can be that audio processing parameters of multi-channel audio to be processed, which correspond to the audio mode instruction, are acquired from a memory which is pre-stored in a terminal such as a smart phone, a tablet computer and the like; the remote obtaining means may obtain the audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction from the remote device through a communication means such as 3G, 4G, WI FI, bluetooth, etc.

Therefore, by adopting the technical scheme, the corresponding processing is carried out according to the audio mode instruction sent by the user, so that the user can obtain the corresponding audio effect according to the preference of the user, and the user experience effect is improved.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is an offset mode command;

when the audio mode instruction is received, the step of acquiring the audio processing parameter corresponding to the audio mode instruction specifically includes:

when the offset mode instruction is received, acquiring an audio envelope;

based on the audio processing parameters, the step of processing the multi-channel audio to be processed specifically includes:

and adjusting the audio envelope of the multi-channel audio to be processed.

Further, the step of adjusting the audio envelope of the multi-channel audio to be processed may specifically include: and adjusting the fundamental frequency and the formants of the multi-channel audio to be processed.

The audio envelope is shifted with pitch, which is determined by the fundamental frequency; therefore, the present embodiment can adjust the audio envelope of the multi-channel audio to be processed by adjusting the fundamental frequency of the multi-channel audio to be processed, so that the tone of the audio can be changed, thereby improving the user experience.

And because the harmonics of the audio are determined by their formants; therefore, the embodiment can also adjust the audio envelope by adjusting the formants of the audio, so that the change of the overtones can be realized.

According to the embodiment of the invention, when the audio mode instruction is received, the audio envelope is changed by adjusting the fundamental frequency and the formant of the multi-channel audio to be processed, so that the audio effect of tone and tone change is realized, and the user experience effect is improved.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a hold mode command;

when an audio mode instruction is received, the step of obtaining the audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction specifically includes:

when a hold mode instruction is received, acquiring an audio envelope and a fundamental frequency of the multi-channel audio to be processed, which correspond to the hold mode instruction;

the fundamental frequency is adjusted while preserving the audio envelope.

The fundamental frequency can be obtained by an autocorrelation method, a waveform estimation method, a cepstrum method, a cyclic histogram method, and the like.

In the embodiment, when an audio mode instruction is received, an audio envelope and a fundamental frequency of a multi-channel audio to be processed are obtained, and the fundamental frequency is adjusted under the condition of keeping the audio envelope; for example, the fundamental frequency may be increased or decreased to achieve a transposition of the multi-channel audio to be processed, thereby improving the user experience.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a crisp mode command;

when an audio mode instruction is received, acquiring audio processing parameters of multi-channel audio to be processed corresponding to the audio mode instruction, wherein the steps comprise:

when a crisp mode instruction is received, acquiring transient pulses and audio phases of multi-channel audio to be processed, which correspond to the crisp mode instruction;

the audio phase is set to a preset value.

Wherein the phase of the transient pulse may be set to zero in the full frequency domain of the multi-channel audio to be processed based on the amplitude-frequency characteristics of the multi-channel audio to be processed.

In the embodiment, when an audio mode instruction is received, transient pulses and audio phases of multi-channel audio to be processed are obtained; then, according to the amplitude-frequency characteristic and the phase-frequency characteristic of the multi-channel audio to be processed, the phase of the transient pulse is set to be zero in the full frequency domain of the multi-channel audio to be processed, or the phase of the transient pulse is set to be zero in the full frequency domain of the multi-channel audio to be processed, the amplitude of the transient pulse is limited, the audio phase is set to be a preset value, sharp sound caused by the transient pulse is weakened, a crisp audio effect can be obtained in the sense of hearing, and therefore the user experience effect is improved.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a smooth mode command;

when a smooth mode instruction is received, acquiring transient pulses of multi-channel audio to be processed corresponding to the smooth mode instruction;

based on the transient pulse, in the frequency domain of the multi-channel audio to be processed, smoothing processing is carried out on the amplitude of the frequency domain corresponding to the transient pulse in the preset frequency range.

In the embodiment, when an audio mode instruction is received, transient pulses of multi-channel audio to be processed are obtained; and then, based on the amplitude-frequency characteristics of the multi-channel audio to be processed, smoothing the amplitude of the frequency domain corresponding to the transient pulse in the preset frequency range in the frequency domain of the multi-channel audio to be processed, so as to realize amplitude limiting of the transient pulse in the frequency domain, thereby obtaining a smooth audio effect in the sense of hearing and improving the user experience effect.

It will be understood by those skilled in the art that obvious modifications or equivalents may be made to the above-described embodiment for obtaining crisp audio effects and the above-described embodiment for smoothing audio effects, and for example, those skilled in the art may form stable modes in consideration of the above-described crisp mode and smoothing mode. When the embodiment of the invention receives a stable mode instruction, acquiring transient pulse and audio phase of multi-channel audio to be processed; then, setting the phase of the transient pulse to be zero in the full frequency domain of the multi-channel audio to be processed, or setting the phase of the transient pulse to be zero in the full frequency domain of the multi-channel audio to be processed, and limiting the amplitude of the transient pulse; setting the audio phase to a preset value; and in the frequency domain of the multi-channel audio to be processed, smoothing is carried out on the amplitude of the frequency domain corresponding to the transient pulse in the preset frequency range, so that the amplitude limit of the transient pulse is realized, and therefore, the audio effect between crisp and smooth can be obtained in the sense of hearing, and the user experience effect is improved.

When the embodiment of the invention is implemented specifically, corresponding steps can be added on the basis of the embodiment. For example, a transient pulse detection mode may also be included. The transient pulse detection modes include, but are not limited to, a tapping mode, a soft threshold mode, and a composite mode. The audio processing parameters corresponding to the knocking mode are a plurality of transient pulses which are spaced from each other; the audio processing parameter corresponding to the soft threshold mode is a transient pulse in a preset period; the audio processing parameters corresponding to the composite mode are a plurality of transient pulses which are spaced apart, and a transient pulse in a predetermined period. By the detection mode, the condition of transient pulses in multi-channel audio to be processed can be obtained, and the transient pulses can be processed in a targeted manner.

When the embodiment of the invention receives a knocking mode instruction, a soft threshold mode instruction or a composite mode instruction, corresponding transient pulse processing is carried out aiming at a corresponding detection mode.

For example, the audio mode is selected by the user by touching the terminal, wherein the audio mode is a tapping mode. As shown in fig. 2, a user clicks the transient pulse detection mode through the terminal, and then selects the tapping mode; and clicking an audio mode, and selecting a smooth mode to process the multi-channel audio to be processed. The method comprises the following specific steps:

as shown in fig. 3, the present embodiment provides a multi-channel audio processing method, including:

step Sa 1: acquiring multi-channel audio to be processed;

step Sa 2: acquiring a transient pulse detection mode; wherein the transient pulse detection mode is a tapping mode;

step Sa 3: when a knocking mode instruction is received, determining a detection object as a plurality of transient pulses which are spaced from each other;

step Sa 4: when a smooth mode instruction is received, acquiring a plurality of spaced transient pulses of the multi-channel audio to be processed, which correspond to the smooth mode instruction;

step Sa 5: and in the frequency domain of the multi-channel audio to be processed, smoothing the amplitude of the frequency domain corresponding to a plurality of spaced transient pulses in a preset frequency range.

Similarly, if a soft threshold mode command is received, a transient pulse in a predetermined period is smoothed, so that a smooth audio effect can be obtained, thereby improving user experience. Modifications and equivalents may be made by those skilled in the art in light of the above teachings and are not described in detail herein. Such modifications and equivalent alternative embodiments are intended to be included within the scope of the present invention.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a speed mode command;

when a speed mode instruction is received, acquiring a preset sampling rate and a formant or a fundamental frequency of the multi-channel audio to be processed, which correspond to the speed mode instruction;

down-sampling multi-channel audio to be processed according to a preset sampling rate;

smoothing the formants of the multi-channel audio to be processed; or adjusting the fundamental frequency of the multi-channel audio to be processed.

The down-sampling of the multi-channel audio to be processed may be, for example, down-sampling the audio at 48kHz to 4kHz, or may be implemented by decimating even sample points, which is not limited by the present invention. Extension of an audio frequency domain can be realized through down sampling, and aliasing of an audio frequency spectrum is easily caused; therefore, preferably, an anti-aliasing filtering process may also be performed before the down-sampling step is performed.

In this embodiment, the speed mode may be regarded as a speed processing priority mode, which means that when the pitch is changed, the multi-channel audio to be processed is firstly down-sampled according to a predetermined sampling rate, and then is processed based on the short-time spectral characteristics. Since the amount of data processed is small, the processing time is shortened, thereby improving the processing speed.

In practical applications, when a user selects a speed mode, the embodiment of the present invention selects a predetermined portion of audio characteristics for processing, for example, a formant of a multi-channel audio to be processed may be selected for processing, or a formant and a transient pulse of the multi-channel audio to be processed may be processed, so as to increase a processing speed for processing the multi-channel audio to be processed, thereby improving a user experience effect.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a timbre mode command;

when a tone quality mode instruction is received, acquiring transient pulses, fundamental frequencies and formants of multi-channel audio to be processed, which correspond to the tone quality mode instruction;

smoothing transient pulses and formants of multi-channel audio to be processed in real time;

In practical application, when a user selects a quality mode, the embodiment of the invention carries out real-time processing on transient pulse, fundamental frequency and formant of the multi-channel audio to be processed so as to change tone and timbre and weaken harsh loudness caused by the transient pulse, thereby improving the timbre and improving the user experience effect.

Of course, one skilled in the art can also form a balanced mode on the basis of the velocity mode embodiment and the tone mode embodiment described above. When the balanced mode command is received, the processing of the speed mode embodiment and the tone mode embodiment is performed, which is not described herein again. Such obvious modifications or equivalent embodiments are also intended to be included within the scope of the present invention.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a cross-band smoothing mode command;

when a cross-frequency-band smooth mode instruction is received, acquiring an audio phase of the multi-channel audio to be processed, which corresponds to the cross-frequency-band smooth mode instruction;

phases within at least two frequency bands of the multi-channel audio to be processed are smoothed based on the audio phase.

In the embodiment, based on the phase-frequency characteristics of the multi-channel audio to be processed, smoothing processing is performed on the phases corresponding to the transient pulses in a plurality of frequency bands of the multi-channel audio to be processed, for example, a first frequency band, a second frequency band and a third frequency band, so that the multi-channel audio is more smooth in hearing sense, and the user experience effect is improved.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a sliding mode command within a sub-band;

when receiving a smooth mode instruction in a sub-band, acquiring an audio phase of the multi-channel audio to be processed corresponding to the smooth mode instruction in the sub-band;

the phase within a subband of any frequency band of the multi-channel audio to be processed is smoothed based on the audio phase.

In the embodiment, when the instruction of the smoothing mode in the sub-band is received, the audio phase of the multi-channel audio to be processed is obtained, and the smoothing processing is performed on the phase in the sub-band of any frequency band of the multi-channel audio to be processed based on the phase-frequency characteristic of the multi-channel audio to be processed, so that the audio is smoothed in the sense of hearing, and the user experience effect is improved.

Of course, in practical applications, those skilled in the art can also combine the above-mentioned cross-band smoothing mode embodiment and the sub-band inner-plane smoothing mode embodiment to form an overall smoothing mode; when an integral smooth mode instruction is received, acquiring an audio phase of a multi-channel audio to be processed; the phases within at least two frequency bands of the multi-channel audio to be processed, as well as the phase within a sub-band of any one frequency band, are then smoothed to make the audio more acoustically smooth, thereby improving the user experience. Such obviously modified embodiments are also intended to be included within the scope of protection of the present invention.

In an alternative embodiment, based on the embodiment shown in FIG. 1b, the audio mode command is a variable speed and constant tone mode command;

when a speed change and tone invariant mode instruction is received, acquiring the audio time of the multi-channel audio to be processed corresponding to the speed change and tone invariant mode instruction;

and shortening or prolonging the audio time length of the multi-channel audio to be processed.

Those skilled in the art will appreciate that embodiments of the present invention may also process the multi-channel audio to be processed in the frequency domain to achieve a variable-speed, non-tonal audio effect. For example, the multi-channel audio to be processed is divided into several segments; then, interpolation is carried out between two adjacent sections of audio in sequence; finally, resampling is carried out; and in the same way, until all the segments are processed, thereby realizing the technical effect the same as the adjustment of the audio time length in the time domain, and further realizing the variable-speed and invariable-tone processing of the multi-channel audio to be processed.

In the embodiment, under the condition of ensuring that the tone color does not change the tone, the audio time length of the multi-channel audio to be processed is shortened or prolonged in the time domain, so that the change of the sound speed is realized, and the user experience effect is improved.

In an alternative embodiment, based on the embodiment shown in fig. 1b, the audio mode command is a pitch invariant mode command;

when a tone-changing non-variable speed mode instruction is received, acquiring fundamental frequency of multi-channel audio to be processed corresponding to the tone-changing non-variable speed mode instruction;

and adjusting the fundamental frequency of the multi-channel audio to be processed.

The embodiment adjusts the fundamental frequency of the multi-channel audio to be processed, for example, performs linear interpolation on the frequency domain, and implements contraction or expansion of the frequency spectrum of the multi-channel audio to be processed (where the relationship between the harmonic components is not changed), thereby implementing the tonal modification audio effect, and improving the user experience effect.

In an alternative embodiment, based on the embodiment shown in FIG. 1b, the audio mode command is a pitch shift mode command;

when a tone-changing speed-changing mode instruction is received, acquiring the audio duration and the fundamental frequency of the multi-channel audio to be processed, which correspond to the tone-changing speed-changing mode instruction;

shortening or prolonging the audio time;

the fundamental frequency is adjusted.

In the embodiment, by integrating the variable-speed and non-variable-tone embodiment and the variable-tone and non-variable-speed embodiment, the audio effect of both variable speed and variable tone can be realized, so that the user experience is improved.

It will be appreciated by those skilled in the art that obvious modifications (e.g., combinations of the enumerated modes) or equivalents may be made to the above embodiments, and for example, those skilled in the art may combine the joint processing mode and the separate processing mode to process the multi-channel audio to be processed based on the above embodiments.

In the above, although the steps in the embodiment of the multi-channel audio processing method are described in the above sequence, it should be clear to those skilled in the art that the steps in the embodiment of the present invention are not necessarily performed in the above sequence, and may also be performed in other sequences such as reverse, parallel, and cross, and further, on the basis of the above steps, those skilled in the art may also add other steps, and these obvious modifications or equivalents should also be included in the protection scope of the present invention, and are not described herein again.

For convenience of description, only the relevant parts of the embodiments of the present invention are shown, and details of the specific technology are not disclosed, please refer to the embodiments of the method of the present invention.

In order to solve the technical problem of how to improve the user experience effect, an embodiment of the present invention provides a multi-channel audio processing apparatus. The apparatus may perform the steps described in the above-described multi-channel audio processing method embodiments. As shown in fig. 4, the apparatus mainly includes: a first acquisition module 41, a second acquisition module 42 and a processing module 43. The first obtaining module 41 is configured to obtain multi-channel audio to be processed. The second obtaining module 42 is configured to obtain audio processing parameters. The processing module 43 is configured to process the multi-channel audio to be processed based on the audio processing parameters.

The embodiment obtains the multi-channel audio to be processed and the audio processing parameters through the first obtaining module 41 and the second obtaining module 42, respectively; then, the processing module 43 processes the multi-channel audio to be processed based on the audio processing parameters; corresponding audio effects can be obtained according to different audio processing parameters, and therefore user experience effects are improved.

In an alternative embodiment, as shown in fig. 4, the second obtaining module 42 is further configured to, when receiving the audio mode instruction, obtain an audio processing parameter of the multi-channel audio to be processed corresponding to the audio mode instruction.

Audio modes include, but are not limited to: a joint processing mode, a split processing mode, a speed mode, a quality mode, a balance mode, an offset mode, a hold mode, a tap mode, a soft threshold mode, a compound mode, a crisp mode, a stable mode, a smooth mode, a cross-band smooth mode, an intra-sub-band flat mode, an overall smooth mode, and the like.

In this embodiment, the audio mode command may be implemented by a user through the terminal in a manner of touch, mouse click, keyboard click, and/or the like.

The audio processing parameters include, but are not limited to, fundamental frequency, formants, audio phase, transient impulse, audio duration, etc.

By adopting the technical scheme, the second acquisition module 42 acquires the audio mode instruction sent by the user, and then the processing module performs corresponding processing on the audio mode instruction, so that the user can obtain a corresponding audio effect according to the preference of the user, and the user experience effect is improved.

In an alternative embodiment, the audio mode command is an offset mode command;

the second obtaining module 42 is specifically configured to, when receiving the offset mode instruction, obtain an audio envelope of the multi-channel audio to be processed, which corresponds to the offset mode instruction;

the processing module 43 is specifically configured to adjust an audio envelope of the multi-channel audio to be processed.

In a preferred embodiment, the processing module 43 is specifically configured to adjust the fundamental frequency and formants of the multi-channel audio to be processed.

In the embodiment of the present invention, when the second obtaining module 42 receives the audio mode instruction, the processing module 43 changes the audio envelope by adjusting the fundamental frequency and the formant of the multi-channel audio to be processed, so as to achieve the audio effect of changing the tone and the tone, thereby improving the user experience effect.

In an alternative embodiment, the audio mode command is a hold mode command;

the second obtaining module 42 is specifically configured to, when receiving the hold mode instruction, obtain an audio envelope and a fundamental frequency of the multi-channel audio to be processed, which correspond to the hold mode instruction;

the processing module 43 is specifically configured to adjust the fundamental frequency while preserving the audio envelope.

In this embodiment, when receiving the audio mode instruction, the second obtaining module 42 obtains the audio envelope and the fundamental frequency of the multi-channel audio to be processed, and then the processing module 43 adjusts the fundamental frequency while maintaining the audio envelope; for example, the fundamental frequency may be increased or decreased to achieve a transposition of the multi-channel audio to be processed, thereby improving the user experience.

In an alternative embodiment, the audio mode command is a crisp mode command;

the second obtaining module 42 is specifically configured to, when the crisp mode instruction is received, obtain a transient pulse and an audio phase of the multi-channel audio to be processed, which correspond to the crisp mode instruction;

the processing module 43 is specifically configured to set the phase of the transient pulse to zero in the entire frequency domain of the multi-channel audio to be processed, or set the phase of the transient pulse to zero in the entire frequency domain of the multi-channel audio to be processed, limit the amplitude of the transient pulse, and set the audio phase to a preset value.

In this embodiment, when receiving the audio mode instruction, the second obtaining module 42 obtains a transient pulse and an audio phase of the multi-channel audio to be processed; then, the processing module 43 sets the phase of the transient pulse to zero in the full frequency domain of the multi-channel audio to be processed, or sets the phase of the transient pulse to zero in the full frequency domain of the multi-channel audio to be processed, limits the amplitude of the transient pulse, and sets the audio phase to a preset value according to the amplitude-frequency characteristic and the phase-frequency characteristic of the multi-channel audio to be processed, so that the sharp sound caused by the transient pulse is weakened, a crisp audio effect can be obtained in the sense of hearing, and the user experience effect is improved.

In an alternative embodiment, the audio mode instruction is a smooth mode instruction;

the second obtaining module 42 is specifically configured to, when receiving the smooth mode instruction, obtain a transient pulse of the multi-channel audio to be processed, which corresponds to the smooth mode instruction;

the processing module 43 is specifically configured to perform smoothing processing on the amplitude of the frequency domain corresponding to the transient pulse in the predetermined frequency range in the frequency domain of the multi-channel audio to be processed based on the transient pulse.

In this embodiment, when receiving the audio mode instruction, the second obtaining module 42 obtains a transient pulse of the multi-channel audio to be processed; then, the processing module 43 performs smoothing processing on the amplitude of the frequency domain corresponding to the transient pulse in the predetermined frequency range in the frequency domain of the multi-channel audio to be processed based on the amplitude-frequency characteristic of the multi-channel audio to be processed, so as to realize amplitude limiting of the transient pulse in the frequency domain, thereby obtaining a smooth audio effect in terms of hearing, and improving the user experience effect.

In an alternative embodiment, the audio mode command is a speed mode command;

the second obtaining module 42 is specifically configured to, when receiving the speed mode instruction, obtain a predetermined sampling rate and a formant or a fundamental frequency of the multi-channel audio to be processed, which correspond to the speed mode instruction;

the processing module 43 is specifically configured to perform downsampling on the multi-channel audio to be processed according to a predetermined sampling rate; moreover, smoothing is carried out on the formants of the multi-channel audio to be processed; or adjusting the fundamental frequency of the multi-channel audio to be processed.

In an alternative embodiment, the audio mode command is a timbre mode command;

the second obtaining module 42 is specifically configured to, when the sound quality mode instruction is received, obtain a transient pulse, a fundamental frequency, and a formant of the multi-channel audio to be processed, which correspond to the sound quality mode instruction;

the processing module 43 is specifically configured to smooth transient pulses and formants of the multi-channel audio to be processed in real time, and adjust fundamental frequencies of the multi-channel audio to be processed in real time.

In the implementation of this embodiment, when the user selects the quality mode, the second obtaining module 42 obtains the transient pulse, the fundamental frequency, and the formant of the multi-channel audio to be processed; then, the processing module 43 processes the transient pulse, the fundamental frequency, and the formant of the multi-channel audio to be processed in real time to change the tone and the timbre and weaken the harsh loudness caused by the transient pulse, thereby improving the timbre and improving the user experience effect.

In an alternative embodiment, the audio mode instruction is a cross-band smoothing mode instruction;

the second obtaining module 42 is specifically configured to, when the cross-band smoothing mode instruction is received, obtain an audio phase of the multi-channel audio to be processed, which corresponds to the cross-band smoothing mode instruction;

the processing module 43 is specifically configured to smooth the phases within at least two frequency bands of the multi-channel audio to be processed based on the audio phase.

In an alternative embodiment, the audio mode instruction is an intra-subband flat-form instruction;

the second obtaining module 42 is specifically configured to, when a subband internal sliding mode instruction is received, obtain an audio phase of the multi-channel audio to be processed, which corresponds to the subband internal sliding mode instruction;

the processing module 43 is specifically configured to smooth the phase within a subband of any frequency band of the multi-channel audio to be processed, based on the audio phase.

In this embodiment, when receiving the in-subband smooth mode instruction, the second obtaining module 42 obtains the audio phase of the multi-channel audio to be processed; then, the processing module 43 performs smoothing processing on the phase in the sub-band of any frequency band of the multi-channel audio to be processed based on the phase-frequency characteristics of the multi-channel audio to be processed, so as to make the audio smooth in auditory sense, thereby improving the user experience effect.

In an alternative embodiment, the audio mode command is a variable speed, constant tone mode command;

the second obtaining module 42 is specifically configured to, when a variable speed and non-variable mode instruction is received, obtain an audio duration of the multi-channel audio to be processed, which corresponds to the variable speed and non-variable mode instruction;

the processing module 43 shortens or extends the audio duration of the multi-channel audio to be processed.

By adopting the above technical solution, the second obtaining module 42 is utilized to obtain the audio duration of the multi-channel audio to be processed; then, the processing module 43 shortens or prolongs the audio duration of the multi-channel audio to be processed in the time domain, so as to realize the change of the sound speed under the condition of ensuring that the tone is not changed, thereby improving the user experience effect.

In an alternative embodiment, the audio mode command is a pitch invariant mode command;

the second obtaining module 42 is specifically configured to, when receiving the pitch-shifted and non-shift mode instruction, obtain a fundamental frequency of the multi-channel audio to be processed, which corresponds to the pitch-shifted and non-shift mode instruction;

the processing module 43 is specifically configured to adjust the fundamental frequency of the multi-channel audio to be processed.

In the embodiment, the processing module 43 adjusts the fundamental frequency of the multi-channel audio to be processed, for example, performs linear interpolation on the frequency domain, so as to implement contraction or expansion of the frequency spectrum of the multi-channel audio to be processed (where the relationship between the harmonic components is not changed), thereby implementing a modified audio effect, and thus improving the user experience effect.

In an alternative embodiment, the audio mode command is a pitch shift mode command;

the second obtaining module 42 is specifically configured to, when receiving the modulation and speed change mode instruction, obtain an audio duration and a fundamental frequency of the multi-channel audio to be processed, which correspond to the modulation and speed change mode instruction;

the processing module 43 is specifically configured to shorten or lengthen the audio duration and adjust the fundamental frequency.

In the present embodiment, by combining the foregoing variable-speed and non-variable-speed embodiments and the above-mentioned variable-speed and non-variable-speed embodiments, the second obtaining module 42 and the processing module 43 can achieve the audio effect of both variable speed and variable tone, thereby improving the user experience.

For detailed descriptions of the working principle, the technical effect of the implementation, and the like of the embodiment of the multi-channel audio processing apparatus, reference may be made to the related descriptions in the foregoing embodiment of the multi-channel audio processing method, and further description is omitted here.

Fig. 5 is a hardware block diagram illustrating a multi-channel audio processing hardware apparatus according to an embodiment of the present disclosure. As shown in fig. 5, a multi-channel audio processing hardware apparatus 50 according to an embodiment of the present disclosure includes a memory 51 and a processor 52.

The memory 51 is used to store non-transitory computer readable instructions. In particular, memory 51 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc.

The processor 52 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the multi-channel audio processing hardware device 50 to perform desired functions. In one embodiment of the present disclosure, the processor 52 is configured to execute the computer readable instructions stored in the memory 51, so that the multi-channel audio processing hardware device 50 performs all or part of the steps of the multi-channel audio processing method of the embodiments of the present disclosure.

Those skilled in the art should understand that, in order to solve the technical problem of how to obtain a good user experience, the present embodiment may also include well-known structures such as a communication bus, an interface, and the like, and these well-known structures should also be included in the protection scope of the present invention.

For the detailed description of the present embodiment, reference may be made to the corresponding descriptions in the foregoing embodiments, which are not repeated herein.

Fig. 6 is a schematic diagram illustrating a computer-readable storage medium according to an embodiment of the present disclosure. As shown in fig. 6, a computer-readable storage medium 60, having non-transitory computer-readable instructions 61 stored thereon, in accordance with an embodiment of the present disclosure. When the non-transitory computer readable instructions 61 are executed by a processor, all or part of the steps of the video feature comparison method according to the embodiments of the present disclosure are performed.

The computer-readable storage medium 60 includes, but is not limited to: optical storage media (e.g., CD-ROMs and DVDs), magneto-optical storage media (e.g., MOs), magnetic storage media (e.g., magnetic tapes or removable disks), media with built-in rewritable non-volatile memory (e.g., memory cards), and media with built-in ROMs (e.g., ROM cartridges).

Fig. 7 is a diagram illustrating a hardware structure of a terminal device according to an embodiment of the present disclosure. As shown in fig. 7, the multi-channel audio processing terminal 70 includes the above-described multi-channel audio processing apparatus embodiment 71.

The terminal device may be implemented in various forms, and the terminal device in the present disclosure may include, but is not limited to, mobile terminal devices such as a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation apparatus, a vehicle-mounted terminal device, a vehicle-mounted display terminal, a vehicle-mounted electronic rear view mirror, and the like, and fixed terminal devices such as a digital TV, a desktop computer, and the like.

The terminal may also include other components as equivalent alternative embodiments. As shown in fig. 8, the multi-channel audio processing terminal 80 may include a power supply unit 81, a wireless communication unit 82, an a/V (audio/video) input unit 83, a user input unit 84, a sensing unit 85, an interface unit 86, a controller 87, an output unit 88, a memory 89, and the like. Fig. 8 shows a terminal having various components, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may alternatively be implemented.

The wireless communication unit 82 allows, among other things, radio communication between the terminal 80 and a wireless communication system or network. The a/V input unit 83 is for receiving an audio or video signal. The user input unit 84 may generate key input data to control various operations of the terminal device according to a command input by a user. The sensing unit 85 detects a current state of the terminal 80, a position of the terminal 80, presence or absence of a touch input of the user to the terminal 80, an orientation of the terminal 80, acceleration or deceleration movement and direction of the terminal 80, and the like, and generates a command or signal for controlling an operation of the terminal 80. The interface unit 86 serves as an interface through which at least one external device is connected to the terminal 80. The output unit 88 is configured to provide output signals in a visual, audio, and/or tactile manner. The memory 89 may store software programs or the like for processing and controlling operations performed by the controller 87, or may temporarily store data that has been output or is to be output. The memory 89 may include at least one type of storage medium. Also, the terminal 80 may cooperate with a network storage device that performs a storage function of the memory 89 through a network connection. The controller 87 generally controls the overall operation of the terminal device. In addition, the controller 87 may include a multimedia module for reproducing or playing back multimedia data. The controller 87 may perform a pattern recognition process to recognize a handwriting input or a picture drawing input performed on the touch screen as a character or an image. The power supply unit 81 receives external power or internal power and supplies appropriate power required to operate the respective elements and components under the control of the controller 87.

Various embodiments of the video feature comparison method presented in the present disclosure may be implemented using a computer-readable medium, such as computer software, hardware, or any combination thereof. For a hardware implementation, various embodiments of the comparison method of video features proposed by the present disclosure may be implemented by using at least one of an application specific integrated circuit (AS ic), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a processor, a controller, a microcontroller, a microprocessor, an electronic unit designed to perform the functions described herein, and in some cases, various embodiments of the comparison method of video features proposed by the present disclosure may be implemented in the controller 87. For software implementation, various embodiments of the video feature comparison method presented in the present disclosure may be implemented with a separate software module that allows at least one function or operation to be performed. The software codes may be implemented by software applications (or programs) written in any suitable programming language, which may be stored in the memory 89 and executed by the controller 87.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

Also, as used herein, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that, for example, a list of "A, B or at least one of C" means A or B or C, or AB or AC or BC, or ABC (i.e., A and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.

It is also noted that in the systems and methods of the present disclosure, components or steps may be decomposed and/or re-combined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

Various changes, substitutions and alterations to the techniques described herein may be made without departing from the techniques of the teachings as defined by the appended claims. Moreover, the scope of the claims of the present disclosure is not limited to the particular aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. Processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A multi-channel audio processing method, comprising:

acquiring multi-channel audio to be processed;

acquiring audio processing parameters;

processing the multi-channel audio to be processed based on the audio processing parameters;

the step of obtaining audio processing parameters comprises:

when an audio mode instruction is received, acquiring audio processing parameters of the multi-channel audio to be processed corresponding to the audio mode instruction;

the audio mode command is a hold mode command;

adjusting the fundamental frequency while maintaining the audio envelope;

in the case of maintaining the audio envelope, the step of adjusting the fundamental frequency specifically comprises:

increasing or decreasing the fundamental frequency while maintaining the audio envelope to achieve a transposition of the multi-channel audio to be processed.

2. The method of claim 1, wherein the audio mode command is an offset mode command;

and adjusting the audio envelope of the multi-channel audio to be processed.

3. The method according to claim 2, wherein the step of adjusting the audio envelope of the multi-channel audio to be processed comprises in particular:

4. The method of claim 1, wherein the audio mode command is a crisp mode command;

setting the audio phase to a preset value.

5. The method of claim 1, wherein the audio mode instruction is a smooth mode instruction;

6. The method of claim 1, wherein the audio mode command is a speed mode command;

adjusting the fundamental frequency of the multi-channel audio to be processed;

the step of obtaining audio processing parameters comprises:

and before the down-sampling step, performing anti-aliasing filtering processing on the multi-channel audio to be processed.

7. The method of claim 1, wherein the audio mode command is a timbre mode command;

8. The method of claim 1, wherein the audio mode instruction is a cross-band smoothing mode instruction;

9. The method of claim 1, wherein the audio mode instruction is an intra-subband flat-form instruction;

10. The method of claim 1, wherein the audio mode command is a variable speed, constant tone mode command;

11. The method of claim 1, wherein the audio mode command is a pitch invariant mode command;

adjusting the fundamental frequency of the multi-channel audio to be processed.

12. The method of claim 1, wherein the audio mode command is a pitch shift mode command;

shortening or prolonging the audio time length;

the fundamental frequency is adjusted.

13. A multi-channel audio processing apparatus, comprising:

the processing module is used for processing the multi-channel audio to be processed based on the audio processing parameters;

the second obtaining module is further configured to:

the audio mode command is a hold mode command;

and the processing module increases or decreases the fundamental frequency under the condition of keeping the audio envelope so as to realize the transposition of the multi-channel audio to be processed.

14. The apparatus of claim 13, wherein the audio mode instruction is an offset mode instruction;

15. The apparatus of claim 13, wherein the processing module is specifically configured to adjust a fundamental frequency and a formant of the multi-channel audio to be processed.

16. The apparatus of claim 13, wherein the audio mode command is a crisp mode command;

17. The apparatus of claim 13, wherein the audio mode instruction is a smooth mode instruction;

18. The apparatus of claim 13, wherein the audio mode command is a speed mode command;

the processing module is specifically configured to down-sample the multi-channel audio to be processed according to a predetermined sampling rate; and, carrying out smoothing processing on the formants of the multi-channel audio to be processed; or, adjusting the fundamental frequency of the multi-channel audio to be processed;

the second obtaining module is further configured to:

the processing module is further configured to perform anti-aliasing filtering processing on the multi-channel audio to be processed before performing the down-sampling step.

19. The apparatus of claim 13, wherein the audio mode command is a timbre mode command;

20. The apparatus of claim 13, wherein the audio mode instruction is a cross-band smoothing mode instruction;

21. The apparatus of claim 13, wherein the audio mode instruction is a sub-band in-band sliding mode instruction;

22. The apparatus of claim 13, wherein the audio mode command is a variable speed, constant tone mode command;

23. The apparatus of claim 13, wherein the audio mode command is a pitch invariant mode command;

24. The apparatus of claim 13, wherein the audio mode command is a pitch shift mode command;

25. A multi-channel audio processing hardware apparatus, comprising:

a memory for storing non-transitory computer readable instructions; and

a processor for executing the computer readable instructions such that the processor when executing performs the multi-channel audio processing method according to any of claims 1-12.

26. A computer-readable storage medium storing non-transitory computer-readable instructions that, when executed by a computer, cause the computer to perform the multi-channel audio processing method of any of claims 1-12.

27. A multi-channel audio processing terminal comprising a multi-channel audio processing apparatus as claimed in any of claims 13 to 24.