CN116312636A

CN116312636A - Method, apparatus, computer device and storage medium for analyzing electric tone key

Info

Publication number: CN116312636A
Application number: CN202310288508.5A
Authority: CN
Inventors: 王佳乐
Original assignee: Guangzhou Ziyun Technology Co ltd
Current assignee: Guangzhou Ziyun Technology Co ltd
Priority date: 2023-03-21
Filing date: 2023-03-21
Publication date: 2023-06-23
Anticipated expiration: 2043-03-21
Also published as: CN116312636B

Abstract

The present application relates to an electric tone base analysis method, an apparatus, a computer device, a storage medium, and a computer program product. The method comprises the following steps: acquiring preprocessed audio of audio to be analyzed; the preprocessed audio is at least fewer in number of channels than the native audio; according to the frequency domain information of the preprocessed audio, performing spectrum conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio; fusing the spectrum information in the spectrogram to obtain a target spectrogram of the preprocessed audio; the target spectrogram comprises spectrum information of a plurality of frequency bands of the preprocessed audio; and determining the electric tone of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram. The method can improve the acquisition efficiency of the electric tone basic tone.

Description

Method, apparatus, computer device and storage medium for analyzing electric tone key

Technical Field

The present application relates to the field of audio technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for electric tone analysis.

Background

The electric tone base refers to a basic tone used in an electric tone work, the electric tone base is a base for forming the electric tone work, and the selection of the electric tone base directly influences the quality of the electric tone work, so that the selection of the electric tone base needs to be carefully considered when the electric tone work is manufactured.

Currently, the electric key of a song is often obtained by means of manual searching. For example, a phone composer manually searches on a website for what the phone key of a song is based on the name of the song. However, the network is quite resource-intensive, and the manual searching of the electric tone is inefficient and is also prone to searching for erroneous information.

Disclosure of Invention

In view of the foregoing, it is desirable to provide an electric tone analysis method, an apparatus, a computer device, a computer-readable storage medium, and a computer program product that can improve the efficiency of acquisition of an electric tone.

In a first aspect, the present application provides a method of electrical tone analysis. The method comprises the following steps:

acquiring preprocessed audio of audio to be analyzed; the preprocessed audio is at least fewer in number of channels than the native audio;

according to the frequency domain information of the preprocessed audio, performing spectrum conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio;

Fusing the spectrum information in the spectrogram to obtain a target spectrogram of the preprocessed audio; the target spectrogram comprises spectrum information of a plurality of frequency bands of the preprocessed audio;

and determining the electric tone of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram.

In one embodiment, the merging the spectral information in the spectrogram to obtain the target spectrogram of the preprocessed audio includes:

merging the spectrum information of a plurality of octaves in each spectrogram into spectrum information of one octave to obtain a musical scale integration spectrogram of each spectrogram; the octave of the musical scale integration spectrogram comprises a plurality of frequency bands;

and merging the spectral data of each frequency band of the multiple musical scale integration spectrograms to obtain the target spectrogram of the preprocessed audio.

In one embodiment, determining the electric tone of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram includes:

performing a mood analysis on the spectrum information of each frequency band in the target spectrogram to obtain the importance of each frequency band;

And screening target tones meeting the preset importance condition from the tones corresponding to the frequency bands according to the importance of each frequency band, and taking the target tones as the electric tone base of the audio to be analyzed.

In one embodiment, according to the frequency domain information of the preprocessed audio, performing spectral conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio, including:

acquiring target transformation information from the frequency domain information of the preprocessed audio according to the offset position in the spectrum converter; the frequency domain information is obtained by carrying out frequency spectrum analysis processing and Fourier transformation processing on the preprocessed audio;

fusing the frequency domain information and the target transformation information to obtain spectrum information of the preprocessed audio;

and converting the spectral information of the preprocessed audio into the spectrogram.

In one embodiment, obtaining pre-processed audio of audio to be analyzed includes:

performing compression reduction treatment on the audio to be analyzed to obtain the original audio of the audio to be analyzed;

according to the reduction factors corresponding to the original audio, carrying out reduction processing on the original audio to obtain preprocessed audio of the original audio; the reduction factor is used to control the reduction strength of the native audio.

In one embodiment, according to a reduction factor corresponding to the native audio, the reducing the native audio to obtain the preprocessed audio of the native audio includes:

carrying out channel fusion processing on the original audio to obtain processed audio; the number of channels of the processed audio is less than the number of channels of the native audio;

filtering the processed audio according to the reduction factor corresponding to the original audio to obtain preprocessed audio of the original audio; the reduction factor is set according to the number of native audio.

In one embodiment, performing compression reduction processing on the audio to be analyzed to obtain native audio of the audio to be analyzed, including:

processing to obtain format information and data stream information of the audio to be analyzed through an audio processing library;

according to the format information and the data stream information, decoding the audio to be analyzed to obtain decoded audio of the audio to be analyzed;

and resampling the decoded audio to obtain the original audio of the audio to be analyzed.

In a second aspect, the present application also provides an electric tone analysis apparatus. The device comprises:

The audio simplifying module is used for acquiring preprocessed audio of the audio to be analyzed; the preprocessed audio is at least fewer in number of channels than the native audio;

the spectrum conversion module is used for carrying out spectrum conversion on the preprocessed audio according to the frequency domain information of the preprocessed audio to obtain a spectrogram of the preprocessed audio;

the spectrum fusion module is used for fusing the spectrum information in the spectrogram to obtain a target spectrogram of the preprocessed audio; the target spectrogram comprises spectrum information of a plurality of frequency bands of the preprocessed audio;

and the key determining module is used for determining the electric key of the audio to be analyzed according to the spectrum information of each frequency band in the target spectrogram.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

The method, the device, the computer equipment, the storage medium and the computer program product for analyzing the electric tone basic tone acquire the preprocessed audio of the audio to be analyzed; then, according to the frequency domain information of the preprocessed audio, performing spectrum conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio; the spectrum information in the spectrograms is fused to obtain a target spectrogram of the preprocessed audio; and determining the electric tone of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram. By adopting the method, a user does not need to search the electric tone of the song manually, the electric tone of the audio to be analyzed can be analyzed by converting the preprocessed audio of the audio to be analyzed into the target spectrogram and analyzing the electric tone of the audio to be analyzed through the target spectrogram, and the acquisition efficiency of the electric tone is greatly improved by means of the characteristic of very high analysis speed of the spectrogram.

Drawings

FIG. 1 is a flow chart of a method of electrical tone analysis in one embodiment;

FIG. 2 is a flowchart illustrating steps performed to obtain pre-processed audio of audio to be analyzed according to one embodiment;

FIG. 3 is a flow chart of a method of electrical tone analysis in another embodiment;

FIG. 4 is a block diagram showing the structure of an electric tone analyzing apparatus in one embodiment;

fig. 5 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, a method for analyzing a tone is provided, where the method is applied to a server for illustration, it is understood that the method may also be applied to a terminal, and may also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. The terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers. In this embodiment, the method includes the steps of:

Step S101, acquiring pre-processed audio of audio to be analyzed; the preprocessed audio is at least smaller in number of channels than the number of channels of the native audio.

The audio to be analyzed refers to audio data of which the electric tone base needs to be analyzed. The preprocessed audio refers to audio data obtained after preprocessing the audio to be analyzed.

Specifically, the server receives audio to be analyzed sent by the terminal, and the audio to be analyzed can be obtained by recording the playing music by the terminal. And then the server performs tone quality restoration processing and reduction processing on the audio to be analyzed, wherein the reduction processing can be compressing the sound channel of the audio to be analyzed and can also be filtering the waveform of the audio to be analyzed, so that the data of the preprocessed audio is more reduced than the data of the audio to be analyzed.

Step S102, according to the frequency domain information of the preprocessed audio, performing spectrum conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio.

Wherein, the frequency domain information refers to information describing characteristics of the preprocessed audio in terms of frequency. A spectrogram is a digital image of audio (waveform) that shows pre-processed audio on an octave basis. For example, the spectrogram of the pre-processed audio may have a waveform (frequency) on the horizontal axis and a relative intensity on the vertical axis.

Specifically, the server performs frequency domain conversion processing on the preprocessed audio to obtain frequency domain information of the preprocessed audio; the server converts the preprocessed audio into spectrum information according to the frequency domain information of the preprocessed audio, and further draws a spectrogram according to the spectrum information.

Step S103, fusing the spectrum information in the spectrogram to obtain a target spectrogram of the preprocessed audio; the target spectrogram contains spectral information of a plurality of frequency bands of the preprocessed audio.

Wherein the target spectrogram is obtained by fusing all spectrograms.

Specifically, the server may determine the number of octaves in the processed audio and the number of frequency bands in each octave, then fuse the spectral information of multiple octaves in the spectrogram, and in addition, each musical scale includes multiple frequency bands, or fuse the spectral information of multiple frequency bands, so that the server obtains the target spectrogram of the preprocessed audio.

Step S104, determining the electric tone of the audio to be analyzed according to the spectrum information of each frequency band in the target spectrogram.

The electric tone base refers to a basic tone used in electric tone works. The choice of the tone base directly affects the quality of the generated electrical work.

Specifically, the server screens out target tones from the frequency bands of the target spectrogram according to the spectral information of each frequency band of the target spectrogram, and the target tones are used as the electric tone base of the audio to be analyzed. In practical application, each frequency band can be further subdivided into big and small tones, and then a target tone can be selected from the big and small tones of all frequency bands to serve as an electric tone of the audio to be analyzed. The server returns the electric tone of the audio to be analyzed to the terminal.

In the above-mentioned electric tone analysis method, obtain the audio after preprocessing of the audio to be analyzed; then, according to the frequency domain information of the preprocessed audio, performing spectrum conversion on the preprocessed audio to obtain a spectrogram of the preprocessed audio; the spectrum information in the spectrograms is fused to obtain a target spectrogram of the preprocessed audio; and determining the electric tone of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram. By adopting the method, a user does not need to search the electric tone of the song manually, the electric tone of the audio to be analyzed can be analyzed by converting the preprocessed audio of the audio to be analyzed into the target spectrogram and analyzing the electric tone of the audio to be analyzed through the target spectrogram, and the acquisition efficiency of the electric tone is greatly improved by means of the characteristic of very high analysis speed of the spectrogram.

In one embodiment, the step S103 is to fuse the spectrum information in the spectrogram to obtain the target spectrogram of the preprocessed audio, and specifically includes the following contents: combining the spectral information of a plurality of octaves in each spectrogram into the spectral information of one octave to obtain a musical scale integration spectrogram of each spectrogram; octaves of the scale integration spectrogram comprise a plurality of frequency bands; and merging the spectral data of each frequency band of the multiple musical scale integrated spectrograms to obtain a target spectrogram of the preprocessed audio.

Specifically, each spectrogram may include a preset value, such as 6 octaves, where each octave may include 12 frequency bands, and each spectrogram includes spectral information of 72 frequency bands. The server can integrate the data on the 6 octaves (i.e. 72 frequency bands) in each spectrogram to the 1 octaves (i.e. 12 frequency bands), so that the server obtains a musical scale integrated spectrogram corresponding to each spectrogram. The server determines the same frequency band in each scale integration spectrogram, and then superimposes the spectrum information of the same frequency band in each scale integration spectrogram, so that the server obtains 1 target spectrogram containing 12 frequency bands. The spectrum information on each frequency band of the target spectrogram can be in the form of an array.

In this embodiment, spectrum information of a plurality of octaves in each spectrum chart is fused into spectrum information of one octave, so as to obtain a scale integration spectrum chart of each spectrum chart; and then, the spectrum data of each frequency band of the multiple scale integration spectrograms are fused into a target spectrogram, so that the data integration of the multiple spectrograms is realized, the server can rapidly analyze and obtain the electric tone of the audio to be analyzed from the target spectrogram, and the efficiency of processing and obtaining the electric tone is effectively improved without analyzing the electric tone of each spectrogram one by one and determining the final electric tone.

In one embodiment, the step S104 determines, according to the spectral information of each frequency band in the target spectrogram, the electric tone of the audio to be analyzed, which specifically includes: performing basic tone analysis on the spectrum information of each frequency band in the target spectrogram to obtain the importance of each frequency band; and screening target tones meeting the preset importance condition from the tones corresponding to the frequency bands according to the importance of each frequency band, and taking the target tones as the electric tone base of the audio to be analyzed.

Wherein the importance is used for measuring the importance of the tone corresponding to each frequency band. The preset importance condition refers to a determination condition set for the importance of each frequency band.

Specifically, the server performs a key analysis on the spectrum information of each frequency band in the target spectrogram, which may be to use a key classifier to evaluate importance levels of the big and small tones of each frequency band (e.g. 12 frequency bands) in the target spectrogram, and then the server calculates a plurality of importance levels (e.g. 24 importance levels). And then the server screens out target tones meeting the preset importance condition from the tones corresponding to the frequency bands, wherein the tones corresponding to the highest importance are screened out from the tones of the target spectrogram according to the importance of each frequency band to serve as target tones, and then the target tones are determined to be the electric tone base of the audio to be analyzed.

For example, assuming that the server calculates the importance of the major and minor tones of 12 frequency bands to obtain 24 importance levels in total, if the highest importance level is the major tone of the 3 rd frequency band in the target spectrogram, the tone of the major tone of the 3 rd frequency band is used as the electric tone of the audio to be analyzed.

It can be appreciated that, compared with the method of analyzing the electric tone of each segment of the audio to be analyzed one by one, the method of selecting the target tone with the highest importance as the electric tone of the whole audio to be analyzed in the embodiment can effectively shorten the time for acquiring the electric tone of the audio to be analyzed, thereby greatly improving the efficiency of acquiring the electric tone, and has higher accuracy.

In the embodiment, the importance of each frequency band is obtained by performing a key analysis on the spectrum information of each frequency band in the target spectrogram; and then, according to the importance of each frequency band, screening out target tones meeting the preset importance condition from the tones corresponding to the frequency band, and taking the target tones as the electric tone base of the audio to be analyzed, so that the target tones can be effectively and rapidly screened out from the target spectrogram, and the electric tone base of the audio to be analyzed can be rapidly and accurately acquired.

In one embodiment, the step S102 performs spectral conversion on the preprocessed audio according to the frequency domain information of the preprocessed audio to obtain a spectrogram of the preprocessed audio, and specifically includes the following contents: acquiring target transformation information from the frequency domain information of the preprocessed audio according to the offset position in the spectrum converter; the frequency domain information is obtained by performing frequency spectrum analysis processing and Fourier transform processing on the preprocessed audio; fusing the frequency domain information and the target transformation information to obtain the spectrum information of the preprocessed audio; and converting the spectral information of the preprocessed audio into a spectrogram.

Wherein the spectral converter is used for converting the preprocessed audio into spectral information.

Specifically, the server acquires a spectrum analyzer, wherein the spectrum analyzer at least comprises a spectrum converter and a temporary window buffer zone. The spectrum converter comprises a container consisting of 6 octaves (each octave comprises 12 frequency bands) and 72 frequency bands, and the container is used for storing offset positions corresponding to frequency domain information; the spectral converter also comprises a two-dimensional array for storing bass and treble.

The server processes the data in the temporary window buffer area and the preprocessed audio through the spectrum analyzer, or multiplies the data in the temporary window buffer area and the preprocessed audio, then the server performs fourier transform on the processed data through the fourier transform adapter (such as the fftdapter), and then the server obtains frequency domain information of the preprocessed audio. The server acquires target transformation information corresponding to the offset position from the frequency domain information of the preprocessed audio according to the offset position in the spectrum converter; then fusing the frequency domain information and the target transformation information, namely adding the frequency domain information and the target transformation information, and obtaining the spectrum information of the preprocessed audio by the server; and drawing the spectral information of the preprocessed audio into a spectrogram.

In practical application, the server may also use every 0×4000 preprocessed audio as a sound segment, process the processed audio to obtain spectral information of each sound segment, and then draw the spectral information of each sound segment into a spectrogram.

In this embodiment, target transformation information is first obtained from frequency domain information of the preprocessed audio according to an offset position in the spectrum converter; then fusing the frequency domain information and the target transformation information to obtain the spectrum information of the preprocessed audio; finally, the spectral information of the preprocessed audio is converted into a spectrogram, so that the conversion process from the preprocessed audio to the spectrogram is realized, and further the subsequent step of electric tone basic analysis can be executed based on the spectrogram.

In one embodiment, as shown in fig. 2, step S101 described above, the pre-processed audio of the audio to be analyzed is obtained, which specifically includes the following contents:

step S201, compressing and restoring the audio to be analyzed to obtain the original audio of the audio to be analyzed.

Step S202, according to the reduction factors corresponding to the original audio, the original audio is subjected to reduction processing, and the preprocessed audio of the original audio is obtained.

Wherein, the native audio refers to the audio data which is not processed and compressed. The reduction factor is used to control the reduction strength of the native audio, for example, the reduction factor may be set to 10, 5, etc.

Specifically, the audio data is in a more format, for example, the audio data in the mp3 format is compressed, so that part of the data may be lost or not fully processed due to compression processing, and the server firstly performs compression restoration processing on the audio to be analyzed to restore the audio to be analyzed to the original audio through the compression restoration processing, which may be implemented through an audio processing library (such as a ffmpeg library). The server sets the reduction factor according to the number of the primary audio, and then reduces the primary audio by using the reduction factor, for example, the waveform and the sound channel of the primary audio can be reduced, and the server obtains the preprocessed audio of the primary audio.

In the embodiment, the original audio of the audio to be analyzed is obtained by performing compression reduction treatment on the audio to be analyzed; and then, according to the reduction factors corresponding to the original audio, the original audio is subjected to reduction processing to obtain the preprocessed audio of the original audio, so that the preprocessing of the audio to be analyzed is realized, the quality of the audio can be improved and richer audio information can be obtained by restoring the audio to be analyzed into the original audio, the data volume required to be calculated can be reduced by reducing the original audio, and the analysis efficiency of the electric tone basic tone is improved.

In one embodiment, the step S202 performs reduction processing on the native audio according to the reduction factor corresponding to the native audio, to obtain the preprocessed audio of the native audio, which specifically includes the following contents: carrying out channel fusion processing on the original audio to obtain processed audio; the number of channels of the processed audio is less than that of the original audio; according to the reduction factors corresponding to the original audio, filtering the processed audio to obtain preprocessed audio of the original audio; the reduction factor is set according to the number of native audio.

The processed audio refers to audio data after channel reduction.

Specifically, the server fuses the multi-channel data (such as the two-channel data) of the primary audio into the mono data, and may average the multi-channel data, the server obtains the processed audio, so that the number of channels of the processed audio is less than the number of channels of the primary audio. The server obtains the reduction factor according to the number of the native audios, for example, if the number of the native audios is large, the reduction factor may be set to be larger, for example, 10, and if the number of the native audios is small, the reduction factor may be set to be smaller, for example, 5. And then the server acquires a low-pass filter, and performs low-pass filtering on the processed audio according to the reduction factor, so that the server obtains the preprocessed audio of the original audio.

In this embodiment, the processed audio is obtained by performing channel fusion processing on the native audio; and then, according to the reduction factor corresponding to the original audio, the processed audio is subjected to filtering processing to obtain the preprocessed audio of the original audio, so that the reduction of the data size of the original audio is realized, the data size required to be processed in the subsequent electric tone basic tone analysis step can be effectively reduced, and the analysis efficiency of the electric tone basic tone is further improved.

In one embodiment, the step S201 is performed to compress and restore the audio to be analyzed to obtain the native audio of the audio to be analyzed, and specifically includes the following steps: processing the format information and the data stream information of the audio to be analyzed through an audio processing library; according to the format information and the data stream information, decoding the audio to be analyzed to obtain decoded audio of the audio to be analyzed; and resampling the decoded audio to obtain the original audio of the audio to be analyzed.

The audio processing library is a program library for recording and converting audio data; the audio processing library comprises a plurality of instruction functions for processing audio data. The data stream information refers to information describing a storage size, a compression amplitude, and the like of the audio data.

Specifically, the server acquires information about format_context format of the audio to be analyzed by calling a format acquisition instruction (such as an format_open_input) in an audio processing library (such as a ffmpeg library), and then the server obtains format information of the audio to be analyzed. Meanwhile, the server calls a data stream acquisition instruction (such as an audio_find_stream_info) in the audio processing library to acquire data stream information of the audio to be analyzed. The server retrieves the decoder by invoking a decoder lookup instruction (such as average find decoder) in the audio processing library. The server may also call up a resampler set instruction (e.g., swr _alloc_set_ops) in the audio processing library to generate a resampler. At this time, the server acquires the relevant data, decoder and resampler necessary for restoring the audio to be analyzed.

Further, the server extracts at least one frame of audio frames from the native audio. For example, the server may call an av_init_packet instruction function in the audio processing library to initialize the native audio, and obtain the initialized audio; the server then invokes an av_read_frame instruction function in the audio processing library to read at least one frame of audio frame from the initialized audio. The server calls an audio frame transmission instruction (such as an average_send_packet) in the audio processing library, and sends at least one frame of audio frame to the decoder obtained by the decoder searching instruction, so that the decoder can decode the at least one frame of audio frame. The server invokes a data read instruction (such as average_receive_frame) in the audio processing library, reads the decoded audio data from the decoder, and obtains decoded audio of the audio to be analyzed. The server carries out resampling processing on the decoded audio through the generated resampler, and calls a data stream format instruction (such as swr _get_out_samples swr_converter) in an audio processing library, so that the resampled data is converted into a native data stream format, and the server obtains the native audio of the audio to be analyzed.

In the embodiment, format information and data stream information of audio to be analyzed are obtained through processing by an audio processing library; then, according to the format information and the data stream information, decoding the audio to be analyzed to obtain decoded audio of the audio to be analyzed; the method comprises the steps of carrying out resampling processing on decoded audio to obtain the original audio of the audio to be analyzed, and realizing the restoration of the original data stream of the audio to be analyzed, so that the original audio is used as a processing basis to execute a subsequent simplifying processing step.

In one embodiment, as shown in fig. 3, another method for analyzing a tone of a voice is provided, which is described by taking the application of the method to a server as an example, and includes the following steps:

step S301, format information and data stream information of audio to be analyzed are obtained through processing through an audio processing library; and decoding the audio to be analyzed according to the format information and the data stream information to obtain decoded audio of the audio to be analyzed.

Step S302, resampling processing is carried out on the decoded audio to obtain the original audio of the audio to be analyzed.

Step S303, carrying out channel fusion processing on the original audio to obtain processed audio; the number of channels of the processed audio is less than the number of channels of the native audio.

Step S304, filtering the processed audio according to the reduction factor corresponding to the original audio to obtain preprocessed audio of the original audio; the reduction factor is set according to the number of native audio.

Step S305, acquiring target transformation information from the frequency domain information of the preprocessed audio according to the offset position in the spectrum converter; the frequency domain information is obtained by performing spectral analysis processing and fourier transform processing on the preprocessed audio.

Step S306, the frequency domain information and the target transformation information are fused to obtain the spectrum information of the preprocessed audio; and converting the spectral information of the preprocessed audio into a spectrogram.

Step S307, merging the spectrum information of a plurality of octaves in each spectrogram into spectrum information of one octave to obtain a musical scale integration spectrogram of each spectrogram; the octave of the scale integration spectrogram comprises a plurality of frequency bands.

Step S308, the spectrum data of each frequency band of the multiple musical scale integrated spectrograms are fused, and the target spectrogram of the preprocessed audio is obtained.

Step S309, performing a key analysis on the spectrum information of each frequency band in the target spectrogram to obtain the importance of each frequency band; and screening target tones meeting the preset importance condition from the tones corresponding to the frequency bands according to the importance of each frequency band, and taking the target tones as the electric tone base of the audio to be analyzed.

The above-mentioned electric tone basic tone analysis method can realize the following beneficial effects: the method has the advantages that the user does not need to search the electric tone of the song manually, the preprocessed audio of the audio to be analyzed can be converted into the target spectrogram, the electric tone of the audio to be analyzed is analyzed through the target spectrogram, and the efficiency of acquiring the electric tone is greatly improved by means of the characteristic of very high analysis speed of the spectrogram.

In order to more clearly clarify the electric tone analysis method provided by the embodiments of the present disclosure, the electric tone analysis method will be specifically described below with reference to a specific embodiment. The method for analyzing the electric tone basic tone can be applied to a server and specifically comprises the following steps:

and the terminal records the audio to be analyzed, which needs to analyze the electric tone, and sends the audio to be analyzed to the server. The server restores the audio to be analyzed to native audio through the functions provided by the ffmpeg library. Then the server carries out average processing on the binaural data of the original audio to obtain mono processed audio; according to the data volume of the mono processed audio, a reduction factor is set, and then the reduction factor is utilized to carry out low-pass filtering on the processed audio, so that the server obtains the preprocessed audio.

The server draws the preprocessed audio into a spectrogram. The server determines the number of octaves in the pre-processed audio and the number of frequency bands in each octave. The server fuses the spectrum data of a plurality of octaves in each spectrogram to one octave in an average way, and then fuses all spectrograms into a target spectrogram in the same frequency band superposition way. And finally, the server evaluates the importance degree of the big and small tones of each frequency band in the target spectrogram through a keyClassification of the keytone classifier, and then screens out the tone corresponding to the highest importance degree from the tones of the target spectrogram as a target tone, and further determines the target tone as the electric tone of the audio to be analyzed.

Further, the server may return the electric key to the terminal for the audio host to which the terminal is connected to complete modification of the key scale of the song based on the electric key.

In the embodiment, the user does not need to search the electric tone of the song manually, and does not need to consider whether the searched electric tone is accurate, so that the electric tone of the audio to be analyzed can be analyzed through the target spectrogram of the audio to be analyzed, and the method has the advantages of high analysis speed and high accuracy.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides an electric tone analysis device for realizing the electric tone analysis method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the electric tone analysis device or devices provided below may be referred to the limitation of the electric tone analysis method hereinabove, and will not be repeated here.

In one embodiment, as shown in fig. 4, there is provided an electric tone analyzing apparatus 400 including: an audio reduction module 401, a spectrum conversion module 402, a spectrum fusion module 403, and a mood determination module 404, wherein:

the audio reduction module 401 is configured to obtain preprocessed audio of the audio to be analyzed; the preprocessed audio is at least smaller in number of channels than the number of channels of the native audio.

The spectrum conversion module 402 is configured to perform spectrum conversion on the preprocessed audio according to the frequency domain information of the preprocessed audio, so as to obtain a spectrogram of the preprocessed audio.

The spectrum fusion module 403 is configured to fuse spectrum information in the spectrogram to obtain a target spectrogram of the preprocessed audio; the target spectrogram contains spectral information of a plurality of frequency bands of the preprocessed audio.

The key determining module 404 is configured to determine an electric key of the audio to be analyzed according to the spectral information of each frequency band in the target spectrogram.

In one embodiment, the spectrum fusion module 403 is further configured to fuse the spectrum information of multiple octaves in each spectrogram into spectrum information of one octave, so as to obtain a musical scale integrated spectrogram of each spectrogram; octaves of the scale integration spectrogram comprise a plurality of frequency bands; and merging the spectral data of each frequency band of the multiple musical scale integrated spectrograms to obtain a target spectrogram of the preprocessed audio.

In one embodiment, the key determining module 404 is further configured to perform a key analysis on the spectrum information of each frequency band in the target spectrogram to obtain importance of each frequency band; and screening target tones meeting the preset importance condition from the tones corresponding to the frequency bands according to the importance of each frequency band, and taking the target tones as the electric tone base of the audio to be analyzed.

In one embodiment, the spectrum conversion module 402 is further configured to obtain target transformation information from the frequency domain information of the preprocessed audio according to the offset position in the spectrum converter; the frequency domain information is obtained by performing frequency spectrum analysis processing and Fourier transform processing on the preprocessed audio; fusing the frequency domain information and the target transformation information to obtain the spectrum information of the preprocessed audio; and converting the spectral information of the preprocessed audio into a spectrogram.

In one embodiment, the audio reduction module 401 is further configured to perform compression reduction processing on the audio to be analyzed to obtain a native audio of the audio to be analyzed; according to the corresponding reduction factors of the original audio, carrying out reduction processing on the original audio to obtain preprocessed audio of the original audio; the reduction factor is used to control the degree of reduction to the native audio.

In one embodiment, the electric tone analysis apparatus 400 further includes a channel fusion module, configured to perform channel fusion processing on the native audio to obtain processed audio; the number of channels of the processed audio is less than that of the original audio; according to the reduction factors corresponding to the original audio, filtering the processed audio to obtain preprocessed audio of the original audio; the reduction factor is set according to the number of native audio.

In one embodiment, the electric tone analysis apparatus 400 further includes a compression and restoration module, configured to obtain format information and data stream information of the audio to be analyzed through processing by the audio processing library; according to the format information and the data stream information, decoding the audio to be analyzed to obtain decoded audio of the audio to be analyzed; and resampling the decoded audio to obtain the original audio of the audio to be analyzed.

The respective modules in the above-described electric tone analyzing apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 5. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer equipment is used for storing data such as audio to be analyzed, preprocessed audio, optical frequency spectrograms and the like. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of electric tone base analysis.

It will be appreciated by those skilled in the art that the structure shown in fig. 5 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method of electrical tone analysis, the method comprising:

2. The method of claim 1, wherein the fusing the spectral information in the spectrogram to obtain the target spectrogram of the preprocessed audio comprises:

3. The method of claim 1, wherein said determining the electrical key of the audio to be analyzed based on the spectral information of each frequency band in the target spectral graph comprises:

4. The method according to claim 1, wherein the performing spectral conversion on the preprocessed audio according to the frequency domain information of the preprocessed audio to obtain the spectrogram of the preprocessed audio comprises:

5. The method of claim 1, wherein the obtaining pre-processed audio of the audio to be analyzed comprises:

6. The method of claim 5, wherein the compacting the native audio according to the compaction factor corresponding to the native audio to obtain the preprocessed audio of the native audio comprises:

7. The method of claim 5, wherein the compressing and restoring the audio to be analyzed to obtain the native audio of the audio to be analyzed comprises:

8. An electrical tone analysis apparatus, the apparatus comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.