CN116259292B

CN116259292B - Method, device, computer equipment and storage medium for identifying basic harmonic musical scale

Info

Publication number: CN116259292B
Application number: CN202310298139.8A
Authority: CN
Inventors: 王佳乐
Original assignee: Guangzhou Ziyun Technology Co ltd
Current assignee: Guangzhou Ziyun Technology Co ltd
Priority date: 2023-03-23
Filing date: 2023-03-23
Publication date: 2023-10-20
Anticipated expiration: 2043-03-23
Also published as: CN116259292A

Abstract

The application relates to a method and a device for identifying basic harmonic musical scales. The method comprises the following steps: identifying song identification information of songs to be identified in the client; playing and recording the song to be identified under the condition that the first database does not contain the basic tone corresponding to the song identification information so as to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; under the condition that recorded audio with a first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate bases and candidate scales of songs to be recognized, and returning the candidate bases and the candidate scales to the client; and under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client. The method can improve the recognition efficiency of basic harmonic musical scales.

Description

Method, device, computer equipment and storage medium for identifying basic harmonic musical scale

Technical Field

The present application relates to the field of audio technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for identifying a basic harmonic scale.

Background

Electric tones, also called electronic music. With the development of electronic music technology, more and more music lovers begin to use electronic musical instruments and electronic music technology to make music, and electric tones bring new music art experiences to listeners.

Currently, music producers need to go to hundred degrees or other manually search the keys and steps of songs on a browser when making a musical tune. However, the browser provides a large number of versions of the fundamental tone and the musical scale of the song, and is inaccurate, resulting in the music producer spending a lot of time acquiring the fundamental tone and the musical scale of the song, which is still inaccurate.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, a computer-readable storage medium, and a computer program product for identifying a fundamental harmonic scale that can improve the efficiency of identifying the fundamental harmonic scale of a song.

In a first aspect, the present application provides a method for identifying a basis harmonic scale. The method comprises the following steps:

identifying song identification information of songs to be identified in the client;

Under the condition that the first database does not contain the basic tone corresponding to the song identification information, playing and recording the song to be identified to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; the first preset duration is smaller than the second preset duration;

under the condition that the recorded audio with the first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate bases and candidate scales of the songs to be recognized, and returning the candidate bases and the candidate scales to the client;

and under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client so that the client can update the candidate key and the candidate scale according to the target key and the target scale.

In one embodiment, after identifying song identification information of a song to be identified in the client, the method further includes:

Under the condition that the first database contains the basic tone and the musical scale corresponding to the song identification information, the basic tone and the musical scale obtained by inquiry are respectively used as the target basic tone and the target musical scale of the song to be identified;

and returning the target base and the target scale to the client.

In one embodiment, identifying song identification information of a song to be identified in a client includes:

playing and recording the song to be identified to obtain recording audio with a third preset duration, wherein the recording audio is used as audio to be analyzed of the song to be identified;

under the condition that a target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is obtained from the fingerprint information of the second database, song identification information corresponding to the target audio fingerprint is used as song identification information of the song to be identified; the second database also stores the basic parameter groups of each song.

In one embodiment, after the song identification information corresponding to the target audio fingerprint is used as the song identification information of the song to be identified, the method further includes:

determining a target climax segment of the song to be identified according to the audio to be analyzed;

Acquiring a basic parameter set of the song to be identified from the second database;

determining the to-be-updated basic tone of the song climax part according to the time range of the target climax part of the song to be identified and the basic tone group;

and sending the time range of the target climax part and the key to be updated to the client so that the client can update the target key of the song to be identified in the time range of the target climax part based on the key to be updated.

In one embodiment, after playing and recording the song to be identified to obtain the recorded audio with the third preset duration, the method further includes:

under the condition that a target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is not obtained from the second database, playing and recording the song to be identified, and obtaining target recorded audio of the song to be identified; the duration of the target recorded audio is equal to the duration of the song to be identified;

performing key analysis processing on the target recorded audio to obtain a key group of the target recorded audio;

And storing the basic factor group of the target recorded audio, the audio fingerprint of the target recorded audio and song identification information of the target recorded audio to the second database.

In one embodiment, performing a key analysis process on the target recorded audio to obtain a key set of the target recorded audio, including:

extracting an audio frame of the target recorded audio to obtain an audio frame of the target recorded audio;

and performing key analysis processing on the audio frames to obtain the keys of the audio frames, and obtaining a key set of the target recorded audio from the key of each audio frame.

and reading and obtaining song identification information of the song to be identified according to the message in the client through a program interface of the client.

In a second aspect, the application further provides a device for identifying the key and the tone. The device comprises:

the identification acquisition module is used for identifying song identification information of songs to be identified in the client;

the song recording module is used for playing and recording the song to be identified under the condition that the first database does not contain the basic and musical scales corresponding to the song identification information, so as to obtain recorded audio with a first preset duration and recorded audio with a second preset duration; the first preset duration is smaller than the second preset duration;

The first processing module is used for performing key recognition and scale recognition on the recorded audio with the first preset duration under the condition of obtaining the recorded audio with the first preset duration, obtaining candidate bases and candidate scales of the songs to be recognized, and returning the candidate bases and the candidate scales to the client;

and the second processing module is used for performing key recognition and scale recognition on the recorded audio with the second preset duration under the condition of obtaining the recorded audio with the second preset duration, obtaining a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client so that the client can update the candidate key and the candidate scale according to the target key and the target scale.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

The identification method, the identification device, the computer equipment, the storage medium and the computer program product for the basic harmonic musical scale identify song identification information of songs to be identified in the client; under the condition that the first database does not contain the basic tone corresponding to the song identification information, playing and recording the song to be identified to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; under the condition that recorded audio with a first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate bases and candidate scales of songs to be recognized, and returning the candidate bases and the candidate scales to the client; and under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client, so that the client updates the candidate key and the candidate scale according to the target key and the target scale. By adopting the method, the candidate key and the candidate musical scale can be obtained through the first preset time length recorded audio identification with shorter time length, so that the obtaining efficiency of the key and the musical scale is improved, then the more accurate target key and the target musical scale are obtained through the second preset time length recorded audio analysis, and the accuracy of the key and the musical scale obtained through identification is improved while the obtaining efficiency of the key and the musical scale is ensured.

Drawings

FIG. 1 is an application environment diagram of a method of identifying basis and scale in one embodiment;

FIG. 2 is a flow chart of a method for identifying a basis and a scale in one embodiment;

FIG. 3 is a schematic diagram of a method for identifying a basis and a scale in one embodiment;

FIG. 4 is a flow chart of the steps to update a key for determining a climax part of a song, in one embodiment;

FIG. 5 is a flowchart of a method for identifying a fundamental tone scale according to another embodiment;

FIG. 6 is a schematic diagram of an interface of a client while waiting for a server to identify a base and a scale according to another embodiment;

FIG. 7 is a schematic diagram of an interface of a client receiving a target key and a target scale according to yet another embodiment;

FIG. 8 is a block diagram of an apparatus for identifying a basis and a scale in one embodiment;

fig. 9 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The identification method of the basic tone scale provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the client 101 communicates with the server 102 via a network. The data storage system may store data that the server 102 needs to process. The data storage system may be integrated on the server 102 or may be located on a cloud or other network server. The server 102 identifies song identification information of songs to be identified in the client 101; under the condition that the first database does not contain the basic tone corresponding to the song identification information, playing and recording the song to be identified to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; the first preset time length is smaller than the second preset time length; under the condition that the recorded audio with the first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate bases and candidate scales of songs to be recognized, and returning the candidate bases and the candidate scales to the client 101; and under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client 101, so that the client 101 updates the candidate key and the candidate scale according to the target key and the target scale. The client refers to a terminal with a music playing function and a recording function, and the client 101 can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices can be smart speakers, smart televisions, smart vehicle devices and the like. The portable wearable device may be a smart watch, smart bracelet, or the like. The server 102 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In one embodiment, as shown in fig. 2, a method for identifying a basic scale and a musical scale is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

step S201, identifying song identification information of a song to be identified in the client.

Wherein, the song to be identified refers to the song of which the basic tone scale is required to be identified; in this embodiment of the method, the song to be identified may refer to a song currently being played in the client. Song identification information refers to unique identification information that can distinguish or determine songs; for example, the song identification information may be a song name, or may be an id number of the song, or may be a song name and a singer name of the song.

Specifically, when the client needs the basic and musical scales of the song to be identified, a request can be sent to the server, and then the server identifies and obtains song identification information of the song to be identified in the client based on the received request. In addition, the client can send the song identification information to the server after locally reading the song identification information of the song to be identified.

Step S202, playing and recording songs to be identified under the condition that the first database does not contain the basic and musical scales corresponding to the song identification information, so as to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; the first preset time period is smaller than the second preset time period.

Wherein the first database refers to a database for storing the fundamental harmonic musical scale of songs. Recording audio refers to audio data obtained by a client by recording a song to be identified that is being played. The first preset time period may be 5s. The second preset time period may be 60s.

Specifically, the server queries the first database based on song identification information of songs to be identified. Fig. 3 is a schematic diagram of the above-mentioned basic tone scale identification method, as shown in fig. 3, if it is queried that the first database does not include the basic tone scale corresponding to the song identification information, for example, the song to be identified is a newly released song, so that the first database does not include relevant information of the song, for example, the song to be identified is a popular music, and the first database records the basic tone and the tone scale complement of the song, so that the relevant information of the song is not stored temporarily, the server returns a query failure message to the client. The client detects the playing state of the song to be identified according to the received inquiry failure message; if the fact that the song to be identified is currently played is detected, recording the song to be identified; if the fact that the song to be identified is not currently played is detected, carrying out playing processing and recording processing on the song to be identified; and the client obtains the recorded audio of the first preset duration and the recorded audio of the second preset duration.

It can be understood that in order to avoid the influence of excessively long duration of recorded audio on the user in performing the electric tone creation, a recorded audio with relatively short duration may be recorded first, that is, recorded to obtain the recorded audio with the first preset duration, so as to obtain the approximate basic harmony scale of the song to be identified through the identification of the recorded audio with the first preset duration, so that the user can conveniently take the basic harmony scale in time to perform the electric tone creation, and the obtaining efficiency of the basic harmony scale is improved. In consideration of the fact that the recorded audio with a short duration may be just a special segment, one recorded audio with a long duration can be recorded at the same time, namely, the recorded audio with a second preset duration is recorded, so that more accurate target basic and target musical scales are obtained through analysis of the recorded audio with the second preset duration, and accuracy of basic tone and musical scales obtained through identification is improved while the obtaining efficiency of the basic and musical scales is ensured.

Step S203, under the condition that the recorded audio with the first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate base harmony candidate scales of the song to be recognized, and returning the candidate base harmony candidate scales to the client.

The candidate basic harmonic musical scale is a basic harmonic musical scale which is obtained based on preliminary identification of recorded audio with a first preset duration.

Specifically, the recording audio of the first preset duration and the recording audio of the second preset duration are recorded at the same time, and the client side immediately sends the recording audio of the first preset duration to the server after recording the recording audio of the first preset duration. And under the condition that the server receives the recorded audio with the first preset duration, respectively performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain a candidate key and a candidate scale of the song to be recognized, and then returning the candidate key and the candidate scale to the client.

Further, the client may also connect to an audio host; the audio host is software loaded with a plurality of sound effect (such as electric sound) plug-ins, and can process audio to perform music creation. After the client receives the candidate basic keys and the candidate musical scales returned by the server, the candidate basic keys and the candidate musical scales can be displayed and synchronized to an audio host, so that the audio host updates the basic keys and the musical scales of the songs to be identified according to the candidate basic keys and the candidate musical scales, and the music creation of the songs to be identified is performed.

Step S204, under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client, so that the client updates the candidate key and the candidate scale according to the target key and the target scale.

The target base and target musical scale are more accurate base and musical scale obtained based on the recorded audio identification of the second preset duration.

Specifically, since the second preset time period is longer than the first preset time period, the client records the recorded audio with the second preset time period at a later time, and then sends the recorded audio with the second preset time period to the server. And under the condition that the server receives the recorded audio with the second preset duration, continuing to perform key recognition and scale recognition on the recorded audio with the second preset duration respectively to obtain a target key and a target scale of the song to be recognized, and then sending the target key and the target scale to the client and the first database. After receiving the target key and the target scale, the client receives the target key and the target scale; if the target key and the target scale are detected to be different from the candidate key and the candidate scale, updating the candidate key and the candidate scale of the song to be identified in the audio host connected with the target key and the target scale; if the target key and the target scale are detected to be the same as the candidate key and the candidate scale, the client does not need to update the candidate key and the candidate scale of the song to be identified in the audio host at present. The first database stores song identification information, target basic and target musical scale of the audio to be identified, so that the next time the server can directly inquire from the first database to obtain the target basic and target musical scale of the song to be identified.

In practical application, experiments show that the accuracy of the candidate key and the candidate musical scale can reach 70-80%, and the accuracy of the target key and the target musical scale can reach 90-100%.

In the above-mentioned basic tone scale identification method, song identification information of songs to be identified in the client is identified; under the condition that the first database does not contain the basic tone corresponding to the song identification information, playing and recording the song to be identified to obtain recorded audio of a first preset duration and recorded audio of a second preset duration; under the condition that recorded audio with a first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate bases and candidate scales of songs to be recognized, and returning the candidate bases and the candidate scales to the client; and under the condition that the recorded audio with the second preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain a target key and a target scale of the song to be recognized, and sending the target key and the target scale to the client, so that the client updates the candidate key and the candidate scale according to the target key and the target scale. By adopting the method, the candidate key and the candidate musical scale can be obtained through the first preset time length recorded audio identification with shorter time length, so that the obtaining efficiency of the key and the musical scale is improved, then the more accurate target key and the target musical scale are obtained through the second preset time length recorded audio analysis, and the accuracy of the key and the musical scale obtained through identification is improved while the obtaining efficiency of the key and the musical scale is ensured.

In one embodiment, after identifying the song identification information of the song to be identified in the client in the above step S201, the method further includes: under the condition that the first database contains the basic tone and the musical scale corresponding to the song identification information, the basic tone and the musical scale obtained by inquiry are respectively used as the target basic tone and the target musical scale of the song to be identified; and returning the target base and the target scale to the client.

Specifically, in the case that the first database includes the basic key and the musical scale corresponding to the song identification information, the server may directly take the basic key and the musical scale corresponding to the song identification information obtained by querying from the first database as the target basic key and the target musical scale of the song to be identified, and return the target basic key and the target musical scale to the client.

In practical application, multiple experiments show that the method of inquiring the database to obtain the target key and the target scale of the song to be identified is one of the modes with the highest speed at present, can almost reach millisecond level, and has higher efficiency, better experience and higher accuracy than the method of inquiring the key and the scale of the song to be identified by a user through a browser. However, the key and scale of the song stored in the first database are not necessarily comprehensive, for example, the first database does not record the key and scale of a newly released song or a popular song, and at this time, the server may obtain the key and scale of the song to be identified through the above-described steps S202 to S204 and then return to the client.

In this embodiment, in the case that the first database includes the basic key and the scale corresponding to the song identification information, the basic key and the scale obtained by the query are respectively used as the target basic key and the target scale of the song to be identified; and then returning the target basic tone and the target musical scale to the client, so that the target basic tone and the target musical scale of the song to be identified are efficiently, quickly and accurately obtained, the basic tone and the musical scale of the song to be identified do not need to be manually searched by a user, the efficiency of obtaining the target basic tone and the target musical scale of the song to be identified is improved, and the experience of the user for obtaining the target basic tone and the target musical scale is enhanced.

In one embodiment, the step S201 identifies song identification information of the song to be identified in the client, which specifically includes the following contents: playing and recording the song to be identified to obtain recording audio with a third preset duration, wherein the recording audio is used as audio to be analyzed of the song to be identified; under the condition that a target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is obtained from the fingerprint information of the second database, song identification information corresponding to the target audio fingerprint is used as song identification information of the song to be identified; the second database also stores a set of base parameters for each song.

The third preset time period may be 5s,10s,15s. Audio fingerprinting refers to content-based, compact digital signatures that can characterize important acoustic features of an audio stream of a song.

As shown in fig. 3, the server may acquire song identification information of a song to be identified in the client by listening to the song. Specifically, the client detects the playing state of a song to be identified; if the fact that the song to be identified is currently played is detected, recording the song to be identified; if the fact that the song to be identified is not currently played is detected, carrying out playing processing and recording processing on the song to be identified; and the client obtains the recorded audio with the third preset duration. In order to distinguish between the recorded audio of the first preset duration and the recorded audio of the second preset duration, the recorded audio of the third preset duration may be simply referred to as the audio to be analyzed of the song to be identified. The client sends the audio to be analyzed to the server, the server performs audio conversion processing on the audio to be analyzed to obtain an audio fingerprint of the audio to be analyzed, and then the second database is queried based on the audio fingerprint of the audio to be analyzed. If the server queries the target audio fingerprint matched with the audio fingerprint of the audio to be analyzed from the fingerprint information of the second database, the server acquires song identification information corresponding to the target audio fingerprint and takes the song identification information as song identification information of the song to be identified.

In this embodiment, a song to be identified is played and recorded through a client, and a server obtains recorded audio with a third preset duration as audio to be analyzed of the song to be identified; under the condition that the server obtains the target audio fingerprint matched with the audio fingerprint of the audio to be analyzed from the fingerprint information of the second database, song identification information corresponding to the target audio fingerprint is used as song identification information of the song to be identified, so that the server inquires the song identification information of the song to be identified through the audio fingerprint of the audio to be analyzed, and therefore under the condition that a user does not know the song identification information of the song to be identified, the server in the embodiment can automatically acquire the song identification information of the song to be identified, and the acquisition efficiency and accuracy of the song identification information are improved.

In one embodiment, as shown in fig. 4, after the song identification information corresponding to the target audio fingerprint is used as the song identification information of the song to be identified, the method further includes:

step S401, determining a target climax piece of the song to be identified according to the audio to be analyzed.

Specifically, after receiving the audio to be analyzed, the server determines the current playing progress of the song to be identified through the audio to be analyzed, which may be that the audio to be analyzed and the song to be identified are subjected to audio matching to determine the position of the audio to be analyzed in the song to be identified, thereby obtaining the current playing progress of the song to be identified. The current playing progress of the song to be identified can also be determined according to the time information of the audio stream corresponding to the target audio fingerprint obtained by matching in the second database. The current playing progress refers to information capable of identifying the start and stop time of the audio to be analyzed in the song to be identified; the current playing progress may contain information of the start-stop time, lyric text, melody change, audio waveform, etc. of the audio to be analyzed.

Further, the server determines a target climax fragment of the song to be identified according to the current playing progress; wherein the time range of the target tide segment is later than the current playing progress. The server may query the second database for a time range of at least one climax clip of the song to be identified; and then the server screens out a first climax fragment positioned after the current playing progress from the at least one climax fragment according to the time range of the at least one climax fragment, and the first climax fragment is used as a target climax fragment. The climax segment refers to a segment in which music in a song is shocked, and also refers to a segment in which the music emotion in the whole song is most infectious. The climax part in the song is at least 1.

In step S402, a set of key parameters of the song to be identified is obtained from the second database.

The key group refers to the group data composed of the keys corresponding to each frame of audio clips (or audio frames) of the entire song.

Specifically, the second database stores therein a plurality of key sets of songs. The server may query the second database for a set of key values for the target audio fingerprint as a set of key values for the song to be identified.

Step S403, determining the to-be-updated key of the song climax part according to the time range and the key group of the target climax part of the song to be identified.

Specifically, the server screens out the to-be-updated basic tone corresponding to the target climax segment from the basic tone group according to the time range of the target climax segment.

For example, assume that 3 climax clips are included in song a to be identified, and the climax clips are respectively located in 1 minute to 1 minute and 20 seconds (marked as clip 1), 2 minutes and 15 seconds to 2 minutes and 40 seconds (marked as clip 2), and 3 minutes and 10 seconds to 3 minutes and 20 seconds (marked as clip 3) of song a to be identified. If the end time of the current playing progress is 1 minute and 40 seconds, the server can take the segment 2 as a target tide segment, and then screen out the basic tone of the audio frame located between 2 minutes and 15 seconds and 2 minutes and 40 seconds from the basic tone group to be used as the basic tone to be updated.

Step S404, the time range of the song climax part and the key to be updated are sent to the client side, so that the client side can update the target key of the song to be identified in the time range of the song climax part based on the key to be updated.

The key to be updated refers to the key for updating the target climax part of the song to be identified.

Specifically, after obtaining the time range of the target climax segment and the key to be updated, the server may send the time range of the target climax segment and the key to be updated to the client, so that the client updates the target key of the song to be identified in the time range of the target climax segment to the key to be updated in the audio host.

In practical application, the climax part of a song is taken as an important part of the melody of the whole song, and the key of the climax part is sometimes different from the key and the tone of the whole song, so that the key of the target climax part of the song in the future can be dynamically changed through the current playing progress, and the situation that an audio host uses an incorrect key to process the target climax part of the song to be identified is avoided. By way of illustration, assume that the target key for song B is the C key, but the actual key for the target climax clip for song B is the G key; if the updating process of the target key of the target climax part is lacking, the audio host can use the C key to process the electric tone of the whole song of the song B, so that the electric tone of the song B in the target climax part is inaccurate. By dynamically adjusting the target key of the target climax part of song B in the audio host to the G key, the electric sound effect of song B can be improved.

In the embodiment, a target climax part and a basic factor group of a song to be identified are determined first; then determining the to-be-updated key of the song climax part according to the time range and the key group of the target climax part of the song to be identified; and then the time range of the song climax part and the key to be updated are sent to the client so that the client can update the target key of the song to be identified in the time range of the song climax part based on the key to be updated, thereby realizing the dynamic adjustment of the electric key of the song to be identified and greatly improving the electric sound effect of the song to be identified.

In one embodiment, after playing and recording the song to be identified to obtain the recorded audio with the third preset duration, the method further includes: under the condition that a target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is not obtained from the second database, playing and recording the song to be identified, and obtaining target recorded audio of the song to be identified; the duration of the target recorded audio is equal to the duration of the song to be identified; performing key analysis processing on the target recorded audio to obtain a key group of the target recorded audio; and storing the basic factor group of the target recorded audio, the audio fingerprint of the target recorded audio and song identification information of the target recorded audio into a second database.

The target recording audio refers to the audio data of the whole song of the song to be identified obtained by recording.

Specifically, as shown in fig. 3, if the server does not query the second database for a target audio fingerprint matching the audio fingerprint of the audio to be analyzed, the server generates a query failure message and sends the query failure message to the client. The client detects the playing state of the song to be identified according to the received inquiry failure message; if the fact that the song to be identified is currently played is detected, recording the whole song of the song to be identified by the client; if the fact that the song to be identified is not currently played is detected, carrying out playing processing and recording processing on the whole song of the song to be identified; and the client obtains the target recorded audio of the song to be identified and sends the target recorded audio to the server. The server analyzes and processes the received target recorded audio to obtain a base tone group of the target recorded audio; the server can also receive song identification information of the song to be identified, which is acquired by the client through the program interface; the server performs audio conversion processing on the target recorded audio to obtain an audio fingerprint of the target recorded audio; and the server stores the basic factor group, the audio fingerprint and the song identification information of the target recorded audio to the second database, so that the server can acquire the song identification information of the song to be identified through the audio fingerprint of the target recorded audio stored in the second database.

In this embodiment, under the condition that a target audio fingerprint matched with an audio fingerprint of the audio to be analyzed is not obtained from the second database, playing and recording the song to be identified, so as to obtain a target recorded audio of the song to be identified; then, performing key analysis processing on the target recorded audio to obtain a key set of the target recorded audio; and then the basic tone group of the target recorded audio, the audio fingerprint of the target recorded audio and the song identification information of the target recorded audio are stored in the second database, so that the storage backup of the audio fingerprint, the basic tone group and the song identification information of the song to be identified is realized, and the server can directly inquire the required information from the second database when identifying the song to be identified next time, thereby improving the identification efficiency of the song to be identified.

In one embodiment, a key analysis process is performed on the target recorded audio to obtain a key set of the target recorded audio, which specifically includes the following contents: extracting an audio frame of the target recorded audio to obtain an audio frame of the target recorded audio; and performing key analysis processing on the audio frames to obtain the keys of the audio frames, and obtaining a key set of the target recorded audio from the keys of each audio frame.

Specifically, the server segments the target recording audio into a plurality of audio frames according to a preset segmentation unit. And then the server performs the key analysis processing on each audio frame to obtain the key of each audio frame. And constructing a basic tone group of the target recorded audio from the basic tone of each audio frame. For example, if the server analyzes the key=bb of the first audio frame and analyzes the key=a of the second audio frame, the key group [ Bb, a ] may be constructed from the keys of the first audio frame and the second audio frame.

In this embodiment, an audio frame of the target recorded audio is obtained by extracting an audio frame of the target recorded audio; and performing key analysis processing on the audio frames to obtain the key of each audio frame, and obtaining a key group of the target recorded audio from the key of each audio frame, so that reasonable acquisition of the key group of the target recorded audio is realized, and the key group can be used as a processing basis to execute the subsequent step of determining the key to be updated.

In one embodiment, the step S201 identifies song identification information of the song to be identified in the client, which specifically includes the following contents: and reading and obtaining song identification information of the song to be identified according to the message in the client through a program interface of the client.

The program interface may be an API (Application Programming Interface ) of the client.

Specifically, in addition to identifying song identification information through the audio fingerprint in the second database, the client may also identify a message in the client by calling a program interface, such as an API of the client, where the message may carry song identification information of the song to be identified that is currently being played, and the client may further read the song identification information of the song to be identified from the message.

In this embodiment, the song identification information of the song to be identified is read according to the message in the client through the program interface of the client, so that the song identification information can be obtained more quickly and more conveniently than the above-mentioned method of identifying the song identification information through the audio fingerprint, but the program interface of the client cannot read the basic tone group in the second database, so that the method has the defect that the target basic tone of the target tide segment of the song to be identified cannot be updated dynamically.

In one embodiment, as shown in fig. 5, another method for identifying basic scales is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

Step S501, playing and recording the song to be identified to obtain recorded audio with a third preset duration, wherein the recorded audio is used as audio to be analyzed of the song to be identified.

In step S502, in the case that the target audio fingerprint matching the audio fingerprint of the audio to be analyzed is obtained from the fingerprint information of the second database, the song identification information corresponding to the target audio fingerprint is used as the song identification information of the song to be identified.

Step S503, determining a target climax segment of the song to be identified according to the audio to be analyzed; and acquiring a basic parameter set of the song to be identified from the second database.

Step S504, determining the to-be-updated key of the song climax part according to the time range and the key group of the target climax part of the song to be identified.

In step S505, the time range of the target climax part and the key to be updated are sent to the client, so that the client updates the target key of the song to be identified in the time range of the target climax part based on the key to be updated.

In step S506, playing and recording the song to be identified to obtain the target recorded audio of the song to be identified, if the target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is not obtained from the second database.

Step S507, performing key analysis processing on the target recorded audio to obtain a key set of the target recorded audio.

Step S508, the basic parameter group of the target recorded audio, the audio fingerprint of the target recorded audio and the song identification information of the target recorded audio are stored in the second database.

It is understood that it may be determined whether or not there is a target audio fingerprint matching the audio fingerprint of the audio to be analyzed in the second database before step S502 and step S506, so that it is determined to perform step S502 or step S506 according to the determination result.

In step S509, playing and recording the song to be identified under the condition that the first database does not include the basic and musical scales corresponding to the song identification information, so as to obtain the recorded audio of the first preset duration and the recorded audio of the second preset duration.

Step S510, under the condition that the recorded audio with the first preset duration is obtained, performing key recognition and scale recognition on the recorded audio with the first preset duration to obtain candidate base harmony candidate scales of the song to be recognized, and returning the candidate base harmony candidate scales to the client.

In step S511, in the case of obtaining the recorded audio with the second preset duration, performing key recognition and scale recognition on the recorded audio with the second preset duration to obtain the target key and the target scale of the song to be recognized, and sending the target key and the target scale to the client, so that the client updates the candidate key and the candidate scale according to the target key and the target scale.

Step S512, under the condition that the first database contains the basic tone and the musical scale corresponding to the song identification information, the basic tone and the musical scale obtained by inquiry are respectively used as the target basic tone and the target musical scale of the song to be identified; and returning the target base and the target scale to the client.

It may be understood that, before step S509 and step S512, it may be determined whether the first database includes the key and the level corresponding to the song identification information, so that step S509 or step S512 is determined to be performed according to the determination result.

The identification method of the basic harmonic musical scale can realize the following beneficial effects: the method comprises the steps of obtaining candidate basic tones and candidate musical scales through first preset duration recording audio frequency identification with shorter duration, improving the obtaining efficiency of basic tones and musical scales, obtaining more accurate target basic tones and target musical scales through second preset duration recording audio frequency analysis, and improving the accuracy of the basic tones and musical scales obtained through identification while guaranteeing the obtaining efficiency of the basic tones and musical scales.

In order to more clearly clarify the method for identifying the basic harmonic musical scales provided by the embodiment of the present disclosure, a specific embodiment is described below specifically. The method for identifying the basic tone scale can be applied to the server in fig. 1, and specifically comprises the following steps:

FIG. 6 is a schematic diagram of an interface of a client while waiting for a server to identify a base and scale. The client acquires the name of the song and the name of the singer currently played by the user, takes the name of the song and the name of the singer as song identification information and sends the song identification information to the server. If the server inquires the basic tone and the scale corresponding to the song identification information from the first database, the inquired basic tone and the inquired scale are respectively used as a target basic tone and a target scale corresponding to the song identification information, and the target tone and the target scale are returned to the client. If the server does not inquire the target key and the target scale corresponding to the song identification information from the first database, the server returns an inquiry failure message to the client. The client records the song currently being played according to the received inquiry failure message, and sends recorded audio to the server so that the server can conduct key identification and scale identification based on the received recorded audio, and therefore target key and target scale corresponding to song identification information are obtained.

Fig. 7 is an interface diagram of a client receiving a target key and a target scale. The client displays the received target key and target scale on the interface. And finally, the client synchronizes the target base and the target scale to the electric sound plug-in the connected audio host.

In this embodiment, the target key and the target scale of the song in the client are automatically obtained through the song identification information or the recorded audio, without the need of the user to manually search the key and the scale of the song in the client, the obtaining efficiency of the target key and the target scale is improved, the experience of the user in obtaining the target key and the target scale is enhanced, the target key and the target scale can be automatically synchronized into the audio host through the client, so as to complete the modification of the key and the scale of the song, and the convenience and the operation efficiency of the whole flow are improved.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a basic harmonic scale identification device for realizing the above-mentioned basic harmonic scale identification method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the device for identifying one or more basic scales and musical scales provided below may refer to the limitation of the method for identifying basic scales and musical scales hereinabove, and will not be repeated here.

In one embodiment, as shown in fig. 8, there is provided a key and tone recognition apparatus 800 including: an identification acquisition module 801, a song recording module 802, a first processing module 803, and a second processing module 803, wherein:

the identification acquisition module 801 is configured to identify song identification information of a song to be identified in the client.

The song recording module 802 is configured to play and record a song to be identified under the condition that the first database does not include the basic and musical scale corresponding to the song identification information, so as to obtain a recorded audio of a first preset duration and a recorded audio of a second preset duration; the first preset time period is smaller than the second preset time period.

The first processing module 803 is configured to, when obtaining the recorded audio with the first preset duration, perform key recognition and scale recognition on the recorded audio with the first preset duration, obtain a candidate base harmonic candidate scale of the song to be recognized, and return the candidate base harmonic candidate scale to the client.

The second processing module 804 is configured to perform key recognition and scale recognition on the recorded audio with the second preset duration when the recorded audio with the second preset duration is obtained, obtain a target key and a target scale of the song to be recognized, and send the target key and the target scale to the client, so that the client updates the candidate key and the candidate scale according to the target key and the target scale.

In one embodiment, the device 800 for identifying a key and a scale further includes a key obtaining module, configured to, when the first database includes a key and a scale corresponding to song identification information, respectively use the queried key and scale as a target key and a target scale of a song to be identified; and returning the target base and the target scale to the client.

In one embodiment, the identifier obtaining module 801 is further configured to play and record a song to be identified, so as to obtain a recorded audio with a third preset duration, which is used as the audio to be analyzed of the song to be identified; under the condition that a target audio fingerprint matched with the audio fingerprint of the audio to be analyzed is obtained from the fingerprint information of the second database, song identification information corresponding to the target audio fingerprint is used as song identification information of the song to be identified; the second database also stores a set of base parameters for each song.

In one embodiment, the identifying device 800 of the basic tone scale further includes a basic tone updating module for determining a target climax segment of the song to be identified according to the audio to be analyzed; acquiring a basic factor set of the song to be identified from a second database; determining the to-be-updated key of the song climax part according to the time range and the key group of the target climax part of the song to be identified; and sending the time range of the target climax part and the key to be updated to the client so that the client can update the target key of the song to be identified in the time range of the target climax part based on the key to be updated.

In one embodiment, the device 800 for identifying basic and musical scales further includes a fingerprint storage module, configured to play and record a song to be identified to obtain a target recorded audio of the song to be identified, where a target audio fingerprint matching an audio fingerprint of the audio to be analyzed is not obtained from the second database; the duration of the target recorded audio is equal to the duration of the song to be identified; performing key analysis processing on the target recorded audio to obtain a key group of the target recorded audio; and storing the basic factor group of the target recorded audio, the audio fingerprint of the target recorded audio and song identification information of the target recorded audio into a second database.

In one embodiment, the device 800 for identifying basic harmonic musical scales further includes an array construction module, configured to perform audio frame extraction on the target recorded audio to obtain an audio frame of the target recorded audio; and performing key analysis processing on the audio frames to obtain the keys of the audio frames, and obtaining a key set of the target recorded audio from the keys of each audio frame.

In one embodiment, the identifier obtaining module 801 is further configured to read, through a program interface of the client, song identifier information of the song to be identified according to a message in the client.

The above-described respective modules in the basic and musical scale recognition apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing song representing information, data of basic scales, etc. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external client through a network connection. The computer program, when executed by a processor, implements a method of identifying a key and a level.

It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method of identifying a basis and a scale, the method comprising:

2. The method of claim 1, further comprising, after identifying song identification information of the song to be identified in the client:

and returning the target base and the target scale to the client.

3. The method of claim 1, wherein the identifying song identification information of the song to be identified in the client comprises:

4. The method of claim 3, further comprising, after taking song identification information corresponding to the target audio fingerprint as song identification information of the song to be identified:

5. The method of claim 3, wherein playing and recording the song to be identified to obtain a recorded audio with a third preset duration, and further comprising, after the audio to be analyzed of the song to be identified:

6. The method of claim 5, wherein performing a key analysis on the target recorded audio to obtain a set of keys for the target recorded audio comprises:

7. The method of claim 1, wherein the identifying song identification information of the song to be identified in the client comprises:

8. A key and tone recognition apparatus, the apparatus comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.