CN104064180A - Singing scoring method and device - Google Patents

Singing scoring method and device Download PDF

Info

Publication number
CN104064180A
CN104064180A CN201410250107.1A CN201410250107A CN104064180A CN 104064180 A CN104064180 A CN 104064180A CN 201410250107 A CN201410250107 A CN 201410250107A CN 104064180 A CN104064180 A CN 104064180A
Authority
CN
China
Prior art keywords
song
lyric
audio data
scoring
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410250107.1A
Other languages
Chinese (zh)
Inventor
郭怀印
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yinzhibang Culture Technology Co ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201410250107.1A priority Critical patent/CN104064180A/en
Publication of CN104064180A publication Critical patent/CN104064180A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention provides a singing scoring method and device. By obtaining original audio data of a song sung by a singer, scoring parameters of the song are obtained, wherein the scoring parameters include lyric information of the song and audio feature parameters corresponding to the lyric information. Accordingly, by means of the scoring parameters, the original audio data are scored, scoring does not depend on MIDI files of the song any longer, the problem that in the prior art, some songs do not have corresponding MIDI files and accordingly singing of the songs cannot be scored can be avoided, and accordingly scoring reliability of the singing of the song is improved.

Description

Singing scoring method and device
[ technical field ] A method for producing a semiconductor device
The invention relates to an audio data processing technology, in particular to a singing scoring method and device.
[ background of the invention ]
The traditional singing scoring method utilizes a Musical Instrument Digital Interface (MIDI) file containing a standard melody of a song to be scored to compare with original audio data of the song sung by a singer so as to obtain the score of the song sung by the singer.
However, not all songs have corresponding MIDI files, and therefore, the singing scoring method in the prior art cannot obtain the singing scores of the songs, thereby causing a decrease in reliability of the singing scores of the songs.
[ summary of the invention ]
Aspects of the present invention provide a singing scoring method and apparatus, which are used to improve the reliability of song singing scoring.
In one aspect of the present invention, a singing scoring method is provided, including:
acquiring original audio data of a song sung by a singer;
obtaining scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and scoring the original audio data by using the scoring parameters.
The above-described aspect and any possible implementation manner further provide an implementation manner, where acquiring original audio data of a song sung by a singer includes:
collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
The above-described aspects and any possible implementations further provide an implementation in which the lyric information includes lyric content and a duration corresponding to the lyric content.
The above aspect and any possible implementation manner further provide an implementation manner, before the obtaining the scoring parameter of the song, further including:
obtaining the lyric information according to the lyric file of the song;
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
The above aspect and any possible implementation manner further provide an implementation manner, before the obtaining the scoring parameter of the song, further including:
obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song;
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
In another aspect of the present invention, there is provided a singing scoring apparatus, including:
the data acquisition unit is used for acquiring original audio data of a song sung by a singer;
the parameter acquisition unit is used for acquiring scoring parameters of the song, and the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and the scoring unit is used for scoring the original audio data by using the scoring parameters.
The above-mentioned aspects and any possible implementation further provide an implementation that the data acquisition unit is specifically configured to
Collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
The above-described aspects and any possible implementations further provide an implementation in which the lyric information includes lyric content and a duration corresponding to the lyric content.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the parameter obtaining unit is further configured to
Obtaining the lyric information according to the lyric file of the song; and
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the parameter obtaining unit is further configured to
Obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
According to the technical scheme, the original audio data of the song sung by the singer are obtained, and the scoring parameters of the song are further obtained, wherein the scoring parameters comprise the lyric information of the song and the audio characteristic parameters corresponding to the lyric information, so that the original audio data can be scored by using the scoring parameters without depending on MIDI files of the song, the problem that the singing scores of the songs cannot be obtained due to the fact that some songs do not have corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed in the embodiments or the prior art descriptions will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without inventive labor.
Fig. 1 is a schematic flow chart of a singing scoring method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a singing scoring device according to another embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terminal according to the embodiments of the present application may include, but is not limited to, a mobile phone, a Personal Digital Assistant (PDA), a wireless handheld device, a wireless netbook, a Personal computer, a portable computer, a tablet computer, an MP3 player, an MP4 player, a wearable device (e.g., smart glasses, a smart watch, a smart bracelet, etc.), and the like.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Fig. 1 is a schematic flow chart of a singing scoring method according to an embodiment of the present invention, as shown in fig. 1.
101. Original audio data of a song sung by a singer is acquired.
102. And acquiring scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information.
103. And scoring the original audio data by using the scoring parameters.
The execution subjects of 101 to 103 may be processing devices, may be located in a local Application (App), such as hundredth musician song, or may also be located in a server on the network side, or may also be partially located in the local Application and partially located in the server on the network side.
It should be understood that the application may be an application installed on the terminal (native app), or may also be a web page of a browser on the terminal (webAPP), as long as an objective existence form of processing of the original audio data can be implemented, which is not limited in this embodiment.
Therefore, by acquiring the original audio data of the song sung by the singer and further acquiring the scoring parameters of the song, wherein the scoring parameters comprise the lyric information of the song and the audio characteristic parameters corresponding to the lyric information, the scoring parameters can be utilized to score the original audio data without depending on MIDI files of the song, the problem that the singing scores of the songs cannot be acquired because some songs have no corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
Optionally, in a possible implementation manner of this embodiment, in 101, the processing device may specifically collect the original audio data in real time.
Specifically, the processing means may collect a sound signal of a song sung by a singer and then convert the sound signal into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
Optionally, in a possible implementation manner of this embodiment, in 101, the processing device may specifically obtain, from a storage device, an audio file that is recorded in advance and the singer sings the song, and then decode the audio file to obtain the original audio data.
The Audio file may include Audio files in various encoding formats in the prior art, such as a Moving Picture Experts Group (MPEG) layer 3(MPEG layer-3, MP3) format Audio file, a wma (windows Media Audio) format Audio file, an Advanced Audio Coding (AAC) format Audio file, or an APE format Audio file, which is not particularly limited in this embodiment.
It should be understood that the storage device may be a hard disk of a computer, or may also be a non-operating Memory of a mobile phone, i.e., a physical Memory, such as a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, in 102, the lyric information may include, but is not limited to, lyric content and a duration corresponding to the lyric content, which is not particularly limited in this embodiment.
The lyric content may be a lyric fragment obtained by dividing the complete lyric of a song by taking one word as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking a plurality of words as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking one sentence as a unit, or may also be a lyric fragment obtained by adopting other dividing methods, which is not particularly limited in this embodiment.
Further optionally, before 102, the processing device may further obtain the lyric information according to a lyric file of the song, for example, a lyric file in an LRC format with IRC as an extension, a lyric file in a QRC format with QRC as an extension, and the like.
Wherein,
the IRC is the abbreviation of English lyric, and the lyric file in LRC format with IRC as extension name can be synchronously displayed in various digital players. The detailed description can refer to the related content in the prior art, and is not repeated herein.
The lyrics file of QRC format with QRC as extension can be synchronously displayed in the latest version of QQ music player. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Further optionally, after the processing device obtains the lyric information according to the lyric file of the song, the processing device may further adjust the lyric information according to reference original song audio data of the song and/or reference accompaniment audio data of the song.
Specifically, the processing device may perform fine adjustment on the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song by using an audio processing technology, such as a human voice endpoint detection technology, a beat detection technology, a voice recognition technology, and the like, so that error ranges of the lyric content and the duration corresponding to the lyric content can be accurately controlled within a preset range, and the accuracy of the song singing score can be improved.
Further optionally, the processing means may store the adjusted lyric information in a new format, i.e. a first format lyric file. The first format is not particularly limited in this embodiment, as long as the lyric content and the duration corresponding to the lyric content can be stored in a specified format. Therefore, the lyric information is independently stored into the lyric file, so that when the lyric content of the song changes, the lyric information in the lyric file can be timely adjusted, and the accuracy of the lyric information can be effectively improved.
Optionally, in a possible implementation manner of this embodiment, before 102, the processing device may further obtain reference standard audio data according to the reference original song audio data of the song and the reference accompaniment audio data of the song. Because the original song of the song contains the accompaniment information, the processing device can specifically obtain the audio data highlighting the voice by using an audio processing technology, such as a voice processing technology and the like, according to the reference original song audio data of the song and the reference accompaniment audio data of the song, so as to be used as the reference standard audio data. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Then, the processing device may obtain an audio characteristic parameter corresponding to the lyric information according to the reference standard audio data and the lyric information.
Specifically, the processing device may specifically obtain the audio feature information according to the reference standard audio data. For example, energy characteristics, pitch characteristics, beat characteristics, melody characteristics, and the like. Then, the processing device may obtain an audio characteristic parameter corresponding to the lyric information according to the lyric information and the audio characteristic information. Therefore, the lyric information of the song and the audio characteristic parameter corresponding to the lyric information can be used as the scoring parameter of the song.
The audio feature information may include, but is not limited to, at least one of short-term audio feature information of each frame in reference standard audio data and long-term audio feature information of multiple frames in reference standard audio data.
Generally, a duration corresponding to one word in a complete lyric of a song is about several hundred milliseconds (ms), which may correspond to audio data of several frames to several tens of frames, and the processing device may use the audio feature information of the frames as the audio feature parameter corresponding to the word, or may use a statistical value of the audio feature information of the frames as the audio feature parameter corresponding to the word, which is not particularly limited in this embodiment.
Further optionally, the processing device may store the lyric information and the audio characteristic parameter corresponding to the lyric information in a lyric file of a new format, i.e., a lyric file of a second format. The second format is not particularly limited in this embodiment, as long as the lyric content, the duration corresponding to the lyric content, and the audio characteristic parameter corresponding to the lyric content can be stored in a specified format.
For example, the data storage format of the lyrics file in the second format may be in the form of:
<time,duration,Control,Value>X;
wherein,
the X field indicates a lyric content.
the time field indicates the start timestamp of X.
The duration field indicates the duration of X.
The Control field may be represented by a 32-bit integer having 4 bytes, each byte starting from the lowest byte represents a specific audio feature parameter and the number of values thereof, and may represent up to 4 audio feature parameters. For example, the most significant bit of each byte, being 0 indicates that there is no corresponding audio characteristic parameter, and being 1 indicates that there is a corresponding audio characteristic parameter; the last 7 bits represent the number of values, which can represent up to 128 values.
For example, the Control field is an integer of 34180, and 24180 is converted to a hexadecimal number of 0x 8584. Starting from the lowest byte, the lowest byte is 0x84, the corresponding binary number is 10000100, the highest bit is 1, which indicates that the specified audio characteristic parameter is present, and the next 7 bits are 0000100, which indicates that 4 values are present. The next byte is 0x85, the corresponding binary number is 10000101, the highest bit is 1, which indicates that there is a specified audio feature parameter, and the next 7 bits are 0000101, which indicates that there are 5 values.
It is understood that the length of the Control field may be specifically adjusted according to the number of the audio characteristic parameters, and is not limited to the above example.
The Value field indicates the Value of the audio feature parameter, and the length is not fixed, which may be determined by the number of values of the audio feature parameter indicated by the Control field.
Therefore, the lyric information and the audio characteristic parameters corresponding to the lyric information are jointly stored as a lyric file, namely the lyric file in the second format, so that the lyric file can be utilized to score singing of the song in real time, and the real-time performance of the scoring of singing of the song can be effectively improved.
In addition, because the Control field included in the lyric file of the second format can effectively indicate the number of the audio characteristic parameters and the numerical value number thereof, the lyric file of the second format can be subjected to version identification by using the Control field without adding version information in the lyric file of the second format, and the flexibility of version maintenance of the lyric file can be effectively improved, for example, the number of the audio characteristic parameters can be adjusted more flexibly, or the numerical value number of the audio characteristic parameters can be adjusted more flexibly.
Optionally, in a possible implementation manner of this embodiment, in 103, the processing device may specifically obtain, according to the original audio data and the lyric information, an audio feature to be scored corresponding to the lyric information. Then, the processing device may calculate the similarity between each audio feature to be scored and each scoring parameter respectively. And finally, obtaining the scoring result of the song according to the similarity between each audio feature to be scored and each scoring parameter and the corresponding weight value.
Specifically, any method in the prior art may be used as the method for calculating the similarity, and this embodiment does not particularly limit this. The detailed description can refer to the related content in the prior art, and is not repeated herein.
It is understood that, in addition to scoring the raw audio data according to the obtained scoring parameters to obtain a first scoring result of the song, the processing device in this embodiment may further score the raw audio data according to the MIDI file of the song to obtain a second scoring result of the song, considering that if the song has the corresponding MIDI file, and finally obtain a comprehensive scoring result according to the first scoring result and its weight value, the second scoring result and its weight value.
The processing device scores the original audio data according to the MIDI file of the song, and the detailed description may refer to related contents in the prior art, which is not described herein again.
In the embodiment, the scoring parameters of the songs are obtained by obtaining the original audio data of the songs sung by the singer, wherein the scoring parameters comprise the lyric information of the songs and the audio characteristic parameters corresponding to the lyric information, so that the original audio data can be scored by using the scoring parameters without depending on the MIDI files of the songs, the problem that the singing scores of the songs cannot be obtained due to the fact that some songs have no corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Fig. 2 is a schematic structural diagram of a singing scoring apparatus according to another embodiment of the present invention, as shown in fig. 2. The singing scoring device of the present embodiment may include a data acquisition unit 21, a parameter acquisition unit 22, and a scoring unit 23. The data acquisition unit 21 is configured to acquire original audio data of a song sung by a singer; a parameter obtaining unit 22, configured to obtain a scoring parameter of the song, where the scoring parameter includes lyric information of the song and an audio characteristic parameter corresponding to the lyric information; and the scoring unit 23 is configured to score the raw audio data by using the scoring parameter.
It should be noted that the singing scoring device provided in this embodiment may be a processing device, may be located in a local Application (App), for example, a hundred musician song, or may also be located in a server on the network side, or may also be partially located in the local Application, and another portion is located in the server on the network side.
It should be understood that the application may be an application installed on the terminal (native app), or may also be a web page of a browser on the terminal (webAPP), as long as an objective existence form of processing of the original audio data can be implemented, which is not limited in this embodiment.
Therefore, the original audio data of the song sung by the singer is obtained through the data obtaining unit, the scoring parameter of the song is obtained through the parameter obtaining unit, the scoring parameter comprises the lyric information of the song and the audio characteristic parameter corresponding to the lyric information, the scoring unit can score the original audio data by using the scoring parameter without depending on the MIDI file of the song, the problem that the singing score of the song cannot be obtained due to the fact that some songs have no corresponding MIDI file in the prior art can be solved, and the reliability of the singing score of the song is improved.
Optionally, in a possible implementation manner of this embodiment, the data obtaining unit 21 may specifically collect the original audio data in real time.
Specifically, the data acquisition unit 21 may collect a sound signal of a song sung by a singer, and then convert the sound signal into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
Optionally, in a possible implementation manner of this embodiment, the data obtaining unit 21 may specifically obtain, in advance, a pre-recorded audio file of the song sung by the singer from a storage device, and further decode the audio file to obtain the original audio data.
The Audio file may include Audio files in various encoding formats in the prior art, such as a Moving Picture Experts Group (MPEG) layer 3(MPEG layer-3, MP3) format Audio file, a wma (windows Media Audio) format Audio file, an Advanced Audio Coding (AAC) format Audio file, or an APE format Audio file, which is not particularly limited in this embodiment.
It should be understood that the storage device may be a hard disk of a computer, or may also be a non-operating Memory of a mobile phone, i.e., a physical Memory, such as a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, the lyric information may include, but is not limited to, lyric content and a duration corresponding to the lyric content, which is not particularly limited in this embodiment.
The lyric content may be a lyric fragment obtained by dividing the complete lyric of a song by taking one word as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking a plurality of words as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking one sentence as a unit, or may also be a lyric fragment obtained by adopting other dividing methods, which is not particularly limited in this embodiment.
Further optionally, the parameter obtaining unit 22 may be further configured to obtain the lyric information according to a lyric file of the song, for example, a lyric file in an LRC format with IRC as an extension, a lyric file in a QRC format with QRC as an extension, and the like; and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
Wherein,
the IRC is the abbreviation of English lyric, and the lyric file in LRC format with IRC as extension name can be synchronously displayed in various digital players. The detailed description can refer to the related content in the prior art, and is not repeated herein.
The lyrics file of QRC format with QRC as extension can be synchronously displayed in the latest version of QQ music player. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Specifically, the parameter obtaining unit 22 may specifically perform fine adjustment on the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song by using an audio processing technology, such as a human voice endpoint detection technology, a beat detection technology, a voice recognition technology, and the like, so as to accurately control error ranges of lyric contents and durations corresponding to the lyric contents within a preset range, thereby improving accuracy of song singing scoring.
Further optionally, the parameter obtaining unit 22 may store the adjusted lyric information into a lyric file with a new format, i.e. a lyric file with a first format. The first format is not particularly limited in this embodiment, as long as the lyric content and the duration corresponding to the lyric content can be stored in a specified format. Therefore, the lyric information is independently stored into the lyric file, so that when the lyric content of the song changes, the lyric information in the lyric file can be timely adjusted, and the accuracy of the lyric information can be effectively improved.
Optionally, in a possible implementation manner of this embodiment, the parameter obtaining unit 22 may further obtain reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and acquiring audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
Since the original song of the song includes the accompaniment information, the parameter obtaining unit 22 may specifically obtain the audio data highlighting the human voice by using an audio processing technique, such as a human voice processing technique, according to the reference original song audio data of the song and the reference accompaniment audio data of the song, so as to serve as the reference standard audio data. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Specifically, the parameter obtaining unit 22 may specifically obtain the audio feature information according to the reference standard audio data. Information of audio feature parameters of the types such as energy feature, pitch feature, beat feature, and melody feature; then, an audio characteristic parameter corresponding to the lyric information can be obtained according to the lyric information and the audio characteristic information. Therefore, the lyric information of the song and the audio characteristic parameter corresponding to the lyric information can be used as the scoring parameter of the song.
The audio feature information may include, but is not limited to, at least one of short-term audio feature information of each frame in reference standard audio data and long-term audio feature information of multiple frames in reference standard audio data.
Generally, a duration corresponding to a word in a complete lyric of a song is about several hundred milliseconds (ms), which corresponds to audio data of several frames to several tens of frames, and the parameter obtaining unit 22 may use audio feature information of the frames as an audio feature parameter corresponding to the word, or may use a statistical value of the audio feature information of the frames as the audio feature parameter corresponding to the word, which is not particularly limited in this embodiment.
Further optionally, the parameter obtaining unit 22 may store the lyric information and the audio characteristic parameter corresponding to the lyric information into a lyric file of a new format, i.e. a lyric file of a second format. The second format is not particularly limited in this embodiment, as long as the lyric content, the duration corresponding to the lyric content, and the audio characteristic parameter corresponding to the lyric content can be stored in a specified format.
For example, the data storage format of the lyrics file in the second format may be in the form of:
<time,duration,Control,Value>X;
wherein,
the X field indicates a lyric content.
the time field indicates the start timestamp of X.
The duration field indicates the duration of X.
The Control field may be represented by a 32-bit integer having 4 bytes, each byte starting from the lowest byte represents a specific audio feature parameter and the number of values thereof, and may represent up to 4 audio feature parameters. For example, the most significant bit of each byte, being 0 indicates that there is no corresponding audio characteristic parameter, and being 1 indicates that there is a corresponding audio characteristic parameter; the last 7 bits represent the number of values, which can represent up to 128 values.
For example, the Control field is an integer of 34180, and 24180 is converted to a hexadecimal number of 0x 8584. Starting from the lowest byte, the lowest byte is 0x84, the corresponding binary number is 10000100, the highest bit is 1, which indicates that the specified audio characteristic parameter is present, and the next 7 bits are 0000100, which indicates that 4 values are present. The next byte is 0x85, the corresponding binary number is 10000101, the highest bit is 1, which indicates that there is a specified audio feature parameter, and the next 7 bits are 0000101, which indicates that there are 5 values.
It is understood that the length of the Control field may be specifically adjusted according to the number of the audio characteristic parameters, and is not limited to the above example.
The Value field indicates the Value of the audio feature parameter, and the length is not fixed, which may be determined by the number of values of the audio feature parameter indicated by the Control field.
Therefore, the lyric information and the audio characteristic parameters corresponding to the lyric information are jointly stored as a lyric file, namely the lyric file in the second format, so that the lyric file can be utilized to score singing of the song in real time, and the real-time performance of the scoring of singing of the song can be effectively improved.
In addition, because the Control field included in the lyric file of the second format can effectively indicate the number of the audio characteristic parameters and the numerical value number thereof, the lyric file of the second format can be subjected to version identification by using the Control field without adding version information in the lyric file of the second format, and the flexibility of version maintenance of the lyric file can be effectively improved, for example, the number of the audio characteristic parameters can be adjusted more flexibly, or the numerical value number of the audio characteristic parameters can be adjusted more flexibly.
Optionally, in a possible implementation manner of this embodiment, the scoring unit 23 may be specifically configured to obtain, according to the original audio data and the lyric information, an audio feature to be scored corresponding to the lyric information; respectively calculating the similarity of each audio feature to be scored and each scoring parameter; and obtaining the scoring result of the song according to the similarity between each audio feature to be scored and each scoring parameter and the corresponding weight value.
Specifically, any method in the prior art may be used as the method for calculating the similarity, and this embodiment does not particularly limit this. The detailed description can refer to the related content in the prior art, and is not repeated herein.
It is understood that, in this embodiment, in addition to scoring the raw audio data according to the obtained scoring parameters to obtain a first scoring result of the song, the scoring unit 23 may further score the raw audio data according to the MIDI file of the song to obtain a second scoring result of the song, considering that if the song has the corresponding MIDI file, and finally obtain a comprehensive scoring result according to the first scoring result and its weight value, the second scoring result and its weight value.
The scoring unit 23 scores the original audio data according to the MIDI file of the song, and the detailed description can refer to the related contents in the prior art, which is not described herein again.
In this embodiment, the original audio data of the song sung by the singer is acquired by the data acquisition unit, and then the scoring parameter of the song is acquired by the parameter acquisition unit, where the scoring parameter includes the lyric information of the song and the audio characteristic parameter corresponding to the lyric information, so that the scoring unit can score the original audio data by using the scoring parameter without depending on the MIDI files of the song, and the problem that the singing score of some songs cannot be acquired because the songs have no corresponding MIDI files in the prior art can be avoided, thereby improving the reliability of the singing score of the song.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, an audio processing engine, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A singing scoring method, comprising:
acquiring original audio data of a song sung by a singer;
obtaining scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and scoring the original audio data by using the scoring parameters.
2. The method of claim 1, wherein obtaining raw audio data of a song sung by a singer comprises:
collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
3. The method of claim 1, wherein the lyric information comprises a lyric content and a duration corresponding to the lyric content.
4. The method of claim 1, wherein before obtaining the scoring parameters for the song, further comprising:
obtaining the lyric information according to the lyric file of the song;
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
5. The method according to any one of claims 1 to 4, wherein before obtaining the scoring parameters of the song, the method further comprises:
obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song;
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
6. A singing scoring device, comprising:
the data acquisition unit is used for acquiring original audio data of a song sung by a singer;
the parameter acquisition unit is used for acquiring scoring parameters of the song, and the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and the scoring unit is used for scoring the original audio data by using the scoring parameters.
7. Device according to claim 6, characterized in that the data acquisition unit is specifically configured for
Collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
8. The apparatus of claim 6, wherein the lyric information comprises a lyric content and a duration corresponding to the lyric content.
9. The apparatus of claim 6, wherein the parameter obtaining unit is further configured to obtain the parameters
Obtaining the lyric information according to the lyric file of the song; and
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
10. The apparatus according to any one of claims 6 to 9, wherein the parameter obtaining unit is further configured to obtain the parameters
Obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
CN201410250107.1A 2014-06-06 2014-06-06 Singing scoring method and device Pending CN104064180A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410250107.1A CN104064180A (en) 2014-06-06 2014-06-06 Singing scoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410250107.1A CN104064180A (en) 2014-06-06 2014-06-06 Singing scoring method and device

Publications (1)

Publication Number Publication Date
CN104064180A true CN104064180A (en) 2014-09-24

Family

ID=51551859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410250107.1A Pending CN104064180A (en) 2014-06-06 2014-06-06 Singing scoring method and device

Country Status (1)

Country Link
CN (1) CN104064180A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361883A (en) * 2014-10-10 2015-02-18 福建星网视易信息系统有限公司 Production method and device of singing evaluation standards files
CN104810025A (en) * 2015-03-31 2015-07-29 天翼爱音乐文化科技有限公司 Audio similarity detecting method and device
CN104882147A (en) * 2015-06-05 2015-09-02 福建星网视易信息系统有限公司 Method, device and system for displaying singing score
CN105244041A (en) * 2015-09-22 2016-01-13 百度在线网络技术(北京)有限公司 Song audition evaluation method and device
CN105989853A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Audio quality evaluation method and system
CN106782600A (en) * 2016-12-29 2017-05-31 广州酷狗计算机科技有限公司 The methods of marking and device of audio file
CN106878841A (en) * 2017-03-21 2017-06-20 北京小米移动软件有限公司 Microphone assembly
CN107103915A (en) * 2016-02-18 2017-08-29 广州酷狗计算机科技有限公司 A kind of audio data processing method and device
CN107978322A (en) * 2017-11-27 2018-05-01 北京酷我科技有限公司 A kind of K songs marking algorithm
CN108415942A (en) * 2018-01-30 2018-08-17 福建星网视易信息系统有限公司 Join in the chorus singing marking two-dimensional code generation method, device and system are taught in personalization
CN108875047A (en) * 2018-06-28 2018-11-23 清华大学 A kind of information processing method and system
CN110737381A (en) * 2019-09-17 2020-01-31 广州优谷信息技术有限公司 subtitle rolling control method, system and device
CN110808069A (en) * 2019-11-11 2020-02-18 上海瑞美锦鑫健康管理有限公司 Evaluation system and method for singing songs
CN111429949A (en) * 2020-04-16 2020-07-17 广州繁星互娱信息科技有限公司 Pitch line generation method, device, equipment and storage medium
CN111586430A (en) * 2020-05-14 2020-08-25 腾讯科技(深圳)有限公司 Online interaction method, client, server and storage medium
CN111770109A (en) * 2020-07-13 2020-10-13 兰州城市学院 Virtual reality music singing practice voice frequency and video frequency transmission method
CN112201100A (en) * 2020-10-27 2021-01-08 暨南大学 Music singing scoring system and method for evaluating artistic quality of primary and secondary schools
CN113345470A (en) * 2021-06-17 2021-09-03 青岛聚看云科技有限公司 Karaoke content auditing method, display device and server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006115387A1 (en) * 2005-04-28 2006-11-02 Nayio Media, Inc. System and method for grading singing data
CN101894552A (en) * 2010-07-16 2010-11-24 安徽科大讯飞信息科技股份有限公司 Speech spectrum segmentation based singing evaluating system
CN102103857A (en) * 2009-12-21 2011-06-22 盛大计算机(上海)有限公司 Singing scoring system
CN102110435A (en) * 2009-12-23 2011-06-29 康佳集团股份有限公司 Method and system for karaoke scoring
CN102664016A (en) * 2012-04-23 2012-09-12 安徽科大讯飞信息科技股份有限公司 Singing evaluation method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006115387A1 (en) * 2005-04-28 2006-11-02 Nayio Media, Inc. System and method for grading singing data
CN102103857A (en) * 2009-12-21 2011-06-22 盛大计算机(上海)有限公司 Singing scoring system
CN102110435A (en) * 2009-12-23 2011-06-29 康佳集团股份有限公司 Method and system for karaoke scoring
CN101894552A (en) * 2010-07-16 2010-11-24 安徽科大讯飞信息科技股份有限公司 Speech spectrum segmentation based singing evaluating system
CN102664016A (en) * 2012-04-23 2012-09-12 安徽科大讯飞信息科技股份有限公司 Singing evaluation method and system

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361883A (en) * 2014-10-10 2015-02-18 福建星网视易信息系统有限公司 Production method and device of singing evaluation standards files
CN105989853A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Audio quality evaluation method and system
CN104810025A (en) * 2015-03-31 2015-07-29 天翼爱音乐文化科技有限公司 Audio similarity detecting method and device
CN104810025B (en) * 2015-03-31 2018-04-20 天翼爱音乐文化科技有限公司 Audio similarity detection method and device
CN104882147A (en) * 2015-06-05 2015-09-02 福建星网视易信息系统有限公司 Method, device and system for displaying singing score
CN105244041A (en) * 2015-09-22 2016-01-13 百度在线网络技术(北京)有限公司 Song audition evaluation method and device
CN107103915A (en) * 2016-02-18 2017-08-29 广州酷狗计算机科技有限公司 A kind of audio data processing method and device
CN106782600B (en) * 2016-12-29 2020-04-24 广州酷狗计算机科技有限公司 Scoring method and device for audio files
CN106782600A (en) * 2016-12-29 2017-05-31 广州酷狗计算机科技有限公司 The methods of marking and device of audio file
CN106878841B (en) * 2017-03-21 2020-01-07 北京小米移动软件有限公司 Microphone assembly
CN106878841A (en) * 2017-03-21 2017-06-20 北京小米移动软件有限公司 Microphone assembly
CN107978322A (en) * 2017-11-27 2018-05-01 北京酷我科技有限公司 A kind of K songs marking algorithm
CN108415942A (en) * 2018-01-30 2018-08-17 福建星网视易信息系统有限公司 Join in the chorus singing marking two-dimensional code generation method, device and system are taught in personalization
CN108875047A (en) * 2018-06-28 2018-11-23 清华大学 A kind of information processing method and system
CN110737381B (en) * 2019-09-17 2020-11-10 广州优谷信息技术有限公司 Subtitle rolling control method, system and device
CN110737381A (en) * 2019-09-17 2020-01-31 广州优谷信息技术有限公司 subtitle rolling control method, system and device
CN110808069A (en) * 2019-11-11 2020-02-18 上海瑞美锦鑫健康管理有限公司 Evaluation system and method for singing songs
CN111429949A (en) * 2020-04-16 2020-07-17 广州繁星互娱信息科技有限公司 Pitch line generation method, device, equipment and storage medium
CN111429949B (en) * 2020-04-16 2023-10-13 广州繁星互娱信息科技有限公司 Pitch line generation method, device, equipment and storage medium
CN111586430A (en) * 2020-05-14 2020-08-25 腾讯科技(深圳)有限公司 Online interaction method, client, server and storage medium
CN111770109A (en) * 2020-07-13 2020-10-13 兰州城市学院 Virtual reality music singing practice voice frequency and video frequency transmission method
CN112201100A (en) * 2020-10-27 2021-01-08 暨南大学 Music singing scoring system and method for evaluating artistic quality of primary and secondary schools
CN113345470A (en) * 2021-06-17 2021-09-03 青岛聚看云科技有限公司 Karaoke content auditing method, display device and server

Similar Documents

Publication Publication Date Title
CN104064180A (en) Singing scoring method and device
CN107220235B (en) Speech recognition error correction method and device based on artificial intelligence and storage medium
CN110263322B (en) Audio corpus screening method and device for speech recognition and computer equipment
CN108831437B (en) Singing voice generation method, singing voice generation device, terminal and storage medium
EP3616190A1 (en) Automatic song generation
CN113053357B (en) Speech synthesis method, apparatus, device and computer readable storage medium
WO2018200268A1 (en) Automatic song generation
CN111161695B (en) Song generation method and device
CN110570876B (en) Singing voice synthesizing method, singing voice synthesizing device, computer equipment and storage medium
US20180158469A1 (en) Audio processing method and apparatus, and terminal
CN112750421B (en) Singing voice synthesis method and device and readable storage medium
CN107978322A (en) A kind of K songs marking algorithm
EP1146504A1 (en) Vocoder using phonetic decoding and speech characteristics
Liu et al. SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
CN104882146B (en) The processing method and processing device of audio promotion message
CN109657094B (en) Audio processing method and terminal equipment
CN111259188B (en) Lyric alignment method and system based on seq2seq network
CN107025902B (en) Data processing method and device
CN107133344B (en) Data processing method and device
JP4961565B2 (en) Voice search apparatus and voice search method
CN112071299B (en) Neural network model training method, audio generation method and device and electronic equipment
CN114613359A (en) Language model training method, audio recognition method and computer equipment
CN112750422B (en) Singing voice synthesis method, device and equipment
CN113658570B (en) Song processing method, apparatus, computer device, storage medium, and program product
CN113825009B (en) Audio and video playing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160316

Address after: 100027 Haidian District, Qinghe Qinghe East Road, No. 23, building two, floor 2108, No., No. 18

Applicant after: BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd.

Address before: 100085 Beijing, Haidian District, No. ten on the street Baidu building, No. 10

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication

Application publication date: 20140924

RJ01 Rejection of invention patent application after publication