CN104064180A - Singing scoring method and device - Google Patents
Singing scoring method and device Download PDFInfo
- Publication number
- CN104064180A CN104064180A CN201410250107.1A CN201410250107A CN104064180A CN 104064180 A CN104064180 A CN 104064180A CN 201410250107 A CN201410250107 A CN 201410250107A CN 104064180 A CN104064180 A CN 104064180A
- Authority
- CN
- China
- Prior art keywords
- song
- lyric
- audio data
- scoring
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013077 scoring method Methods 0.000 title claims abstract description 11
- 238000000034 method Methods 0.000 claims description 19
- 238000012545 processing Methods 0.000 description 35
- 238000005516 engineering process Methods 0.000 description 11
- 239000012634 fragment Substances 0.000 description 8
- 230000005236 sound signal Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Telephonic Communication Services (AREA)
Abstract
The invention provides a singing scoring method and device. By obtaining original audio data of a song sung by a singer, scoring parameters of the song are obtained, wherein the scoring parameters include lyric information of the song and audio feature parameters corresponding to the lyric information. Accordingly, by means of the scoring parameters, the original audio data are scored, scoring does not depend on MIDI files of the song any longer, the problem that in the prior art, some songs do not have corresponding MIDI files and accordingly singing of the songs cannot be scored can be avoided, and accordingly scoring reliability of the singing of the song is improved.
Description
[ technical field ] A method for producing a semiconductor device
The invention relates to an audio data processing technology, in particular to a singing scoring method and device.
[ background of the invention ]
The traditional singing scoring method utilizes a Musical Instrument Digital Interface (MIDI) file containing a standard melody of a song to be scored to compare with original audio data of the song sung by a singer so as to obtain the score of the song sung by the singer.
However, not all songs have corresponding MIDI files, and therefore, the singing scoring method in the prior art cannot obtain the singing scores of the songs, thereby causing a decrease in reliability of the singing scores of the songs.
[ summary of the invention ]
Aspects of the present invention provide a singing scoring method and apparatus, which are used to improve the reliability of song singing scoring.
In one aspect of the present invention, a singing scoring method is provided, including:
acquiring original audio data of a song sung by a singer;
obtaining scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and scoring the original audio data by using the scoring parameters.
The above-described aspect and any possible implementation manner further provide an implementation manner, where acquiring original audio data of a song sung by a singer includes:
collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
The above-described aspects and any possible implementations further provide an implementation in which the lyric information includes lyric content and a duration corresponding to the lyric content.
The above aspect and any possible implementation manner further provide an implementation manner, before the obtaining the scoring parameter of the song, further including:
obtaining the lyric information according to the lyric file of the song;
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
The above aspect and any possible implementation manner further provide an implementation manner, before the obtaining the scoring parameter of the song, further including:
obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song;
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
In another aspect of the present invention, there is provided a singing scoring apparatus, including:
the data acquisition unit is used for acquiring original audio data of a song sung by a singer;
the parameter acquisition unit is used for acquiring scoring parameters of the song, and the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and the scoring unit is used for scoring the original audio data by using the scoring parameters.
The above-mentioned aspects and any possible implementation further provide an implementation that the data acquisition unit is specifically configured to
Collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
The above-described aspects and any possible implementations further provide an implementation in which the lyric information includes lyric content and a duration corresponding to the lyric content.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the parameter obtaining unit is further configured to
Obtaining the lyric information according to the lyric file of the song; and
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the parameter obtaining unit is further configured to
Obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
According to the technical scheme, the original audio data of the song sung by the singer are obtained, and the scoring parameters of the song are further obtained, wherein the scoring parameters comprise the lyric information of the song and the audio characteristic parameters corresponding to the lyric information, so that the original audio data can be scored by using the scoring parameters without depending on MIDI files of the song, the problem that the singing scores of the songs cannot be obtained due to the fact that some songs do not have corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed in the embodiments or the prior art descriptions will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without inventive labor.
Fig. 1 is a schematic flow chart of a singing scoring method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a singing scoring device according to another embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terminal according to the embodiments of the present application may include, but is not limited to, a mobile phone, a Personal Digital Assistant (PDA), a wireless handheld device, a wireless netbook, a Personal computer, a portable computer, a tablet computer, an MP3 player, an MP4 player, a wearable device (e.g., smart glasses, a smart watch, a smart bracelet, etc.), and the like.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Fig. 1 is a schematic flow chart of a singing scoring method according to an embodiment of the present invention, as shown in fig. 1.
101. Original audio data of a song sung by a singer is acquired.
102. And acquiring scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information.
103. And scoring the original audio data by using the scoring parameters.
The execution subjects of 101 to 103 may be processing devices, may be located in a local Application (App), such as hundredth musician song, or may also be located in a server on the network side, or may also be partially located in the local Application and partially located in the server on the network side.
It should be understood that the application may be an application installed on the terminal (native app), or may also be a web page of a browser on the terminal (webAPP), as long as an objective existence form of processing of the original audio data can be implemented, which is not limited in this embodiment.
Therefore, by acquiring the original audio data of the song sung by the singer and further acquiring the scoring parameters of the song, wherein the scoring parameters comprise the lyric information of the song and the audio characteristic parameters corresponding to the lyric information, the scoring parameters can be utilized to score the original audio data without depending on MIDI files of the song, the problem that the singing scores of the songs cannot be acquired because some songs have no corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
Optionally, in a possible implementation manner of this embodiment, in 101, the processing device may specifically collect the original audio data in real time.
Specifically, the processing means may collect a sound signal of a song sung by a singer and then convert the sound signal into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
Optionally, in a possible implementation manner of this embodiment, in 101, the processing device may specifically obtain, from a storage device, an audio file that is recorded in advance and the singer sings the song, and then decode the audio file to obtain the original audio data.
The Audio file may include Audio files in various encoding formats in the prior art, such as a Moving Picture Experts Group (MPEG) layer 3(MPEG layer-3, MP3) format Audio file, a wma (windows Media Audio) format Audio file, an Advanced Audio Coding (AAC) format Audio file, or an APE format Audio file, which is not particularly limited in this embodiment.
It should be understood that the storage device may be a hard disk of a computer, or may also be a non-operating Memory of a mobile phone, i.e., a physical Memory, such as a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, in 102, the lyric information may include, but is not limited to, lyric content and a duration corresponding to the lyric content, which is not particularly limited in this embodiment.
The lyric content may be a lyric fragment obtained by dividing the complete lyric of a song by taking one word as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking a plurality of words as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking one sentence as a unit, or may also be a lyric fragment obtained by adopting other dividing methods, which is not particularly limited in this embodiment.
Further optionally, before 102, the processing device may further obtain the lyric information according to a lyric file of the song, for example, a lyric file in an LRC format with IRC as an extension, a lyric file in a QRC format with QRC as an extension, and the like.
Wherein,
the IRC is the abbreviation of English lyric, and the lyric file in LRC format with IRC as extension name can be synchronously displayed in various digital players. The detailed description can refer to the related content in the prior art, and is not repeated herein.
The lyrics file of QRC format with QRC as extension can be synchronously displayed in the latest version of QQ music player. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Further optionally, after the processing device obtains the lyric information according to the lyric file of the song, the processing device may further adjust the lyric information according to reference original song audio data of the song and/or reference accompaniment audio data of the song.
Specifically, the processing device may perform fine adjustment on the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song by using an audio processing technology, such as a human voice endpoint detection technology, a beat detection technology, a voice recognition technology, and the like, so that error ranges of the lyric content and the duration corresponding to the lyric content can be accurately controlled within a preset range, and the accuracy of the song singing score can be improved.
Further optionally, the processing means may store the adjusted lyric information in a new format, i.e. a first format lyric file. The first format is not particularly limited in this embodiment, as long as the lyric content and the duration corresponding to the lyric content can be stored in a specified format. Therefore, the lyric information is independently stored into the lyric file, so that when the lyric content of the song changes, the lyric information in the lyric file can be timely adjusted, and the accuracy of the lyric information can be effectively improved.
Optionally, in a possible implementation manner of this embodiment, before 102, the processing device may further obtain reference standard audio data according to the reference original song audio data of the song and the reference accompaniment audio data of the song. Because the original song of the song contains the accompaniment information, the processing device can specifically obtain the audio data highlighting the voice by using an audio processing technology, such as a voice processing technology and the like, according to the reference original song audio data of the song and the reference accompaniment audio data of the song, so as to be used as the reference standard audio data. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Then, the processing device may obtain an audio characteristic parameter corresponding to the lyric information according to the reference standard audio data and the lyric information.
Specifically, the processing device may specifically obtain the audio feature information according to the reference standard audio data. For example, energy characteristics, pitch characteristics, beat characteristics, melody characteristics, and the like. Then, the processing device may obtain an audio characteristic parameter corresponding to the lyric information according to the lyric information and the audio characteristic information. Therefore, the lyric information of the song and the audio characteristic parameter corresponding to the lyric information can be used as the scoring parameter of the song.
The audio feature information may include, but is not limited to, at least one of short-term audio feature information of each frame in reference standard audio data and long-term audio feature information of multiple frames in reference standard audio data.
Generally, a duration corresponding to one word in a complete lyric of a song is about several hundred milliseconds (ms), which may correspond to audio data of several frames to several tens of frames, and the processing device may use the audio feature information of the frames as the audio feature parameter corresponding to the word, or may use a statistical value of the audio feature information of the frames as the audio feature parameter corresponding to the word, which is not particularly limited in this embodiment.
Further optionally, the processing device may store the lyric information and the audio characteristic parameter corresponding to the lyric information in a lyric file of a new format, i.e., a lyric file of a second format. The second format is not particularly limited in this embodiment, as long as the lyric content, the duration corresponding to the lyric content, and the audio characteristic parameter corresponding to the lyric content can be stored in a specified format.
For example, the data storage format of the lyrics file in the second format may be in the form of:
<time,duration,Control,Value>X;
wherein,
the X field indicates a lyric content.
the time field indicates the start timestamp of X.
The duration field indicates the duration of X.
The Control field may be represented by a 32-bit integer having 4 bytes, each byte starting from the lowest byte represents a specific audio feature parameter and the number of values thereof, and may represent up to 4 audio feature parameters. For example, the most significant bit of each byte, being 0 indicates that there is no corresponding audio characteristic parameter, and being 1 indicates that there is a corresponding audio characteristic parameter; the last 7 bits represent the number of values, which can represent up to 128 values.
For example, the Control field is an integer of 34180, and 24180 is converted to a hexadecimal number of 0x 8584. Starting from the lowest byte, the lowest byte is 0x84, the corresponding binary number is 10000100, the highest bit is 1, which indicates that the specified audio characteristic parameter is present, and the next 7 bits are 0000100, which indicates that 4 values are present. The next byte is 0x85, the corresponding binary number is 10000101, the highest bit is 1, which indicates that there is a specified audio feature parameter, and the next 7 bits are 0000101, which indicates that there are 5 values.
It is understood that the length of the Control field may be specifically adjusted according to the number of the audio characteristic parameters, and is not limited to the above example.
The Value field indicates the Value of the audio feature parameter, and the length is not fixed, which may be determined by the number of values of the audio feature parameter indicated by the Control field.
Therefore, the lyric information and the audio characteristic parameters corresponding to the lyric information are jointly stored as a lyric file, namely the lyric file in the second format, so that the lyric file can be utilized to score singing of the song in real time, and the real-time performance of the scoring of singing of the song can be effectively improved.
In addition, because the Control field included in the lyric file of the second format can effectively indicate the number of the audio characteristic parameters and the numerical value number thereof, the lyric file of the second format can be subjected to version identification by using the Control field without adding version information in the lyric file of the second format, and the flexibility of version maintenance of the lyric file can be effectively improved, for example, the number of the audio characteristic parameters can be adjusted more flexibly, or the numerical value number of the audio characteristic parameters can be adjusted more flexibly.
Optionally, in a possible implementation manner of this embodiment, in 103, the processing device may specifically obtain, according to the original audio data and the lyric information, an audio feature to be scored corresponding to the lyric information. Then, the processing device may calculate the similarity between each audio feature to be scored and each scoring parameter respectively. And finally, obtaining the scoring result of the song according to the similarity between each audio feature to be scored and each scoring parameter and the corresponding weight value.
Specifically, any method in the prior art may be used as the method for calculating the similarity, and this embodiment does not particularly limit this. The detailed description can refer to the related content in the prior art, and is not repeated herein.
It is understood that, in addition to scoring the raw audio data according to the obtained scoring parameters to obtain a first scoring result of the song, the processing device in this embodiment may further score the raw audio data according to the MIDI file of the song to obtain a second scoring result of the song, considering that if the song has the corresponding MIDI file, and finally obtain a comprehensive scoring result according to the first scoring result and its weight value, the second scoring result and its weight value.
The processing device scores the original audio data according to the MIDI file of the song, and the detailed description may refer to related contents in the prior art, which is not described herein again.
In the embodiment, the scoring parameters of the songs are obtained by obtaining the original audio data of the songs sung by the singer, wherein the scoring parameters comprise the lyric information of the songs and the audio characteristic parameters corresponding to the lyric information, so that the original audio data can be scored by using the scoring parameters without depending on the MIDI files of the songs, the problem that the singing scores of the songs cannot be obtained due to the fact that some songs have no corresponding MIDI files in the prior art can be solved, and the reliability of the singing scores of the songs is improved.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Fig. 2 is a schematic structural diagram of a singing scoring apparatus according to another embodiment of the present invention, as shown in fig. 2. The singing scoring device of the present embodiment may include a data acquisition unit 21, a parameter acquisition unit 22, and a scoring unit 23. The data acquisition unit 21 is configured to acquire original audio data of a song sung by a singer; a parameter obtaining unit 22, configured to obtain a scoring parameter of the song, where the scoring parameter includes lyric information of the song and an audio characteristic parameter corresponding to the lyric information; and the scoring unit 23 is configured to score the raw audio data by using the scoring parameter.
It should be noted that the singing scoring device provided in this embodiment may be a processing device, may be located in a local Application (App), for example, a hundred musician song, or may also be located in a server on the network side, or may also be partially located in the local Application, and another portion is located in the server on the network side.
It should be understood that the application may be an application installed on the terminal (native app), or may also be a web page of a browser on the terminal (webAPP), as long as an objective existence form of processing of the original audio data can be implemented, which is not limited in this embodiment.
Therefore, the original audio data of the song sung by the singer is obtained through the data obtaining unit, the scoring parameter of the song is obtained through the parameter obtaining unit, the scoring parameter comprises the lyric information of the song and the audio characteristic parameter corresponding to the lyric information, the scoring unit can score the original audio data by using the scoring parameter without depending on the MIDI file of the song, the problem that the singing score of the song cannot be obtained due to the fact that some songs have no corresponding MIDI file in the prior art can be solved, and the reliability of the singing score of the song is improved.
Optionally, in a possible implementation manner of this embodiment, the data obtaining unit 21 may specifically collect the original audio data in real time.
Specifically, the data acquisition unit 21 may collect a sound signal of a song sung by a singer, and then convert the sound signal into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
Optionally, in a possible implementation manner of this embodiment, the data obtaining unit 21 may specifically obtain, in advance, a pre-recorded audio file of the song sung by the singer from a storage device, and further decode the audio file to obtain the original audio data.
The Audio file may include Audio files in various encoding formats in the prior art, such as a Moving Picture Experts Group (MPEG) layer 3(MPEG layer-3, MP3) format Audio file, a wma (windows Media Audio) format Audio file, an Advanced Audio Coding (AAC) format Audio file, or an APE format Audio file, which is not particularly limited in this embodiment.
It should be understood that the storage device may be a hard disk of a computer, or may also be a non-operating Memory of a mobile phone, i.e., a physical Memory, such as a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, the lyric information may include, but is not limited to, lyric content and a duration corresponding to the lyric content, which is not particularly limited in this embodiment.
The lyric content may be a lyric fragment obtained by dividing the complete lyric of a song by taking one word as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking a plurality of words as a unit, or may also be a lyric fragment obtained by dividing the complete lyric of the song by taking one sentence as a unit, or may also be a lyric fragment obtained by adopting other dividing methods, which is not particularly limited in this embodiment.
Further optionally, the parameter obtaining unit 22 may be further configured to obtain the lyric information according to a lyric file of the song, for example, a lyric file in an LRC format with IRC as an extension, a lyric file in a QRC format with QRC as an extension, and the like; and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
Wherein,
the IRC is the abbreviation of English lyric, and the lyric file in LRC format with IRC as extension name can be synchronously displayed in various digital players. The detailed description can refer to the related content in the prior art, and is not repeated herein.
The lyrics file of QRC format with QRC as extension can be synchronously displayed in the latest version of QQ music player. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Specifically, the parameter obtaining unit 22 may specifically perform fine adjustment on the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song by using an audio processing technology, such as a human voice endpoint detection technology, a beat detection technology, a voice recognition technology, and the like, so as to accurately control error ranges of lyric contents and durations corresponding to the lyric contents within a preset range, thereby improving accuracy of song singing scoring.
Further optionally, the parameter obtaining unit 22 may store the adjusted lyric information into a lyric file with a new format, i.e. a lyric file with a first format. The first format is not particularly limited in this embodiment, as long as the lyric content and the duration corresponding to the lyric content can be stored in a specified format. Therefore, the lyric information is independently stored into the lyric file, so that when the lyric content of the song changes, the lyric information in the lyric file can be timely adjusted, and the accuracy of the lyric information can be effectively improved.
Optionally, in a possible implementation manner of this embodiment, the parameter obtaining unit 22 may further obtain reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and acquiring audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
Since the original song of the song includes the accompaniment information, the parameter obtaining unit 22 may specifically obtain the audio data highlighting the human voice by using an audio processing technique, such as a human voice processing technique, according to the reference original song audio data of the song and the reference accompaniment audio data of the song, so as to serve as the reference standard audio data. The detailed description can refer to the related content in the prior art, and is not repeated herein.
Specifically, the parameter obtaining unit 22 may specifically obtain the audio feature information according to the reference standard audio data. Information of audio feature parameters of the types such as energy feature, pitch feature, beat feature, and melody feature; then, an audio characteristic parameter corresponding to the lyric information can be obtained according to the lyric information and the audio characteristic information. Therefore, the lyric information of the song and the audio characteristic parameter corresponding to the lyric information can be used as the scoring parameter of the song.
The audio feature information may include, but is not limited to, at least one of short-term audio feature information of each frame in reference standard audio data and long-term audio feature information of multiple frames in reference standard audio data.
Generally, a duration corresponding to a word in a complete lyric of a song is about several hundred milliseconds (ms), which corresponds to audio data of several frames to several tens of frames, and the parameter obtaining unit 22 may use audio feature information of the frames as an audio feature parameter corresponding to the word, or may use a statistical value of the audio feature information of the frames as the audio feature parameter corresponding to the word, which is not particularly limited in this embodiment.
Further optionally, the parameter obtaining unit 22 may store the lyric information and the audio characteristic parameter corresponding to the lyric information into a lyric file of a new format, i.e. a lyric file of a second format. The second format is not particularly limited in this embodiment, as long as the lyric content, the duration corresponding to the lyric content, and the audio characteristic parameter corresponding to the lyric content can be stored in a specified format.
For example, the data storage format of the lyrics file in the second format may be in the form of:
<time,duration,Control,Value>X;
wherein,
the X field indicates a lyric content.
the time field indicates the start timestamp of X.
The duration field indicates the duration of X.
The Control field may be represented by a 32-bit integer having 4 bytes, each byte starting from the lowest byte represents a specific audio feature parameter and the number of values thereof, and may represent up to 4 audio feature parameters. For example, the most significant bit of each byte, being 0 indicates that there is no corresponding audio characteristic parameter, and being 1 indicates that there is a corresponding audio characteristic parameter; the last 7 bits represent the number of values, which can represent up to 128 values.
For example, the Control field is an integer of 34180, and 24180 is converted to a hexadecimal number of 0x 8584. Starting from the lowest byte, the lowest byte is 0x84, the corresponding binary number is 10000100, the highest bit is 1, which indicates that the specified audio characteristic parameter is present, and the next 7 bits are 0000100, which indicates that 4 values are present. The next byte is 0x85, the corresponding binary number is 10000101, the highest bit is 1, which indicates that there is a specified audio feature parameter, and the next 7 bits are 0000101, which indicates that there are 5 values.
It is understood that the length of the Control field may be specifically adjusted according to the number of the audio characteristic parameters, and is not limited to the above example.
The Value field indicates the Value of the audio feature parameter, and the length is not fixed, which may be determined by the number of values of the audio feature parameter indicated by the Control field.
Therefore, the lyric information and the audio characteristic parameters corresponding to the lyric information are jointly stored as a lyric file, namely the lyric file in the second format, so that the lyric file can be utilized to score singing of the song in real time, and the real-time performance of the scoring of singing of the song can be effectively improved.
In addition, because the Control field included in the lyric file of the second format can effectively indicate the number of the audio characteristic parameters and the numerical value number thereof, the lyric file of the second format can be subjected to version identification by using the Control field without adding version information in the lyric file of the second format, and the flexibility of version maintenance of the lyric file can be effectively improved, for example, the number of the audio characteristic parameters can be adjusted more flexibly, or the numerical value number of the audio characteristic parameters can be adjusted more flexibly.
Optionally, in a possible implementation manner of this embodiment, the scoring unit 23 may be specifically configured to obtain, according to the original audio data and the lyric information, an audio feature to be scored corresponding to the lyric information; respectively calculating the similarity of each audio feature to be scored and each scoring parameter; and obtaining the scoring result of the song according to the similarity between each audio feature to be scored and each scoring parameter and the corresponding weight value.
Specifically, any method in the prior art may be used as the method for calculating the similarity, and this embodiment does not particularly limit this. The detailed description can refer to the related content in the prior art, and is not repeated herein.
It is understood that, in this embodiment, in addition to scoring the raw audio data according to the obtained scoring parameters to obtain a first scoring result of the song, the scoring unit 23 may further score the raw audio data according to the MIDI file of the song to obtain a second scoring result of the song, considering that if the song has the corresponding MIDI file, and finally obtain a comprehensive scoring result according to the first scoring result and its weight value, the second scoring result and its weight value.
The scoring unit 23 scores the original audio data according to the MIDI file of the song, and the detailed description can refer to the related contents in the prior art, which is not described herein again.
In this embodiment, the original audio data of the song sung by the singer is acquired by the data acquisition unit, and then the scoring parameter of the song is acquired by the parameter acquisition unit, where the scoring parameter includes the lyric information of the song and the audio characteristic parameter corresponding to the lyric information, so that the scoring unit can score the original audio data by using the scoring parameter without depending on the MIDI files of the song, and the problem that the singing score of some songs cannot be acquired because the songs have no corresponding MIDI files in the prior art can be avoided, thereby improving the reliability of the singing score of the song.
In addition, by adopting the technical scheme provided by the invention, the adopted scoring parameters comprise the lyric information of the song sung by the singer and the audio characteristic parameters corresponding to the lyric information, so that each word sung by the singer can be scored in real time, and the real-time performance of the song sung scoring can be effectively improved.
In addition, by adopting the technical scheme provided by the invention, a complex audio processing algorithm is not required, and the complexity of song singing scoring can be effectively reduced.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, an audio processing engine, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A singing scoring method, comprising:
acquiring original audio data of a song sung by a singer;
obtaining scoring parameters of the song, wherein the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and scoring the original audio data by using the scoring parameters.
2. The method of claim 1, wherein obtaining raw audio data of a song sung by a singer comprises:
collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
3. The method of claim 1, wherein the lyric information comprises a lyric content and a duration corresponding to the lyric content.
4. The method of claim 1, wherein before obtaining the scoring parameters for the song, further comprising:
obtaining the lyric information according to the lyric file of the song;
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
5. The method according to any one of claims 1 to 4, wherein before obtaining the scoring parameters of the song, the method further comprises:
obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song;
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
6. A singing scoring device, comprising:
the data acquisition unit is used for acquiring original audio data of a song sung by a singer;
the parameter acquisition unit is used for acquiring scoring parameters of the song, and the scoring parameters comprise lyric information of the song and audio characteristic parameters corresponding to the lyric information;
and the scoring unit is used for scoring the original audio data by using the scoring parameters.
7. Device according to claim 6, characterized in that the data acquisition unit is specifically configured for
Collecting the original audio data in real time; or
And acquiring a pre-recorded audio file of the song sung by the singer, and decoding the audio file to obtain the original audio data.
8. The apparatus of claim 6, wherein the lyric information comprises a lyric content and a duration corresponding to the lyric content.
9. The apparatus of claim 6, wherein the parameter obtaining unit is further configured to obtain the parameters
Obtaining the lyric information according to the lyric file of the song; and
and adjusting the lyric information according to the reference original singing audio data of the song and/or the reference accompaniment audio data of the song.
10. The apparatus according to any one of claims 6 to 9, wherein the parameter obtaining unit is further configured to obtain the parameters
Obtaining reference standard audio data according to the reference original singing audio data of the song and the reference accompaniment audio data of the song; and
and obtaining audio characteristic parameters corresponding to the lyric information according to the reference standard audio data and the lyric information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410250107.1A CN104064180A (en) | 2014-06-06 | 2014-06-06 | Singing scoring method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410250107.1A CN104064180A (en) | 2014-06-06 | 2014-06-06 | Singing scoring method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104064180A true CN104064180A (en) | 2014-09-24 |
Family
ID=51551859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410250107.1A Pending CN104064180A (en) | 2014-06-06 | 2014-06-06 | Singing scoring method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104064180A (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361883A (en) * | 2014-10-10 | 2015-02-18 | 福建星网视易信息系统有限公司 | Production method and device of singing evaluation standards files |
CN104810025A (en) * | 2015-03-31 | 2015-07-29 | 天翼爱音乐文化科技有限公司 | Audio similarity detecting method and device |
CN104882147A (en) * | 2015-06-05 | 2015-09-02 | 福建星网视易信息系统有限公司 | Method, device and system for displaying singing score |
CN105244041A (en) * | 2015-09-22 | 2016-01-13 | 百度在线网络技术(北京)有限公司 | Song audition evaluation method and device |
CN105989853A (en) * | 2015-02-28 | 2016-10-05 | 科大讯飞股份有限公司 | Audio quality evaluation method and system |
CN106782600A (en) * | 2016-12-29 | 2017-05-31 | 广州酷狗计算机科技有限公司 | The methods of marking and device of audio file |
CN106878841A (en) * | 2017-03-21 | 2017-06-20 | 北京小米移动软件有限公司 | Microphone assembly |
CN107103915A (en) * | 2016-02-18 | 2017-08-29 | 广州酷狗计算机科技有限公司 | A kind of audio data processing method and device |
CN107978322A (en) * | 2017-11-27 | 2018-05-01 | 北京酷我科技有限公司 | A kind of K songs marking algorithm |
CN108415942A (en) * | 2018-01-30 | 2018-08-17 | 福建星网视易信息系统有限公司 | Join in the chorus singing marking two-dimensional code generation method, device and system are taught in personalization |
CN108875047A (en) * | 2018-06-28 | 2018-11-23 | 清华大学 | A kind of information processing method and system |
CN110737381A (en) * | 2019-09-17 | 2020-01-31 | 广州优谷信息技术有限公司 | subtitle rolling control method, system and device |
CN110808069A (en) * | 2019-11-11 | 2020-02-18 | 上海瑞美锦鑫健康管理有限公司 | Evaluation system and method for singing songs |
CN111429949A (en) * | 2020-04-16 | 2020-07-17 | 广州繁星互娱信息科技有限公司 | Pitch line generation method, device, equipment and storage medium |
CN111586430A (en) * | 2020-05-14 | 2020-08-25 | 腾讯科技(深圳)有限公司 | Online interaction method, client, server and storage medium |
CN111770109A (en) * | 2020-07-13 | 2020-10-13 | 兰州城市学院 | Virtual reality music singing practice voice frequency and video frequency transmission method |
CN112201100A (en) * | 2020-10-27 | 2021-01-08 | 暨南大学 | Music singing scoring system and method for evaluating artistic quality of primary and secondary schools |
CN113345470A (en) * | 2021-06-17 | 2021-09-03 | 青岛聚看云科技有限公司 | Karaoke content auditing method, display device and server |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006115387A1 (en) * | 2005-04-28 | 2006-11-02 | Nayio Media, Inc. | System and method for grading singing data |
CN101894552A (en) * | 2010-07-16 | 2010-11-24 | 安徽科大讯飞信息科技股份有限公司 | Speech spectrum segmentation based singing evaluating system |
CN102103857A (en) * | 2009-12-21 | 2011-06-22 | 盛大计算机(上海)有限公司 | Singing scoring system |
CN102110435A (en) * | 2009-12-23 | 2011-06-29 | 康佳集团股份有限公司 | Method and system for karaoke scoring |
CN102664016A (en) * | 2012-04-23 | 2012-09-12 | 安徽科大讯飞信息科技股份有限公司 | Singing evaluation method and system |
-
2014
- 2014-06-06 CN CN201410250107.1A patent/CN104064180A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006115387A1 (en) * | 2005-04-28 | 2006-11-02 | Nayio Media, Inc. | System and method for grading singing data |
CN102103857A (en) * | 2009-12-21 | 2011-06-22 | 盛大计算机(上海)有限公司 | Singing scoring system |
CN102110435A (en) * | 2009-12-23 | 2011-06-29 | 康佳集团股份有限公司 | Method and system for karaoke scoring |
CN101894552A (en) * | 2010-07-16 | 2010-11-24 | 安徽科大讯飞信息科技股份有限公司 | Speech spectrum segmentation based singing evaluating system |
CN102664016A (en) * | 2012-04-23 | 2012-09-12 | 安徽科大讯飞信息科技股份有限公司 | Singing evaluation method and system |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361883A (en) * | 2014-10-10 | 2015-02-18 | 福建星网视易信息系统有限公司 | Production method and device of singing evaluation standards files |
CN105989853A (en) * | 2015-02-28 | 2016-10-05 | 科大讯飞股份有限公司 | Audio quality evaluation method and system |
CN104810025A (en) * | 2015-03-31 | 2015-07-29 | 天翼爱音乐文化科技有限公司 | Audio similarity detecting method and device |
CN104810025B (en) * | 2015-03-31 | 2018-04-20 | 天翼爱音乐文化科技有限公司 | Audio similarity detection method and device |
CN104882147A (en) * | 2015-06-05 | 2015-09-02 | 福建星网视易信息系统有限公司 | Method, device and system for displaying singing score |
CN105244041A (en) * | 2015-09-22 | 2016-01-13 | 百度在线网络技术(北京)有限公司 | Song audition evaluation method and device |
CN107103915A (en) * | 2016-02-18 | 2017-08-29 | 广州酷狗计算机科技有限公司 | A kind of audio data processing method and device |
CN106782600B (en) * | 2016-12-29 | 2020-04-24 | 广州酷狗计算机科技有限公司 | Scoring method and device for audio files |
CN106782600A (en) * | 2016-12-29 | 2017-05-31 | 广州酷狗计算机科技有限公司 | The methods of marking and device of audio file |
CN106878841B (en) * | 2017-03-21 | 2020-01-07 | 北京小米移动软件有限公司 | Microphone assembly |
CN106878841A (en) * | 2017-03-21 | 2017-06-20 | 北京小米移动软件有限公司 | Microphone assembly |
CN107978322A (en) * | 2017-11-27 | 2018-05-01 | 北京酷我科技有限公司 | A kind of K songs marking algorithm |
CN108415942A (en) * | 2018-01-30 | 2018-08-17 | 福建星网视易信息系统有限公司 | Join in the chorus singing marking two-dimensional code generation method, device and system are taught in personalization |
CN108875047A (en) * | 2018-06-28 | 2018-11-23 | 清华大学 | A kind of information processing method and system |
CN110737381B (en) * | 2019-09-17 | 2020-11-10 | 广州优谷信息技术有限公司 | Subtitle rolling control method, system and device |
CN110737381A (en) * | 2019-09-17 | 2020-01-31 | 广州优谷信息技术有限公司 | subtitle rolling control method, system and device |
CN110808069A (en) * | 2019-11-11 | 2020-02-18 | 上海瑞美锦鑫健康管理有限公司 | Evaluation system and method for singing songs |
CN111429949A (en) * | 2020-04-16 | 2020-07-17 | 广州繁星互娱信息科技有限公司 | Pitch line generation method, device, equipment and storage medium |
CN111429949B (en) * | 2020-04-16 | 2023-10-13 | 广州繁星互娱信息科技有限公司 | Pitch line generation method, device, equipment and storage medium |
CN111586430A (en) * | 2020-05-14 | 2020-08-25 | 腾讯科技(深圳)有限公司 | Online interaction method, client, server and storage medium |
CN111770109A (en) * | 2020-07-13 | 2020-10-13 | 兰州城市学院 | Virtual reality music singing practice voice frequency and video frequency transmission method |
CN112201100A (en) * | 2020-10-27 | 2021-01-08 | 暨南大学 | Music singing scoring system and method for evaluating artistic quality of primary and secondary schools |
CN113345470A (en) * | 2021-06-17 | 2021-09-03 | 青岛聚看云科技有限公司 | Karaoke content auditing method, display device and server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104064180A (en) | Singing scoring method and device | |
CN107220235B (en) | Speech recognition error correction method and device based on artificial intelligence and storage medium | |
CN110263322B (en) | Audio corpus screening method and device for speech recognition and computer equipment | |
CN108831437B (en) | Singing voice generation method, singing voice generation device, terminal and storage medium | |
EP3616190A1 (en) | Automatic song generation | |
CN113053357B (en) | Speech synthesis method, apparatus, device and computer readable storage medium | |
WO2018200268A1 (en) | Automatic song generation | |
CN111161695B (en) | Song generation method and device | |
CN110570876B (en) | Singing voice synthesizing method, singing voice synthesizing device, computer equipment and storage medium | |
US20180158469A1 (en) | Audio processing method and apparatus, and terminal | |
CN112750421B (en) | Singing voice synthesis method and device and readable storage medium | |
CN107978322A (en) | A kind of K songs marking algorithm | |
EP1146504A1 (en) | Vocoder using phonetic decoding and speech characteristics | |
Liu et al. | SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound | |
CN104882146B (en) | The processing method and processing device of audio promotion message | |
CN109657094B (en) | Audio processing method and terminal equipment | |
CN111259188B (en) | Lyric alignment method and system based on seq2seq network | |
CN107025902B (en) | Data processing method and device | |
CN107133344B (en) | Data processing method and device | |
JP4961565B2 (en) | Voice search apparatus and voice search method | |
CN112071299B (en) | Neural network model training method, audio generation method and device and electronic equipment | |
CN114613359A (en) | Language model training method, audio recognition method and computer equipment | |
CN112750422B (en) | Singing voice synthesis method, device and equipment | |
CN113658570B (en) | Song processing method, apparatus, computer device, storage medium, and program product | |
CN113825009B (en) | Audio and video playing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C41 | Transfer of patent application or patent right or utility model | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20160316 Address after: 100027 Haidian District, Qinghe Qinghe East Road, No. 23, building two, floor 2108, No., No. 18 Applicant after: BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd. Address before: 100085 Beijing, Haidian District, No. ten on the street Baidu building, No. 10 Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140924 |
|
RJ01 | Rejection of invention patent application after publication |