WO2016192395A1 - 一种演唱评分显示方法、装置及系统 - Google Patents

一种演唱评分显示方法、装置及系统 Download PDF

Info

Publication number
WO2016192395A1
WO2016192395A1 PCT/CN2016/070111 CN2016070111W WO2016192395A1 WO 2016192395 A1 WO2016192395 A1 WO 2016192395A1 CN 2016070111 W CN2016070111 W CN 2016070111W WO 2016192395 A1 WO2016192395 A1 WO 2016192395A1
Authority
WO
WIPO (PCT)
Prior art keywords
song
voiceprint
display
singing
column
Prior art date
Application number
PCT/CN2016/070111
Other languages
English (en)
French (fr)
Inventor
卓康志
林鎏娟
林剑宇
祖可峰
刘灵辉
陈�胜
Original Assignee
福建星网视易信息系统有限公司
卓康志
林鎏娟
林剑宇
祖可峰
刘灵辉
陈�胜
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 福建星网视易信息系统有限公司, 卓康志, 林鎏娟, 林剑宇, 祖可峰, 刘灵辉, 陈�胜 filed Critical 福建星网视易信息系统有限公司
Publication of WO2016192395A1 publication Critical patent/WO2016192395A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Definitions

  • the invention relates to the field of singing scores, and in particular to a method, device and system for displaying a singing score.
  • the on-demand songs on the display in the existing digital audio-visual equipment have the ordinary song MV, the respective score interfaces of the manufacturers for the singer's singing situation, and there is no performance interface for the singer and the song original singing star sound similarity situation. If the user wants to know the similarity of the voiceprints of the singer who imitates the song while singing, the existing interface does not satisfy this requirement.
  • the inventors provide a method for displaying a singing score, comprising the following steps:
  • the standard voice column of the song is superimposed on the display screen, and the corresponding voice column is filled according to the progress of the song and the similarity of the voiceprint.
  • the standard voice column is superimposed on the display screen with a transparent picture.
  • the standard voice column is superimposed on the display screen with a transparent picture of a gradient color.
  • the standard voice column picture has a transparency of 10% to 100%.
  • the standard voice column picture activity is displayed on the display screen.
  • the standard voice column picture overlay is displayed in the middle of the song video.
  • the standard voice column picture is 1/3 to 1/2 of the size of the song video display interface.
  • the standard voice column of the song is superimposed.
  • the phrase “superimposing the standard voice column of the song on the display screen, and filling the corresponding voice column according to the song progress and the voiceprint similarity” is specifically: the display screen first displays the song video, on the song video.
  • the overlay displays a transparent background image, and displays the standard voice column of the current singing sentence on the background image, and fills the corresponding voice column according to the song progress and the similarity of the voiceprint.
  • the singing score display method further comprises the steps of: acquiring different avatars of the song corresponding songs according to different voiceprint similarities and displaying them on the display interface.
  • the interface displays different multimedia resources according to different voiceprint similarities.
  • the multimedia resource is a special effect picture, text, audio or special effect animation.
  • the singing score display method further includes the following steps:
  • the total voiceprint score data of the song is counted and displayed on the display interface.
  • the invention also provides a singing score display device, comprising the following modules:
  • Real-time audio acquisition module used to obtain real-time audio input by the sound collection device
  • Voiceprint similarity acquisition module used to obtain the voiceprint similarity of the real-time audio relative to the song
  • Sound column display module The standard voice column of the song is superimposed on the display screen, and the corresponding voice column is filled according to the song progress and the voice similarity.
  • the standard voice column is displayed on the display screen as a transparent picture overlay.
  • the standard voice column picture activity is displayed on a display screen.
  • the sound column display module is specifically configured to display a song video first on the display screen, and display a transparent background image on the song video, and display a standard voice column of the current singing sentence on the background image, according to The progress of the song and the similarity of the voiceprint fill the corresponding voice column.
  • the singing score display device further includes a multimedia resource display module: Different voiceprint similarities, the interface displays different multimedia resources, and the multimedia resources are special effects pictures, texts, audio or special effects animations.
  • the singing score display device further includes the following modules:
  • the singer avatar display module obtains different avatars of the song corresponding to the singer according to different vocal similarities and displays them on the display interface.
  • the present invention also provides a singing score display system, comprising a singing score display device, a display, a sound playing device and a sound collecting device, the display is connected with the singing score display device, and the sound playing device, the sound collecting device is connected with the singing score display device,
  • the singing score display device is the above-described singing score display device.
  • the above technical solution displays the avatars of different singers through different voiceprint similarities, and simply and clearly shows the similarity of the current sings and the original singers of the songs, so that the user can see the singing and the original singing in real time.
  • the similarity of the voice of the singer is displayed.
  • FIG. 1 is a schematic structural view of a system embodiment of the present invention
  • FIG. 4 is a schematic diagram of a display interface according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a display interface according to another embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a display interface according to still another embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a display interface according to still another embodiment of the present invention.
  • the embodiment first provides a singing score display method for displaying the similarity of the voiceprint of the sound and the original song of the song when the user sings on the display interface.
  • the method can be run in the singing score display device 100 of FIG.
  • the singing score display device 100 can be coupled to the display 120, the sound playing device 140, and the sound collecting device 160, and the singing score display device 100 can be implemented as a digital audiovisual device, such as a set top box.
  • the screen of the display 120 can be used as a display interface for display, and the sound playing device 140 can be used to play the sound signal to the user.
  • the sound collecting device 160 can be used to collect the sound of the user singing and input to the singing score display device 100. .
  • the lyrics information of the current song can be obtained from the display 120, the accompaniment sound of the current song can be heard from the sound playing device 140, and the sound of the user singing is input from the sound collecting device 160, and the sound playing device 140 can also be real-time. Play the sound of the user singing.
  • the song Before the user wants to sing, he will first make a song.
  • the song can be clicked through a karaoke station or a palm device connected to the singing score display device 100. After the song is clicked, after the song is turned, the title can be played to prompt the user that the song is about to start, and then the song will start playing.
  • the song corresponds to the original singer, that is, the singer in the embodiment of the present invention, the same song name may have different original singers, such as the male voice version corresponding to the male original singer, the female voice version corresponding to It is a female original singer. That is, the songs of the same song name may be different songs because the original singers are different.
  • a song corresponds to a fixed one or more original singers, such as a single version of the song corresponds to a singer, a double version of the song corresponds to two singers.
  • the user cares about the sound of the singer corresponding to the song, that is, the sound of the original singer when the song is played, so that the singer can be imitated in real time during the singing.
  • the singing score display device 100 may first proceed to step S201 to obtain real-time audio input by the sound collection device.
  • the real-time audio at this time is the sound that the user sings in real time against the sound collecting device.
  • the singing score display device 100 can acquire the voiceprint similarity of the real-time audio with respect to the song in step S202.
  • the voiceprint similarity of the real-time audio relative to the song may be voiceprint score information. The higher the similarity, the higher the voiceprint score, and the lower the similarity, the lower the voiceprint score.
  • the voiceprint similarity of the real-time audio relative to the song can be realized by the similarity of the real-time audio and the voiceprint of the original singer's voice at the current playing moment of the song, and the device for realizing the comparison process can be a singing score display device. 100 or a server connected to the singing score display device 100.
  • the singing score display device 100 can directly collect the real-time audio of the user and the voice of the original song singer of the current time, that is, the human voice, and perform the singing score.
  • the device implementing the comparison process is a server
  • the singing score display device 100 can transmit the collected real-time audio to the server, and the server realizes the comparison between the real-time audio and the singing score of the original song singer's voice at the current moment, and scores the singing score.
  • the result of the comparison that is, the result of the voiceprint similarity of the song in real time, is returned to the singing score display device 100, so that the singing score display device 100 can acquire the voiceprint similarity of the real-time audio with respect to the song.
  • the method of the server can reduce the calculation amount of the singing score display device 100, avoid the situation that the operand score display device 100 is too large to cause the song to play, and also facilitate the change of the similarity. This method does not need to be changed when the method is changed.
  • Each of the singing score display devices 100 makes a change.
  • voiceprint scoring There are two methods for voiceprint scoring. One is to extract the MFCC feature coefficients of the vocal voice and the target voiceprint, and then calculate the similarity between the vocalist's voiceprint and the standard voiceprint through the DTW algorithm. : Extract the MFCC feature coefficient of the target voiceprint, and then match the MFCC feature coefficient of the vocalist's voiceprint with the target model to obtain the voiceprint similarity.
  • the standard voice column of the song may be superimposed and displayed on the display screen in step S203, and the corresponding voice column is filled according to the song progress and the voiceprint similarity.
  • the method of the above embodiment may further include the following steps, the singing score display device 100 proceeds to step S203 to display a standard voice column of the song on the display interface.
  • the standard voiceprint is the sound spectrum of the sound information of the original singer of the song, and each word in the lyrics corresponds to a corresponding voice column, and the voiceprint can be quantized and displayed in a graphical manner.
  • Different voiceprint data correspond to the height of different graphics.
  • the heights of the different rectangles displayed in the area C correspond to the voiceprints of different lyrics, and the rectangles can be regarded as the voice column.
  • the present invention does not limit the number of voice columns in a unit time. If the voice column corresponds to the sound information of the original singer at a fixed time interval, the shorter the interval time, the more the number of voice bars, if The voice column corresponds to the sound information of the original singer of each word in the lyrics, and the number of voice columns is related to the lyrics.
  • the display position of the voice column should correspond to the lyric display position corresponding to the voice column. As shown in FIG. 4 to FIG. 7 , the voice column corresponds to the word of each lyric, and the word of each lyric is correspondingly displayed with a voice column corresponding to the word, and the user can conveniently sing the word of the lyric. See the voice column corresponding to the word.
  • the vocal score display device 100 displays the voiceprint column
  • the voiceprint similarity of the real-time audio acquired in step S202 is filled according to the progress of the song and the similarity of the voiceprint.
  • the voiceprint similarity of real-time audio is the degree of similarity between the real-time audio and the voiceprint of the original singer's song sound. The higher the score, the more similar. Different song progresses have different sound lines.
  • the lyrics of the lyrics in Fig. 4 and Fig. 6 correspond to the sound lines on the word "small", and the lyrics "wave” correspond to the words "wave”. Sound column.
  • the singing score display device 100 acquires the voiceprint similarity of the real-time audio, and fills the voice column above the "small” word according to the level of the voiceprint similarity.
  • the singing score display device 100 acquires the voiceprint similarity of the real-time audio, and fills the voice column above the "wave” word according to the level of the voiceprint similarity. The higher the similarity of the voiceprint, the larger the voiceprint column is filled.
  • the area C in Fig. 4 is a case where the displayed lyrics are not yet sung, and the area C is the case where the displayed lyrics are filled with black (may be other colors) after the lyrics are sung.
  • Fig. 5 and Fig. 6 respectively shows the filling of the voiceprint column of two different voiceprint score data, and the sound pattern similarity of Fig. 5 is lower, and the black fill of the voice column is smaller, the sound of Fig. 6
  • the similarity of the grain is higher, the black fill of the voice column is larger, and the similarity of the voiceprint also indicates the real-time sound.
  • the frequency is similar to the sound of the song.
  • the standard voice column is displayed on the display screen as a transparent picture overlay.
  • the standard voice column is superimposed on the display screen with transparent pictures, so that the user can not only see the picture of the standard voice column, improve the appearance, but also vaguely see the video played on the display screen. Does not affect the user singing.
  • the standard voice column picture has a transparency of 10% to 100%.
  • the transparency of the standard voice column image can be set according to the user's needs.
  • the standard voiceprint image can partially block the video played on the display screen; when the transparency effect is required, the transparency of the image can be reduced, so that not only the standard voiceprint image can be seen. You can also see the video that is displayed on the display screen.
  • the standard voice column is superimposed on the display screen with a transparent picture of a gradient color.
  • the transparent voice picture of the gradient color can highlight the standard voice column picture, giving the user a more intuitive visual display effect.
  • the standard voice column picture activity is displayed on the display screen.
  • the standard voice column picture can change the position of the overlay, for example, the standard voice column picture can be superimposedly displayed on the upper, lower or middle part of the display screen.
  • the superimposed position of the standard voice column picture can be adjusted according to the user's needs.
  • the standard voice column picture overlay is displayed in the middle of the song video.
  • This display mode is not only novel, but also can maximize the display effect of the standard voice column picture, and focus the user's attention on the voiceprint similarity of the standard voice column picture, which can be seen at a glance.
  • the similarity between the individual singing voiceprint and the standard voiceprint makes it easy to improve the singing level.
  • the standard voice column picture is 1/3 to 1/2 size of the song video display interface. According to the visual effect, the size of the standard voice column picture can be adjusted, and the size of the picture is preferably 1/3 to 1/2 of the song video display interface.
  • the standard voice column of the song is superimposed. Due to During the singing process, the display screen preferably plays the song video of the corresponding song, and then plays the standard voice column picture on the song video.
  • the "the standard voice column of the song displayed on the display screen is superimposed, and the corresponding voice column is filled according to the song progress and the voiceprint similarity".
  • the display screen first displays the song video, and the display video is superimposed and displayed transparently.
  • the background image shows the standard voice column of the current singing sentence on the background image, and fills the corresponding sound column according to the song progress and the similarity of the voiceprint.
  • the standard voice column of the next song is displayed on the background image, and then the user audio and the voiceprint similarity for the audio and the standard audio are obtained according to the user's singing progress, according to Similar similarity, filled with standard voice column.
  • the similarity of the voiceprint is higher, the user's audio is similar to the voiceprint of the standard audio.
  • the voiceprint similarity is 90%
  • the similarity between the user's audio and the standard audio voiceprint is 90%
  • corresponding to The standard voiceprint similar column needs to be filled 90%.
  • the method further includes the step S204: acquiring different avatars of the singer corresponding to the song according to different vocal similarities and displaying the avatar on the display interface.
  • a song corresponds to one or more singers, and different voice similarities correspond to different avatars of the singer.
  • the avatar's avatar is displayed in the area A of the display interface.
  • the area A in Figure 4 shows the avatar's smiling avatar (the avatar can be the singer's cartoon avatar), and the area A in Figure 5 shows the singer's sad avatar.
  • the avatar's smiling avatar may correspond to a high degree of similarity, and the singer's sad avatar may correspond to a low degree of similarity.
  • the avatar's smiling avatar can be displayed, and when the similarity is low, the singer's sad avatar can be displayed.
  • the present invention does not limit the number of pictures of the singer avatar corresponding to different similarities of the song. For example, if the similarity is high, there may be an avatar corresponding to the similarity.
  • the generality of the similarity may have an avatar corresponding to the avatar.
  • the similarity is very low and there may be an avatar. correspond.
  • the difference information between different avatars that distinguish different similarities may be the expression of the singer; it may also be the action of the avatar, such as high similarity, the singer nods, the similarity is very low, the singer shakes his head, etc.; Show similarity reviews to distinguish different voiceprint similarities, such as high voiceprint similarity, it shows "singing "Good”, generally shows “general”, the low similarity shows “singing bad”, etc.
  • the user can intuitively obtain the voiceprints of their own singing through the display of different avatars of the singers. Similarity, so that you can easily understand how similar you are to the singer.
  • updating the avatar's avatar display according to the similarity may be real-time comparison of the real-time audio and song's voiceprint similarity and updating the avatar in real time, or may be real-time counting all the voiceprint similarity data in front of the song.
  • the average of the similarity is obtained, and the avatar display is updated according to the average value in real time, or after all the lyrics are finished, all the voice similarity data in front of the song is counted, and the tie value is obtained, and the avatar's avatar is updated according to the average value. Display, or you can also average the similarity of the lyrics after the end of a lyric, and then average the lyrics, and update the avatar's avatar display according to the average.
  • the similarity may be the voiceprint score data of the real-time audio for the original singer voice of the song. The higher the voiceprint score data, the higher the similarity, the lower the voiceprint score data, and the more similar the similarity low.
  • the method further includes the following step S205: displaying different multimedia resources according to different voice similarities.
  • the singing score display device 100 obtains the voiceprint score data of the real-time audio or counts the total or staged voiceprint score data of the song, and according to the voiceprint score data of the real-time audio, the total or staged voiceprint score. Data, the corresponding multimedia resource is displayed on the display interface.
  • the staged voiceprint score data includes score data of a fixed time period length or voiceprint score data of a fixed number of sentences.
  • Multimedia resources can be special effects images, text, audio or special effects animations. If the similarity of the voiceprint is high, the animation of the applause or the sound of the applause can be displayed.
  • the sound resource of " ⁇ " can be played or the related text or picture of "not good singing” can be displayed.
  • different multimedia resources may be different types of multimedia resources, such as text and sound are different resource types.
  • Different multimedia resources can also be different contents of the same resource type, such as the applause sound and the "beep" sound of the resource type.
  • the image information may be an applause image, a thumb image, a bubble image, or the like.
  • the animation information may be a flower or a star as in Fig.
  • the middle position of the display interface of Fig. 6 may take the stars, and the stars emerging in Fig. 6 may be scored.
  • the data display area B moves.
  • These prompt information can be displayed in the middle of the display interface. In the middle position of Figure 7, the words "singing well" are displayed for the user to see.
  • These prompt information can be automatically displayed when the rating data is high, giving the user encouragement, and increasing the enthusiasm of the user while increasing the interest.
  • These multimedia resources can prompt the user to sing the current situation and increase the interest of singing. When the multimedia resource type is sound, the singer of the current singer who is not watching the display screen can also be reminded to facilitate the user's understanding of the singing situation.
  • the correspondence between the score data and the prompt information may be customized. For example, when the score is above 80, the display "sings well", the score is above 90, and the thumb image is displayed. As an optional embodiment, the above multimedia information can be sung like, sing, bubble-general, bubble-like, bubble-like.
  • the corresponding voiceprint score data can be as shown in Table 1 below:
  • Table 1 Correspondence table between scoring data and multimedia resources
  • the score data in Table 1 can obtain the staged voiceprint score data and display the corresponding multimedia resources after a series of scores are accumulated to a certain extent, and give the user a staged encouragement or prompt to facilitate the user to obtain the staged singer. Simulate similarity results.
  • the singing score display apparatus 100 may further include step S206: displaying the value of the voiceprint similarity on the display interface, in some embodiments
  • the value of the voiceprint similarity is the voiceprint score data.
  • the voiceprint score data may be the voiceprint score data of the real-time audio or the total voiceprint score data of the song.
  • the total voiceprint score data indicates the state in which the song is sung to the current total voiceprint similarity, and the total voiceprint score data can be obtained by averaging the voiceprint score data.
  • the interval is calculated at a fixed time or the total voiceprint score data of the song can be counted after the end of a lyric.
  • the interval fixed time and the calculation after the end of one sentence lyrics are compared with the real-time calculation, and the calculated calculation amount can be reduced, and the processing resources of the singing score display device 100 can be saved.
  • the singing score display device 100 may further perform the following step S207: displaying the information of the singer corresponding to the song on the display interface.
  • the singer's information includes the name of the singer or the region to which the singer belongs, such as Hong Kong and Taiwan stars, mainland stars, and so on.
  • the display of the information of the singer may be displayed at any position on the display interface, or may be displayed in the display area of the avatar, so that the user can conveniently obtain the information of the star when browsing the avatar of the singer.
  • the above embodiment does not limit the positions of the avatar display area, the voiceprint score data display area, and the voice column display area.
  • These display areas may be any area on the display screen, and these areas may or may not overlap, in some
  • the display area may be the display area AC as shown in FIG. 4 to FIG. 6 , that is, the avatar display area is the display area A at the upper left of the display interface, and the sound pattern is evaluated.
  • the sub-data display area is the display area B at the upper right of the display interface
  • the voice column display area is the display area C below the middle of the display interface.
  • Such display area arrangement does not affect the user watching the video MV played by the display interface, and can conveniently view the information of each display area.
  • Voiceprint similarity acquisition module used to obtain the voiceprint similarity of the real-time audio relative to the song
  • Sound column display module The standard voice column of the song is superimposed on the display screen, and the corresponding voice column is filled according to the song progress and the voice similarity.
  • the present invention also provides a singing score display device 100, as shown in FIG. 1, comprising the following modules: a real-time audio acquisition module 101: for acquiring real-time audio input by the sound collection device; a voiceprint similarity acquisition module 102: for Obtaining the voiceprint similarity of the real-time audio with respect to the song; the voice column display module 103: for displaying the standard voiceprint column of the song on the display screen, filling the corresponding voice column according to the song progress and the voiceprint similarity.
  • the real-time audio acquisition module can be connected to the sound collection device 160 to obtain real-time audio sung by the user.
  • the voice column display module 103 can be connected to the display 120.
  • the standard voice column of the song can be superimposed on the display screen, and the corresponding voice column can be filled according to the song progress and the voiceprint similarity.
  • the voice column display module Through the voice column display module, the user can visually see which parts of the singer are similar to each other when singing, and which parts are similar to the singer, so that the user can adjust the similarity part and let himself sing and imitate. The singing of the singers is more similar.
  • the standard voice column is displayed on the display screen as a transparent picture.
  • the standard voice column picture activity is displayed on the display screen.
  • the sound column display module 103 is specifically configured to display a song video first on the display screen, and display a transparent background image on the song video, and display a standard voice column of the current singing sentence on the background image, according to the song progress. Similar to the voiceprint, fill the corresponding voice column.
  • the standard voice column can be displayed interactively on the display screen, and the standard voice column can be displayed in the upper, lower or middle of the display screen. And the standard voice column is transparent The picture is superimposed on the display screen, so that not only the standard voice column picture can be seen, but also the video played on the display screen can be seen, which does not affect the user watching the video, and can also directly see the real-time similarity between the individual singing and the singer voice color. Degree, easy to improve.
  • the apparatus 100 further includes the following module: a singer avatar display module 104: acquiring different avatars of the singer corresponding to the song according to different vocal similarities and displaying on the display interface. It is simple and clear to show the similarity of the current voice and the original singing star of the song, which is convenient for the user to see the similarity of the voice pattern of the singing and the original singing star in real time.
  • the device 100 further includes a multimedia resource display module 105.
  • the interface displays different multimedia resources according to different voiceprint similarities, and the multimedia resources are special effects pictures, texts, audios, or special effects animations.
  • the multimedia resource display module 105 can play a plurality of multimedia resources under a voiceprint similarity, for example, when the similarity is high, the palmming animation and the "singing well" text can be simultaneously played. Through the multimedia resource display module 105, the user can be reminded to know the current singing situation.
  • the multimedia resource display module 105 can also display the corresponding multimedia resource on the display interface according to the score data of the real-time audio or the total or periodic score data of the song, according to the score data of the real-time audio, the total or periodic score data. .
  • the multimedia resource display module 105 can more actively remind the user of the simulated similarity of the voiceprint. The user does not have to keep staring at the voice column or the score data to know the singing situation through the prompt information.
  • the singing score display device 100 further includes a score data display module 106 for acquiring voiceprint score data of real-time audio and displaying voiceprint score data on the display interface, or a score data display module.
  • 106 is used to count the total score data of the song and display it on the display interface.
  • the display of the score data allows the user to see more detailed data on the similarity of the voiceprint.
  • the user will have a short idle time after the end of each lyric, so that the user can adjust his own singing method in the next sentence after seeing the score, and improve his singing. Similar to the original vocalist's voiceprint.
  • the singer score display device 100 further includes the following module: the singer information display module 107: displaying the information of the singer corresponding to the song on the display interface.
  • the present invention also provides a singing score display system, as shown in FIG. 1, including a singing score display device 100, a display 120, a sound playing device 140, and a sound collecting device 160, the display is connected with a singing score display device, a sound playing device, The sound collecting device is connected to the singing score display device, and the singing score display device is the singing score display device according to any of the above embodiments.
  • the system can display the similarity of the voice of the user singing and the original singer's singer in a multi-faceted manner when the user imitates the singer singing, so that the user can know the situation of the singing and allow the user to adjust his singing method in time.
  • the computer device includes but is not limited to: a personal computer, a server, a general purpose computer, a special purpose computer, a network device, an embedded device, a programmable device, a smart mobile terminal, a smart home device, a wearable smart device, a vehicle smart device, and the like;
  • the storage medium includes, but is not limited to, a RAM, a ROM, a magnetic disk, a magnetic tape, an optical disk, a flash memory, a USB flash drive, a mobile hard disk, a memory card, and a memory. Reminiscent of sticks, web server storage, network cloud storage, etc.
  • the computer program instructions can also be stored in a computer device readable memory that can direct the computer device to operate in a particular manner, such that instructions stored in the computer device readable memory produce an article of manufacture comprising the instruction device, the instruction device being implemented in the process Figure One or more processes and/or block diagrams of the functions specified in a block or blocks.
  • These computer program instructions can also be loaded onto a computer device such that a series of operational steps are performed on the computer device to produce computer-implemented processing, such that instructions executed on the computer device are provided for implementing one or more processes in the flowchart And/or block diagram of the steps of a function specified in a box or blocks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

一种演唱评分显示方法、装置及系统,其中方法包括如下步骤:获取声音采集设备(160)输入的实时音频;获取实时音频相对于该歌曲的声纹相似度,在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。所述显示方法通过不同的声纹相似度填充对应的声纹柱,简单明了地显示出当前唱歌与歌曲原唱歌星的声纹相似度,方便用户实时看到自己唱歌与原唱歌星的声纹相似度。

Description

一种演唱评分显示方法、装置及系统 技术领域
本发明涉及演唱评分领域,尤其涉及一种演唱评分显示方法、装置及系统。
背景技术
现有数字视听设备中的显示器上点播歌曲表现形式有普通歌曲MV、各个厂家对于演唱者演唱情况的各自的评分界面,并没有对于演唱者与歌曲原唱歌星声音相似度情况的表现界面。如果用户在唱歌的时候想知道自己模仿该歌曲的歌星唱歌的声纹相似度,现有的界面并没办法满足这个需求。
发明内容
为此,需要提供一种演唱评分显示方法、装置及系统,解决现有数字视听设备无法显示唱歌与歌曲原唱歌星相似度的问题。
为实现上述目的,发明人提供了一种演唱评分显示方法,包括如下步骤:
获取声音采集设备输入的实时演唱音频;
获取实时演唱音频相对于该歌曲标准音频的声纹相似度,
在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
进一步,所述标准声纹柱以透明图片叠加显示在显示屏幕上。
进一步,所述标准声纹柱以渐变色的透明图片叠加显示在显示屏幕上。
更进一步,所述标准声纹柱图片的透明度为10%-100%。
进一步,所述标准声纹柱图片活动显示在显示屏幕上。
进一步,所述标准声纹柱图片叠加显示在歌曲视频中部。
进一步,所述标准声纹柱图片为歌曲视频显示界面的1/3~1/2大小。
进一步,在所述显示屏幕显示的歌曲视频上,叠加显示歌曲的标准声纹柱。
更进一步,所述“在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱”具体为:所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。
进一步,所述演唱评分显示方法还包括如下步骤:根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面上显示。
进一步,根据不同的声纹相似度,界面显示不同的多媒体资源。所述多媒体资源为特效图片、文字、音频或特效动画。
进一步,所述演唱评分显示方法还包括如下步骤:
获取实时音频的声纹评分数据并在显示界面上显示声纹评分数据;
或者,统计该歌曲总的声纹评分数据并在显示界面上显示。
本发明还提供一种演唱评分显示装置,包括如下模块:
实时音频获取模块:用于获取声音采集设备输入的实时音频;
声纹相似度获取模块:用于获取实时音频相对于该歌曲的声纹相似度;
声纹柱显示模块:在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
进一步,所述声纹柱显示模块中,所述标准声纹柱以透明图片叠加显示在显示屏幕上。
进一步,所述声纹柱显示模块中,所述标准声纹柱图片活动显示在显示屏幕上。
更进一步,所述声纹柱显示模块具体为,用于所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。
进一步,所述演唱评分显示装置,其还包括多媒体资源显示模块:根据 不同的声纹相似度,界面显示不同的多媒体资源,所述多媒体资源为特效图片、文字、音频或特效动画。
进一步,所述演唱评分显示装置,还包括如下模块:
歌星头像显示模块:根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面上显示。
本发明还提供一种演唱评分显示系统,包括演唱评分显示装置、显示器、声音播放设备和声音采集设备,显示器与演唱评分显示装置连接,声音播放设备、声音采集设备与演唱评分显示装置连接,其特征在于:所述演唱评分显示装置为上述的演唱评分显示装置。
区别于现有技术,上述技术方案通过不同的声纹相似度显示不同歌星的头像,简单明了地显示出当前唱歌与歌曲原唱歌星的声纹相似度,方便用户实时看到自己唱歌与原唱歌星的声纹相似度。
附图说明
图1为本发明一系统实施例的结构示意图;
图2为本发明一方法实施例的流程图;
图3为本发明另一方法实施例的流程图;
图4为本发明一实施例中显示界面的示意图;
图5为本发明另一实施例中显示界面的示意图;
图6为本发明又一实施例中显示界面的示意图;
图7为本发明再一实施例中显示界面的示意图。
附图标记说明:
100、演唱评分显示装置,
120、显示器,
140、声音播放设备,
160、声音采集设备。
具体实施方式
为详细说明技术方案的技术内容、构造特征、所实现目的及效果,以下结合具体实施例并配合附图详予说明。
请参阅图1到图7所示,本实施例首先提供一种演唱评分显示方法,用于在显示界面上显示用户唱歌时声音与歌曲原唱的声纹相似度。本方法可以运行于图1中的演唱评分显示装置100中。演唱评分显示装置100可以与显示器120、声音播放设备140和声音采集设备160连接,演唱评分显示装置100可以作为一个数字视听设备实现,如机顶盒。显示器120的屏幕可以作为显示界面用于显示,声音播放设备140可以用于将声音信号播放出来给用户听,声音采集设备160可以用于采集用户唱歌时的声音并输入到演唱评分显示装置100中。用户在唱歌时,可以从显示器120上获取当前歌曲的歌词信息,可以从声音播放设备140上听到当前歌曲的伴奏声音,从声音采集设备160输入用户歌唱的声音,声音播放设备140也可以实时播放用户歌唱的声音。
用户在要唱歌之前,首先会进行点歌。点歌可以通过与演唱评分显示装置100连接的点歌台或者掌上设备进行点歌。点歌后,轮到该歌曲后,可以播放片头提示用户该歌曲即将开始,而后歌曲会开始进行播放。歌曲在播放时,这首歌曲对应有原唱者,即本发明实施例中的歌星,相同歌曲名称可以有不同的原唱者,如男声版的歌曲对应是男性原唱者,女声版对应的是女性原唱者。即相同歌曲名称的歌曲可以是不同首的歌曲,因为原唱者不同。同时,一首歌曲对应有固定的一位或者多位原唱者,如一首单人版的歌曲对应一个歌星,一首双人版的歌曲对应有两个歌星。歌曲在演唱时,用户关心的是该歌曲对应的歌星的声音,即该歌曲在播放时的原唱者的声音,这样方便在演唱时对该歌星进行实时模仿。
本发明的方法显示用户的演唱评分时,如图2所示,演唱评分显示装置 100首先可以进入步骤S201获取声音采集设备输入的实时音频。此时的实时音频即用户对着声音采集设备实时歌唱的声音。
而后演唱评分显示装置100可以在步骤S202获取实时音频相对于该歌曲的声纹相似度。其中,实时音频相对于该歌曲的声纹相似度可以是声纹评分信息,相似度越高,则声纹评分越高,相似度越低,声纹评分越低。实时音频相对于该歌曲的声纹相似度可以通过此较实时音频和该歌曲当前播放时刻原唱者的声音的声纹相似程度的方式实现,具体实现此较过程的设备可以是演唱评分显示装置100或者与演唱评分显示装置100连接的服务器。当实现此较过程的设备是演唱评分显示装置100时,演唱评分显示装置100可以直接采集用户的实时音频和当前时刻歌曲原唱者的声音,即人声,并进行演唱评分此较。当实现此较过程的设备是服务器时,演唱评分显示装置100可以将采集的实时音频发送给服务器,服务器实现实时音频和当前时刻歌曲原唱者的声音的演唱评分此较,并将演唱评分此较结果、即实时音频对该歌曲的声纹相似度的结果返回给演唱评分显示装置100,从而演唱评分显示装置100可以获取到实时音频相对于歌曲的声纹相似度。通过服务器的方式可以降低演唱评分显示装置100的运算量,避免演唱评分显示装置100运算量过大造成歌曲播放卡顿等情况,也方便改变相似度的此较方法,此较方法改变时无需对每个演唱评分显示装置100进行改变。
声纹评分的方法一般有两种,其一为:先提取演唱者声纹和目标声纹的MFCC特征系数,然后通过DTW算法计算演唱者的声纹与标准声纹的相似度;其二为:提取目标声纹的MFCC特征系数建模,然后将演唱者声纹的MFCC特征系数与目标模型进行匹配,得到声纹相似度。
演唱评分显示装置100获取到声纹相似度后,可以在步骤S203在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
为了进一步方便用户实时了解自己的演唱与歌星的声纹相似度,如图2 所示,上述实施例的方法还可以包括如下步骤,演唱评分显示装置100进入步骤S203在显示界面上显示歌曲的标准声纹柱。其中,标准声纹即歌曲原唱者声音信息的声波频谱,歌词中的每个字都对应有相应的声纹柱,声纹可以被量化后用图形的方式显示出来,在本实施例中,不同的声纹数据对应不同的图形的高度。如图4到图7,区域C中显示的不同的长方形的高度对应歌词不同的声纹,长方形可以看作是声纹柱。本发明并不限定单位时间中声纹柱的个数,如果声纹柱对应的是固定时间间隔的原唱者的声音信息,则间隔的时间越短,声纹柱的个数越多,如果声纹柱对应的是歌词中每个字的原唱者的声音信息,则声纹柱的个数与歌词有关。为了方便用户此较,声纹柱的显示位置应该与声纹柱对应的歌词显示位置相对应。如图4到图7中,声纹柱对应的是每个歌词的字,则每个歌词的字上面对应显示有与该字对应的声纹柱,用户在唱到歌词的字时,可以方便地看到该字对应的声纹柱。
演唱评分显示装置100在显示声纹柱后,在步骤S202获取的实时音频的声纹相似度后,根据歌曲进度和声纹相似度的高低填充对应的声纹柱。实时音频的声纹相似度即实时音频与原唱者的歌曲声音的声纹相似程度,分数越高,就越相似。不同的歌曲进度对应有不同的声纹柱,如图4和图6中的歌词“小”字对应为“小”字上面的声纹柱,歌词“浪”字对应为“浪”字上面的声纹柱。当歌曲进度到“小”字时,演唱评分显示装置100获取实时音频的声纹相似度,根据声纹相似度的高低对“小”字上面的声纹柱进行填充。当歌曲进度到“浪”字时,演唱评分显示装置100获取实时音频的声纹相似度,根据声纹相似度的高低对“浪”字上面的声纹柱进行填充。声纹相似度越高,则声纹柱被填充的占此越大。图4的区域C为显示的歌词尚未歌唱时的声纹柱的显示情况,图6为的区域C为显示的歌词被唱完后声纹柱被黑色(还可以是其他颜色)填充的情况。图5和图6的区域C分别显示了两种不同声纹评分数据的声纹柱的填充情况,图5声纹相似度较低,声纹柱黑色填充的占此较小,图6的声纹相似度较高,声纹柱黑色填充的占此较大,声纹相似度较高也表明了实时音 频与该歌曲的声纹相似度更高。通过实施的标准声纹柱的填充显示,用户可以知道在哪些部分的声纹相似度较高,哪些部分的声纹相似度较低,从而用户可以在声纹相似度较低的位置改变自己的唱法,方便用户提高模仿歌星演唱的声纹相似度。
在另一方法实施例中,如图3所示,所述标准声纹柱以透明图片叠加显示在显示屏幕上。为了更好的显示效果,将标准声纹柱以透明图片叠加显示在显示屏幕,这样用户不仅可看到标准声纹柱的图片,提高美观度,还可以隐约看到显示屏幕上播放的视频,不影响用户唱歌。
如图3所示,所述标准声纹柱图片的透明度为10%-100%。可根据用户的需求,设置标准声纹柱图片的透明度。当将标准声纹图片的透明度设置为100%时,标准声纹图片可将显示屏幕播放的视频部分遮挡;当需要透明效果时,可减小图片的透明度,这样不仅可以看到标准声纹图片,还可以看到显示屏幕播放的视频。
所述标准声纹柱以渐变色的透明图片叠加显示在显示屏幕上。通过渐变色的透明图片,可突出显示标准声纹柱图片,给用户更直观的视觉显示效果。
所述标准声纹柱图片活动显示在显示屏幕上。所述标准声纹柱图片可变化叠加的位置,例如,标准声纹柱图片可叠加显示在显示屏幕的上部,下部或中部。可根据用户的需求,调整标准声纹柱图片的叠加位置。
优选的,所述标准声纹柱图片叠加显示在歌曲视频中部。这种显示方式,不仅此较新颖,而且能最大程度的突出标准声纹柱图片的显示效果,将用户的关注度集中在标准声纹柱图片的声纹相似度上,用户可一目了然地看到个人演唱声纹与标准声纹的相似度,便于提高演唱水平。
所述标准声纹柱图片为歌曲视频显示界面的1/3~1/2大小。可根据视觉效果,调整标准声纹柱图片的大小,其大小较佳的选择为歌曲视频显示界面的1/3~1/2。
在所述显示屏幕显示的歌曲视频上,叠加显示歌曲的标准声纹柱。由于 在演唱过程中,显示屏幕优选播放相应歌曲的歌曲视频,然后再在歌曲视频上叠加播放标准声纹柱图片。
所述“在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱”具体为:所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。在歌曲演唱过程中,当换句时,先在背景图上显示下一句歌曲的标准声纹柱,然后根据用户的演唱进度,获取用户音频及用于音频与标准音频的声纹相似度,根据相似相似度,填充标准声纹柱。当声纹相似度越高时,用户的音频与标准音频的声纹越相似,例如,当声纹相似度为90%时,用户的音频与标准音频的声纹越相似度为90%,对应的标准声纹相似柱需填充90%。当该句演唱完毕,继续演唱再下一句时,重新显示再下一句的标准声纹柱及重新根据再下一句的演唱得分填充标准声纹柱。
本发明中,还包括步骤S204根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面上显示。一首歌曲对应有一位或者多位歌星,不同的声纹相似度对应有该歌星不同的头像。如图4和图5所示,在显示界面的区域A显示有歌星的头像。图4区域A显示的是歌星微笑的头像(该头像可以是歌星的卡通头像),图5区域A显示的是歌星难过的头像。歌星微笑的头像可以是与相似度高相对应,而歌星难过的头像可以是与相似度低的情况相对应。则在相似度高时,可以显示歌星微笑的头像,而在相似度低的时候,可以显示歌星难过的头像。当然,本发明并不限定歌曲不同相似度对应的歌星头像的图片个数,如相似度很高可以有一张头像对应,相似度一般状态可以有一张头像对应,相似度很低也可以有一张头像对应。同时,区分不同相似度的不同头像间的区别信息可以是歌星的表情;也可以是头像的动作,如相似度很高,歌星点头,相似度很低,歌星摇头等;或者可以通过在头像上显示相似度的评语来区分不同的声纹相似度,如声纹相似度高,就显示“唱得 好”,一般就显示“一般”,相似度低就显示“唱的不好”等。通过歌唱时歌星的不同头像的显示,用户可以直观地通过歌星不同头像的显示来获取自己歌唱的声纹相似度,从而方便地了解自己模仿该歌星的相似程度。
上述实施例中根据相似度更新歌星的头像显示可以是实时此较实时音频和歌曲的声纹相似度并实时更新头像的显示,也可以是实时统计该歌曲前面所有的声纹相似度数据后求得相似度的平均值,并实时根据平均值更新头像显示,或者也可以在一句歌词结束后,统计该歌曲前面所有的声纹相似度数据后求得平局值,并根据平均值更新歌星的头像显示,或者也可以在一句歌词结束后,统计该句歌词所有相似度后求得平均值,并根据该平均值更新歌星的头像显示。所有的这些相似度都与实时音频的声纹相似度相关,则歌星不同的头像对应的声纹相似度可以看作是与实时音频相关的声纹相似度。在某些实施例中,相似度可以是实时音频现对于该歌曲原唱者声音的声纹评分数据,声纹评分数据越高,则相似度越高,声纹评分数据越低,相似度越低。
在某些实施例中,如图2所示,本方法进一步还包括如下步骤S205,根据不同的声纹相似度,界面显示不同的多媒体资源。具体可以是演唱评分显示装置100获取实时音频的声纹评分数据或者统计该歌曲总的或阶段性的声纹评分数据,并根据实时音频的声纹评分数据、总的或阶段性的声纹评分数据,在显示界面上显示对应的多媒体资源。其中,阶段性的声纹评分数据包括固定时间段长度的评分数据或者固定句子个数的声纹评分数据。多媒体资源可以为特效图片、文字、音频或特效动画。如声纹相似度高,可以显示鼓掌的动画或播放鼓掌的声音,如果声纹相似度低,可以播放“嘘”的一声的声音资源或者显示“唱的不好”相关文字或图片。其中,不同的多媒体资源可以是多媒体的资源类型不同,如文字和声音是不同的资源类型。不同的多媒体资源也可以是同一种资源类型的不同内容,如资源类型为声音的鼓掌声音和“嘘”声。又例如根据声纹评分数据的高低对应有英文信息good,excellent,wonderful,fantastic,unbelievable或者中文信息好、很好、非常好等文字信息。 图像信息可以是鼓掌图像、大拇指图像,泡泡图像等。动画信息可以是撒花或者如图6冒星星的方式,即在声纹评分数据大于一个预设值后,图6的显示界面的正中间位置可以冒星星,图6冒出的星星可以向评分数据显示区域B移动。这些提示信息可以显示在显示界面的正中间,如图7的正中间位置显示有“唱的很好”四个字,方便用户看到。这些提示信息可以在评分数据较高时,自动显示出来,给用户以鼓励,在增加趣味性的同时提高用户演唱的积极性。这些多媒体资源可以提示用户当前演唱的情况,增加演唱的趣味。在多媒体资源类型为声音的时候,还可以提醒没有在看显示屏幕的用户当前演唱者的演唱情况,方便用户对演唱情况的了解。
上述实施例中评分数据与提示信息的对应关系可以是自定义,如评分在80以上时,显示“唱的很好”,评分在90以上,显示大拇指图像。作为一个可选实施例,上述的多媒体信息可以分别为唱的很像、唱的不像、泡泡-一般、泡泡-不像、泡泡-像。对应的声纹评分数据可以如下表1所示:
Figure PCTCN2016070111-appb-000001
表1:评分数据与多媒体资源对应关系表
表1的评分数据可以在连续几句分数累积到一定程度后,获取到阶段性的声纹评分数据并显示对应的多媒体资源,给与用户阶段性的鼓励或者提示,方便用户获取阶段性的歌星模仿相似度结果。
为了让用户更加具体地了解演唱与歌星的声纹相似度,在某些实施例中,演唱评分显示装置100还可以包括步骤S206:在显示界面上显示声纹相似度的值,在一些实施例中,该声纹相似度的值即为声纹评分数据。如图4到图7所示界面的区域B中,显示有对应声纹评分数据的数值。正如上述对相似度的说明,声纹评分数据可以是实时音频的声纹评分数据或者是统计该歌曲总的声纹评分数据。总的声纹评分数据表明了该歌曲演唱到当前总的声纹相似度的状态,总的声纹评分数据可以对声纹评分数据累加后求平均值实现。通过显示声纹评分数据,用户可以更加直观地了解到相似度的高低,更加准确地知道自己当前模仿歌星演唱的声纹相似度。
对于总的声纹评分数据统计可以实时计算,间隔固定时间计算或者可以在一句歌词结束后,统计该歌曲总的声纹评分数据。间隔固定时间和在一句歌词结束后进行计算相对于实时计算,可以降低计算的运算量,节省演唱评分显示装置100的处理资源。
为了方便用户了解歌星的信息,演唱评分显示装置100还可以进行如下步骤S207:在显示界面上显示该歌曲对应的歌星的信息。歌星的信息包括有歌星的姓名或者歌星所属地区,如港台明星、大陆明星等。歌星的信息的显示可以是显示在显示界面的任意位置,或者可以显示在头像的显示区域中,便于用户在浏览歌星的头像时,顺便获取明星的信息。
上述实施例并不限定头像显示区域、声纹评分数据显示区域和声纹柱显示区域的位置,这些显示区域可以是显示屏幕上的任意一个区域,这些区域可以重叠也可以不重叠,在某些实施例中,这些显示区域可以是如图4到图6的显示区域A-C,即头像显示区域为显示界面左上方的显示区域A,声纹评 分数据显示区域为显示界面右上方的显示区域B,声纹柱显示区域为显示界面中间靠下的显示区域C。这样的显示区域排布不会影响用户观看显示界面播放的视频MV,又可以方便地观看各个显示区域的信息。
声纹相似度获取模块:用于获取实时音频相对于该歌曲的声纹相似度;
声纹柱显示模块:在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
以及本发明还提供一种演唱评分显示装置100,如图1所示,包括如下模块:实时音频获取模块101:用于获取声音采集设备输入的实时音频;声纹相似度获取模块102:用于获取实时音频相对于该歌曲的声纹相似度;声纹柱显示模块103:用于在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。实时音频获取模块可以与声音采集设备160连接,获取用户歌唱的实时音频。
声纹柱显示模块103可以与显示器120连接,可以在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。通过声纹柱显示模块,用户可以直观地看到自己在演唱时,哪些部分与歌星相似度高,哪些部分与歌星相似度低,这样用户可以调整相似度地的部分,让自己的演唱与模仿的歌星的演唱更相似。
所述声纹柱显示模块103中,所述标准声纹柱以透明图片叠加显示在显示屏幕上。
所述声纹柱显示模块103中,所述标准声纹柱图片活动显示在显示屏幕上。
所述声纹柱显示模块103具体为,用于所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。
为了更突出显示标准声纹柱,标准声纹柱可活动地在显示屏幕上进行显示,可将标准声纹柱显示在显示屏幕上部、下部或中部。且标准声纹柱以透 明图片叠加在显示屏幕上,这样不仅可看到标准声纹柱图片,还可以看到显示屏幕上播放的视频,不影响用户观看视频,还可以直接地看到个人演唱与歌星音色的实时相似程度,便于改进。
在某些实施例中,本装置100还包括如下模块:歌星头像显示模块104:根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面上显示。简单明了地显示出当前唱歌与歌曲原唱歌星的声纹相似度,方便用户实时看到自己唱歌与原唱歌星的声纹相似度。
在某些实施例中,本装置100还包括多媒体资源显示模块105:根据不同的声纹相似度,界面显示不同的多媒体资源,多媒体资源为特效图片、文字、音频或特效动画。多媒体资源显示模块105可以在一种声纹相似度下面播放多种多媒体资源,如在相似度高的情况下可以同时播放鼓掌动画和“唱的很好”的文字。通过多媒体资源显示模块105,可以提醒用户知道当前演唱的情况。多媒体资源显示模块105还可以实时音频的评分数据或者统计该歌曲总的或阶段性的评分数据,根据实时音频的评分数据、总的或阶段性的评分数据,在显示界面上显示对应的多媒体资源。多媒体资源显示模块105可以更主动地提醒用户模仿的声纹相似度的情况,用户不必一直盯着声纹柱或者评分数据即可通过提示信息知道自身的演唱情况。
在上述任意装置实施例的基础上,演唱评分显示装置100还包括评分数据显示模块106:用于获取实时音频的声纹评分数据并在显示界面上显示声纹评分数据;或者,评分数据显示模块106用于统计该歌曲总的评分数据并在显示界面上显示。评分数据的显示可以让用户看到更详细的声纹相似度的数据。对于统计歌曲总的评分数据的时机,可以在一句歌词结束后,每句歌词结束后,用户会有短暂的空闲时间,这样用户看到分数后可以在下一句调整自身的演唱方法,提升自身演唱与原唱者的声纹相似度。
为了方便用户更好地了解歌星信息,演唱评分显示装置100还包括如下模块:歌星信息显示模块107:在显示界面上显示该歌曲对应的歌星的信息。
以及本发明还提供一种演唱评分显示系统,如图1所示,包括演唱评分显示装置100、显示器120、声音播放设备140和声音采集设备160,显示器与演唱评分显示装置连接,声音播放设备、声音采集设备与演唱评分显示装置连接,所述演唱评分显示装置为上述任一实施例所述的演唱评分显示装置。本系统可以在用户模仿歌星演唱时,多方位地显示用户演唱与原唱者歌星的声纹相似度,方便用户获知自身演唱的情况并可以让用户及时调整自己的演唱方法。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括……”或“包含……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的要素。此外,在本文中,“大于”、“小于”、“超过”等理解为不包括本数;“以上”、“以下”、“以内”等理解为包括本数。
本领域内的技术人员应明白,上述各实施例可提供为方法、装置、或计算机程序产品。这些实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。上述各实施例涉及的方法中的全部或部分步骤可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机设备可读取的存储介质中,用于执行上述各实施例方法所述的全部或部分步骤。所述计算机设备,包括但不限于:个人计算机、服务器、通用计算机、专用计算机、网络设备、嵌入式设备、可编程设备、智能移动终端、智能家居设备、穿戴式智能设备、车载智能设备等;所述的存储介质,包括但不限于:RAM、ROM、磁碟、磁带、光盘、闪存、U盘、移动硬盘、存储卡、记 忆棒、网络服务器存储、网络云存储等。
上述各实施例是参照根据实施例所述的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到计算机设备的处理器以产生一个机器,使得通过计算机设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机设备以特定方式工作的计算机设备可读存储器中,使得存储在该计算机设备可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机设备上,使得在计算机设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已经对上述各实施例进行了描述,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改,所以以上所述仅为本发明的实施例,并非因此限制本发明的专利保护范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围之内。

Claims (20)

  1. 一种演唱评分显示方法,其特征在于,包括如下步骤:
    获取声音采集设备输入的实时演唱音频;
    获取实时演唱音频相对于该歌曲标准音频的声纹相似度,
    在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
  2. 根据权利要求1所述的演唱评分显示方法,其特征在于,所述标准声纹柱以透明图片叠加显示在显示屏幕上。
  3. 根据权利要求1所述的演唱评分显示方法,其特征在于,所述标准声纹柱以渐变色的透明图片叠加显示在显示屏幕上。
  4. 根据权利要求2或3所述的演唱评分显示方法,其特征在于,所述标准声纹柱图片的透明度为10%-100%。
  5. 根据权利要求1-3任一项所述的演唱评分显示方法,其特征在于,所述标准声纹柱图片活动显示在显示屏幕上。
  6. 根据权利要求5任一项所述的演唱评分显示方法,其特征在于,所述标准声纹柱图片叠加显示在歌曲视频中部。
  7. 根据权利要求1-3任一项所述的演唱评分显示方法,其特征在于,所述标准声纹柱图片为歌曲视频显示界面的1/3~1/2大小。
  8. 根据权利要求1-3任一项所述的演唱评分显示方法,其特征在于,在所述显示屏幕显示的歌曲视频上,叠加显示歌曲的标准声纹柱。
  9. 根据权利要求1-3任一项所述的演唱评分显示方法,其特征在于,所述“在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱”具体为:所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。
  10. 根据权利要求1所述的演唱评分显示方法,其特征在于,
    根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面 上显示。
  11. 根据权利要求1所述的演唱评分显示方法,其特征在于,还包括步骤:根据不同的声纹相似度,界面显示不同的多媒体资源。
  12. 根据权利要求11所述的演唱评分显示方法,其特征在于,所述多媒体资源为特效图片、文字、音频或特效动画。
  13. 根据权利要求1所述的演唱评分显示方法,其特征在于,还包括如下步骤:
    获取实时音频的声纹评分数据并在显示界面上显示声纹评分数据;
    或者,
    统计该歌曲总的声纹评分数据并在显示界面上显示。
  14. 一种演唱评分显示装置,其特征在于,包括如下模块:
    实时音频获取模块:用于获取声音采集设备输入的实时音频;
    声纹相似度获取模块:用于获取实时音频相对于该歌曲的声纹相似度;
    声纹柱显示模块:在显示屏幕上叠加显示歌曲的标准声纹柱,根据歌曲进度和声纹相似度填充对应的声纹柱。
  15. 根据权利要求14所述的演唱评分显示装置,其特征在于,所述声纹柱显示模块中,所述标准声纹柱以透明图片叠加显示在显示屏幕上。
  16. 根据权利要求14所述的演唱评分显示装置,其特征在于,所述声纹柱显示模块中,所述标准声纹柱图片活动显示在显示屏幕上。
  17. 根据权利要求14、15或16任一项所述的演唱评分显示装置,其特征在于,所述声纹柱显示模块为,用于所述显示屏幕先显示歌曲视频,在歌曲视频上叠加显示透明的背景图,在背景图上显示当前演唱句的标准声纹柱,根据歌曲进度和声纹相似度,填充对应的声纹柱。
  18. 根据权利要求14所述的演唱评分显示装置,其特征在于,其还包括多媒体资源显示模块:根据不同的声纹相似度,界面显示不同的多媒体资源,所述多媒体资源为特效图片、文字、音频或特效动画。
  19. 根据权利要求14所述的演唱评分显示装置,其特征在于,还包括如下模块:
    歌星头像显示模块:根据不同的声纹相似度获取该歌曲对应歌星的不同的头像并在显示界面上显示。
  20. 一种演唱评分显示系统,包括演唱评分显示装置、显示器、声音播放设备和声音采集设备,显示器与演唱评分显示装置连接,声音播放设备、声音采集设备与演唱评分显示装置连接,其特征在于:所述演唱评分显示装置为权利要求14到19任一项所述的演唱评分显示装置。
PCT/CN2016/070111 2015-06-05 2016-01-05 一种演唱评分显示方法、装置及系统 WO2016192395A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510305092.9A CN104882147A (zh) 2015-06-05 2015-06-05 一种演唱评分显示方法、装置及系统
CN201510305092.9 2015-06-05

Publications (1)

Publication Number Publication Date
WO2016192395A1 true WO2016192395A1 (zh) 2016-12-08

Family

ID=53949615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/070111 WO2016192395A1 (zh) 2015-06-05 2016-01-05 一种演唱评分显示方法、装置及系统

Country Status (2)

Country Link
CN (1) CN104882147A (zh)
WO (1) WO2016192395A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503961A (zh) * 2019-09-03 2019-11-26 北京字节跳动网络技术有限公司 音频识别方法、装置、存储介质及电子设备
CN112102835A (zh) * 2020-11-18 2020-12-18 北京声智科技有限公司 大屏语音响应方法、装置、电子设备和存储介质
US11170043B2 (en) * 2019-04-08 2021-11-09 Deluxe One Llc Method for providing visualization of progress during media search
TWI745338B (zh) * 2017-01-19 2021-11-11 香港商阿里巴巴集團服務有限公司 伴奏音樂的提供方法和裝置

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104882147A (zh) * 2015-06-05 2015-09-02 福建星网视易信息系统有限公司 一种演唱评分显示方法、装置及系统
JP6759545B2 (ja) * 2015-09-15 2020-09-23 ヤマハ株式会社 評価装置およびプログラム
CN105260154A (zh) * 2015-10-15 2016-01-20 桂林电子科技大学 一种多媒体数据显示方法及显示装置
CN105976849B (zh) * 2016-05-05 2019-05-03 广州酷狗计算机科技有限公司 一种播放音频数据的方法和装置
CN106250400B (zh) * 2016-07-19 2021-03-26 腾讯科技(深圳)有限公司 一种音频数据处理方法、装置以及系统
CN107347167A (zh) * 2017-05-27 2017-11-14 福建星网视易信息系统有限公司 一种实现联网唱歌比赛的方法、系统及应用
CN107221340B (zh) * 2017-05-31 2021-01-15 福建星网视易信息系统有限公司 基于多路音频的实时评分方法、存储设备及应用
CN107257338B (zh) * 2017-06-16 2018-09-28 腾讯科技(深圳)有限公司 媒体数据处理方法、装置及存储介质
CN107393519B (zh) * 2017-08-03 2020-09-15 腾讯音乐娱乐(深圳)有限公司 演唱评分的显示方法、装置及存储介质
CN109905789A (zh) * 2017-12-10 2019-06-18 张德明 一种k歌话筒
CN108182946B (zh) * 2017-12-25 2021-04-13 广州势必可赢网络科技有限公司 一种基于声纹识别的声乐模式选择方法及装置
CN108492807B (zh) * 2018-03-30 2020-09-11 北京小唱科技有限公司 展示修音状态的方法及装置
CN108647313A (zh) * 2018-05-10 2018-10-12 福建星网视易信息系统有限公司 一种实时生成演唱视频的方法和系统
CN108848419B (zh) * 2018-06-07 2020-12-11 康佳集团股份有限公司 基于生物特征识别的电视互动方法、智能电视及存储介质
CN111383620B (zh) * 2018-12-29 2022-10-11 广州市百果园信息技术有限公司 一种音频的修正方法、装置、设备及存储介质
CN111666445A (zh) * 2019-03-06 2020-09-15 深圳市冠旭电子股份有限公司 一种情景歌词的显示方法、装置及音箱设备
CN110010159B (zh) * 2019-04-02 2021-12-10 广州酷狗计算机科技有限公司 声音相似度确定方法及装置
CN113596590B (zh) * 2020-04-30 2022-08-26 聚好看科技股份有限公司 显示设备及播放控制方法
CN112951274A (zh) * 2021-02-07 2021-06-11 脸萌有限公司 语音相似度确定方法及设备、程序产品
CN113707113B (zh) * 2021-08-24 2024-02-23 北京达佳互联信息技术有限公司 用户歌声的修音方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915725A (zh) * 2012-09-10 2013-02-06 福建星网视易信息系统有限公司 人机互动的歌曲演唱系统、方法
CN103871426A (zh) * 2012-12-13 2014-06-18 上海八方视界网络科技有限公司 对比用户音频与原唱音频相似度的方法及其系统
CN103971674A (zh) * 2014-05-22 2014-08-06 天格科技(杭州)有限公司 一种评分准确、用户体验好的演唱实时评分方法
JP2014194471A (ja) * 2013-03-28 2014-10-09 Xing Inc カラオケ装置、カラオケプログラム、及び記録媒体
JP2015045671A (ja) * 2013-08-27 2015-03-12 株式会社第一興商 歌唱パート決定システム
CN104882147A (zh) * 2015-06-05 2015-09-02 福建星网视易信息系统有限公司 一种演唱评分显示方法、装置及系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03287289A (ja) * 1990-04-03 1991-12-17 Brother Ind Ltd 採点順位表示機能付きカラオケ装置
JP3785232B2 (ja) * 1996-10-16 2006-06-14 株式会社エクシング カラオケ採点装置及びカラオケ装置
US7806759B2 (en) * 2004-05-14 2010-10-05 Konami Digital Entertainment, Inc. In-game interface with performance feedback
CN201611570U (zh) * 2009-12-24 2010-10-20 盛大计算机(上海)有限公司 音频评测装置
CN104064180A (zh) * 2014-06-06 2014-09-24 百度在线网络技术(北京)有限公司 演唱评分方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915725A (zh) * 2012-09-10 2013-02-06 福建星网视易信息系统有限公司 人机互动的歌曲演唱系统、方法
CN103871426A (zh) * 2012-12-13 2014-06-18 上海八方视界网络科技有限公司 对比用户音频与原唱音频相似度的方法及其系统
JP2014194471A (ja) * 2013-03-28 2014-10-09 Xing Inc カラオケ装置、カラオケプログラム、及び記録媒体
JP2015045671A (ja) * 2013-08-27 2015-03-12 株式会社第一興商 歌唱パート決定システム
CN103971674A (zh) * 2014-05-22 2014-08-06 天格科技(杭州)有限公司 一种评分准确、用户体验好的演唱实时评分方法
CN104882147A (zh) * 2015-06-05 2015-09-02 福建星网视易信息系统有限公司 一种演唱评分显示方法、装置及系统

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI745338B (zh) * 2017-01-19 2021-11-11 香港商阿里巴巴集團服務有限公司 伴奏音樂的提供方法和裝置
US11170043B2 (en) * 2019-04-08 2021-11-09 Deluxe One Llc Method for providing visualization of progress during media search
CN110503961A (zh) * 2019-09-03 2019-11-26 北京字节跳动网络技术有限公司 音频识别方法、装置、存储介质及电子设备
CN112102835A (zh) * 2020-11-18 2020-12-18 北京声智科技有限公司 大屏语音响应方法、装置、电子设备和存储介质
CN112102835B (zh) * 2020-11-18 2023-02-17 北京声智科技有限公司 大屏语音响应方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN104882147A (zh) 2015-09-02

Similar Documents

Publication Publication Date Title
WO2016192395A1 (zh) 一种演唱评分显示方法、装置及系统
WO2021109652A1 (zh) 文字虚拟礼物的赠送方法、装置、设备及存储介质
CN105513583B (zh) 一种歌曲节奏的显示方法及其系统
CN107329980B (zh) 一种基于音频的实时联动显示方法及存储设备
EP2982421A1 (en) Facial-expression assessment device, dance assessment device, karaoke device, and game device
CN105187936B (zh) 基于演唱音频评分的多媒体文件播放方法和装置
US11968433B2 (en) Systems and methods for generating synthetic videos based on audio contents
JP2007220093A5 (zh)
CN104574453A (zh) 用图像表达音乐的软件
CN107436921A (zh) 视频数据处理方法、装置、设备及存储介质
US11511200B2 (en) Game playing method and system based on a multimedia file
TW201203113A (en) Graphical representation of events
CN105243093A (zh) 一种演唱者推荐方法及装置
KR20150131215A (ko) 3d 모바일 및 커넥티드 tv 광고 트래피킹 시스템
US20230090995A1 (en) Virtual-musical-instrument-based audio processing method and apparatus, electronic device, computer-readable storage medium, and computer program product
Maes et al. The “Conducting Master”: an interactive, real-time gesture monitoring system based on spatiotemporal motion templates
WO2018095195A1 (zh) 一种包装盒定制方法及装置
CN108614872A (zh) 课程内容展示方法及装置
CN108268139A (zh) 虚拟场景交互方法及装置、计算机装置及可读存储介质
JP2017045374A (ja) 情報処理装置及びプログラム
Keidl Cinephilic Fandom
CN104036252B (zh) 图像处理方法、图像处理装置和电子设备
JP6222465B2 (ja) アニメーション生成装置、アニメーション生成方法およびプログラム
WO2022142851A1 (zh) 一种信息播放控制方法、装置、电子设备、计算机可读存储介质及计算机程序产品
CN113438532B (zh) 视频处理、视频播放方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16802318

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16802318

Country of ref document: EP

Kind code of ref document: A1