WO2019223393A1 - 生成歌词、显示歌词的方法、装置、电子设备及存储介质 - Google Patents

生成歌词、显示歌词的方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2019223393A1
WO2019223393A1 PCT/CN2019/076815 CN2019076815W WO2019223393A1 WO 2019223393 A1 WO2019223393 A1 WO 2019223393A1 CN 2019076815 W CN2019076815 W CN 2019076815W WO 2019223393 A1 WO2019223393 A1 WO 2019223393A1
Authority
WO
WIPO (PCT)
Prior art keywords
characters
marked
lyrics
character
word
Prior art date
Application number
PCT/CN2019/076815
Other languages
English (en)
French (fr)
Inventor
冯穗豫
Original Assignee
腾讯音乐娱乐科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯音乐娱乐科技(深圳)有限公司 filed Critical 腾讯音乐娱乐科技(深圳)有限公司
Priority to SG11202011712WA priority Critical patent/SG11202011712WA/en
Publication of WO2019223393A1 publication Critical patent/WO2019223393A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a method, a device, an electronic device, and a storage medium for generating and displaying lyrics.
  • the K song service refers to the music accompaniment of songs played by the music player, and on the current playback interface Display the lyrics so that users can sing along with the music while watching the lyrics.
  • the terminal usually displays a piece of lyrics corresponding to the current playback time in the form of karaoke, and dynamically labels which character of the lyrics currently being played in the phrase with color.
  • the lyrics usually include some polyphonic characters. Polyphonic characters refer to characters with multiple pronunciations in multiple words, which may cause users to sing incorrectly during singing. Therefore, a method for displaying lyrics is urgently needed so that users can sing the pronunciation of each character correctly.
  • Embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for generating lyrics and displaying lyrics, which can solve the problem that a user sings a character incorrectly in the lyrics in the related art.
  • the technical solution is as follows:
  • a method for generating lyrics includes:
  • the word where the character to be marked is located, according to a preset query principle, query the pronunciation of the character to be marked in the word, and determine the pronunciation of the character to be marked in the word as the character to be marked.
  • a method for displaying lyrics is provided.
  • the method is applied on a terminal, and the method includes:
  • the lyrics display instruction When a lyrics display instruction is received, obtaining a first lyrics file of the target song, the lyrics display instruction is used to display the lyrics of the target song;
  • the corresponding pronunciation of the character to be marked in the target song is marked on the target position of the character to be marked, and the target position is above the character to be marked.
  • an apparatus for generating lyrics includes:
  • a determining module configured to determine characters to be marked among a plurality of characters of the lyrics
  • the query module is configured to query the pronunciation of the character to be marked in the word according to a preset query principle according to the word where the character to be marked is located, and determine the pronunciation of the character to be marked in the word as The corresponding pronunciation of the character to be marked in the target song;
  • a generating module is configured to generate a first lyrics file of the target song according to the corresponding pronunciation of the plurality of characters and the characters to be marked in the target song.
  • an apparatus for displaying lyrics includes:
  • An obtaining module configured to obtain a first lyrics file of a target song when the lyrics display instruction is received, and the lyrics display instruction is used to display the lyrics of the target song;
  • the obtaining module is further configured to obtain the lyrics of the target song from the first lyrics file, and the corresponding pronunciation of the characters to be marked in the multiple characters of the lyrics in the target song;
  • a display module for displaying a plurality of characters of the lyrics
  • a labeling module configured to label the corresponding pronunciation of the character to be labeled in the target song when the character to be labeled is displayed, and the target position is the target to be labeled Above the character.
  • an electronic device includes a processor and a memory.
  • the memory stores at least one instruction, and the instruction is loaded and executed by the processor to implement the method described in the first aspect.
  • a computer-readable storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to implement the method for generating lyrics according to the first aspect, or as The operation performed by the method for displaying lyrics according to the second aspect.
  • the server may determine the pronunciation of each to-be-marked character in the target song based on the lyrics of the target song, and further generate a first The lyrics file, so that the corresponding pronunciation is bound for each character to be marked, so that when the lyrics are subsequently displayed, the pronunciation can be displayed synchronously to ensure that the user can sing the pronunciation of each character of the target song correctly.
  • the terminal when displaying lyrics, can also mark the pronunciation above the corresponding character to be marked, so that the pronunciation is clearly visible, and the user can accurately and quickly find the corresponding pronunciation of the character to be marked, which improves the accuracy of displaying the lyrics.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present invention.
  • FIG. 2 is a flowchart of a method for generating lyrics according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for displaying lyrics according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of an interface for displaying lyrics according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an interface for displaying lyrics according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a device for generating lyrics according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a device for displaying lyrics according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present invention.
  • the implementation environment includes a terminal 101 and a server 102.
  • the terminal 101 and the server 102 are connected through a network.
  • An application is installed on the terminal 101, and the terminal 101 can obtain a target song from the server 102 based on the application, and perform target song playback, lyrics display, and the like.
  • the server 102 may query the pronunciation of the to-be-marked character from a dictionary engine based on the to-be-marked characters in the lyrics of the target song ’s lyrics in advance, thereby obtaining the corresponding pronunciation of the to-be-marked character in the target song.
  • the lyrics and the pronunciation of the character to be marked in the target song may be stored, and the lyrics of the target song and the pronunciation of the character to be marked are sent to the terminal 101.
  • the terminal 101 plays the target song, the lyrics of the target song are displayed synchronously, and the pronunciation corresponding to the character to be marked is marked above the character to be marked.
  • the above-mentioned process of obtaining the pronunciation of the characters to be marked for multiple characters may also be performed by the terminal 101, that is, after the terminal 101 obtains the lyrics of the target song from the server 102, it determines the pronunciation of the characters to be marked in the lyrics and Callout.
  • the application may be a music player or an application installed with a music playback plug-in
  • the terminal may be a mobile phone terminal, a PAD (Portable Android Device, tablet computer) terminal, or a computer terminal.
  • the server 102 is a background server of the application.
  • the server 102 may be a server, or a server cluster composed of several servers, or a cloud computing server center.
  • FIG. 2 is a flowchart of a method for generating lyrics according to an embodiment of the present invention. This embodiment of the present invention is executed by a server or a terminal. This embodiment of the present invention is described by using only a server as an example. Referring to FIG. 2, the method includes:
  • the server obtains the lyrics of the target song.
  • the server may first obtain the lyrics file of the target song, and obtain the lyrics of the target song from the lyrics file, and the lyrics include multiple characters.
  • the target song is any song that needs to generate lyrics, which is generally a song that includes Chinese characters in the lyrics.
  • the Chinese characters include, but are not limited to, simplified Chinese characters, traditional Chinese characters, Chinese characters used in Japanese, and traditional characters used in Japanese.
  • "Hidden” in "Hidden” corresponds to the simplified Chinese characters, traditional Chinese characters, and traditional characters used in Japanese in order: hidden, hidden, and ⁇ ⁇ .
  • simplified Chinese characters and traditional Chinese characters include not only the Chinese characters used in Mandarin, but also the Chinese characters used in some local languages, for example, " ⁇ " in Cantonese, which belongs to simplified Chinese characters.
  • the pronunciation is “ “mao”, which corresponds to "no” in Mandarin.
  • the display time of multiple characters is often included, and the display time is used to indicate when each character is displayed during the playback of the song.
  • each character and the display time of the character are stored together, and there is often a display time between characters.
  • the actual storage form of the lyrics " ⁇ ⁇ ⁇ ” in the lyrics file is "( 39112) Rain (39803) (40356) Empty (40606) Port (41176) ", where the number in parentheses indicates the display time of the character, and the time between characters is cut off; the server can obtain the second of the target song Lyrics file, extracts the lyrics of the target song and the display time of multiple characters in the lyrics from the second lyrics file, and establishes an index relationship between each character and the display time of each character to ensure subsequent display time You can also display based on the display time.
  • the second lyrics file is an original lyrics file of the target song, and includes a lyrics of the target song and a display time of multiple characters of the lyrics.
  • the server may store the characters and the display time separately in the form of an array, so that the characters and the display time may correspond one-to-one.
  • the step of the server obtaining the lyrics of the target song may be: the server obtains the target song The second lyrics file of the first and third arrays, the first array is used to store multiple characters in the lyrics, and the third array is used to store the display time of the multiple characters.
  • the server writes the multiple characters into the first array, writes the display times of the multiple characters into the third array, and determines the multiple characters in the first array as the lyrics of the target song.
  • the server may store the display time of each character in the third array according to the storage order of each character in the first array. Based on the storage order, each character in the first array is related to the third The display time of the character in the array corresponds one-to-one, which improves the accuracy of the lyrics and display time.
  • the server extracts characters so that subsequent pronunciation can be determined based on the characters.
  • the server establishes an index relationship between each character and the display time of each character without destroying the original of the second lyrics file. Extracting lyrics on the premise of having content ensures that accurate lyrics files can be obtained in the end.
  • the server may first determine a character to be marked from a plurality of characters through the following step 202, perform a query based on the character to be marked, and then query the mark to be marked based on step 203.
  • the pronunciation of the character may be determined from a plurality of characters through the following step 202, perform a query based on the character to be marked, and then query the mark to be marked based on step 203.
  • the server determines a character to be labeled among a plurality of characters of the lyrics.
  • the server may recognize a Chinese character in the plurality of characters by using a preset recognition algorithm, and determine the Chinese character as the character to be marked.
  • the server generally stores encoded data of the plurality of characters
  • the preset recognition algorithm may be an encoding table based on the encoding manner of the plurality of characters to identify the plurality of characters.
  • the embodiment of the present invention does not limit the encoding method of characters.
  • the encoding table may be a encoding table of a Unicode (Uniform Code) encoding method.
  • the server first recognizes the multiple characters to identify the characters to be marked, and directly determines the pronunciation based on the characters to be marked in the multiple characters, reducing the number of characters to be processed in the future, and further improving To determine the efficiency of pronunciation.
  • the embodiment of the present invention uses Chinese characters as characters to be marked, and subsequently marks the pronunciation of the Chinese characters in the lyrics. For users whose mother tongue is not Chinese or Japanese, it is convenient for users to learn a foreign language by singing, which not only satisfies the listening pleasure , And achieve the effect of learning foreign languages, greatly enrich the user experience, thereby greatly improving the practical value of this method.
  • the server queries the pronunciation of the character to be marked in the word according to the preset query principle according to the word where the character to be marked is located, and determines the pronunciation of the character to be marked in the word as the target of the character to be marked in the target. The corresponding pronunciation in the song.
  • the server may establish and store a dictionary engine in advance.
  • the dictionary engine includes multiple characters and pronunciations of multiple characters.
  • the server obtains the pronunciation of the character to be marked from the dictionary engine according to the character to be marked, and determines the pronunciation of the query as the pronunciation corresponding to the target song terminal.
  • certain characters may have multiple pronunciations, and different pronunciations in different words, or even in the same word, but the voices may be different, and the server may , You can also query based on the word and / or voice of the character to be marked. Therefore, this step can include the following three cases.
  • the server inputs the character to be marked into the dictionary engine, and outputs the pronunciation of the character to be marked corresponding to the target song.
  • the server may directly input the character to be marked into the dictionary engine, and when the server finds the unique pronunciation of the character to be marked based on the dictionary engine, directly output the pronunciation of the character to be marked, and output the The pronunciation is determined as the corresponding pronunciation of the character to be marked in the target song.
  • the server may obtain the unique pronunciation of the character to be marked based on the character to be marked.
  • the server determines the pronunciation of the character to be marked in the word in the dictionary engine according to the word in which the character to be marked is located.
  • a character may have multiple pronunciations and different pronunciations in different words.
  • Chinese characters in Japanese have two or more pronunciations in most cases.
  • the server may further lock the pronunciation of the character to be marked in the target song in the dictionary engine based on the word where the character to be marked is located.
  • the server performs phrase division on the multiple characters to determine multiple phrases; the server inputs the multiple phrases into the dictionary engine, and performs a word-by-word increase query on the multiple phrases in the dictionary engine to query the characters to be marked Multiple candidate words matching the word in question; the server selects the candidate word with the most characters from the multiple candidate words according to the number of characters included in the multiple candidate words, and outputs The pronunciation of the candidate with the largest number of characters.
  • the plurality of phrases includes characters to be marked, and the terminal may divide a lyrics in the target song into one phrase, and each candidate word includes at least characters included in the word in which the character to be marked is located.
  • the server performs a word-by-word increase query in the dictionary engine, and the process of filtering among the multiple candidate words queried may be: for each phrase, the server in the dictionary engine selects the phrase from the Starting with the first character to be marked, determine the initial pronunciation of the character to be marked; and based on the subsequent characters of the phrase to increase verbatim, use the first character to be marked and the added character as the word where the character to be marked is located.
  • the dictionary engine searches for a candidate word that matches the word where the character to be marked is located. During the query process, when a candidate word including the word where the character to be marked is matched, the pronunciation of the character to be marked is determined based on the candidate word.
  • the server can determine that the pronunciation of each character to be marked in the lyrics of the sentence is: " ⁇ Wo ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ".
  • the pronunciation of the character to be marked may be Chinese pinyin, and in Japanese, the pronunciation of the character to be marked may be hiragana.
  • the server determines the pronunciation of the character to be marked in the word in the dictionary engine according to the word where the character to be marked is located and the key character of the character to be marked.
  • the pronunciation of characters may also be affected by the voice, and the pronunciations of characters with different voices are different.
  • the server performs phrase division on the multiple characters, determines multiple words, and enters the multiple phrases into the dictionary engine.
  • the server performs a word-by-word increase query on the multiple phrases in the dictionary engine to query the characters to be marked. Multiple candidates matching the word in question.
  • the server selects from the plurality of candidate words the candidate word that includes the most characters and has the same voice as the to-be-marked characters.
  • each candidate word includes at least the characters included in the word to which the character to be marked is included, and the plurality of phrases include the character to be marked and the key character of the voice of the character to be marked.
  • the server performs a query based on the word and the voice of the character to be marked, and the candidate word filtered out includes the largest number of characters, and the voice is consistent with the character to be marked.
  • the server performs a query in the dictionary engine.
  • the selection process is the same process as in the second case, and is not repeated here.
  • the key characters based on this voice can be used to further match the accurate pronunciation of the characters to be marked.
  • the above steps 202-203 are to determine the characters to be marked first, and then perform the pronunciation query process of the characters to be marked.
  • the server can also directly input multiple characters into the dictionary engine for query, and the server can search the dictionary The engine only stores the pronunciation of Chinese characters. For characters other than the characters to be marked, the dictionary engine will not output the pronunciation. Therefore, the server can directly query based on multiple characters, and can also get the characters to be marked in the target song. Corresponding pronunciation.
  • the server generates a first lyrics file of the target song according to the corresponding pronunciation of the plurality of characters and the character to be marked in the target song.
  • the server creates a first lyrics file, stores the corresponding pronunciations of the plurality of characters and characters to be marked in the target song in the first lyrics file, and establishes characters to be marked and characters to be marked in the target song.
  • the server may establish an index relationship between the character and the pronunciation based on the array.
  • the server establishes a second array, and writes the corresponding pronunciation of the character to be marked in the target song into the second array.
  • the first array and the first Two arrays are added to the first lyrics file.
  • the server For each character to be marked, stores the pronunciation corresponding to the character to be marked in the second storage position of the third array according to the first storage position of the character to be marked in the first array, where:
  • the first storage location and the third storage location may both be associated bytes in the first array and the second array, for example, both are the first byte.
  • the server may further obtain the display time of the plurality of characters.
  • the server may further add the display time to the first lyrics file. Therefore, the step of generating the first lyrics file of the target song according to the corresponding pronunciation of the multiple characters and the characters to be marked in the target song may be as follows: the server creates the first lyrics file and establishes a second array to mark the to-be-marked The corresponding pronunciation of the characters in the target song is written into the second array; the server adds the first array, the second array, and the third array to the first lyrics file.
  • the server can also determine the display time of the target word according to the display time of at least two adjacent characters to be marked included in the target word, The display time of the target word updates the third array; finally, the server adds the first array, the second array, and the updated third array to the first lyrics file; wherein the target word includes at least two adjacent Characters to be marked, and the number of characters to be marked is not equal to the number of pronunciations of the target word.
  • the server determines the display time of the target word by combining the display times of at least two adjacent characters to be marked included in the target word, and determining the combined display time as the display time of the target word.
  • the pronunciation of many words is not equal to the number of characters they include, for example, ⁇ , ⁇ , Wait.
  • the display time stored in the second lyrics file is shown in Table 1 below:
  • the server merges the display time of multiple adjacent characters in the target word, so that multiple adjacent characters can be used as a word, corresponding to a separate display time, to avoid the actual playback of songs based on each Each character is displayed corresponding to a display time, resulting in a problem that the display time of each character does not match the actual pronunciation of the singing, thereby making the display time of each character match the actual pronunciation of the character, and the display time of each word Matching the pronunciation of the word makes the display time of the lyrics in the first lyrics file accurately match the pronunciation of the lyrics, which further improves the accuracy of the finally obtained lyrics file.
  • the server may determine the pronunciation of each character to be marked in the lyrics in the target song based on the lyrics of the target song, and further generate the first of the target song based on the pronunciation of the multiple characters and the characters to be marked.
  • the lyrics file so that the corresponding pronunciation is bound for each character to be marked, so that when the lyrics are subsequently displayed, the pronunciation can be displayed synchronously to ensure that the user can sing the pronunciation of each character of the target song correctly.
  • FIG. 3 is a flowchart of a method for displaying lyrics according to an embodiment of the present invention.
  • the embodiment of the present invention is executed by a terminal. Referring to FIG. 3, the method includes:
  • the terminal When receiving a lyrics display instruction, the terminal obtains a first lyrics file of a target song.
  • the lyrics display instruction is used to display the lyrics of the target song.
  • the terminal can obtain the first lyrics of the target song from the local or server according to the identifier of the target song. file.
  • the lyrics display instruction may be obtained when the user triggers the terminal to play the target song, or may be obtained when the user triggers the display of the lyrics file.
  • the first lyrics file is a lyrics file generated based on the lyrics and the target pronunciation of multiple characters in advance.
  • the specific generation process is shown in steps 201-204 above.
  • the first lyrics file includes at least the lyrics of the target song. 2.
  • the terminal obtains, from the first lyrics file, the lyrics of the target song and the corresponding pronunciation of the characters to be marked in the multiple characters of the lyrics in the target song.
  • the lyrics and the pronunciation of the characters to be marked may be stored in the form of an array, respectively.
  • the terminal obtains a first array and a second array from the first lyrics file, reads a plurality of characters in the lyrics from the first array, and reads the characters of the characters to be marked from the plurality of characters from the second array. pronunciation.
  • the terminal may determine a second storage location associated with the second array based on a first storage location of each character to be marked in the first array, and read the pronunciation of the character to be marked from the second storage location. .
  • the first lyrics file may further include display times of multiple characters in the lyrics, and the terminal may also obtain the display times of the multiple characters from the first lyrics file.
  • the process may be that the terminal obtains a third array from the first lyrics file, and obtains the display time of the plurality of characters from the third array.
  • the display time of the plurality of characters includes a display time of a target word in the lyrics and a display time of each character except the target word.
  • the target word includes at least two adjacent characters to be marked, and the number of characters to be marked is not equal to the number of pronunciations of the target word.
  • the display time of the target word is at least two included in the target word. The display time of the display time of the adjacent to-be-marked characters after merging.
  • the terminal displays multiple characters of the lyrics.
  • the terminal highlights the currently played target word among multiple characters according to the display time of the target word; according to the display time of each character other than the target word, highlights the characters other than the target word
  • the terminal can highlight the character or target word being played by the font color of the character. For example, characters or target words that have been played are displayed as a first color, and characters or target words that have not been played are displayed as a second color.
  • the terminal marks the corresponding pronunciation of the character to be marked in the target song at the target position of the character to be marked.
  • the target position is above the character to be marked.
  • the terminal displays the character to be marked
  • the terminal highlights the character to be marked and the currently to be marked according to the display time of the character to be marked.
  • the to-be-marked characters may include a target word, and the terminal may highlight the currently played target word and the corresponding pronunciation of the currently played target word in the target song according to the display time of the target word.
  • brackets as shown in FIG. 4, for example, who is the man and woman ( ⁇ ⁇ ) ⁇ ⁇ ( ⁇ ⁇ ) ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , and such labeling will cause some misunderstanding to the user, for example, it will think that "female" alone pronounces ⁇ ⁇ , the pronunciation of male is unknown, especially the compound word, for example, time travel ( ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ), the user After browsing, we can't tell whether the pronunciation in brackets is travel or time travel.
  • the terminal can display the pronunciation above the character to be marked.
  • the pronunciation of the character to be marked can be clearly and accurately found, which improves the The user learns the efficiency of the pronunciation of the characters to be marked, and improves the accuracy of the lyrics display.
  • the user sings based on the pronunciation, ensuring that the singing can accurately sing each character's pronunciation.
  • the terminal when the lyrics display instruction is received, the terminal may display the lyrics of the target song and the corresponding pronunciation of the characters to be marked in the multiple characters of the lyrics in the target song; thereby ensuring that the user can sing accurately when singing The pronunciation of each character, and the terminal can also mark the pronunciation above the corresponding character to be marked, so that the pronunciation is clearly visible, and the user can accurately and quickly find the corresponding pronunciation of the character to be marked, which improves the accuracy of displaying the lyrics.
  • FIG. 6 is a schematic structural diagram of an apparatus for generating lyrics according to an embodiment of the present invention.
  • the apparatus includes: an acquisition module 601, a determination module 602, a query module 603, and a generation module 604.
  • An acquisition module 601 configured to acquire lyrics of a target song
  • a determining module 602 configured to determine a character to be marked among a plurality of characters of the lyrics
  • the query module 603 is configured to query the pronunciation of the character to be marked in the word according to a preset query principle according to the word in which the character to be marked is located, and determine the pronunciation of the character to be marked in the word as the character to be marked. The corresponding pronunciation in the target song
  • a generating module 604 is configured to generate a first lyrics file of the target song according to the corresponding pronunciations of the plurality of characters and the character to be marked in the target song.
  • the query module 603 includes:
  • a determining unit configured to perform phrase division on the plurality of characters, and determine a plurality of phrases, where the plurality of phrases include characters to be marked;
  • An input unit configured to input the plurality of phrases into a dictionary engine, where the dictionary engine includes multiple characters and pronunciations of multiple characters;
  • a query unit configured to perform a word-by-word increase query in the dictionary engine for the plurality of phrases, and query a plurality of candidate words matching the word where the character to be labeled matches;
  • a filtering unit configured to filter, based on the number of characters included in the plurality of candidate words, the candidate word with the largest number of characters included in the plurality of candidate words, and output the candidate with the largest number of characters selected by the character to be labeled; The pronunciation of a word.
  • Each candidate word includes at least the characters included in the word where the character to be marked is located.
  • the query module 603 includes:
  • a determining unit configured to divide the plurality of characters into phrases and determine a plurality of phrases, where the plurality of phrases include a character to be marked and a key character of the voice of the character to be marked;
  • An input unit configured to input the plurality of phrases into a dictionary engine, where the dictionary engine includes multiple characters and pronunciations of multiple characters;
  • a query unit configured to perform a word-by-word increase query in the dictionary engine for the plurality of phrases, and query a plurality of candidate words matching the word where the character to be labeled matches;
  • a screening unit configured to filter, based on the number of characters included in the plurality of candidate words and the key characters of the to-be-marked character, the voice that has the largest number of characters and is related to the character to be marked; For the same candidate word, output the pronunciation of the character to be labeled among the candidate characters with the largest number of characters and the same voice as the character to be labeled.
  • Each candidate word includes at least the word that the character to be labeled includes. character.
  • the Chinese characters include: simplified Chinese characters, traditional Chinese characters, Chinese characters used in Japanese, and traditional characters used in Japanese.
  • the obtaining module 601 includes:
  • An obtaining unit configured to obtain a second lyrics file of the target song, where the second lyrics file includes a lyrics of the target song and a display time of a plurality of characters of the lyrics;
  • a building unit for building a first array and a third array writing the plurality of characters into the first array, writing display times of the plurality of characters into the third array, and storing a plurality of characters in the first array
  • the character is determined as the lyrics of the target song.
  • the generating module 602 is configured to establish a second array, and write the corresponding pronunciation of the character to be marked in the target song into the second array; when the target word exists in the lyrics, the target word includes Display time of at least two adjacent to-be-labeled characters to determine the display time of the target word, the target word includes at least two adjacent to-be-labeled characters, and the number of included to-be-labeled characters is similar to the pronunciation of the target word The numbers are not equal; the display time of the target word is written into the third array; the first array, the second array, and the third array are added to the first lyrics file;
  • the method of determining the display time of the target word is: combining the display times of at least two adjacent characters to be marked included in the target word, and determining the combined display time as the display time of the target word.
  • the generating module 602 is configured to establish a second array, write the corresponding pronunciation of the character to be marked in the target song into the second array, and add the first array and the second array to the first In the lyrics file, the first array is used to store multiple characters of the lyrics.
  • the server may determine the pronunciation of each character to be marked in the lyrics in the target song based on the lyrics of the target song, and further generate the first of the target song based on the pronunciation of the multiple characters and the characters to be marked.
  • the lyrics file so that the corresponding pronunciation is bound for each character to be marked, so that when the lyrics are subsequently displayed, the pronunciation can be displayed synchronously to ensure that the user can sing the pronunciation of each character of the target song correctly.
  • FIG. 7 is a schematic structural diagram of a device for displaying lyrics according to an embodiment of the present invention.
  • the device is applied to a terminal.
  • the device includes an obtaining module 701, a display module 702, and a labeling module 703.
  • An obtaining module 701 is configured to obtain a first lyrics file of a target song when a lyrics display instruction is received, and the lyrics display instruction is used to display the lyrics of the target song;
  • the obtaining module 701 is further configured to obtain, from the first lyrics file, the lyrics of the target song and the corresponding pronunciations of the characters to be marked in the multiple characters of the lyrics in the target song;
  • the labeling module 703 is configured to mark the corresponding pronunciation of the character to be labeled in the target song when the character to be labeled is displayed, and the target position is above the character to be labeled.
  • the display module 702 includes:
  • a first display unit configured to highlight the currently played target word in the plurality of characters according to the display time of the target word in the plurality of characters
  • a second display unit configured to highlight the currently playing character among the characters other than the target word according to the display time of each character other than the target word in the plurality of characters;
  • the target word includes at least two adjacent characters to be marked, and the number of characters to be marked is not equal to the number of pronunciations of the target word.
  • the display time of the target word is at least two included in the target word. The display time of the display time of the adjacent to-be-marked characters after merging.
  • the second display unit is configured to, when displaying the to-be-marked character, highlight the currently-to-be-marked character and the currently-to-be-marked character in the target song according to the display time of the to-be-marked character Corresponding pronunciation.
  • the terminal when the lyrics display instruction is received, the terminal may display the lyrics of the target song and the corresponding pronunciation of the characters to be marked in the multiple characters of the lyrics in the target song; thereby ensuring that the user can sing accurately when singing The pronunciation of each character, and the terminal can also mark the pronunciation above the corresponding character to be marked, so that the pronunciation is clearly visible, and the user can accurately and quickly find the corresponding pronunciation of the character to be marked, which improves the accuracy of displaying the lyrics.
  • the lyrics generating device provided in the foregoing embodiment generates lyrics, or when the lyrics displaying device displays lyrics
  • only the above-mentioned division of the functional modules is used as an example.
  • the The above function allocation is completed by different function modules, that is, the internal structure of the device is divided into different function modules to complete all or part of the functions described above.
  • the embodiments of the device for generating lyrics and the method for generating lyrics provided by the foregoing embodiments, and the embodiments of the device for displaying lyrics and the method for displaying lyrics belong to the same concept. For specific implementation processes, refer to the method embodiments, and details are not described herein again.
  • FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • the terminal 800 can be: smartphone, tablet, MP3 player (Moving Picture Experts Group Audio Layer III, moving image expert compression standard audio level 3), MP4 (Moving Picture Expert Experts Group Audio Audio Layer IV, moving image expert compression standard audio Level 4) Player, laptop or desktop computer.
  • the terminal 800 may also be called other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and the like.
  • the terminal 800 includes a processor 801 and a memory 802.
  • the processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 801 may use at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array). achieve.
  • the processor 801 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in the wake state, also called a CPU (Central Processing Unit).
  • the coprocessor is Low-power processor for processing data in standby.
  • the processor 801 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is responsible for rendering and drawing content required to be displayed on the display screen.
  • the processor 801 may further include an AI (Artificial Intelligence) processor, and the AI processor is configured to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 802 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 802 may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash storage devices.
  • non-transitory computer-readable storage medium in the memory 802 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 801 to implement the generation of lyrics provided by the method embodiment in this application Method of displaying lyrics.
  • the terminal 800 may optionally include a peripheral device interface 803 and at least one peripheral device.
  • the processor 801, the memory 802, and the peripheral device interface 803 may be connected through a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 803 through a bus, a signal line, or a circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 804, a touch display screen 805, a camera 806, an audio circuit 807, a positioning component 808, and a power supply 809.
  • the peripheral device interface 803 may be used to connect at least one peripheral device related to I / O (Input / Output) to the processor 801 and the memory 802.
  • the processor 801, the memory 802, and the peripheral device interface 803 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 801, the memory 802, and the peripheral device interface 803 or Both can be implemented on separate chips or circuit boards, which is not limited in this embodiment.
  • the radio frequency circuit 804 is used to receive and transmit an RF (Radio Frequency) signal, also called an electromagnetic signal.
  • the radio frequency circuit 804 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 804 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 804 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • the radio frequency circuit 804 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes, but is not limited to, a metropolitan area network, mobile communication networks of various generations (2G, 3G, 4G, and 5G), a wireless local area network, and / or a WiFi (Wireless Fidelity) network.
  • the radio frequency circuit 804 may further include NFC (Near Field Communication) circuits, which are not limited in this application.
  • the display screen 805 is used to display a UI (User Interface).
  • the UI may include graphics, text, icons, videos, and any combination thereof.
  • the display screen 805 also has the ability to collect touch signals on or above the surface of the display screen 805.
  • the touch signal can be input to the processor 801 as a control signal for processing.
  • the display screen 805 may also be used to provide a virtual button and / or a virtual keyboard, which is also called a soft button and / or a soft keyboard.
  • the display screen 805 may be one, and the front panel of the terminal 800 is provided.
  • the display screen 805 may be at least two, which are respectively provided on different surfaces of the terminal 800 or have a folded design. In still other embodiments, the display screen 805 may be a flexible display screen disposed on a curved surface or a folded surface of the terminal 800. Furthermore, the display screen 805 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen.
  • the display screen 805 can be made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
  • the camera component 806 is used for capturing images or videos.
  • the camera component 806 includes a front camera and a rear camera.
  • the front camera is disposed on the front panel of the terminal, and the rear camera is disposed on the back of the terminal.
  • the camera assembly 806 may further include a flash.
  • the flash can be a monochrome temperature flash or a dual color temperature flash.
  • a dual color temperature flash is a combination of a warm light flash and a cold light flash, which can be used for light compensation at different color temperatures.
  • the audio circuit 807 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 801 for processing, or input them to the radio frequency circuit 804 to implement voice communication.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves.
  • the speaker can be a traditional film speaker or a piezoelectric ceramic speaker.
  • the speaker When the speaker is a piezoelectric ceramic speaker, it can not only convert electrical signals into sound waves audible to humans, but also convert electrical signals into sound waves inaudible to humans for ranging purposes.
  • the audio circuit 807 may further include a headphone jack.
  • the positioning component 808 is used to locate the current geographic position of the terminal 800 to implement navigation or LBS (Location Based Service).
  • the positioning component 808 may be a positioning component based on the United States' GPS (Global Positioning System, Global Positioning System), China's Beidou system, Russia's Granus system, or the European Union's Galileo system.
  • the power supply 809 is used to power various components in the terminal 800.
  • the power source 809 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • the terminal 800 further includes one or more sensors 810.
  • the one or more sensors 810 include, but are not limited to, an acceleration sensor 811, a gyroscope sensor 812, a pressure sensor 813, a fingerprint sensor 814, an optical sensor 815, and a proximity sensor 816.
  • the acceleration sensor 811 can detect the magnitude of acceleration on three coordinate axes of the coordinate system established by the terminal 800.
  • the acceleration sensor 811 may be used to detect components of the acceleration of gravity on three coordinate axes.
  • the processor 801 may control the touch display screen 805 to display a user interface in a horizontal view or a vertical view according to a gravity acceleration signal collected by the acceleration sensor 811.
  • the acceleration sensor 811 may also be used for collecting motion data of a game or a user.
  • the gyro sensor 812 can detect the body direction and rotation angle of the terminal 800, and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect a 3D motion of the user on the terminal 800. Based on the data collected by the gyro sensor 812, the processor 801 can implement the following functions: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 813 may be disposed on a side frame of the terminal 800 and / or a lower layer of the touch display screen 805.
  • a user's holding signal to the terminal 800 can be detected, and the processor 801 can perform left-right hand recognition or quick operation according to the holding signal collected by the pressure sensor 813.
  • the processor 801 operates according to the pressure of the user on the touch display screen 805 to control the operable controls on the UI interface.
  • the operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 814 is used to collect a user's fingerprint, and the processor 801 recognizes the identity of the user based on the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 recognizes the identity of the user based on the collected fingerprint. When identifying the user's identity as a trusted identity, the processor 801 authorizes the user to perform related sensitive operations, such as unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 814 may be provided on the front, back, or side of the terminal 800. When a physical button or a manufacturer's logo is set on the terminal 800, the fingerprint sensor 814 may be integrated with the physical button or the manufacturer's logo.
  • the optical sensor 815 is used to collect ambient light intensity.
  • the processor 801 may control the display brightness of the touch display screen 805 according to the ambient light intensity collected by the optical sensor 815. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 805 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 805 is decreased.
  • the processor 801 may also dynamically adjust the shooting parameters of the camera component 806 according to the ambient light intensity collected by the optical sensor 815.
  • the proximity sensor 816 also called a distance sensor, is usually disposed on the front panel of the terminal 800.
  • the proximity sensor 816 is used to collect the distance between the user and the front of the terminal 800.
  • the processor 801 controls the touch display screen 805 to switch from the bright screen state to the closed screen state; when the proximity sensor 816 detects When the distance between the user and the front of the terminal 800 gradually increases, the touch display screen 805 is controlled by the processor 801 to switch from the screen state to the bright screen state.
  • FIG. 8 does not constitute a limitation on the terminal 800, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
  • FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the server 900 may have a large difference due to different configurations or performance, and may include one or more processors (central processing units) (CPUs) 901 and one Or more than one memory 902, wherein the memory 902 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 901 to implement the method for generating lyrics provided by the foregoing method embodiments.
  • the server may also have wired or wireless network interfaces, keyboards, and input / output interfaces for input and output.
  • the server may also include other components for implementing device functions, and details are not described herein.
  • a computer-readable storage medium such as a memory including instructions, which can be executed by a processor in a terminal to complete the method for generating lyrics or the method for displaying lyrics in the above embodiments.
  • the computer-readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
  • the program may be stored in a computer-readable storage medium.
  • the storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

本发明公开了一种生成歌词、显示歌词的方法、装置、电子设备及存储介质,属于互联网技术领域。该方法包括:获取目标歌曲的歌词;确定该歌词的多个字符中的待标注字符;根据该待标注字符所在的词,按照预设查询原则,查询该待标注字符在该词中的读音,将该待标注字符在该词中的读音确定为该待标注字符在该目标歌曲中对应的读音;根据该多个字符和该待标注字符在该目标歌曲中对应的读音,生成该目标歌曲的第一歌词文件,从而使得后续在显示歌词时,可以同步显示读音,保证用户可以基于正确演唱目标歌曲的每个字符的读音。并且,终端在显示歌词时,还可以在将读音标注在对应待标注字符的上方,使得该读音清晰可见,提高了显示歌词的准确性。

Description

生成歌词、显示歌词的方法、装置、电子设备及存储介质
本申请要求于2018年05月25日提交的申请号为201810513546.5、发明名称为“生成歌词、显示歌词的方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网技术领域,特别涉及一种生成歌词、显示歌词的方法、装置、电子设备及存储介质。
背景技术
随着互联网技术的发展,许多音乐播放器不仅支持海量歌曲的在线播放,还可以为提供用户的K歌服务,K歌服务是指由音乐播放器播放歌曲的音乐伴奏,并在当前播放界面上显示歌词,使得用户可以边看歌词边随着音乐伴奏唱歌。
目前,在当前播放界面上,终端通常以卡拉OK的形式,显示当前播放时刻对应的一句歌词,并用颜色动态标注当前正播放到该句歌词中的哪个字符。然而,歌词中通常会包括一些多音字符,多音字符是指在多个词中由多个读音的字符,而导致演唱过程中用户可能会唱错。因此,目前亟需一种显示歌词的方法,以使用户可以正确演唱每个字符的读音。
发明内容
本发明实施例提供了一种生成歌词、显示歌词的方法、装置、电子设备及存储介质,可以解决相关技术中用户唱错歌词中字符读音的问题。所述技术方案如下:
第一方面,提供了一种生成歌词的方法,所述方法包括:
获取目标歌曲的歌词;
确定所述歌词的多个字符中的待标注字符;
根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符 在所述词中的读音,将所述待标注字符在所述词中的读音确定为所述待标注字符在所述目标歌曲中对应的读音;
根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件。
第二方面,提供了一种显示歌词的方法,所述方法应用在终端上,所述方法包括:
当接收到歌词显示指令时,获取目标歌曲的第一歌词文件,所述歌词显示指令用于显示所述目标歌曲的歌词;
从所述第一歌词文件中获取所述目标歌曲的歌词、所述歌词的多个字符中待标注字符在所述目标歌曲中对应的读音;
显示所述歌词的多个字符;
在显示所述待标注字符时,在所述待标注字符的目标位置上,标注所述待标注字符在所述目标歌曲中对应的读音,所述目标位置为所述待标注字符的上方。
第三方面,提供了一种生成歌词的装置,所述装置包括:
获取模块,用于获取目标歌曲的歌词;
确定模块,用于确定所述歌词的多个字符中的待标注字符
查询模块,用于根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符在所述词中的读音,将所述待标注字符在所述词中的读音确定为所述待标注字符在所述目标歌曲中对应的读音;
生成模块,用于根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件。
第四方面,提供了一种显示歌词的装置,所述装置包括:
获取模块,用于当接收到歌词显示指令时,获取目标歌曲的第一歌词文件,所述歌词显示指令用于显示所述目标歌曲的歌词;
所述获取模块,还用于从所述第一歌词文件中获取所述目标歌曲的歌词、所述歌词的多个字符中待标注字符在所述目标歌曲中对应的读音;
显示模块,用于显示所述歌词的多个字符;
标注模块,用于在显示所述待标注字符时,在所述待标注字符的目标位置上,标注所述待标注字符在所述目标歌曲中对应的读音,所述目标位置为所述待标注字符的上方。
第五方面,提供了一种电子设备,所述电子设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如第一方面所述的生成歌词的方法,或者如第二方面所述的显示歌词的方法所执行的操作。
第六方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如第一方面所述的生成歌词的方法,或者如第二方面所述的显示歌词的方法所执行的操作。
本发明实施例提供的技术方案带来的有益效果是:
本发明实施例中,服务器可以基于目标歌曲的歌词,确定歌词中每个待标注字符在该目标歌曲中对应的读音;并进一步根据多个字符和待标注字符的读音,生成目标歌曲的第一歌词文件,从而为每个待标注字符绑定了对应的读音,使得后续在显示歌词时,可以同步显示读音,保证用户可以基于正确演唱目标歌曲的每个字符的读音。并且,终端在显示歌词时,还可以在将读音标注在对应待标注字符的上方,使得该读音清晰可见,用户可以准确、快速的找到待标注字符对应读音,提高了显示歌词的准确性。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种实施环境的示意图;
图2是本发明实施例提供的一种生成歌词的方法的流程图;
图3是本发明实施例提供的一种显示歌词的方法的流程图;
图4是本发明实施例提供的一种歌词显示的界面示意图;
图5是本发明实施例提供的一种歌词显示的界面示意图;
图6是本发明实施例提供的一种生成歌词的装置的结构示意图;
图7是本发明实施例提供的一种显示歌词的装置的结构示意图;
图8是本发明实施例提供的一种终端的结构示意图;
图9是本发明实施例提供的一种服务器的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1是本发明实施例提供的一种实施环境的示意图,该实施环境包括:终端101和服务器102。其中,该终端101和服务器102之间通过网络连接。该终端101上安装了应用程序,该终端101可以基于该应用程序从服务器102中获取目标歌曲,并进行目标歌曲的播放、歌词显示等。
其中,该服务器102可以事先基于目标歌曲的歌词内的多个字符中待标注字符,从词典引擎中查询该待标注字符的读音,从而得到待标注字符在该目标歌曲中对应的读音,该服务器可以存储该歌词和该待标注字符在目标歌曲中对应的读音,并将该目标歌曲的歌词和待标注字符对应的读音发送至终端101。该终端101在播放该目标歌曲时,同步显示该目标歌曲的歌词,并在待标注字符的上方标注待标注字符对应的读音。
其中,上述就多个字符获取待标注字符的读音的过程,也可以由终端101执行,也即是,终端101从服务器102中获取目标歌曲的歌词后,自行为歌词内待标注字符确定读音并标注。
需要说明的是,该应用程序可以为音乐播放器或者安装有音乐播放插件的应用程序,该终端可以为手机终端、PAD(Portable Android Device,平板电脑)终端或者电脑终端等。服务器102是该应用程序的后台服务器。服务器102可以为一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务器中心。
图2是本发明实施例提供的一种生成歌词的方法的流程图。该发明实施例 的执行主体为服务器或终端,本发明实施例仅以服务器为例进行说明,参见图2,该方法包括:
201、服务器获取目标歌曲的歌词。
本发明实施例中,服务器可以先获取该目标歌曲的歌词文件,从该歌词文件中获取该目标歌曲的歌词,该歌词包括多个字符。其中,该目标歌曲为任一需要生成歌词的歌曲,一般为歌词中包括汉字的歌曲。其中,该汉字包括但不限于:中文简化汉字、中文繁体字、日文中使用的汉字、日文中使用的繁体字等。例如,“隐藏”中的“隐”对应的中文简化汉字、中文繁体字、日文中使用的繁体字依次为:隐藏、隱藏、隠す。需要说明的是,中文简化汉字、中文繁体字中不仅包括普通话中所使用的汉字,还包括一些地方语言中所使用的汉字,例如,粤语中的“冇”,属于中文简化汉字,读音为“mao”,对应于普通话中“没有”。
在一些歌曲的歌词文件中,往往还包括多个字符的显示时间,该显示时间用于指示每个字符在该歌曲播放过程中何时进行显示。并且,在歌词文件中,每个字符和该字符的显示时间是拼接存储的,字符与字符之间往往还存在显示时间,例如,歌词“雨の空港”在歌词文件中实际存储形式为“(39112)雨(39803)の(40356)空(40606)港(41176)”,其中括号内数字表示字符的显示时间,字符与字符之间均被显示时间切断;服务器可以获取该目标歌曲的第二歌词文件,从该第二歌词文件中提取该目标歌曲的歌词以及该歌词内多个字符的显示时间,并建立每个字符与每个字符的显示时间之间的索引关系,以保证后续显示时,还可以基于显示时间进行显示。其中,该第二歌词文件为该目标歌曲的原始歌词文件,包括该目标歌曲的歌词和该歌词的多个字符的显示时间。
其中,服务器可以以数组的形式分别存储字符与显示时间,以使字符与显示时间可以一一对应,在一种可能的设计中,服务器获取目标歌曲的歌词的步骤可以为:服务器获取该目标歌曲的第二歌词文件,建立第一数组和第三数组,该第一数组用于存储该歌词内的多个字符,该第三数组用于存储该多个字符的显示时间。服务器将该多个字符写入该第一数组,将该多个字符的显示时间写入该第三数组,将该第一数组内的多个字符确定为该目标歌曲的歌词。其中,在第三数组中,服务器可以按照第一数组中每个字符存储顺序,将每个字符的显示时间存储在第三数组中,基于存储顺序,使得第一数组中每个字符与第三数组中该字符的显示时间一一对应,从而提高了歌词以及显示时间的准确性。
需要说明的是,服务器通过提取字符,以使后续可以基于字符进行读音的确定,同时,服务器通过建立每个字符和每个字符的显示时间的索引关系,在不破坏该第二歌词文件的原有内容的前提下,进行歌词的提取,保证了最终可以得到准确的歌词文件。
本发明实施例中,当服务器获取到歌词后,服务器可以先通过以下步骤202,从多个字符中确定出待标注字符,基于该待标注字符进行查询,再基于步骤203,查询出该待标注字符的读音。
202、服务器确定该歌词的多个字符中的待标注字符。
本发明实施例中,服务器可以通过预设识别算法,识别出该多个字符中的汉字,将该汉字确定为该待标注字符。其中,服务器一般存储该多个字符在编码后的编码数据,该预设识别算法可以为基于该多个字符的编码方式的编码表,对该多个字符进行识别。当然,本发明实施例并不限定字符的编码方式,例如,该编码表可以为Unicode(统一码)编码方式的编码表。
需要说明的是,服务器通过对该多个字符先进行识别,以识别出待标注字符,从而直接基于多个字符中待标注字符进行读音的确定,缩减了后续所要处理的字符的数量,进一步提高了确定读音的效率。并且,本发明实施例通过将汉字作为待标注字符,后续标注歌词中的汉字的读音,对于母语不是中文或者日文的用户,方便了用户通过唱歌的方式来学习一门外语,既满足了听觉乐趣,又达到学习外语的效果,极大的丰富了用户体验,从而大大提高了本方法的实用价值。
203、服务器根据该待标注字符所在的词,按照预设查询原则,查询该待标注字符在该词中的读音,将该待标注字符在该词中的读音确定为该待标注字符在该目标歌曲中对应的读音。
本发明实施例中,服务器可以事先建立并存储词典引擎,该词典引擎包括多个字符和多个字符的读音。本步骤中,服务器根据该待标注字符,从该词典引擎中获取该待标注字符的读音,并将该查询的读音确定为该在该目标歌曲终端对应的读音。其中,在一些语言中,某些字符可能有多个读音,在不同的词中读音也不相同,或者,即使在相同的词中,但是语态不同,读音也有可能不相同,服务器在查询时,还可以基于待标注字符所在的词和/或语态进行查询。因此,本步骤可以包括以下三种情况。
第一种情况、服务器将该待标注字符输入该词典引擎,输出该待标注字符 的在该目标歌曲中对应的读音。
本步骤中,服务器可以直接将该待标注字符输入该词典引擎,当该服务器基于该词典引擎查询到该待标注字符唯一确定的读音时,直接输出该待标注字符的读音,并将该输出的读音确定为该待标注字符在该目标歌曲中对应的读音。
需要说明的是,当该待标注字符为中文汉字中非多音字或者,日文中仅有唯一读音的汉字时,服务器可以仅基于该待标注字符,查询得到该待标注字符的唯一读音。
第二种情况、服务器根据该待标注字符所在的词,在词典引擎中确定该待标注字符在该词中的读音。
在一些语言中,一个字符可能有多个读音,在不同的词中读音也不相同,例如,日文中的汉字,在大多数情况下有两个或者两个以上的读音。本发明实施例中,服务器可以基于该待标注字符所在的词,在词典引擎中进一步锁定该待标注字符在目标歌曲中对应的读音。
本步骤中,服务器对该多个字符进行词组划分,确定多个词组;服务器将该多个词组输入词典引擎,并对该多个词组在词典引擎中进行逐字增加查询,查询该待标注字符所在的词匹配的多个候选词;服务器根据该多个候选词包括的字符数目,从该多个候选词中,筛选出包括的字符数目最多的候选词,输出该待标注字符在筛选出的该字符数目最多的候选词中的读音。
其中,该多个词组包括待标注字符,终端可以将目标歌曲中一句歌词划分为一个词组,每个候选词至少包括该待标注字符所在的词所包括的字符。
在一种可能的设计中,服务器在词典引擎中进行逐字增加查询,并在查询出的多个候选词中进行筛选的过程可以为:对于每个词组,服务器在词典引擎中从该词组的第一个待标注字符开始,确定该待标注字符的初始读音;并基于该词组的后续字符逐字增加,将该第一个待标注字符与增加的字符作为该待标注字符所在的词,在词典引擎中查询与该待标注字符所在的词相匹配的候选词,在查询过程中,当匹配到包括该待标注字符所在的词的候选词时,基于该候选词确定该待标注字符的读音;并继续基于该词组的后续字符逐字增加,并更新待标注字符所在的词,按照上述查询过程继续匹配,当匹配到字符数目更多的候选词时,按照最长词条优先匹配的原则,将字符数目更多的候选词覆盖之前字符数目较少的候选词,基于字符数目更多的候选词确定待标注字符读音;并继续按照最长词条优先匹配的原则查询,直至查询完词组内每个待标注字符的 读音。
例如,日文歌词“思い出を思い出してください”中,待标注字符为两个“思”、两个“出”,其中,两个“出”所在的词不相同,服务器基于词典引擎进行查询的过程如下:
a、先查询“思”:“思”读音不能唯一确定,当作单字符,暂时确定“思”的初始读音“し”;
b、查询“思”所在的词“思い”:基于词典引擎中所存储的候选词“
Figure PCTCN2019076815-appb-000001
う”,得到“思”唯一确定的读音“おも”;
c、继续查询“思い出”中的“出”:同理,“出”读音不能唯一确定,当作单字符,暂时确定“出”的候选读音“で”;
d、查询“出”所在的词“思い出を”:基于候选词“
Figure PCTCN2019076815-appb-000002
Figure PCTCN2019076815-appb-000003
”,得到“出”唯一确定的读音“で”;
e、查询“思い出を思”中后一个“思”:与第一个“思”同理,暂时确定候选读音“し”;
f、继续基于“思い出を思い”中,后一个“思”所在的词“思い”:基于词典引擎中包括字符数目更多的候选词“
Figure PCTCN2019076815-appb-000004
う”,得到“思”唯一确定的读音“おも”;
g、查询“思い出を思い出”中后一个“出”:暂时基于候选词“
Figure PCTCN2019076815-appb-000005
Figure PCTCN2019076815-appb-000006
”确定候选读音“で”;
h、继续查询“思い出を思い出し”中,后一个“出”所在的词“思い出し”:查询得到包括字符数目更多的候选词“
Figure PCTCN2019076815-appb-000007
Figure PCTCN2019076815-appb-000008
す”,得到“出”唯一确定的读音“だ”。
通过上述步骤a-h,服务器可以确定出该句歌词中各个待标注字符的读音分别为:“
Figure PCTCN2019076815-appb-000009
Figure PCTCN2019076815-appb-000010
Figure PCTCN2019076815-appb-000011
Figure PCTCN2019076815-appb-000012
してください”。
需要说明的是,在中文中,该待标注字符的读音可以为汉语拼音,在日文中,该待标注字符的读音可以为振假名。
第三种情况、服务器根据该待标注字符所在词和该待标注字符的语态关键字符,在词典引擎中确定该待标注字符在该词中的读音。
本发明实施例中,在一些语言中,字符的读音还有可能受到语态的影响,语态不同的字符对应的读音也不相同。本步骤中,服务器对该多个字符进行词 组划分,确定多个词,并将该多个词组输入词典引擎;服务器对该多个词组在词典引擎中进行逐字增加查询,查询该待标注字符所在的词匹配的多个候选词。服务器根据该多个候选词包括的字符数目和该待标注字符的语态关键字符,从该多个候选词中,筛选出包括的字符数目最多且与该待标注字符的语态相同的候选词,输出该待标注字符在筛选出字符数目最多且与该待标注字符的语态相同的候选词中的读音。其中,每个候选词至少包括该待标注字符所在的词所包括的字符,该多个词组包括待标注字符和该待标注字符的语态关键字符。
本步骤中,服务器基于该待标注字符所在的词和语态进行查询,筛选出的候选词包括的字符数目最多,并且,语态与该待标注字符一致,其中,服务器在词典引擎中进行查询并筛选的过程为上述第二种情况同理的过程,此处不再赘述。
例如,在“
Figure PCTCN2019076815-appb-000013
いたと
Figure PCTCN2019076815-appb-000014
いだの
Figure PCTCN2019076815-appb-000015
Figure PCTCN2019076815-appb-000016
Figure PCTCN2019076815-appb-000017
うのよ”中,第一个瞬和第二个瞬,所在的词均为瞬い,然而,该句中存在瞬的语态关键字符“た”和“だ”,服务器基于该语态关键字符,进一步确定第一个瞬的读音为“またた”,第二个瞬的读音为“まじろ”。
需要说明的是,在日文中,这里的两个瞬,都是五段动词,基于词典引擎查询得到,瞬对应有两个读音,分别为:“またたく”和“まじろぐ”,但是由于语态的不同,读音会发生如下变化:
对于またたく:またたく加た,变为またたき+た;
对于まじろぐ:まじろぐ加た,变为まじろぎ+た;
由于五段动词的き+た和ぎ+た都需要发生音变,就分别有了如下的结果:
对于またたく:またたき,变为またたい+た
对于まじろぐ:まじろぎ,变为まじろい+だ
也即是,实际上的结果让过去助词,发生了浊化,从た变成了だ。
基于音变后得到的读音,分别确定两个瞬的读音:第一个“瞬いた”中,基于语态关键字符た,确定“瞬い”中瞬的读音为“またた”;第二个“瞬いだ”中,基于语态关键字符だ,确定“瞬い”中瞬的读音为“まじろ”。
因此,在实际查询时,需要基于该语态关键字符,才能进一步匹配出待标注字符的准确读音。
需要说明的是,上述步骤202-203是先确定待标注字符,再进行待标注字符的读音查询过程,然而,服务器还可以直接将多个字符输入词典引擎中进行查 询,该服务器可以在该词典引擎中仅存储汉字的读音,对于待标注字符以外的其他字符,该词典引擎则不会输出其读音,因此,服务器可以直接基于多个字符进行查询,同样可以得到待标注字符在该目标歌曲中对应的读音。
204、服务器根据该多个字符和该待标注字符在该目标歌曲中对应的读音,生成该目标歌曲的第一歌词文件。
本步骤中,服务器创建第一歌词文件,将该多个字符和待标注字符在该目标歌曲中对应的读音存储在该第一歌词文件中,并建立待标注字符和待标注字符在该目标歌曲中对应的读音之间的索引关系。其中,服务器可以基于数组建立该字符和读音之间的索引关系,服务器建立第二数组,将该待标注字符在该目标歌曲中对应的读音写入该第二数组;将第一数组和该第二数组添加至第一歌词文件中。其中,对于每个待标注字符,服务器根据该待标注字符在该第一数组中的第一存储位置,将该待标注字符对应的读音存储在该第三数组的第二存储位置中,其中,该第一存储位置和该第三存储位置可以均为第一数组和第二数组中关联字节,例如,均为第一字节。
本发明实施例中,在步骤201中,服务器还可以获取该多个字符的显示时间,本步骤中,该服务器还可以将该显示时间添加到该第一歌词文件中。因此,服务器根据多个字符和待标注字符在目标歌曲中对应的读音,生成目标歌曲的第一歌词文件的步骤还可以为:服务器创建该第一歌词文件,并建立第二数组,将待标注字符在目标歌曲中对应的读音写入第二数组;服务器将该第一数组、第二数组以及第三数组添加至该第一歌词文件中。
在一种可能的设计中,在一些语言中,可能存在一些字符数目与读音的数目并不相等的情况,例如,日文中的
Figure PCTCN2019076815-appb-000018
包括的两个字符实际对应有三个读音,因此,当歌词内存在目标词时,服务器还可以根据目标词包括的至少两个相邻待标注字符的显示时间,确定目标词的显示时间,并根据该目标词的显示时间,更新该第三数组;最终,服务器将第一数组、第二数组和更新后的第三数组添加至第一歌词文件中;其中,目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与目标词的读音的数目不相等。服务器确定该目标词的显示时间的方式为:将目标词包括的至少两个相邻待标注字符的显示时间进行合并,将合并后的显示时间确定为目标词的显示时间。
在日文中,许多词的读音与其包括的字符的数目不相等,例如,
Figure PCTCN2019076815-appb-000019
Figure PCTCN2019076815-appb-000020
う、
Figure PCTCN2019076815-appb-000021
う、
Figure PCTCN2019076815-appb-000022
Figure PCTCN2019076815-appb-000023
等。以
Figure PCTCN2019076815-appb-000024
为例进行说 明,更新第三数组之前,在第二歌词文件中存储的显示时间如下表1所示:
表1
第三数组 39112 39803 40356 41176
第一数组
显然,“欠”和“片”分别各自对应一个显示时间,然而,在实际显示时,由于欠片共同对应三个读音,因此,按照两个显示时间分开显示“欠”和“片”,显然会导致显示错误,通过对两个显示时间进行合并,得到更新后的第三数组中存储的显示时间如下表2所示:
表2
Figure PCTCN2019076815-appb-000025
需要说明的是,服务器通过对目标词的内多个相邻字符的显示时间进行合并,使得多个相邻字符可以作为一个词,对应一个单独的显示时间,避免在实际播放歌曲时,基于每个字符对应一个显示时间进行显示,而导致各个字符的显示时间与实际所演唱的读音不匹配的问题,进而使得每个字符的显示时间与该字符的实际读音相匹配,每个词的显示时间与该词的读音相匹配,使得第一歌词文件中的歌词的显示时间均能准确匹配到歌词的读音,进一步提升了最终所得到的歌词文件的准确性。
本发明实施例中,服务器可以基于目标歌曲的歌词,确定歌词中每个待标注字符在该目标歌曲中对应的读音;并进一步根据多个字符和待标注字符的读音,生成目标歌曲的第一歌词文件,从而为每个待标注字符绑定了对应的读音,使得后续在显示歌词时,可以同步显示读音,保证用户可以基于正确演唱目标歌曲的每个字符的读音。
图3是本发明实施例提供的一种显示歌词的方法的流程图。该发明实施例的执行主体为终端,参见图3,该方法包括:
301、当接收到歌词显示指令时,终端获取目标歌曲的第一歌词文件。
本发明实施例中,该歌词显示指令用于显示该目标歌曲的歌词,当终端接收到歌词显示指令时,该终端可以根据目标歌曲的标识,从本地或服务器中获 取该目标歌曲的第一歌词文件。其中,该歌词显示指令可以为用户触发终端播放目标歌曲时得到,或者,用户触发显示该歌词文件时得到。
需要说明的是,该第一歌词文件为事先基于歌词和多个字符的目标读音生成的歌词文件,具体生成过程如上述步骤201-204所示,该第一歌词文件至少包括该目标歌曲的歌词、该歌词的多个字符中待标注字符在该目标歌曲中对应的读音。
302、终端从该第一歌词文件中获取该目标歌曲的歌词、该歌词的多个字符中待标注字符在该目标歌曲中对应的读音。
本发明实施例中,在该第一歌词文件中,可以数组的形式分别存储歌词和待标注字符的读音。终端从该第一歌词文件中获取第一数组和第二数组,并从该第一数组中读取该歌词内的多个字符,从第二数组中读取该多个字符中待标注字符的读音。其中,该终端可以基于该第一数组中每个待标注字符的第一存储位置,确定该第二数组中关联的第二存储位置,从该第二存储位置中读取该待标注字符的读音。
在一种可能的设计中,该第一歌词文件中还可以包括该歌词中多个字符的显示时间,该终端还可以从第一歌词文件中获取该多个字符的显示时间。该过程可以为:终端从该第一歌词文件中获取第三数组,并从该第三数组中获取该多个字符的显示时间。该多个字符的显示时间包括该歌词中目标词的显示时间以及除目标词以外的每个字符的显示时间。
其中,该目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与该目标词的读音的数目不相等,该目标词的显示时间为该目标词包括的至少两个相邻待标注字符的显示时间合并后的显示时间。
303、终端显示该歌词的多个字符。
本发明实施例中,终端根据该目标词的显示时间,突出显示多个字符中当前播放的目标词;根据该除目标词以外的每个字符的显示时间,突出显示该除目标词以外的字符中当前播放的字符。其中,该终端可以通过字符的字体颜色来突出显示正在播放的字符或目标词。例如,将已经播放的字符或目标词显示为第一颜色,将还未播放的字符或目标词显示为第二颜色。
304、在显示该待标注字符时,终端在该待标注字符的目标位置上,标注该待标注字符在该目标歌曲中对应的读音。
本发明实施例中,该目标位置为该待标注字符的上方,终端在显示该待标 注字符时,根据该待标注字符的显示时间,突出显示当前播放的待标注字符和该当前播放的待标注字符在该目标歌曲中对应的读音。当然,该待标注字符中可能包括目标词,该终端可以根据目标词的显示时间,突出显示当前播放的目标词和该当前播放的目标词在该目标歌曲中对应的读音。
需要说明的是,在一些现有技术中,为了简化逻辑,对于歌词的独有读音,就使用括号的方式来标注读音,如图4所示,例如,誰でももし男女(ひと)が人間(ひと)でありたいなら,而这样标注,会给用户造成一定的误解,比如会认为“女”单独发ひと的音,男的读音未知,特别是复合词,例如,時間旅行(タイムトリップ),用户浏览后无法分辨出括号内读音是旅行还是時間旅行。同时,在突出显示歌词时,无法保证字符和读音可以按照相同的显示节奏进行突出显示,例如,男女(ひと),男女只有两个音节,对应到一行文字就变成了6个(含括弧),其中,在基于“男”的显示时间突出显示“男”时,需要将“男女(”渲染为第一颜色,显示“女”时对应的将剩下的部分“ひと)”渲染为第一颜色,从而导致显示错误,给用户带来较差的用户体验。
本发明实施例中,如图5所示,终端可以将该读音显示在待标注字符的上方,当用户浏览到该待标注字符时,可以清晰、准确的找到该待标注字符的读音,提高了用户获知待标注字符的读音的效率,并提高了歌词显示的准确性。同时,用户基于该读音进行演唱,保证演唱就可以准确演唱每个字符读音。
本发明实施例中,当接收到歌词显示指令时,终端可以显示目标歌曲的歌词、该歌词的多个字符中待标注字符在该目标歌曲中对应的读音;从而保证用户演唱时可以准确的演唱每个字符的读音,并且,终端还可以在将读音标注在对应待标注字符的上方,使得该读音清晰可见,用户可以准确、快速的找到待标注字符对应读音,提高了显示歌词的准确性。
图6是本发明实施例提供的一种生成歌词的装置的结构示意图。参见图6,该装置包括:获取模块601、确定模块602、查询模块603和生成模块604。
获取模块601,用于获取目标歌曲的歌词;
确定模块602,用于确定该歌词的多个字符中的待标注字符;
查询模块603,用于根据该待标注字符所在的词,按照预设查询原则,查询该待标注字符在该词中的读音,将该待标注字符在该词中的读音确定为该待标注字符在该目标歌曲中对应的读音
生成模块604,用于根据该多个字符和该待标注字符在该目标歌曲中对应的读音,生成该目标歌曲的第一歌词文件。
可选的,该查询模块603,包括:
确定单元,用于对该多个字符进行词组划分,确定多个词组,该多个词组包括待标注字符;
输入单元,用于将该多个词组输入词典引擎,该词典引擎包括多个字符和多个字符的读音;
查询单元,用于对该多个词组在词典引擎中进行逐字增加查询,查询该待标注字符所在的词匹配的多个候选词;
筛选单元,用于根据该多个候选词包括的字符数目,从该多个候选词中,筛选出包括的字符数目最多的候选词,输出该待标注字符在筛选出的该字符数目最多的候选词中的读音,每个候选词至少包括该待标注字符所在的词所包括的字符。
可选的,该查询模块603,包括:
确定单元,用于对该多个字符进行词组划分,确定多个词组,该多个词组包括待标注字符和该待标注字符的语态关键字符;
输入单元,用于将该多个词组输入词典引擎,该词典引擎包括多个字符和多个字符的读音;
查询单元,用于对该多个词组在词典引擎中进行逐字增加查询,查询该待标注字符所在的词匹配的多个候选词;
筛选单元,用于根据该多个候选词包括的字符数目和该待标注字符的语态关键字符,从该多个候选词中,筛选出包括的字符数目最多且与该待标注字符的语态相同的候选词,输出该待标注字符在筛选出的字符数目最多且与该待标注字符的语态相同的候选词中的读音,每个候选词至少包括该待标注字符所在的词所包括的字符。
可选的,该汉字包括:中文简化汉字、中文繁体字、日文中使用的汉字、日文中使用的繁体字。
可选的,该获取模块601,包括:
获取单元,用于获取该目标歌曲的第二歌词文件,该第二歌词文件包括该目标歌曲的歌词和该歌词的多个字符的显示时间;
建立单元,用于建立第一数组和第三数组,将该多个字符写入该第一数组, 将该多个字符的显示时间写入该第三数组,将该第一数组内的多个字符确定为该目标歌曲的歌词。
可选的,该生成模块602,用于建立第二数组,将该待标注字符在该目标歌曲中对应的读音写入该第二数组;当该歌词内存在目标词时,根据该目标词包括的至少两个相邻待标注字符的显示时间,确定该目标词的显示时间,该目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与该目标词的读音的数目不相等;将该目标词的显示时间写入该第三数组;将第一数组、该第二数组和该第三数组添加至该第一歌词文件中;
其中,确定该目标词的显示时间的方式为:将该目标词包括的至少两个相邻待标注字符的显示时间进行合并,将合并后的显示时间确定为该目标词的显示时间。
可选的,该生成模块602,用于建立第二数组,将该待标注字符在该目标歌曲中对应的读音写入该第二数组;将第一数组和该第二数组添加至该第一歌词文件中,该第一数组用于存储该歌词的多个字符。
本发明实施例中,服务器可以基于目标歌曲的歌词,确定歌词中每个待标注字符在该目标歌曲中对应的读音;并进一步根据多个字符和待标注字符的读音,生成目标歌曲的第一歌词文件,从而为每个待标注字符绑定了对应的读音,使得后续在显示歌词时,可以同步显示读音,保证用户可以基于正确演唱目标歌曲的每个字符的读音。
图7是本发明实施例提供的一种显示歌词的装置的结构示意图。该装置应用在终端上,参见图7,该装置包括:获取模块701、显示模块702和标注模块703。
获取模块701,用于当接收到歌词显示指令时,获取目标歌曲的第一歌词文件,该歌词显示指令用于显示该目标歌曲的歌词;
该获取模块701,还用于从该第一歌词文件中获取该目标歌曲的歌词、该歌词的多个字符中待标注字符在该目标歌曲中对应的读音;
显示模块702,用于显示该歌词的多个字符;
标注模块703,用于在显示该待标注字符时,在该待标注字符的目标位置上,标注该待标注字符在该目标歌曲中对应的读音,该目标位置为该待标注字符的上方。
可选的,该显示模块702,包括:
第一显示单元,用于根据该多个字符中目标词的显示时间,突出显示多个字符中当前播放的目标词;
第二显示单元,用于根据该多个字符中除目标词以外的每个字符的显示时间,突出显示该除目标词以外的字符中当前播放的字符;
其中,该目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与该目标词的读音的数目不相等,该目标词的显示时间为该目标词包括的至少两个相邻待标注字符的显示时间合并后的显示时间。
可选的,该第二显示单元,用于在显示该待标注字符时,根据该待标注字符的显示时间,突出显示当前播放的待标注字符和该当前播放的待标注字符在该目标歌曲中对应的读音。
本发明实施例中,当接收到歌词显示指令时,终端可以显示目标歌曲的歌词、该歌词的多个字符中待标注字符在该目标歌曲中对应的读音;从而保证用户演唱时可以准确的演唱每个字符的读音,并且,终端还可以在将读音标注在对应待标注字符的上方,使得该读音清晰可见,用户可以准确、快速的找到待标注字符对应读音,提高了显示歌词的准确性。
上述所有可选技术方案,可以采用任意结合形成本公开的可选实施例,在此不再一一赘述。
需要说明的是:上述实施例提供的生成歌词的装置在生成歌词时,或者,显示歌词的装置在显示歌词时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的生成歌词的装置与生成歌词的方法,以及显示歌词的装置与显示歌词的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图8是本发明实施例提供的一种终端的结构示意图。该终端800可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。 终端800还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,终端800包括有:处理器801和存储器802。
处理器801可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器801可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器801也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器801可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器801还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器802可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器802还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器802中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器801所执行以实现本申请中方法实施例提供的生成歌词的方法、显示歌词的方法。
在一些实施例中,终端800还可选包括有:外围设备接口803和至少一个外围设备。处理器801、存储器802和外围设备接口803之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口803相连。具体地,外围设备包括:射频电路804、触摸显示屏805、摄像头806、音频电路807、定位组件808和电源809中的至少一种。
外围设备接口803可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器801和存储器802。在一些实施例中,处理器801、存储器802和外围设备接口803被集成在同一芯片或电路板上;在一些其他实施例中,处理器801、存储器802和外围设备接口803中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路804用于接收和发射RF(Radio Frequency,射频)信号,也称电 磁信号。射频电路804通过电磁信号与通信网络以及其他通信设备进行通信。射频电路804将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路804包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路804可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路804还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏805用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏805是触摸显示屏时,显示屏805还具有采集在显示屏805的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器801进行处理。此时,显示屏805还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏805可以为一个,设置终端800的前面板;在另一些实施例中,显示屏805可以为至少两个,分别设置在终端800的不同表面或呈折叠设计;在再一些实施例中,显示屏805可以是柔性显示屏,设置在终端800的弯曲表面上或折叠面上。甚至,显示屏805还可以设置成非矩形的不规则图形,也即异形屏。显示屏805可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件806用于采集图像或视频。可选地,摄像头组件806包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件806还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路807可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器801进行处理,或者输入至射频电路 804以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端800的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器801或射频电路804的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路807还可以包括耳机插孔。
定位组件808用于定位终端800的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件808可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源809用于为终端800中的各个组件进行供电。电源809可以是交流电、直流电、一次性电池或可充电电池。当电源809包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端800还包括有一个或多个传感器810。该一个或多个传感器810包括但不限于:加速度传感器811、陀螺仪传感器812、压力传感器813、指纹传感器814、光学传感器815以及接近传感器816。
加速度传感器811可以检测以终端800建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器811可以用于检测重力加速度在三个坐标轴上的分量。处理器801可以根据加速度传感器811采集的重力加速度信号,控制触摸显示屏805以横向视图或纵向视图进行用户界面的显示。加速度传感器811还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器812可以检测终端800的机体方向及转动角度,陀螺仪传感器812可以与加速度传感器811协同采集用户对终端800的3D动作。处理器801根据陀螺仪传感器812采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器813可以设置在终端800的侧边框和/或触摸显示屏805的下层。当压力传感器813设置在终端800的侧边框时,可以检测用户对终端800的握持信号,由处理器801根据压力传感器813采集的握持信号进行左右手识别或快捷操作。当压力传感器813设置在触摸显示屏805的下层时,由处理器801根据用户对触摸显示屏805的压力操作,实现对UI界面上的可操作性控件进行 控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器814用于采集用户的指纹,由处理器801根据指纹传感器814采集到的指纹识别用户的身份,或者,由指纹传感器814根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器801授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器814可以被设置终端800的正面、背面或侧面。当终端800上设置有物理按键或厂商Logo时,指纹传感器814可以与物理按键或厂商Logo集成在一起。
光学传感器815用于采集环境光强度。在一个实施例中,处理器801可以根据光学传感器815采集的环境光强度,控制触摸显示屏805的显示亮度。具体地,当环境光强度较高时,调高触摸显示屏805的显示亮度;当环境光强度较低时,调低触摸显示屏805的显示亮度。在另一个实施例中,处理器801还可以根据光学传感器815采集的环境光强度,动态调整摄像头组件806的拍摄参数。
接近传感器816,也称距离传感器,通常设置在终端800的前面板。接近传感器816用于采集用户与终端800的正面之间的距离。在一个实施例中,当接近传感器816检测到用户与终端800的正面之间的距离逐渐变小时,由处理器801控制触摸显示屏805从亮屏状态切换为息屏状态;当接近传感器816检测到用户与终端800的正面之间的距离逐渐变大时,由处理器801控制触摸显示屏805从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图8中示出的结构并不构成对终端800的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
图9是本发明实施例提供的一种服务器的结构示意图,该服务器900可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)901和一个或一个以上的存储器902,其中,所述存储器902中存储有至少一条指令,所述至少一条指令由所述处理器901加载并执行以实现上述各个方法实施例提供的生成歌词的方法。当然,该服务器还可以具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进 行输入输出,该服务器还可以包括其他用于实现设备功能的部件,在此不做赘述。
在示例性实施例中,还提供了一种计算机可读存储介质,例如包括指令的存储器,上述指令可由终端中的处理器执行以完成上述实施例中的生成歌词的方法或者显示歌词的方法。例如,所述计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种生成歌词的方法,其特征在于,所述方法包括:
    获取目标歌曲的歌词;
    确定所述歌词的多个字符中的待标注字符;
    根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符在所述词中的读音,将所述待标注字符在所述词中的读音确定为所述待标注字符在所述目标歌曲中对应的读音;
    根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符在所述词中的读音包括:
    对所述多个字符进行词组划分,确定多个词组,所述多个词组包括待标注字符;
    将所述多个词组输入词典引擎,所述词典引擎包括多个字符和多个字符的读音;
    对所述多个词组在词典引擎中进行逐字增加查询,查询所述待标注字符所在的词匹配的多个候选词;
    根据所述多个候选词包括的字符数目,从所述多个候选词中,筛选出包括的字符数目最多的候选词,输出所述待标注字符在筛选出的所述字符数目最多的候选词中的读音,每个候选词至少包括所述待标注字符所在的词所包括的字符。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符在所述词中的读音包括:
    对所述多个字符进行词组划分,确定多个词组,所述多个词组包括待标注字符和所述待标注字符的语态关键字符;
    将所述多个词组输入词典引擎,所述词典引擎包括多个字符和多个字符的读音;
    对所述多个词组在词典引擎中进行逐字增加查询,查询所述待标注字符所 在的词匹配的多个候选词;
    根据所述多个候选词包括的字符数目和所述待标注字符的语态关键字符,从所述多个候选词中,筛选出包括的字符数目最多且与所述待标注字符的语态相同的候选词,输出所述待标注字符在筛选出的所述字符数目最多且与所述待标注字符的语态相同的候选词中的读音,每个候选词至少包括所述待标注字符所在的词所包括的字符。
  4. 根据权利要求1所述的方法,其特征在于,所述汉字包括:中文简化汉字、中文繁体字、日文中使用的汉字、日文中使用的繁体字。
  5. 根据权利要求1所述的方法,其特征在于,所述获取所述目标歌曲的歌词包括:
    获取所述目标歌曲的第二歌词文件,所述第二歌词文件包括所述目标歌曲的歌词和所述歌词的多个字符的显示时间;
    建立第一数组和第三数组,将所述多个字符写入所述第一数组,将所述多个字符的显示时间写入所述第三数组,将所述第一数组内的多个字符确定为所述目标歌曲的歌词。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件包括:
    建立第二数组,将所述待标注字符在所述目标歌曲中对应的读音写入所述第二数组;
    当所述歌词内存在目标词时,根据所述目标词包括的至少两个相邻待标注字符的显示时间,确定所述目标词的显示时间,所述目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与所述目标词的读音的数目不相等;
    将所述目标词的显示时间写入所述第三数组;
    将第一数组、所述第二数组和所述第三数组添加至所述第一歌词文件中;
    其中,确定所述目标词的显示时间的方式为:将所述目标词包括的至少两个相邻待标注字符的显示时间进行合并,将合并后的显示时间确定为所述目标 词的显示时间。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件包括:
    建立第二数组,将所述待标注字符在所述目标歌曲中对应的读音写入所述第二数组;
    将第一数组和所述第二数组添加至所述第一歌词文件中,所述第一数组用于存储所述歌词的多个字符。
  8. 一种显示歌词的方法,其特征在于,所述方法应用在终端上,所述方法包括:
    当接收到歌词显示指令时,获取目标歌曲的第一歌词文件,所述歌词显示指令用于显示所述目标歌曲的歌词;
    从所述第一歌词文件中获取所述目标歌曲的歌词、所述歌词的多个字符中待标注字符在所述目标歌曲中对应的读音;
    显示所述歌词的多个字符;
    在显示所述待标注字符时,在所述待标注字符的目标位置上,标注所述待标注字符在所述目标歌曲中对应的读音,所述目标位置为所述待标注字符的上方。
  9. 根据权利要求8所述的方法,其特征在于,所述第一歌词文件还包括所述多个字符的显示时间,相应的,所述显示所述歌词的多个字符包括:
    根据所述多个字符中目标词的显示时间,突出显示多个字符中当前播放的目标词;
    根据所述多个字符中除目标词以外的每个字符的显示时间,突出显示所述除目标词以外的字符中当前播放的字符;
    其中,所述目标词包括至少两个相邻的待标注字符,且包括的待标注字符的数目与所述目标词的读音的数目不相等,所述目标词的显示时间为所述目标词包括的至少两个相邻待标注字符的显示时间合并后的显示时间。
  10. 根据权利要求8所述的方法,其特征在于,所述在显示所述待标注字符时,在所述待标注字符的目标位置上,标注所述待标注字符在所述目标歌曲中对应的读音包括:
    在显示所述待标注字符时,根据所述待标注字符的显示时间,突出显示当前播放的待标注字符和所述当前播放的待标注字符在所述目标歌曲中对应的读音;
    其中,所述待标注字符为汉字,所述汉字包括:中文简化汉字、中文繁体字、日文中使用的汉字、日文中使用的繁体字。
  11. 一种生成歌词的装置,其特征在于,所述装置包括:
    获取模块,用于获取目标歌曲的歌词;
    确定模块,用于确定所述歌词的多个字符中的待标注字符;
    查询模块,用于根据所述待标注字符所在的词,按照预设查询原则,查询所述待标注字符在所述词中的读音,将所述待标注字符在所述词中的读音确定为所述待标注字符在所述目标歌曲中对应的读音;
    生成模块,用于根据所述多个字符和所述待标注字符在所述目标歌曲中对应的读音,生成所述目标歌曲的第一歌词文件。
  12. 根据权利要求1所述的装置,其特征在于,所述查询模块,包括:
    确定单元,用于对所述多个字符进行词组划分,确定多个词组,所述多个词组包括待标注字符;
    输入单元,用于将所述多个词组输入词典引擎,所述词典引擎包括多个字符和多个字符的读音;
    查询单元,用于对所述多个词组在词典引擎中进行逐字增加查询,查询所述待标注字符所在的词匹配的多个候选词;
    筛选单元,用于根据所述多个候选词包括的字符数目,从所述多个候选词中,筛选出包括的字符数目最多的候选词,输出所述待标注字符在筛选出的候选词中的读音,每个候选词至少包括所述待标注字符所在的词所包括的字符。
  13. 一种显示歌词的装置,其特征在于,所述装置应用在终端上,所述装置包括:
    获取模块,用于当接收到歌词显示指令时,获取目标歌曲的第一歌词文件,所述歌词显示指令用于显示所述目标歌曲的歌词;
    所述获取模块,还用于从所述第一歌词文件中获取所述目标歌曲的歌词、所述歌词的多个字符中待标注字符在所述目标歌曲中对应的读音;
    显示模块,用于显示所述歌词的多个字符;
    标注模块,用于在显示所述待标注字符时,在所述待标注字符的目标位置上,标注所述待标注字符在所述目标歌曲中对应的读音,所述目标位置为所述待标注字符的上方。
  14. 一种电子设备,其特征在于,所述电子设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如权利要求1至权利要求7任一项所述的生成歌词的方法,或者权利要求8至权利要求10任一项所述的显示歌词的方法所执行的操作。
  15. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求7任一项所述的生成歌词的方法,或者权利要求8至权利要求10任一项所述的显示歌词的方法所执行的操作。
PCT/CN2019/076815 2018-05-25 2019-03-04 生成歌词、显示歌词的方法、装置、电子设备及存储介质 WO2019223393A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
SG11202011712WA SG11202011712WA (en) 2018-05-25 2019-03-04 Method and apparatus for generating lyrics, method and apparatus for displaying lyrics, electronic device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810513546.5A CN108763441B (zh) 2018-05-25 2018-05-25 生成歌词、显示歌词的方法、装置、电子设备及存储介质
CN201810513546.5 2018-05-25

Publications (1)

Publication Number Publication Date
WO2019223393A1 true WO2019223393A1 (zh) 2019-11-28

Family

ID=64006568

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/076815 WO2019223393A1 (zh) 2018-05-25 2019-03-04 生成歌词、显示歌词的方法、装置、电子设备及存储介质

Country Status (3)

Country Link
CN (1) CN108763441B (zh)
SG (1) SG11202011712WA (zh)
WO (1) WO2019223393A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786020A (zh) * 2021-01-21 2021-05-11 腾讯音乐娱乐科技(深圳)有限公司 一种歌词时间戳生成方法及存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763441B (zh) * 2018-05-25 2022-05-17 腾讯音乐娱乐科技(深圳)有限公司 生成歌词、显示歌词的方法、装置、电子设备及存储介质
CN111368057B (zh) * 2020-03-05 2023-08-22 腾讯科技(深圳)有限公司 词组查询方法、装置、计算机设备以及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010060629A (ja) * 2008-09-01 2010-03-18 Bmb Corp カラオケ装置
CN103793364A (zh) * 2014-01-23 2014-05-14 北京百度网讯科技有限公司 对文本进行自动注音处理及显示的方法和装置
CN103810993A (zh) * 2012-11-14 2014-05-21 北京百度网讯科技有限公司 一种文本注音方法及装置
CN106570001A (zh) * 2016-10-24 2017-04-19 广州酷狗计算机科技有限公司 一种音译文字的方法及装置
CN108763441A (zh) * 2018-05-25 2018-11-06 腾讯音乐娱乐科技(深圳)有限公司 生成歌词、显示歌词的方法、装置、电子设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101482867B (zh) * 2008-01-09 2012-07-04 北大方正集团有限公司 一种自动为汉字添加拼音的方法及装置
CN103365925B (zh) * 2012-04-09 2016-12-14 高德软件有限公司 获取多音字拼音、基于拼音检索的方法及其相应装置
CN104142909B (zh) * 2014-05-07 2016-04-27 腾讯科技(深圳)有限公司 一种汉字注音方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010060629A (ja) * 2008-09-01 2010-03-18 Bmb Corp カラオケ装置
CN103810993A (zh) * 2012-11-14 2014-05-21 北京百度网讯科技有限公司 一种文本注音方法及装置
CN103793364A (zh) * 2014-01-23 2014-05-14 北京百度网讯科技有限公司 对文本进行自动注音处理及显示的方法和装置
CN106570001A (zh) * 2016-10-24 2017-04-19 广州酷狗计算机科技有限公司 一种音译文字的方法及装置
CN108763441A (zh) * 2018-05-25 2018-11-06 腾讯音乐娱乐科技(深圳)有限公司 生成歌词、显示歌词的方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786020A (zh) * 2021-01-21 2021-05-11 腾讯音乐娱乐科技(深圳)有限公司 一种歌词时间戳生成方法及存储介质
CN112786020B (zh) * 2021-01-21 2024-02-23 腾讯音乐娱乐科技(深圳)有限公司 一种歌词时间戳生成方法及存储介质

Also Published As

Publication number Publication date
SG11202011712WA (en) 2021-01-28
CN108763441B (zh) 2022-05-17
CN108763441A (zh) 2018-11-06

Similar Documents

Publication Publication Date Title
WO2019223394A1 (zh) 生成歌词、显示歌词的方法、装置、电子设备及存储介质
CN110556127B (zh) 语音识别结果的检测方法、装置、设备及介质
WO2020103550A1 (zh) 音频信号的评分方法、装置、终端设备及计算机存储介质
WO2021068903A1 (zh) 确定音量的调节比例信息的方法、装置、设备及存储介质
CN112735429B (zh) 确定歌词时间戳信息的方法和声学模型的训练方法
WO2019223393A1 (zh) 生成歌词、显示歌词的方法、装置、电子设备及存储介质
CN111428079B (zh) 文本内容处理方法、装置、计算机设备及存储介质
CN111081277B (zh) 音频测评的方法、装置、设备及存储介质
WO2022111260A1 (zh) 音乐筛选方法、装置、设备及介质
CN112232059B (zh) 文本纠错方法、装置、计算机设备及存储介质
CN111475611B (zh) 词典管理方法、装置、计算机设备及存储介质
CN110837557B (zh) 摘要生成方法、装置、设备及介质
WO2019223268A1 (zh) 存储歌词注音的方法和装置
CN108831423B (zh) 提取音频数据中主旋律音轨的方法、装置、终端及存储介质
CN112786025B (zh) 确定歌词时间戳信息的方法和声学模型的训练方法
CN111640432B (zh) 语音控制方法、装置、电子设备及存储介质
WO2019223269A1 (zh) 渲染歌词的方法和装置
CN110910862B (zh) 音频调整方法、装置、服务器及计算机可读存储介质
CN111737423B (zh) 领域识别方法、装置、电子设备及存储介质
CN108446276B (zh) 确定歌单关键词的方法和装置
CN112380380A (zh) 显示歌词的方法、装置、设备及计算机可读存储介质
CN115312039A (zh) 确定歌词时间方法、设备和存储介质
CN116157859A (zh) 音频处理方法、装置、终端以及存储介质
CN114298014A (zh) 文本纠错的方法、装置、设备及计算机可读存储介质
CN111524533A (zh) 语音操作方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19806786

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 19/05/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19806786

Country of ref document: EP

Kind code of ref document: A1