CN101615417A - A kind of Chinese synchronously displaying lyrics method that is accurate to word - Google Patents
A kind of Chinese synchronously displaying lyrics method that is accurate to word Download PDFInfo
- Publication number
- CN101615417A CN101615417A CN200910089572A CN200910089572A CN101615417A CN 101615417 A CN101615417 A CN 101615417A CN 200910089572 A CN200910089572 A CN 200910089572A CN 200910089572 A CN200910089572 A CN 200910089572A CN 101615417 A CN101615417 A CN 101615417A
- Authority
- CN
- China
- Prior art keywords
- lyrics
- word
- accurate
- voice
- chinese
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Reverberation, Karaoke And Other Acoustics (AREA)
Abstract
The present invention relates to field of audio play, relate in particular to a kind of Chinese synchronously displaying lyrics method that is accurate to word.The present invention is divided into several portions by the voice with every lyrics, and the quantity of this several portions equals this lyrics number of words and adds an ending ventilation, and the every part voice that are divided into are mated respectively and then obtain matching attribute α
xAnd then with every kind cut apart the voice that obtain in turn with these lyrics in each word carry out the phoneme coupling, and obtain corresponding matching degree β
xChoose λ * α at last
x+ (1-λ) * β
xValue is maximum as optimal dividing, and wherein λ is weight coefficient and satisfies 0≤λ≤1.The inventive method has solved the problem that synchronously displaying lyrics can not be accurate to word, waits in the equipment that needs synchronously displaying lyrics in Karaoke to have significant application value.
Description
Technical field
The present invention relates to field of audio play, relate in particular to the method for synchronously displaying lyrics in the audio frequency broadcast system.
Background technology
The lyrics Presentation Function of playout software makes people can see the lyrics of audio file when hearing graceful melody, and present many playout softwares all have the function of synchronously displaying lyrics.Concrete grammar is that the lyrics are stored in the text-only file, and a time tag that presents with [MM:SS] form before beginning, every lyrics is arranged, wherein MM is the time minute value that is played song, SS is the value in second, when the lyrics be played to MM divide SS during second playout software just can show these lyrics, and then the lyrics and the voice that make to show are synchronous.
The above conventional synchronization shows that lyrics method is by sentence writing time, gives each word with every lyrics by reallocation after the equal divisional processing of time then, can only be accurate to and can not be accurate to word so the lyrics show.Yet a lot of application scenarios are arranged at present as Karaoke (TV song accompaniment apparatus) etc., all need a kind ofly correctly to show the playout software of the lyrics, and present synchronously displaying lyrics method precision is very poor, almost can not correctly show the time of each word in the lyrics by word.
Summary of the invention
The invention provides the Chinese synchronously displaying lyrics method that is accurate to word in a kind of audio frequency broadcast system that can overcome the above problems.
In first aspect, the invention provides a kind of Chinese synchronously displaying lyrics method that is accurate to word, this method at first is divided into several portions with the voice of every lyrics, the quantity of this several portions equals this lyrics number of words and adds an ending ventilation, and the every part voice that are divided into are mated respectively and then obtain matching attribute α
xAnd then with every kind cut apart the voice that obtain in turn with these lyrics in each word carry out the phoneme coupling, and obtain corresponding matching degree β
xChoose λ * α at last
x+ (1-λ) * β
xValue is maximum as optimal dividing, and wherein λ is weight coefficient and satisfies 0≤λ≤1.
In one embodiment of the invention, with in the optimal dividing the zero-time of corresponding each part as the zero-time of each word in the lyrics, and this time is kept in the text-only file of store lyrics.
In another embodiment of the present invention, the zero-time of lyrics word in the manual adjustment text-only file so that the demonstration time of this lyrics word can be synchronized with this lyrics word more.
The present invention utilizes precisely original lyrics of sentence, the voice of every lyrics are divided into the section identical with this lyrics syllable, and comprehensively the section of cutting apart coupling obtains optimal dividing with the matching degree that phoneme mates.And then solved the problem that synchronously displaying lyrics can not be accurate to word, wait in the equipment that needs the synchronously displaying lyrics word in Karaoke to have significant application value.
Description of drawings
Below with reference to accompanying drawings specific embodiments of the present invention is described in detail, in the accompanying drawings:
Fig. 1 is the Chinese synchronously displaying lyrics process flow diagram that is accurate to word.
Embodiment
Fig. 1 is the Chinese synchronously displaying lyrics process flow diagram that is accurate to word.
In step 110, the lyrics are divided into some sentences, lyrics of each correspondence.
Preferably, adopt elimination musical sound algorithm to eliminate or to weaken musical sound and outstanding voice in step 111 pair every song, described elimination musical sound algorithm can adopt any one voice enhancement algorithm.
In step 120, according to the hop count of every lyrics of lyrics content statistics, this hop count comprises the ventilation in every when ending song, and promptly this hop count number of words of equaling every lyrics adds an ending ventilation.
In step 130, the voice of every lyrics are divided into the hop count voice that step 120 statistics obtains, and each voice after cutting apart are mated, and then obtain a plurality of matching attributes.
Particularly, according to speech recognition algorithm the voice of described every lyrics are divided into several portions, the concrete quantity of described several portions equals this lyrics hop count that step 120 statistics obtains, and optimum segmentation comprises a complete syllable i.e. a Chinese character or an ending ventilation for each part.
N kind different feasible cutting apart arranged in speech recognition algorithm is cut apart the process of every lyrics voice, and each is cut apart resulting syllable and all has matching attribute α corresponding with it, and then obtains the multiple different matching attribute α of these lyrics voice
1, α
2, α
3..., α
nThis α value is used to estimate the corresponding quality of cutting apart with it, and the big more then explanation of α value is cut apart accurately more.
In step 140, carry out phoneme (being each Chinese character) coupling, obtain different matching degree β.
Particularly, n kind in the step 130 is cut apart every kind cut apart resulting syllable in order with this song in the phoneme of each word mate, the matching degree that obtains is β, obtains matching degree β respectively so the n kind is cut apart
1, β
2, β
3..., β
nDescribed matching degree method can be any one voice match algorithm.
In step 150, with α and β according to certain weight, and by setting threshold and then definite optimal dividing.
Particularly, the minimum threshold of choosing α is α min, and the minimum threshold of β is β min, and sets weight coefficient λ (0≤λ≤1).Choose and make λ * α
x+ (1-λ) * β
xValue is maximum, and satisfies α
x>α min and β
xThe pairing optimal dividing that is divided into of the x of>β min.Just do not satisfy α simultaneously if do not exist in these lyrics
x>α min and β
xThe x of>β min then directly chooses and makes λ * α
x+ (1-λ) * β
xMaximum x institute correspondence is divided into optimal dividing.
In step 160, determine the zero-time of each word in the lyrics.
Particularly, in the optimal dividing that step 150 is obtained the zero-time of corresponding each part as the zero-time of each word in the lyrics, and this time is remained in the text-only file of these lyrics of storage.
Preferably, in the zero-time of step 161, and then reach the more accurately purpose of synchronously displaying lyrics by some lyrics word (words in the inaccurate lyrics of time) in the text of the described store lyrics of manual adjustment.
Obviously, under the prerequisite that does not depart from true spirit of the present invention and scope, the present invention described here can have many variations.Therefore, the change that all it will be apparent to those skilled in the art that all should be included within the scope that these claims contain.The present invention's scope required for protection is only limited by described claims.
Claims (7)
1. Chinese synchronously displaying lyrics method that is accurate to word comprises:
Step a is divided into several portions with the voice of every lyrics, and the quantity of this several portions equals this lyrics number of words and adds an ending ventilation, and the every part voice that are divided into are mated respectively and then obtain matching attribute α
x
Step b, with described every kind cut apart the voice that obtain in turn with these lyrics in each word carry out the phoneme coupling, and obtain corresponding matching degree β
x
Step c chooses λ * α
x+ (1-λ) * β
xValue is maximum as optimal dividing, and wherein λ is weight coefficient and satisfies 0≤λ≤1.
2. a kind of Chinese synchronously displaying lyrics method that is accurate to word as claimed in claim 1 is characterized in that, comprises before step a:
Steps d is divided into some sentences with the lyrics, lyrics of each correspondence, and the musical sound algorithm is eliminated in every song employing given prominence to voice to subdue musical sound.
3. a kind of Chinese synchronously displaying lyrics method that is accurate to word as claimed in claim 1 is characterized in that, the optimum segmentation among the step a is that each part that is divided into all comprises a complete syllable.
4. a kind of Chinese synchronously displaying lyrics method that is accurate to word as claimed in claim 1 is characterized in that, the minimum threshold of setting α in step c is α min, and the minimum threshold of β is β min, and satisfies α
x>α min and β
x>β min.
5. a kind of Chinese synchronously displaying lyrics method that is accurate to word as claimed in claim 1 is characterized in that, comprises after step c:
Step e: with in the described optimal dividing the zero-time of corresponding each part as the zero-time of each word in the lyrics, and this time is kept in the text-only file of the described lyrics of storage.
6. a kind of Chinese synchronously displaying lyrics method that is accurate to word as claimed in claim 5 is characterized in that, comprises after step e:
Step f: the zero-time of lyrics word in the described text-only file of manual adjustment so that the demonstration time of this lyrics word can be synchronized with this lyrics word more.
7. Chinese synchronously displaying lyrics device that is accurate to word comprises:
The voice of every lyrics are divided into several portions, and the every part voice that are divided into are mated and then obtain matching attribute α
xModule, the quantity of described several portions equals this lyrics number of words and adds an ending ventilation;
And with described every kind cut apart the voice that obtain in turn with these lyrics in each word carry out the phoneme coupling, and obtain corresponding matching degree β
xModule;
And with λ * α
x+ (1-λ) * β
xThe maximum module as optimal dividing of value, wherein λ is weight coefficient and satisfies 0≤λ≤1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100895720A CN101615417B (en) | 2009-07-24 | 2009-07-24 | Synchronous Chinese lyrics display method which is accurate to words |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100895720A CN101615417B (en) | 2009-07-24 | 2009-07-24 | Synchronous Chinese lyrics display method which is accurate to words |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101615417A true CN101615417A (en) | 2009-12-30 |
CN101615417B CN101615417B (en) | 2011-01-26 |
Family
ID=41495014
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2009100895720A Expired - Fee Related CN101615417B (en) | 2009-07-24 | 2009-07-24 | Synchronous Chinese lyrics display method which is accurate to words |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101615417B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014079322A1 (en) * | 2012-11-22 | 2014-05-30 | 腾讯科技(深圳)有限公司 | Method and system for tracking audio media stream, and storage medium |
CN104142989A (en) * | 2014-07-28 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Matching detection method and device |
CN106652983A (en) * | 2016-09-18 | 2017-05-10 | 福建网龙计算机网络信息技术有限公司 | Subtitling method and subtitling system |
CN106649644A (en) * | 2016-12-08 | 2017-05-10 | 腾讯音乐娱乐(深圳)有限公司 | Lyric file generation method and device |
CN108206029A (en) * | 2016-12-16 | 2018-06-26 | 北京酷我科技有限公司 | A kind of method and system for realizing the word for word lyrics |
CN108228658A (en) * | 2016-12-22 | 2018-06-29 | 阿里巴巴集团控股有限公司 | It is a kind of to automatically generate the method, apparatus and electronic equipment for dubbing word |
CN113179444A (en) * | 2021-04-20 | 2021-07-27 | 浙江工业大学 | Voice recognition-based phonetic character synchronization method |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675896B (en) * | 2019-09-30 | 2021-10-22 | 北京字节跳动网络技术有限公司 | Character time alignment method, device and medium for audio and electronic equipment |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6077084A (en) * | 1997-04-01 | 2000-06-20 | Daiichi Kosho, Co., Ltd. | Karaoke system and contents storage medium therefor |
CN101458953A (en) * | 2008-12-25 | 2009-06-17 | 康佳集团股份有限公司 | Lyrics synchronously displaying method when playing multimedia songs |
-
2009
- 2009-07-24 CN CN2009100895720A patent/CN101615417B/en not_active Expired - Fee Related
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014079322A1 (en) * | 2012-11-22 | 2014-05-30 | 腾讯科技(深圳)有限公司 | Method and system for tracking audio media stream, and storage medium |
US9612791B2 (en) | 2012-11-22 | 2017-04-04 | Guangzhou Kugou Computer Technology Co., Ltd. | Method, system and storage medium for monitoring audio streaming media |
CN104142989A (en) * | 2014-07-28 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Matching detection method and device |
CN104142989B (en) * | 2014-07-28 | 2017-10-17 | 广州酷狗计算机科技有限公司 | A kind of matching detection method and device |
CN106652983A (en) * | 2016-09-18 | 2017-05-10 | 福建网龙计算机网络信息技术有限公司 | Subtitling method and subtitling system |
CN106652983B (en) * | 2016-09-18 | 2021-04-02 | 福建网龙计算机网络信息技术有限公司 | Subtitle making method and system |
CN106649644A (en) * | 2016-12-08 | 2017-05-10 | 腾讯音乐娱乐(深圳)有限公司 | Lyric file generation method and device |
CN106649644B (en) * | 2016-12-08 | 2020-02-07 | 腾讯音乐娱乐(深圳)有限公司 | Lyric file generation method and device |
CN108206029A (en) * | 2016-12-16 | 2018-06-26 | 北京酷我科技有限公司 | A kind of method and system for realizing the word for word lyrics |
CN108228658A (en) * | 2016-12-22 | 2018-06-29 | 阿里巴巴集团控股有限公司 | It is a kind of to automatically generate the method, apparatus and electronic equipment for dubbing word |
CN113179444A (en) * | 2021-04-20 | 2021-07-27 | 浙江工业大学 | Voice recognition-based phonetic character synchronization method |
CN113179444B (en) * | 2021-04-20 | 2022-05-17 | 浙江工业大学 | Voice recognition-based phonetic character synchronization method |
Also Published As
Publication number | Publication date |
---|---|
CN101615417B (en) | 2011-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101615417B (en) | Synchronous Chinese lyrics display method which is accurate to words | |
CN106898340B (en) | Song synthesis method and terminal | |
EP3675122B1 (en) | Text-to-speech from media content item snippets | |
Galliano et al. | The ESTER 2 evaluation campaign for the rich transcription of French radio broadcasts | |
US9865251B2 (en) | Text-to-speech method and multi-lingual speech synthesizer using the method | |
Carter et al. | F2 variation in Newcastle and Leeds English liquid systems | |
CN107172449A (en) | Multi-medium play method, device and multimedia storage method | |
Chládková et al. | Native dialect matters: Perceptual assimilation of Dutch vowels by Czech listeners | |
CN103035235A (en) | Method and device for transforming voice into melody | |
WO2018121368A1 (en) | Method for generating music to accompany lyrics and related apparatus | |
CN110675886A (en) | Audio signal processing method, audio signal processing device, electronic equipment and storage medium | |
CN106205601B (en) | Determine the method and system of text voice unit | |
TWI270052B (en) | System for selecting audio content by using speech recognition and method therefor | |
de Mareüil et al. | A diachronic study of initial stress and other prosodic features in the French news announcer style: corpus-based measurements and perceptual experiments | |
Dong et al. | Loudness and pitch of Kunqu Opera | |
CN109686358B (en) | High-fidelity intelligent customer service voice synthesis method | |
JP6314884B2 (en) | Reading aloud evaluation device, reading aloud evaluation method, and program | |
CN111402919A (en) | Game cavity style identification method based on multiple scales and multiple views | |
JP2014013340A (en) | Music composition support device, music composition support method, music composition support program, recording medium storing music composition support program and melody retrieval device | |
Zhang et al. | The Influence of Language Experience on the Categorical Perception of Vowels: Evidence from Mandarin and Korean. | |
Poerner et al. | A web service for pre-segmenting very long transcribed speech recordings | |
Cooper et al. | Characteristics of text-to-speech and other corpora | |
CN111429878A (en) | Self-adaptive speech synthesis method and device | |
CN113516963A (en) | Audio data generation method and device, server and intelligent loudspeaker box | |
Moberg et al. | Optimizing speech synthesizer memory footprint through phoneme set reduction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110126 Termination date: 20180724 |
|
CF01 | Termination of patent right due to non-payment of annual fee |