TW529018B

TW529018B - Terminal apparatus, guide voice reproducing method, and storage medium

Info

Publication number: TW529018B
Application number: TW090114151A
Authority: TW
Inventors: Akitoshi Saito
Original assignee: Yamaha Corp
Priority date: 2000-06-12
Filing date: 2001-06-12
Publication date: 2003-04-21
Also published as: WO2001097209A1; JP2001356784A; HK1054460A1; CN1436345A; KR20030010696A; CN100461262C; KR100530916B1; AU2001264240A1

Abstract

There is provided a terminal apparatus that eliminates the need for preparing information of pitch and intonation of a guide voice. Content data composed of performance data formed of a sequence of performance events, and voice symbol data formed of voice symbols indicative of respective syllables of lyrics attached to the performance data is distributed to the terminal apparatus. Musical tones are reproduced from the performance data. A guide voice is synthesized based on the voice symbol data. The performance data is pre-read to control the voice synthesis section, whereby characteristics of the synthesized guide voice are changed according to the performance data.

Description

529018 A7 B7 五、發明説明（！）技術領域本發明相關於能接收内容資料來允許使用者撥放伴唱並且可合適地應用於伴唱裝置、行動電話手機或類似的設備的一種終端裝置和一種引導語音複製方法，與一種儲存程式以用於執行引導語音複製方法之儲存媒介。背景技藝包含被認知爲數位蜂巢系統的PDC (Personal Digital Cellular telecommunication system)和 PHS (Personal Handyphone System)之蜂巢電話系統所佔用頻寬是較窄的，因此這些系統是以較低位元速率來傳輸資料。這需要用高效率壓縮編碼（compression encoding)方式來壓縮的狀態來傳輸語音（speech)訊號。分析-合成（analysis-synthesis) 編碼方式已知爲高效率壓縮編碼其中一種方法，其使用包含聲音來源模型（sound source model)和發聲道模型（vocal tract model)的語音合成模型。分析-合成编碼方法包括 MPC 法（Multi-Pulse Excited LPC，多脈衝激勵 LPC)和使用碼簿（code book)對語音資料向量量化（vector-quantizes)的 CELP 法（Code Excited LPC，編碼激勵 LPC)。CELP 法已被特定型態的數位蜂巢系統採納爲實際用途。伴唱系統傳統上已被建議成從分散的伴唱資料複製伴唱音樂以允許使用者撥放伴唱，意即對所複製伴唱音樂歌唱。這種伴唱系統通常被稱爲π通訊伴唱系統π (communication karaoke system)，並且包含甚至將伴唱資料散布到家庭的一種型態。在該系統中，所要求音樂的音 -4- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 529018 A7529018 A7 B7 V. INTRODUCTION TO THE INVENTION (!) TECHNICAL FIELD The present invention relates to a terminal device and a guide capable of receiving content data to allow a user to play a backing vocal, and can be suitably applied to a backing device, a mobile phone, or a similar device Voice copy method and a storage medium for storing a program for performing a guided voice copy method. BACKGROUND ART Cellular telephone systems including Personal Digital Cellular telecommunication system (PDC) and Personal Handyphone System (PHS), which are known as digital cellular systems, occupy a relatively narrow bandwidth, so these systems transmit at a lower bit rate. data. This requires high-efficiency compression encoding to compress the state to transmit speech signals. Analysis-synthesis coding method is known as one of the methods of high-efficiency compression coding, which uses a speech synthesis model including a sound source model and a vocal tract model. Analysis-synthesis encoding methods include MPC (Multi-Pulse Excited LPC) and Code Excited LPC (Code Excited LPC) using codebooks to vector-quantize speech data. ). The CELP method has been adopted for practical use by specific types of digital honeycomb systems. Backing systems have traditionally been proposed to copy backing music from scattered backing materials to allow the user to play back backing, meaning singing the copied backing music. This vocal system is often referred to as the π communication karaoke system and includes a form that even distributes vocal data to the home. In this system, the sound of the required music is -4- this paper size applies to the Chinese National Standard (CNS) A4 specification (210X 297 mm) 529018 A7

樂資料、引導歌詞資料以可見的歌詞提示顯示並且當被需求時、用於做爲背景影像的影像資料以伴押二料來散布。使用者對從音樂資料所複製的音樂聲音歌= 同時觀看從所散布引導歌詞資料所複製和帛示於:蓋^ 引導歌詞（可見的提示）。、然而，伴唱系統遇到一個問題爲當撥放伴唱時，使用必須於歌唱同時觀看顯示於螢幕上的引導歌詞，因此顏色於此根據首樂演奏的進行部分地來改變。因而，如果使用者無法看到顯示螢幕，使用者通常很難使用伴唱。這可能發生於如當使用者駕駛汽車時，當沒有顯示幕時，或顯示幕太小因而在螢幕上的字無法辯讀。、爲了解決上述問題，另一種型態的通訊伴唱系統已於日本平鋪公開專利出版物（Japanese Laid-〇pen PatentMusic data and guide lyrics data are displayed with visible lyrics prompts, and when required, the image data used as the background image is distributed with companion data. User's copy of the music sound song from the music data = At the same time watch the copy and display from the distributed guide lyrics data at: cover ^ guide lyrics (visible hint). However, the accompaniment system encountered a problem: when playing the accompaniment, it must be used while singing to watch the guide lyrics displayed on the screen, so the color is partially changed according to the performance of the first music. Therefore, if the user cannot see the display screen, it is often difficult for the user to use the backing vocal. This can happen, for example, when the user is driving a car, when there is no display, or the display is too small and the characters on the screen cannot be read. In order to solve the above problems, another type of communication accompaniment system has been published in Japanese tile patent publication (Japanese Laid-〇pen Patent

Publicati〇n，Koka〇第1 1-167392號中所建議，其中由音樂資料、背景影像資料、做爲引導歌詞的歌詞顯示資料所組成的伴唱資料與做爲被附加於此之發聲提示（v〇Cal 的歌詞資料一起傳輸。一種伴唱裝置接收這些資料，並且接著從音樂資料複製伴唱音樂，同時根據伴唱音樂複製的進行顯示基於歌詞顯示資料的引導歌詞在顯示基於背景影像資料的背景影像之螢幕上。進而，基於包含於發聲提示歌資料之重音（α(^ηΪ)的資_訊、音調的強度、音高 (pitch ’语音的特質），形成合成（synthesized)聲音並且接著根據也包含於發聲提示歌詞資料之讀取時序（reading timing)資訊來輸出。因而，使用者可藉著聽取由語音合成 •5- 本紙張尺度適财B ® 格(210 X 297公釐）裝訂線五 '發明説明（，、…谨聲提示來歌唱，不需要注視顯示螢幕。詞I姐’在使用者歌唱歌曲相對應部分之前，發聲提 :Μ必須大聲地被讀取或歌唱，並且辑提示歌詞資料合成，因此所合成聲音：；=從歌用者將歌唱旋律部分之立古 /、有ί應於使 Γ者感覺到經由聽取二致於讓 2因’發聲提示歌詞資料需要包含重音資訊、音二：此 y =合成聲音的音高（語音特質）和讀取時序f訊。^又、 :二?過旋律與歌曲類似部分的分析需要被準備二利:用:緣故，系統可被期彳“ 傳輸速率限制到低位元速率，因而傳輸容量也 ΐ;::艮：因此’傳輸具有附加發聲提示歌詞資料的：: 較長的時間，會導致增加使用該系統通話的費用。 ΐ料二使用者由抬頭做出需求該音樂作品之後，伴唱 =才會散布到使用者4㈣輸該資㈣久時， =法忍受於傳送需求後，要長久等待音樂作品的再生，立知可能引起使用者鬆動撥放伴唱的興趣。八並且，蜂巢電話必需裝配語音合成裝資料合成聲音’這不僅讓物話昂貴，也；d 縮小加諸了限制，因爲語音合成裝置也需要空間。本發明有鑑於這些情形，本發明第_個目的爲提供一種終端裝置和引導語音再生方法以減少需要準備音準資訊和 529018 A7 B7 五、發明説明（4 ) 引導語音音調、及儲存執行該引導語音再生方法的程式之儲存媒體。此外，本發明第二個目的爲提供一種終端裝置和引導語音再生方法以甚至用低資料傳輸速率都能夠在短時間内傳輸伴唱資料，並且同時不需專屬語音合成裝置來用於引導語音的複製及儲存執行該引導語音再生方法的程式之儲存媒體。，發明揭露爲達到上述第一個目的，根據本發明第一個層面 (aspect)，提供一種終端裝置散布由一連串演奏事件所形成的演奏資料所組成的内容資料，及由附加到該演奏資料相對應歌詞音節（syllables)所指示的語音符號所形成之語音符號資料。該終端裝置根據本發明第一個層面，由包含從演奏資料再生音樂音調之音樂音調合成區段（musical tone synthesis section)、基於語音符號資料合成引導語音之引導語音、由以預讀演奏資料的方式控制語音合成區段之語音合成控制區段來特徵化，因而該合成引導語音的特徵根據演奏資料來改變。根據本終端設備，由預讀演奏資料以控制語音合成區段，因而該合成引導語音的特-徵4艮據演奏資料來改變。因此，不再需要準備引導語音的聲響時序（sounding timing) 資訊，及其音高和音準資訊。這讓免除分析每個音樂作品的程序和被合成語音之重音資訊、音調資訊、音高（語音本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 529018 A7 B7 五、發明説明（5 ) 特質）資訊、讀取時序資訊的準備成爲可能。進而，因爲分散資料不需包含引導語音音高和音準的資訊，減少必需散布的資料量是可能的。並且，因爲由預讀和分析演奏資料來控制引導語音的聲響時序是可能的，所需散布的資料量可以進一步減少。最好是演奏資料是MIDI格式演奏資料，將語音符號資料插入到演奏資料以做爲專用的訊息。最好是該終端設備進而包含一種分析包含在演奏資料中發聲線（vocal line)的演奏資料之分析區段，和由根據語音線、基於分析區段改變由語音合成區段所合成之引導語音的音高和音準來分析結果、控制語音合成區段的語音合成控制區段。 ' 更合適地是語音合成控制區段，藉由語音合成區段基於由合成區段的分析結果，來控制合成的時序，因而由語音合成區段所合成之引導語音在對應於此的發聲線之前被發出聲響。進而更合適地是，該終端裝置包含語音資料庫儲存語音參數，並且語音合成控制區段基於由分析區段的語音符號資料和分析結果，以從語音資料庫所讀取的語音參數供應語音合成區段，藉以由語音合成區段所合成的引導語音具有每個音節與語音符號資料相各-並且音高和音準可以根據語音線來改變。爲了獲得上述第二個目的，根據本發明第二個層面，提供一種終端裝置散布由一連Φ演奏事件所形成的演奏資料 -8- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 529018 A7 B7 五、發明説明（6 ) 和附加於演奏資料的指示相對應歌詞音節語音符號所形成語音符號資料之内容資料。根據本發明第二個層面之終端裝置以包含啓動電話語音的電話功能區段、複製從演奏資料來的音樂音調的音樂音調合成區段、和基於語音符號資料來合成引導語音之語音合成區段來特徵化並且對電話語音之語音資料解碼。根據本終端設備，引導語音可以藉由使用如數位蜂巢系統行動電話手機中所提供之對語音資料解碼的語音合成區段來合成，因而該行動電話手機不需要具有額外語音合成區段。因此，甚至如果行動電話乎機被設定成可以輸出引導語音、不需新的容納空間，這讓將電話手機尺寸變小成爲可能。進而，Ή爲現存語音合成區段也可以依此用途來共用，防止製造成本的增加是可能的。較合適的是，此終端設備進而包含一種經由預讀演奏資料的方式來控制語音合成區段之語音合成控制區段，因此該合成引導語音的特徵根據演奏資料來改變。較合適的是，此終端設備進而包含一種分析包含在演奏資料中之發聲線演奏資料的分析區段，和基於分析區段的分析結果控制語音合成區段之語音合成控制區段以改變根據發聲線由語音合成區段所合成之引導語音的音高和音準0 爲了達成第一個目的，根據本發明第三個層面，提供一種複製引導語音的方法，經由使用一種終端裝置散布由一連串演奏事件所形成的演奏資料和附加於該演奏資料的指 -9- 本纸張尺度適用中國國家標準(CNS) Α4規格(210X 297公釐) 529018 A7 B7 五、發明説明（7 ) 示相對應歌詞音節語音符號所形成語音符號資料之内容資料。該方法由包含從演奏資料複製音樂音調、合成基於語音符號資料的引導語音和預讀該演奏資料的方式以致於所合成引導語音的特徵根據該演奏資料來改變等步驟來特徵化。爲了達成第一個目的，根據本發明第四個層面，提供一種儲存媒介以儲存程式用以引起電腦執行複製引導語音的方法。圖式簡述 . 圖1爲一圖示顯示應用於根據本發明第一具體實施例、及並列基站（base' station)之蜂巢電話組態設定的範例；圖2爲一圖示更詳盡地顯示圖1中蜂巢電話電話功能區段之語音壓縮合成區段的組態設定；圖3爲一圖示顯示演奏資料處理的流程，和代表出現於圖1中電話功能區段的處理區段功能之功能區段圖；圖4爲顯示圖1中蜂巢電話所使用伴唱資料的格式；圖5爲下載伴唱資料到圖1中蜂巢電話的程序之觀念呈現圖；圖6爲顯示根據本發明第二具體實施、及並列分散中心 (distribution center)來應用的伴、臀裝置組態設定之範例；圖7爲一圖示更詳盡地顯示圖6中伴唱裝置語音合成區段的語音合成區段組態設定。執行本發明最佳模式 -10- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 裝訂線 529018Recommended by Publicati〇n, Koka〇 No. 1 1-167392, in which a backing material composed of music data, background image data, lyrics display data as a guide lyrics, and vocal prompts attached thereto (v 〇Cal's lyrics data are transmitted together. A vocal device receives these data, and then copies the vocal music from the music data, and at the same time displays the guidance lyrics based on the lyrics display data according to the copy of the vocal music. On the screen showing the background image based on the background image data Furthermore, based on the accent (α (^ ηΪ) information, the intensity of the tone, and the pitch (pitch 'voice)) included in the vocal cue data, a synthesized sound is formed and then based on the The vocal prompts the reading timing information of the lyrics data to be output. Therefore, the user can listen to the speech synthesis by listening to it. 5- This paper size is suitable for B ® grid (210 X 297 mm). Explanation (,, ... reminds you to sing, don't need to look at the display screen. The word I sister 'corresponds to the song that the user sings Before the part, the vocal mention: Μ must be read or sung aloud, and the lyrics should be composed of lyrics data, so the synthesized sound:; == From the song user, the melody part of the singing melody /, you should use Γ The person feels that listening to the two causes the two to cause the vocalization to indicate that the lyrics data needs to include accent information, and the second: this y = the pitch (speech characteristics) of the synthesized sound and the reading sequence f. The analysis of similar parts of the melody and the song needs to be prepared for two benefits: use: for reasons, the system can be expected to "transmit the rate to a low bit rate, so the transmission capacity is also ΐ; :: Gen: therefore 'transmission has additional vocal prompting lyrics data :: Long time, it will increase the cost of using the system to call. Ϊ́ 二 2 After the user makes a request for the music piece, the backing vocal = will be distributed to the user 4 when the resource is lost for a long time, = After enduring the need for transmission, it is necessary to wait for a long time for the reproduction of music works, and it is known that the user may be interested in loosening the backing vocals. Eighth, the cellular phone must be equipped with speech synthesis equipment to synthesize sounds. Not only making things expensive, but also d. Shrinking puts restrictions because the speech synthesis device also needs space. In view of these circumstances, the present invention aims to provide a terminal device and a method for guiding speech reproduction to reduce the need for preparation Intonation information and 529018 A7 B7 V. Description of the invention (4) Guiding voice tones and a storage medium storing a program that executes the guidance voice reproduction method. In addition, a second object of the present invention is to provide a terminal device and a guidance voice reproduction method to Even with a low data transmission rate, the vocal data can be transmitted in a short time, and at the same time, a dedicated speech synthesis device is not needed for the copy of the guided voice and a storage medium for storing a program that executes the guided voice reproduction method. In order to achieve the above first object, according to the first aspect of the present invention, a terminal device is provided to disseminate content data composed of a series of performance data formed by a series of performance events, and a phase information attached to the performance data. Corresponds to the phonetic symbol data formed by the phonetic symbols indicated by lyrics syllables. According to the first aspect of the present invention, the terminal device includes a musical tone synthesis section including a reproduction of musical tones from performance data, a guide voice based on speech symbol data to synthesize a guide voice, and a pre-read performance data. The method controls the speech synthesis control section of the speech synthesis section to be characterized, so the characteristics of the synthesis guide speech are changed according to the performance data. According to the present terminal device, the speech synthesis section is controlled by pre-reading the performance data, so the characteristics of the synthesis guide speech are changed according to the performance data. Therefore, it is no longer necessary to prepare the sounding timing information of the guiding voice, as well as its pitch and intonation information. This eliminates the need to analyze the program of each musical composition and the accent information, pitch information, and pitch of the synthesized speech (the paper size of the speech applies to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 529018 A7 B7 V. Invention It is possible to explain (5) characteristics) information and read timing information. Furthermore, because the dispersed data does not need to include information to guide the pitch and pitch of the speech, it is possible to reduce the amount of data that must be distributed. In addition, because it is possible to control the timing of the sound of the guidance voice by pre-reading and analyzing performance data, the amount of data to be distributed can be further reduced. It is best that the performance data is MIDI format performance data, and voice symbol data is inserted into the performance data as a dedicated message. Preferably, the terminal device further includes an analysis section that analyzes performance data including vocal lines in the performance data, and a guide voice synthesized by the speech synthesis section based on the voice line and based on the analysis section change The pitch and pitch are used to analyze the results and control the speech synthesis section of the speech synthesis section. '' The speech synthesis control section is more suitable. The speech synthesis section controls the timing of the synthesis based on the analysis results of the synthesis section. Therefore, the guidance speech synthesized by the speech synthesis section corresponds to the utterance line corresponding thereto. Was sounded before. Further more suitably, the terminal device includes a voice database to store voice parameters, and the voice synthesis control section supplies voice synthesis from the voice parameters read from the voice database based on the voice symbol data and analysis results from the analysis section. Sections, whereby the guidance speech synthesized by the speech synthesis section has each syllable different from the phonetic symbol data-and the pitch and pitch can be changed according to the speech line. In order to achieve the above-mentioned second objective, according to the second aspect of the present invention, a terminal device is provided to disseminate performance data formed by a series of Φ performance events. -8- This paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X 297 male) (Centi) 529018 A7 B7 5. Description of the invention (6) Content information of the phonetic symbol data formed by the syllable phonetic symbols corresponding to the instructions attached to the performance data. The terminal device according to the second aspect of the present invention includes a telephone function section including activation of a telephone voice, a music tone synthesis section that reproduces music tones from performance data, and a speech synthesis section that synthesizes guided speech based on speech symbol data To characterize and decode the phonetic voice data. According to the present terminal device, the guidance speech can be synthesized by using a speech synthesis section that decodes speech data as provided in a digital cellular phone handset, so the mobile phone handset does not need to have an additional speech synthesis section. Therefore, even if the mobile phone is set to output the guidance voice without the need for a new accommodation space, this makes it possible to reduce the size of the phone handset. Furthermore, the existing speech synthesis section can be shared for this purpose, and it is possible to prevent an increase in manufacturing costs. It is more appropriate that the terminal device further includes a speech synthesis control section that controls the speech synthesis section by pre-reading the performance data, so the characteristics of the synthesized guidance speech are changed according to the performance data. It is more suitable that the terminal device further includes an analysis section that analyzes the sound line performance data included in the performance data, and a speech synthesis control section that controls the speech synthesis section based on the analysis result of the analysis section to change the sound according to the utterance. The pitch and pitch of the guidance speech synthesized by the speech synthesis section. In order to achieve the first objective, according to the third aspect of the present invention, a method of copying the guidance speech is provided, which is distributed by using a terminal device by a series of performance events. The formed performance data and the instructions attached to the performance data-9- This paper size applies the Chinese National Standard (CNS) A4 specifications (210X 297 mm) 529018 A7 B7 V. The description of the invention (7) shows the corresponding lyrics syllables Content data of speech symbol data formed by speech symbols. The method is characterized by steps including copying musical tones from performance data, synthesizing guidance speech based on speech symbol data, and pre-reading the performance data such that the characteristics of the synthesized guidance speech are changed according to the performance data. In order to achieve the first object, according to a fourth aspect of the present invention, a storage medium is provided to store a program for causing a computer to execute a method for copying a guide voice. Brief description of the drawings. FIG. 1 is a diagram showing an example of a configuration of a cellular phone applied to a base station in accordance with the first embodiment of the present invention; FIG. 2 is a diagram showing a more detailed diagram The configuration settings of the voice compression and synthesis section of the cellular phone telephone function section in FIG. 1; FIG. 3 is a diagram showing the process of processing performance data and representing the functions of the processing section appearing in the telephone function section of FIG. Functional section diagram; Figure 4 shows the format of the backing material used by the cellular phone in Figure 1; Figure 5 is a conceptual rendering of the program for downloading backing material to the cellular phone in Figure 1; An example of the configuration settings of the accompaniment and hip devices implemented and applied in the distribution center; FIG. 7 is a diagram showing the configuration settings of the speech synthesis section of the speech synthesis section of the vocal device in FIG. 6 in more detail . Best Mode for Implementing the Invention -10- This paper size applies Chinese National Standard (CNS) A4 (210 X 297 mm) binding line 529018

現士，纟發明將以參照顯示本文所示具體實施例的圖示來更洋盡地描述。圖1爲一圖示顯示應用於根據本發明第一具體實施例、及並列基站（base station)之蜂巢電話組態設定的範例。圖1中參…、數丰1代表根據本發明第一具體實施例的蜂巢電話’同時參照數字2代表管理相對應無線電地帶（radi0 zone)的基站。通常，數位蜂巢系統使用小地帶系統，其中服務區域被分劃成數個無線電地帶。無線電地帶每個由一個相對應基站2所管理。當通話到普通電話機，蜂巢電話1 边匕基站2連接到一個父換台，並然後透過該交換口連接到晋通電話網路，如本文以下所詳述。蜂巢電話i被>是供一個天線1〇，其通常是爲可伸縮 (retractable)的型怨並且連接到傳輸器…抓仰…以)/接收器 (receiver)區段11。該傳輸器/接收器丨丨對天線⑺所接收到的訊號解調（demodulate)，並且將訊號調變來傳輸以將調變後的訊號傳送給天線1〇。電話功能區段12包括一種可引起蜂巢電話1如電話機作用以和其他電話機通訊之處理裝置，並且語音壓縮-合成區段22具有CELp編碼器功能和 CELP解碼器功能，兩者皆可調適於高效率語音壓縮。語音可以由供應從資料庫24讀取之語音參數到語音壓縮-合成區段22和使用語音壓縮-合淹，曼段22來合成。簡言之，語晋壓縮-合成區段2 2可以被引發成如語音合成裝置來作用。資料庫24儲存” a”到” n”聲音和模仿聲音的語音參數。於電洁通話的期間中，透過麥克風2 1輸入的語音訊號每 __________ -11- 本紙張尺度通财g g家標準(CNS) A4規格(2ι()χ297公爱] " —---- 、發明説明（個由語音壓縮-合成編碼，炊德」碼功能接受到高效率壓縮- 來傳輸:、另輸器1 接收器11來調變，接著透過天線10 的語音資_ U，透過天線_接收冑效率壓縮編碼過功能;來解調變並且然後由電話接著r勺扛曰壓5成區&22來解碼成原始語音訊號，考攸己括％聲器（loudspeaker)的輸出區段2〇來輸出。如區：二於電話通話期間*，訊號經過傳輸器/接收器 σσ又1和電話功能區段12來傳送或接收。Now, the invention will be described more fully with reference to the drawings showing specific embodiments shown herein. FIG. 1 is a diagram showing an example of a cellular phone configuration setting applied to a first specific embodiment of the present invention and a base station in parallel. In FIG. 1, reference numerals 1 and 1 represent the cellular phone according to the first embodiment of the present invention, and the reference numeral 2 represents a base station that manages a corresponding radio zone. Generally, digital honeycomb systems use a small-zone system in which the service area is divided into radio zones. The radio zones are each managed by a corresponding base station 2. When a call is made to a normal telephone, the cellular phone 1 and the base station 2 are connected to a parent switching station, and then connected to the Jintong telephone network through the switching port, as detailed below in this article. The cellular phone i is > provided for an antenna 10, which is usually of a retractable type and is connected to a transmitter ... grab ... a receiver / receiver section 11. The transmitter / receiver demodulates the signal received by the antenna ，, and modulates the signal to transmit to transmit the modulated signal to the antenna 10. The telephone function section 12 includes a processing device that can cause the cellular telephone 1 to function as a telephone to communicate with other telephones, and the speech compression-synthesis section 22 has a CELP encoder function and a CELP decoder function, both of which are adjustable for high Efficiency speech compression. The speech can be synthesized by supplying the speech parameters read from the database 24 to the speech compression-synthesis section 22 and using the speech compression-synthesis section 22 to synthesize the speech. In short, the language compression-synthesis section 22 can be triggered to function as a speech synthesis device. The database 24 stores voice parameters of “a” to “n” sounds and mimic sounds. During the call of DJ, the voice signal input through the microphone 2 1 every __________ -11- This paper is a standard GG home standard (CNS) A4 specification (2ι () χ297 public love) " —---- 、 Explanation of the invention (by speech compression-synthetic coding, cooking code) The code function receives high-efficiency compression-for transmission: 1. Modifier 1 Receiver 11 to modulate, and then pass the voice data _ U of antenna 10 through the antenna _Receiving efficiency compression coding function; to demodulate and then decode by phone and then compress 50% area & 22 to decode the original voice signal, including the output section of loudspeaker The output is 20. For example, during the telephone conversation *, the signal is transmitted or received through the transmitter / receiver σσ and 1 and the telephone function section 12.

，存裝a η馬如本文此後所述之暫時儲存所散布伴唱資 ^憶tf。伴唱資料由使用者f求_料音㈣品演奏一所^的演4資料#附加於該演奏資料的指示相對應 3骨節語音符·號所形成語音符號資料來組成。進而，伴口曰貧料可以包括顯示歌詞在顯示幕上的引導歌詞資料。如圖1所^，伴唱資料以MIDI格式來寫作，並且歌詞的語音付號資料以專用的訊息被植AMIDI資料。因此，單一音樂作品的伴唱資料量可以被限定成很小的尺寸，其甚至H 彳/、有低位7L貧料傳輸速率的數位蜂巢系統在短期間内傳輸一件伴唱資料的音樂作品。 '貝料分隔區段（data separating section) 14將MIDI解碼器併入並且解澤從儲存裝置13所讀出的MIDI資料來同一個資料刀隔成士奏 '貝料和I吾音付號广資料。透過做爲延遲電路 (delay circuit)的緩衝記憶體（buffer mem〇ry，buff) 15，被分離出的演奏資料供應到由序化器（sequencer)和midi聲音源所組成之音樂音碉合成區段（mUSiCal t〇ne Synthesjs __~ 1 2 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐) 裝訂 529018, Storing a η horse as described later in this article temporary storage of scattered backing funds ^ memory tf. The accompaniment data is composed of the voice symbol data formed by the user f seeking _ material sound ㈣品玩一所 ^ 的演 4 资料 # The instructions attached to the performance data correspond to the three bone joint phonetic symbols. Further, the companion speech may include guide lyrics data showing lyrics on a display screen. As shown in Figure 1, the vocal data is written in MIDI format, and the phonological data of the lyrics is embedded with AMIDI data in a dedicated message. Therefore, the amount of backing material for a single music piece can be limited to a small size, which even H H /, a digital honeycomb system with a low 7L lean material transmission rate can transmit a piece of backing piece music piece in a short period of time. 'Data separating section (data separating section) 14 incorporates a MIDI decoder and resolves the MIDI data read from the storage device 13 to separate the same data into a musical instrument' data. Through the buffer memory (buffer memry, buff) 15 as a delay circuit, the separated performance data is supplied to a music sound synthesis area composed of a sequencer and a midi sound source. Section (mUSiCal t〇ne Synthesjs __ ~ 1 2-This paper size applies to China National Standard (CNS) A4 specifications (210X297 mm) binding 529018

sectl〇n) 16。另—方面，被分離出的語音符號資料和淹奏貧料一起供應給電話功能區段（telephone functi〇n “州⑽） 12。在電話功能區段12中’引導語音基於語音符號資料由語音壓縮合成區段22來合成並且從而輸出。該引導語音提供歌唱者伴唱歌曲歌詞的聲頻提示以幫助其在引導^= 像顯示於顯示器上的地方唱歌。該引導語音根據由音樂音調合成段落16所複製伴唱音樂音調的進行來被合成，並且接著透過輸出段落20來輸出。因此，每個對應到一個預先測足長度的歌詞片語之引導語音被合成並且比該歌詞片語被歌唱的時間還早輸出。進而，加入重音和音調於此後，该引導語晋根據對應於該演奏資料的節奏以快拍（fast tempo)來合成。· 爲了彳生制輸出引導#吾音的時序及加入節奏、重音、音調於該相同引導語音，包含於演奏資料中的發聲線路之演奏貧料（發聲部分的區段）由電話功能區段12的處理區段來分析。例如，藉由分析發聲線路演奏資料的調性改變（key change、旋律，melody)，控制引導語音音準變動的方式並且藉由分析反應像是滑音（slur)和斷音（staccato)等樂諸 (musical score)符號之發聲線路速度（velocity)資訊和聲響區間（sounding duration)(匣時間，gate time)資訊，控制引導語音的音調和重音。進而，當+唱是用於二重唱時，發聲線路演奏資料的調性改變可被分析以確定一片語是以男聲部分或女聲部分來處理，並且該片語的引導語音音高可依該決定來設定，因而該引導語音聽起來是爲女音或男音。 ____-13- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 裝訂線 529018 A7sectl0n) 16. On the other hand, the separated speech symbol data is supplied to the telephone function section (telephone functión) together with the insipid material 12. In the telephone function section 12, the 'guide speech based on the speech symbol data is replaced by the speech The synthesis section 22 is compressed to synthesize and thereby output. The guidance voice provides an audio cue for the lyrics of the singer's backing song to help him sing in the place where the guidance ^ = appears on the display. The guidance voice is based on the synthesis of the tone from the paragraph 16 The process of dubbing the backing music tones is synthesized, and then output through the output paragraph 20. Therefore, each of the leading voices corresponding to a pre-measured lyrics phrase is synthesized and longer than the time that the lyrics phrase was sung Output early. In addition, after adding accents and tones, the guide is synthesized by a fast tempo according to the rhythm corresponding to the performance data. · To generate the output of the guide # 我音 's timing and adding rhythm, The accent and tones are based on the same guiding voice, and the performance of the vocal lines included in the performance data is poor (the vocal part of the Section) is analyzed by the processing section of the telephone function section 12. For example, by analyzing the key changes, melody, and melody of the performance data of the vocal line, controlling how to guide the pitch change of the voice and by analyzing the response image It is the velocity line information (velocity) information and sounding duration (gate time) information of the musical score symbols such as slur and staccato, which controls the pitch and accent of the guiding voice Further, when + sing is used for duet, the tonal change of the vocal line performance data can be analyzed to determine whether a phrase is handled by a male or female part, and the guiding voice pitch of the phrase can be determined accordingly It is set so that the guidance voice sounds female or male. ____- 13- This paper size applies Chinese National Standard (CNS) A4 (210 X 297 mm) gutter 529018 A7

供應給電話功能區段12的語音符號資料傳送給資料庫 24’並且聲音參數”料庫24讀出和將之供應給語音壓縮 -合成區段22,因而由語音符號資料所代表的聲音以音節爲基礎由語音壓縮-合成區段22來合成。從資料庫24所讀出的語音參數透過基於如上所述語音線路演奏資料分析^ 果的控制來製作’因而該參數反應出旋律、速度、和咳發聲線路的聲響期間，其啓動語音壓縮_合成區段以來合^ 該引導語音’因此該引導語音的音準、重音、和音調根據該發聲線路來改變。如上又所述，對應引導語音的演奏資料部分被預讀和分析，並且然後該引導語音於音樂音調基於演奏資料部分來複製之前被輸出·。換言之，基於演奏資料的音樂音調被延遲到1豕引導語音的複製之後。此延遲由緩衝記憶體 memory) 15來產生。由緩衝記憶體15依預定時間期間所延遲的次奏貝料供應到音樂音調合成段落1 6以複製音樂音调。結果’於所對應音樂音調被由壓縮·合成區段2 2所合成的引導語晉經由輸出段落20於相對應音樂音調由音樂音調合成區段16複製之前來輸出。音樂音調合成區段包含序化器和MIDI聲音源。由音樂音調合成區段16所複製的音樂音調被傳送到效果區段（effect section) 17，其中效果（effect)袂^加入到音樂音調。加入效果的音樂音調由合成區段18與合成引導語音一起合成。於被音樂音調合成之前，引導語音擁有由效果區段2 3所加入的效果。由合成區段18合成之音樂音調和引導語音由放大 ____-14- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 529018 A7 ^__ B7 五、發明説明（12 ) 區& 19來放大並且經由輸出區段2〇來輸出。效果區段17、 23取決於輸出段落20揚聲器的數目來進行如本地化 (localization)的控制。進而，還可以加入效果，像是迴響 (reverberation)和合唱（chorus)。仍更進一步，資料庫僅儲存用於合成典型機械聲響（mechanical-sounding)人聲的聲音參數，並且因而所合成引導語音可用等化器（equaHzer)來更正。並且，引導語音的音量可以根據操作伴唱歌唱者的技術來減低。 …其次關於圖2，圖1蜂巢電話1的電話功能區段12之語音壓縮-合成段落22組態連同資料庫24的組態更詳盡地顯示。 . 於圖2中所7^語音壓縮—合成區段22包括典型的CELP解碼器以用於對高效率壓縮編碼語音資料來解碼。應該注意到阳曰壓縮-合成區段22還包括未顯示的CELP編碼器，其能狗進行語音資訊高效率壓縮的編碼。現在，語音合成的基本原則將被解釋。語音的特徵可以由音高L和由發聲合弦所產生來源聲音的噪音部件（n〇ise component，將被稱之爲來源聲音特徵參數）來描述，還有當聲音通過揚聲器或歌唱者喉嚨和嘴巴時、給與該聲音之 1各帶傳輸（vocal-tract)特徵與當聲音通過揚聲器或歌唱者濤唇時、給與該聲音知的發聲·^特徵（這些特徵將被稱馬發聲帶特徵參數”v〇caMract charactedstic parameters”）。簡言之，聲音合成模型可以用發聲合弦模型（vocal chord model)和發聲帶模型（v〇cal tract m〇del)來呈本紙張尺度適if]巾g g *標準(CNs) μ規格撕公爱) 529018The speech symbol data supplied to the telephone function section 12 is transmitted to the database 24 'and the sound parameter "material library 24 is read out and supplied to the speech compression-synthesis section 22, so the sound represented by the speech symbol data is in syllables Based on synthesis by the speech compression-synthesis section 22. The speech parameters read out from the database 24 are produced through the control based on the analysis of the performance data of the speech line as described above, so the parameters reflect the melody, speed, and During the sound of the vocalization line, it has started the speech compression_synthesis section ^ The guidance speech ', so the pitch, accent, and pitch of the guidance speech are changed according to the vocalization line. As described above, the performance of the corresponding guidance speech The material part is pre-read and analyzed, and then the guidance voice is output before the musical tone is copied based on the performance profile part. In other words, the musical tone based on the performance profile is delayed until after the reproduction of the guide voice. This delay is buffered Memory memory 15). The buffer memory 15 is supplied with the infra shell material delayed by a predetermined period of time. The musical tones are synthesized into paragraphs 16 to duplicate the musical tones. As a result, the leading phrase synthesized by the compression / synthesis section 2 2 for the corresponding musical tones is copied through the musical tones synthesis section 16 to the corresponding musical tones via the output paragraph 20 The output is before. The music tone synthesis section contains a sequencer and a MIDI sound source. The music tones copied from the music tone synthesis section 16 are transferred to the effect section 17 where the effect 袂 ^ is added To music tones. The music tones with effects are synthesized by the synthesis section 18 together with the synthesis guide voice. Before being synthesized by the music tones, the guide voice has the effects added by the effects section 2 3. The music synthesized by the synthesis section 18 Tones and guiding voices are amplified by ____- 14- This paper size applies Chinese National Standard (CNS) A4 specifications (210 X 297 mm) 529018 A7 ^ __ B7 V. Description of the invention (12) Area & 19 The output section 20 is used for output. The effect sections 17, 23 depend on the number of speakers in the output section 20 for localization control. Furthermore, you can add Effects such as reverberation and chorus. Going one step further, the database only stores the sound parameters used to synthesize typical mechanical-sounding human voices, and thus the synthesized guidance voice can be used with an equalizer (equaHzer) ) To correct. Also, the volume of the guidance voice can be reduced according to the technique of operating the vocal singer.… Secondly, regarding the voice compression-synthesis paragraph 22 configuration of the telephone function section 12 of the cellular telephone 1 of FIG. 2 and the configuration together with the database The configuration of 24 is shown in more detail. The 7 ^ speech compression-synthesis section 22 shown in Figure 2 includes a typical CELP decoder for decoding highly efficient compression-encoded speech data. It should be noted that the compression-synthesis section 22 also includes a not-shown CELP encoder, which enables efficient compression coding of speech information. Now the basic principles of speech synthesis will be explained. The characteristics of speech can be described by the pitch L and the noise component of the source sound (noise component, which will be referred to as the source sound characteristic parameter), and when the sound passes through the speaker or singer's throat and Features of vocal-tract given to the voice at the mouth, and vocal features given to the voice when the voice passes through the speaker or the singer's lips (these features will be called horse vocal tract features Parameter "v〇caMract charactedstic parameters"). In short, the vocal chord model and vocal chord model (vocal tract m〇del) can be used to present the paper scale suitable for this paper] gg * standard (CNs) μ size tearing public Love) 529018

現。圖2中語晉壓縮_合成區段22的(：£1^解碼器，藉由執行基於上述语音合成模式之聲音合成將壓縮编碼語音資料解碼成原來的聲音。如圖2中所示，每個對語音壓縮-合成區段22輸入的資料訊框由資料處理區段30分隔成聲音參數分別以索引（index) I、音高(pitch) L和反應係數（reflection coefficient) r 來指不。以音高L來指示的參數被供應給短期震盪（sh〇rt term oscillating)區段32，指示索引I的參數供應給碼簿31，指示反應係數r的參數供應給喉嚨近似濾波器（throat approximation filter) 34。碼簿3 1具有與用以對原始聲音编碼之編碼器碼簿·相同的内容。該碼簿3 1的内容儲存於僅讀 έ己憶體 ROM (Read Only Memory)中。短期震盪區段32以基於指示音高L參數的音高L來產生指示聲音的解碼訊號並且將所產生編碼訊號傳送到來源波形杈製（source waveform-reproducing section)區段 33。供應以索引Ϊ標示並從碼簿3 11買出之碼向量（code vector)資料給來源波形複製區段3 3，並且用音高L之指示聲音的編碼訊號來合成資料以複製合成來源波形。從來源波形複製區段3 3 所輸出合成來源波形爲類似由人聲和絃震動所產生波形，並且濾波器係數受指示反應係、秦參數所控制之喉觉近似濾波器34將合成來源波形過濾成合成語音。複製人類喉嘴和人類嘴巴轉換（transfer)功能之喉嚨近似濾波器34儲存從資料處理區段30預先供應的反應係數r並且將反應係數 _ -16- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 五、發明説明（14 =之相對遽波器。從喉嘴近似遽波器34輸出的合被傳送给聲譜遽波器（啊⑽fmer) ^，二出不=合成聲音部份(comp_ 一 ’如用Μ率壓縮編碼法所形成語音訊號之壓、·伯耳—曰貝科被語音壓縮合成區段“所解碼和輸出。另万面，資料分離區段14所分離出的語音被供應給聲音資料庫㈣___，並且用以語音符號貧枓指示的引導語音來合成之音高參數、波形選擇表數和反應係數參數從聲音資料庫轉出。所輸出以音高⑽ 不的音高參數供應給短期震盪區段32,且短期震盪區㈣產生一解碼信號代表具有音.高L g之聲音並且供應到來源波形-複製區段·33。波形選擇參數供應到波形資料庫 (waveform database) 41，並且定義聲音型態的波形從波形資料庫41讀出並且傳送到來源波形-複製區段^。來源波形-複製區段33合成具有音高Lg的解碼訊號和定義聲音型態之波形以複製合成來源波形。從來源波形·複製區段B 所輸出 < 合成來源波形由喉嚨近似濾波器34所過濾，其淚波係數受從被供應從聲音資料庫4〇來的反應係數參數之反應係數修改資料庫42所讀出指示反應係數r g的參數所控制，藉以引導語音被合成。從喉嚨近似濾波器34所輸出合成聲音被傳送到聲謂濾、波器3其對人聲不自然的合成言五音部分被消去，並且接著以引導語音被輸出。於上述過程中，資料庫24以控制訊號被供應。此控制訊號控制引導語音音高Lg和音高Lg變動的方式還有引導語音 529018 A7 B7 五、發明説明的晋準和重音。該控制訊號藉由嵌入電話功能區段丨2的處理區段代表包含在演奏資料中發聲線演奏資料上所做的分析結果資料。經由改變短期震盪區段32的震盪頻率、透過使用控制訊號的音鬲參數Lg的控制，是可能選擇性地合成引導語音成男聲或女聲。進而，藉由改變從波形資料庫4 j 讀出的波形資料，是可能改變引導語音聲音的型態。並且，藉由改變從反應係數修改資料庫42讀出指示反應係數 rg的參數，是可能改變該引導語音的音準和重音。於現有具體實施例中，因爲控制訊號是基於上述發聲線演奏資料的分析來產生，引導語音的音高、音準、重音可以根據發聲線的旋律來改變。結果是，藉由傾聽於歌唱相對應片居之別的引導语音，使用者可以理解該片語如何及其調性應該被唱出。進而，資料庫24被供應和儲存時間資訊丁ime以標示聲響時序（sounding timing)和聲響引導語音（s〇unding guide ¥01065)的節奏。回應給時間資訊丁丨11^，預定波形從波形資料庫41讀出，並且指示預定反應係數^ §的參數從反應係數修改資料庫42讀出。時間資訊Time於時序上比引導語音被提,歌詞字歌唱時點還早。進而，引導語音每個音；二長度是基於時間資訊Time來控制，並且引導語音也以該速度輸出。發聲線的演奏資料由上述電話功能區段12的處理區段來分析。此分析過程藉由處理區段執行分析程式來進行。圖 3例示用硬體組件形式的功能性區塊所代表之分析過程的 t a ® ^^^Cn57I4^(21〇 x 297^«)------- 裝訂線 529018 A7 B7 五、發明説明（16 ) 流程。圖3中，以伴唱資料散佈的MIDI資料被供應給來自於傳輸器/接受器區段11之儲存裝置13。資料分離區段1 4從儲存裝置13讀出MIDI資料，並且用MIDI解碼器功能解譯 MIDI資料以將該資料分離出演奏資料和語音符號資料。如圖4中所示，MIDI資料具有語音符號資料以如圖中所示位於狀態位元組M F0”和” F7”間所夾層部份代表之外部訊息植入此處。外部訊息以一連串提供給每個片語、指示引導語音的語音符號來形成並且可以包括？丨導片語應該被發聲的時序資料。 _ 由資料分離區段14所分離的語音符號資料被供應給引導語音音高 / 節奏-決定（guide voice pitch/tempo-determining section)區段36，引導語音的音高和節奏於此處被確定。於此情況中，是可能對使用者標示該音高以藉以選擇引導語音爲女聲或男聲。進而，音高資訊和節奏資料還有音準資料和重音資料，從發音線-分析區段（vocal line-analyzing section) 38，透過交換器（switch，SW)，被供應到引導語音音高/節奏決定區段36。引導語音音高/節奏決定區段36根據和語音符號資料在一起的所接收各種種類的資訊輸出控制訊號，因而引起該引導語音被形成，其具有所標示的音高並且依賴於該發聲線。該控、姊南L號控制該引導語音的音高、速度（節奏）、音準、和重音。應該注意到當交換器SW於關閉（OFF)的狀態並且該音高未被標示，該引導語音形成以具有内定的音高。如果引導 -1 9 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 529018 五、發明説明（17 ) 語音發聲不自然，當其爲根據發聲線的節拍來輸出時，僅而要父換器SW被關閉。於此情況中，單音的引導語音被被資料分離區段14所分離的演奏資料供應到發聲線_分析區段38,包含於演奏資料中的發聲線演奏資料於此處被分析。孩分離演奏資料’於被緩衝記憶體15被延遲之後，還傳送給青樂音調合成區段16。結果是，基於演奏資料的音樂晋調於被延遲在料語音之後被複製，並且同時被注音線-分析區段則讀和分析。語音線_分析區段叫析速度資訊和訊封（envelope)資訊，其反應發聲線和音樂樂調性改變（旋律），像是滑音和斷音。進而，期間（durat 資訊和匿時間㈣etime)資訊也被分析。從該分析所得旋律貧訊，由發聲線-分析區段3 古引導辞音立U… 乂 "控制資訊供應給 5丨译π曰θ同/即奏決定區段36。從速度資訊和訊封資訊分析:得的重音控制資訊和音準控制資訊還供應給引導舌五骨晋高/節奏決定區段36。進而，引導語音音高/節奏^ 區段3 6用從期間資訊和Ε時間資訊分析所得的引導祖立響時序資訊和引導語音節奏資訊來被供應。 ” * 應該注意到當音樂作品是二重唱（duet)時，發聲線-分析區段38進行分析發聲線具有男聲部還是女聲部，並 ^析結果的音高資訊傳送給、斜語音音高/節奏·決定^ 因而’引導語音的音高和每個音節的長度根據_ '線的旋律來控制。並且，於二重唱的情況中，輸出:;；本紙張尺度適用中國國家標準⑴卿八4規格(2 裝訂 -20- 1〇 X 297^¾)Now. The (: £ 1 ^ decoder in the language compression_synthesis section 22 in FIG. 2 decodes the compression-encoded speech data into the original sound by performing sound synthesis based on the speech synthesis mode described above. As shown in FIG. 2, Each data frame input to the speech compression-synthesis section 22 is separated into data parameters by the data processing section 30. The sound parameters are respectively index I, pitch L, and reflection coefficient r. The parameter indicated by the pitch L is supplied to the short term oscillating section 32, the parameter indicating the index I is supplied to the codebook 31, and the parameter indicating the response coefficient r is supplied to the throat approximation filter (throat). approximation filter) 34. The codebook 31 has the same contents as the encoder codebook used to encode the original sound. The contents of this codebook 31 are stored in the Read Only Memory (Read Only Memory) The short-term oscillating section 32 generates a decoded signal of the indicated sound at a pitch L based on the indicated pitch L parameter and transmits the generated encoded signal to a source waveform-reproducing section 33. It is supplied with Introduce the code vector data marked and bought from the codebook 3 11 to the source waveform copy section 33, and synthesize the data with the coded signal indicating the sound at pitch L to copy and synthesize the source waveform. From the source Waveform copy section 3 3 The output synthetic source waveform is similar to the waveform generated by human voice and chord vibration, and the filter coefficient is controlled by the indicator response system and the laryngeal approximation filter controlled by Qin parameters. 34 The synthetic source waveform is filtered into synthetic speech. The throat approximation filter 34, which duplicates the human throat and human mouth transfer functions, stores the response coefficient r supplied in advance from the data processing section 30 and sets the response coefficient _ -16-This paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm) V. Description of the invention (14 = relative wave filter. The sum of the output from the throat approximation wave filter 34 is transmitted to the sound spectrum wave filter (ah ⑽ fmer) ^, not two outputs = Synthetic sound part (comp_ a) As the pressure of the speech signal formed by the M-rate compression coding method, · Boer-Yebeke was decoded and output by the speech compression and synthesis section "In addition, the data is separated The speech separated in section 14 is supplied to the sound database ㈣ ___, and the pitch parameters, the number of waveform selection tables, and the response coefficient parameters synthesized from the guided speech indicated by the poor speech symbols are transferred from the sound database. The output is supplied to the short-term oscillating section 32 with a pitch parameter of ⑽ not, and the short-term oscillating section ㈣ generates a decoded signal representing a sound having a tone. High L g and supplies it to the source waveform-copy section 33. The waveform selection parameters are supplied to a waveform database 41, and a waveform defining a sound pattern is read from the waveform database 41 and transmitted to a source waveform-copy section ^. The source waveform-copy section 33 synthesizes a decoded signal having a pitch Lg and a waveform defining a sound pattern to copy the synthesized source waveform. Output from the source waveform and copy section B < The synthesized source waveform is filtered by the throat approximation filter 34, and its tear wave coefficient is modified by the reaction coefficient modification database 42 from the response coefficient parameters supplied from the sound database 40 The read-out parameter indicating the response coefficient rg is controlled to guide the speech to be synthesized. The synthesized sound output from the throat approximation filter 34 is transmitted to the voice filter, the wave filter 3 whose unsynthesized pentatonic part which is unnatural to the human voice is eliminated, and is then outputted as a guide voice. In the above process, the database 24 is supplied with a control signal. This control signal controls the way of guiding the voice pitch Lg and the change of the pitch Lg as well as guiding the voice. 529018 A7 B7 V. Proof and Accent of the Invention. The control signal is embedded in the telephone function section 2 and the processing section represents the analysis result data made on the vocal performance data included in the performance data. By changing the oscillation frequency of the short-term oscillating section 32 and by controlling the sound parameter Lg using the control signal, it is possible to selectively synthesize and guide the voice into a male voice or a female voice. Furthermore, by changing the waveform data read from the waveform database 4 j, it is possible to change the type of the guidance voice sound. Furthermore, it is possible to change the pitch and accent of the guidance voice by changing the parameter indicating the response coefficient rg read from the response coefficient modification database 42. In the existing specific embodiment, since the control signal is generated based on the analysis of the sounding line performance data described above, the pitch, pitch, and accent of the guide voice can be changed according to the melody of the sounding line. As a result, the user can understand how the phrase and its tonality should be sung by listening to the other leading voices of the corresponding song. Further, the database 24 is supplied with and stores time information Dingime to indicate the sounding timing and the rhythm of the sounding guide ¥ 01065. In response to the time information D11, the predetermined waveform is read from the waveform database 41, and the parameters indicating the predetermined response coefficient ^ § are read from the reaction coefficient modification database 42. The time information Time is earlier than the leading voice rapture, and the lyrics are sung earlier. Furthermore, each tone of the guidance voice is controlled; the two lengths are controlled based on the time information Time, and the guidance voice is also output at this speed. The sound line performance data is analyzed by the processing section of the telephone function section 12 described above. This analysis is performed by executing an analysis program in the processing section. Figure 3 illustrates the ta ® ^^^ Cn57I4 ^ (21〇x 297 ^ «) ------- gutter line 529018 A7 B7 for the analysis process represented by functional blocks in the form of hardware components (16) Process. In FIG. 3, the MIDI data distributed as the vocal data is supplied to the storage device 13 from the transmitter / receiver section 11. The data separation section 14 reads out the MIDI data from the storage device 13 and interprets the MIDI data using the MIDI decoder function to separate the data into performance data and voice symbol data. As shown in FIG. 4, the MIDI data has voice symbol data and is inserted here as an external message represented by a sandwiched portion between the state bytes M F0 ″ and “F7” as shown in the figure. The external message is provided in a series For each phrase, a phonetic symbol indicating the guidance voice is formed and can include? 丨 Timing data for which the guide phrase should be spoken. _ The voice symbol data separated by the data separation section 14 is supplied to the guidance voice pitch / In the tempo-determining section 36, the pitch and tempo of the guidance voice are determined here. In this case, it is possible to mark the pitch to the user in order to select the guidance voice as Female or male voice. In addition, pitch information and rhythm data as well as pitch data and accent data are supplied from the vocal line-analyzing section 38 to the guidance voice through a switch (SW) Pitch / rhythm determination section 36. The guidance voice pitch / rhythm determination section 36 outputs control signals based on various types of information received with the voice symbol data, because This causes the guidance voice to be formed, which has the indicated pitch and is dependent on the vocal line. The control, Sister South L controls the pitch, speed (rhythm), intonation, and stress of the guidance voice. It should be noted When the switch SW is in the OFF state and the pitch is not marked, the guidance voice is formed to have a predetermined pitch. If the guidance-1 9-This paper size applies the Chinese National Standard (CNS) A4 specification (210X (297 mm) 529018 5. Description of the invention (17) The speech sound is unnatural. When it is output according to the beat of the sound line, only the parent switch SW is turned off. In this case, the monophonic guidance speech is turned off. The performance data separated by the data separation section 14 is supplied to the vocal line_analysis section 38, and the vocal line performance data included in the performance data is analyzed here. The child separation performance data is delayed in the buffered memory 15 After that, it is also transmitted to the Qingyue tone synthesis section 16. As a result, the musical note based on the performance data is copied after being delayed and the voice is read, and at the same time it is read and analyzed by the phonetic-analysis section. The sound beam_analysis section is called speed information and envelope information, which reflects the change of sound lines and musical tonality (melody), such as portamento and staccato. In addition, durat information and etime information It is also analyzed. The melody is poor from the analysis, and the vocal ray-analysis section 3 ancient guide speech articulates U ... 乂 " Control information is supplied to 5 丨 translation π, θ, and / or immediate decision section 36. From Speed information and message information analysis: The obtained accent control information and pitch control information are also supplied to guide the tongue five-bone elevation / rhythm determination section 36. Further, the guidance voice pitch / rhythm ^ section 36 is supplied with the guidance ancestral timing information and the guidance voice rhythm information obtained from the period information and E time information analysis. "* It should be noted that when the musical piece is duet, the vocal line-analysis section 38 analyzes whether the vocal line has a male voice or a female voice, and transmits the pitch information of the analysis result to the pitch / rhythm of the oblique voice. · Decision ^ Therefore, the pitch of the guiding voice and the length of each syllable are controlled according to the melody of the _ 'line. And, in the case of duet, the output is :; This paper scale applies the Chinese national standard Qi Qing 8 4 specifications ( 2 Staple-20- 1〇X 297 ^ ¾)

聲被合成的引導語音被合成的引導語音於男聲；二：聲心可，同時輸出以男鸯的時點取決於從斧立緩二發耳〈可。每個引導語音輸出響時序資訊。應該°二^ 38所供應的引導語音發訊從引導語音音高;;奏：：::響:序資訊和節奏資出。進而，讀間資訊了1阶來轉時，該引導語：二!料包含引導語音發響的時序資該二。曰土於此時序資訊來發響。則述從引導★五立立古，^ 過内插％決定區段36來的控制訊號透 (lnterP〇1at〇r) 37輸出到資科庫24。杏立對應發聲線的旋律來^貝村厍24 根據相高不自然地變動防止引導語音的音改-…：内插器根據發聲線的旋律動態地立^導料變動速率。這引起料語音以平曰垠輸出。怎孩 >王意到緩衝記憶體⑸皮提供以同步化發聲線複製的 ^序^引導語音發響的日f序，並且前述引導語音發響時序貝Λ藉由將緩衝記憶體15所造成的延遲時間列入考慮而形根據本發明終端設備所應用到的蜂巢電話丨可以從外部下載伴唱資料。圖5是一個觀念描述顯示伴唱資料如何被下載到蜂巢電話la和蜂巢電話lb，每個建構方式類似於本發明第一具體實施例終端設備^魂用的圖1蜂巢電話1。通^蜂巢笔活的蜂巢系統使用小區域系統，其中每個服各區域被分割成數個無線電區域。無線電區域每個由此處所提供的基站所管理。當蜂巢電話如基站撥打電話到一般本纸張尺度適用中國國家標準(CNS) Α4規格(210X297公釐) 裝訂 529018 五、發明説明（ 19 =機時，蜂巢電話經過管理該蜂巢電話所屬無線電區域炊Γ占連接到行動父換站（m〇blle eXChange Station)，接著後連接到一般電話網路。因而透過無線電網路，蜂巢兩，可以連接到管理所相應無線電區域的基站，藉以蜂巢^ :可以發話到其他電話機。進而，當蜂巢電話發話到：二；其他無線電區域的其他蜂巢電話時，上述傳輸側蜂巢電 j經過管理該傳輸側蜂巢電話無線電的基站，連接到行動又換站，接著透過行動交換站連接到相關接收側蜂巢電話的基站。圖5例示上述蜂巢系統的範例，其中該蜂巢電話“屬於由基站2a至2d中一個2c所管理的無線電區域，同時蜂巢電活lb屬於由基站-2a所管理的無線電區域。蜂巢電話h由無線電網路連接到基站2c。㈣巢電fela用於通話或位置註開啓訊號（Up-Signal)由基站2c所接收和處理。雖然蜂巢電話ib由基站2a所管理，其間的關係類似於蜂巢電話h 和基站2c間的關係。基站M2d分別管理不同的無線電區域，但是無線電區域可以擁有彼此重疊的周邊。基站“至 2d透過多工線連接到行動交換站3。並且數個行動交換站3 集縮在連接到晋通電話交換站5a之閘交換站（0卜以仏⑽# station) 4。數個閘交換站.—4透過中繼傳輸線路加心 tfa_iSsion line)來互連。普通春話交換站丸…被配置於對應的當地區域（1〇cal areas)，並且也透過中繼傳輸線路互連。每個普通電話交換站化、5b、兄…具有多個普通電話機連接於此，並且例如分佈中心6連接到普通電話 -22- 本纸張尺度適用中國國家標準(CNS) A4規格(21〇χ297公釐） 529018 A7 五、發明説明（2〇交換站5b。 4!中Γ儲存多個伴唱資料項目並且新的歌曲可以經 :、ϋ入::此。根據本具體實施例，是可能將如伴唱資料從連接到普通電話網路的分佈中心6下載到蜂巢電話la、 =當蜂巢電話1a爲下載伴唱資料時，其傳輸分V中心6 的電洁號碼。結果是，蜂巢南禾疋♦巢％冶h透過基站2c、行動交換』甲父站4、普通電話交換站5 a及普通電話交換站 5b的路徑來連接到分佈中以。接著，藉由操作數字鍵盤 (numeric key pad)，於引導的跟隨指導顯示在蜂巢電話h 顯π幕上< 時，緩步撥號（j〇g dial)或蜂巢電話4相似的做法，使用者可以需求具有所需要歌曲抬頭的伴唱資料項目並且下載相似的·東西。於此情況中，伴唱包含引導語音的語音符號資料。相似地，蜂巢電話lb可以需求和下載具有所想要歌曲抬頭的伴唱資料項目。分佈中心6可以連接到網際網路，因而允許伴唱資料透過網際網路從分佈中心6 下載。當伴唱使用圖1中所示的蜂巢電話1來播放時，從麥克風 2 1所輸入的歇唱语音也從輸出區段2 〇輸出。於此情況中，蜂巢氣居1此夠執行免持聽筒通話ea⑴，並且因而當伴唱是以免持聽筒狀態播放時，從輸出區段20 輸出的聲音可以透過麥克風2J：凉輸入，因此引發_吼聲 (howling)。爲了克服這個問題，當蜂巢電話1設定成能夠進行免持聽筒通話時，回音消除電路（ech〇 canceUer circuit) 被提供來防止嘶吼聲。進而調頻調變器（FM modulator)可 ____ -23- 本紙張尺度適用中國國家標準(CNS) A4規格(21〇x297公釐) 裝訂 529018 A7The synthesized guidance voice is synthesized by the male voice; the second is that the voice is okay, and the timing of outputting the male voice at the same time depends on the second ear from the axe. Each pilot voice output sounds timing information. It should be provided with the guidance voice signal provided by ^ 38 from the pitch of the guidance voice;; play :::: ring: sequence information and rhythm funds. Furthermore, when the reading information is transferred to level 1, the guide word: Two! The material contains the timing sequence of the guide sound. The earth sounds in this time series information. Then the control signal (lnterP01at〇r) 37 from the guide ★ Wu Li Li Gu, ^ through the interpolation% decision section 36 is output to the asset library 24. Xing Li corresponds to the melody of the vocal line. 贝贝村厍 24 Prevents the voice of the guide voice from changing unnaturally according to the phase. -...: The interpolator dynamically sets the rate of change of the guide according to the melody of the vocal line. This causes the material speech to be output in a normal voice. How?> Wang Yi realized that the buffer memory provides a sequence of ^ copied by synchronized voice lines to guide the day sequence of voice sounds, and the aforementioned sequence of guidance voice sounds is caused by buffer memory 15 Considering the delay time, the cellular phone to which the terminal device according to the present invention is applied can download the vocal data from the outside. Fig. 5 is a conceptual description showing how the backing material is downloaded to the cellular phone 1a and the cellular phone 1b, each of which is constructed similarly to the cellular phone 1 of Fig. 1 used by the terminal device ^ soul of the first embodiment of the present invention. The hive system of the hive is a small area system in which each service area is divided into several radio areas. The radio zones are each managed by the base stations provided here. When a cellular phone, such as a base station, makes a call to the general paper size, the Chinese National Standard (CNS) A4 specification (210X297 mm) is bound 529018. 5. Description of the invention (19 = machine, the cellular phone is managed by the radio zone to which the cellular phone belongs. Γ 占 is connected to the mobile parent station (m〇blle eXChange Station), and then to the general telephone network. Therefore, through the radio network, the honeycomb can be connected to the base station that manages the corresponding radio area, so that the honeycomb ^: Yes Make a call to another phone. Furthermore, when the honeycomb phone speaks to: 2; other honeycomb phones in other radio areas, the transmission-side cell phone j passes through the base station that manages the transmission-side cell phone radio, connects to the mobile station, and then switches to The mobile switching station is connected to the base station of the relevant cellular telephone on the receiving side. Figure 5 illustrates the example of the cellular system described above, where the cellular telephone "belongs to a radio area managed by one 2c of the base stations 2a to 2d, while the cellular electrical activity lb belongs to the base station -2a radio area managed. Cellular phone h is connected to the base by radio network Station 2c. ㈣ Nest electric fela is used for call or location note up-signal received and processed by the base station 2c. Although the cellular phone ib is managed by the base station 2a, the relationship between them is similar to that between the cellular phone h and the base station 2c The base station M2d manages different radio areas respectively, but the radio areas can have perimeters that overlap each other. The base stations "to 2d are connected to the mobile switching station 3 through a multiplex line. And several mobile switching stations 3 are condensed to connect to Jin The gate exchange station (0 ###) of the telephone exchange station 5a 4. Several gate exchange stations.-4 are interconnected through a relay transmission line plus a tfa_iSsion line. Ordinary spring language exchange station Maru ... is located in the corresponding local area (10cal areas), and is also interconnected through a relay transmission line. Each ordinary telephone exchange station, 5b, brother ... has multiple ordinary telephones connected here, and for example, distribution center 6 is connected to ordinary telephones-22- This paper standard applies to China National Standard (CNS) A4 specifications (21〇χ297 (Mm) 529018 A7 V. Description of the invention (20 exchange station 5b. 4! In Γ, multiple vocal material items are stored and new songs can be edited via :, ϋ ::, this. According to this specific embodiment, it is possible to replace The backing data is downloaded from the distribution center 6 connected to the ordinary telephone network to the honeycomb phone la, = when the honeycomb phone 1a is downloading the backing data, it transmits the electric cleaning number of the center V. As a result, the hive Nanhe % Meh is connected to the distribution through the path of the base station 2c, mobile exchange "A parent station 4, ordinary telephone exchange station 5a and ordinary telephone exchange station 5b. Then, by operating a numeric key pad, The following guidance of the guide is displayed on the screen of the hive phone h. When the screen is similar to jog dial or the hive phone 4, the user can request a vocal data item with the required song head up and download it. Similar things. In this case, the vocal vocal contains the voice symbol data of the guide voice. Similarly, the honeycomb phone lb can request and download the vocal data items with the desired song heading up. The distribution center 6 can be connected to the Internet, This allows the backing material to be downloaded from the distribution center 6 via the Internet. When the backing is played using the cellular telephone 1 shown in FIG. 1, the rest voice input from the microphone 21 is also output from the output section 20. In this case, Hive Qi 1 is enough to perform speakerphone talk ea⑴, and thus when the vocal is played in the speakerphone state, the sound output from the output section 20 can be input through the microphone 2J: cool input, so _ roar ( howling). In order to overcome this problem, when the hive phone 1 is set up to be able to talk hands-free, an echo cancellation circuit (ech〇canceUer circuit) is provided to prevent howling. Then the FM modulator can be ____ -23- This paper size applies to China National Standard (CNS) A4 (21 × 297mm) binding 529018 A7

以斤』本以弱無線電因此裝置於房間由★、輸區段2〇來傳輸輸出訊號， FM接收器可以^收2具隔間（Vehkle⑶叫咖nt)内的㈣聲可能發生 /讀電波。還有在這情況中，聲。因此回f消除電路被提供來防止嘶吼並且’當伴唱被播放時，蜂巢電話1的傳輸器區段被額外j用來傳运需求（requests)，並且因此用以供應電力給傳輸备區&的電源供應器可以維持在關（OFF)的狀態，除了當做出需求時，因而延長電池的服務壽命。圖6顯π根據本發明第二具體實施例所應用終端裝置之伴唱裝置組態範例，以及分佈中心。此具體實施例基本上 f通=和顯示功·能方面與第一具體實施例不同。更特別地是，第二具體實施例與第一具體實施例的差異在於數據機 111和控制區段112分別對應於第一具體實施例之傳輸器/ 接收器功能區段11和電話功能區段12的組態是不同的，及顯π區段126是另外地提供，但是本具體實施例的其他組件和第一具體實施例相對應的組件具有相同的功能，並且因而其他組件用相同參照數値，並且詳盡的描述於本文中被省略。於圖6中，參照數値1 〇〇標示根據第二具體實施例所應用終端裝置的伴唱裝置，及伴备裝^置1 〇〇可以從分佈中心6下載伴唱資料。伴唱裝置10 0和分佈中心6透過通訊線路彼此連接。例如通訊線路可由電話線路形成。伴唱裝置1 〇〇包括數據機111，所想要的伴唱資料項目可藉其從分佈中心6 _ -24- 本紙張尺度適用中國國家標準(CNS) Α4規格(210X 297公釐) 裝訂線 529018 A7 -- -—-_____B7_ 五、發明説明（22 ) '' -- 下載。數據機in對所接收的訊號解調變並對將被傳輸的訊號調變和接著傳送調變過的訊號給通訊線路。控制區段 112包括顯示控制區段125和語音合成區段122。^制區^ 112控制伴唱裝置110整體的運作。當語音由控制區端二所合成時，從資料庫24所讀出的語音參數可以供應給語音合成區段122’而且語音合成區段122可以根據語音參數來合成語音。資料24儲存從” a"到，，n"聲音和擬聲（imitati〇n sounds)的語音參數。類似於第一具體實施例中的儲存裝置13，儲存裝置Η爲一種儲存所分散伴唱的記憶體。伴唱資料由使用者所需求，音㈣品-it串㈣事件所形成的演奏資料⑼組成而語晋符號資料由附·加於演奏資料指示歌詞各自的音節之笋音符號所組成、及引導歌詞顯示用於顯示歌詞之資料在器126上。引導歌詞顯示資料從數據機lu供應給控制區段 112。當演奏資料被播放時，引導歌詞顯示資料一段接著 -段從控制區段112傳送到顯示器126，#此對應到每段的引導歌詞字依序被顯示在顯示器126上。同時間，背景影像資料從大容量儲存裝置讀取出來，未於圖上顯示，二將其與引導歌詞顯示在顯示126上。伴唱資料除了引導歌詞顯示資料之外係爲MIDI格式，其範例由圖4如前述所顯 TF，並且取凋的語骨符號資林李圖4所示以額外的訊息被插入MIDI資料。所以，於伴.唱中單一音樂作品的資料量可以減少到很小的量，讓一件伴唱的音樂作品於短時間内傳輸成爲可能。 -25- 本紙張尺度適用中國國家標準(CNi^A4規格(210X 297公釐) 529018 A7 B7 五、發明説明（23 ) 於本具體實施例中，由資料分離區段14所分離出的語音符號資料與演奏資料一同供應給控制區段112。控制區段 112以語音符號資料爲基準來合成引導語音並且從語音合成區段122傳送該合成引導語音。引導語音被提供做爲發聲提示用以幫助使用者不須看著顯示在顯示區段126上的引導歌詞即可歌唱。根據音樂音調合成區段16所產生伴唱音樂之音樂音調複製的進行，合成引導語音，並且接著透過輸出區段20輸出。當伴唱由伴唱裝置100播放時，透過麥克風2 1輸入的歌唱聲音也從輸出區段20輸出。 . 接著關於圖7，根據第二具體實施例，此處更詳盡地顯示控制區段112的語音合成區段122的組態設定和伴唱裝置 100的資料庫24。圖7中所示的語音合成區段122不同於第一具體實施例之終端裝置所應用之蜂巢電話1的語音壓縮-合成區段22，因爲沒有編碼器提供於此。該組態的其他部分在組態設定上近似於蜂巢電話1的語音壓縮-合成區段22的其他部分，並且因此於此文中省略掉詳盡的描述。進而，類似於第一具體實施例，發聲線的演奏資料由控制區段112的處理區段（processing section)所分析。此分析過程藉由控制區段112的處理區•被之分析程式的執行來進行。該過程的流程相似於由第一具體實施例之蜂巢電話1 所進行分析過程所執行的流程，如圖3中所示，並且因此於此文中省略掉詳盡的描述。 -26- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐） 529018 A7 B7 五、發明説明（24 ) 現在，將給予伴唱裝置100如何下載伴唱資料的描述。伴唱裝置100透過數據機Π1存取分佈中心6。結果是，伴唱設備100連接到分佈中心6。接著，經由操作輸入裝置，未顯示出來，於引導的跟隨指導（following instructions)顯示在顯示器126時，使用者可以需求具有所想要歌曲抬頭的伴唱資料項目並且下載相同的曲目。於此情況中，伴唱資料包含引導語音的語音符號資料和於此處所附加的引導歌詞顯示資料。應該注意到，分佈中心6可以連接到網際網路以藉此允許伴唱資料從分佈中心6透過網際網路來下載。 - 不須贅言地，本發明的目的可以藉由裝設軟體的程式碼以實現上述具體'實施例終端設備功能來完成，從該程式碼所被記錄的電子設備之儲存媒體，像是伴唱裝置、蜂巢電話、或是個人電腦（PC)、並且引發該電子設備的電腦（或是 CPU)來執行該程式。於此情況中，使用儲存媒體來裝置於該電子設備中的程式碼達成本發明嶄新的功能，並且儲存該程式碼的儲存媒體構成了本發明。用以記錄該程式碼的儲存媒體可以例如是軟碟（floppy disk)、硬碟（hard disk)、光記憶體碟（optical memory disk)、磁光碟（magneto-opticai^4isk)、CD-ROM、CD-R(可寫式光碟，（CD-Recordable)、磁帶（magnetic tape)、非揮發性記憶體卡（nonvolatile memory card)、或是 ROM。還有，程式碼可以從伺服器電腦經由通訊網路來供應。 -27- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐） 529018 A7 — B7 " " - " 丨丨"丨_ 五、發明説明（25 ) 不須贅言地，本發明包含一種情況，其所例示的具體實施例功能不僅藉由執行由該電腦所讀出程式碼來完成，還引發在該電腦上運作的作業系統（OS)根據該程式碼的指令，執行部份或所有的眞正運作。進而，本發明還包含一種情況，其中從儲存媒體所讀出的程式碼被寫入於插入在伴唱裝置或pc中的延伸功能板或於所連接延伸功能單元中所提供的記憶體，並且整合於延伸功能板或延伸功能單元中的CPU或類似的物件基於該程式碼的指令執行部份或所有的運作，以完成所例示具體實施例的功能。工業的可應用性（Industrial applicability) 如前所述，根據本發明的終端裝置可應於具有通訊功能的伴唱裝置還有行動手機，因此蜂巢電話或汽車電話手機，都可有伴唱功能。進而，是可能藉由提供具有將數據機、蜂巢電話或類似物件連接到該裝置之通訊功能，將本發明之終端設備應用到具有伴唱功能的電子裝置。圖1 1 蜂巢電話（CELLULAR PHONE) 2 基站（BASE STATION) 10 天線（ANTENNA) 11 傳輸器 / 接收器區段、f5&RANSMITTER/RECEIVER SECTION) 12 電話功能區段（TELEPHONE FUNCTION SECTION) 13 儲存裝置（STORAGE MEANS) ____-28- 本紙張尺度適用中國國家榡準(CNs) A4規格(210X297公釐) 529018 A7 B7 五、發明説明（26 ) MIDE 資料（MIDE DATA) 14 資料分離區段（DATA SEPARATING SECTION) 語音符號（VOICE SYMBOL) 演奏資料（PERFORMANCE DATA) 15 緩衝記憶體（BUFFER MEMORY，（BUFF)) 16 音樂音調合成區段（MUSICAL TONE SYNTHESIS SECTION) 17 效應區段（EFFECT SECTION) 19 放大器區段（AMPLIFIER SECTION) 20 輸出區段（OUTPUT SECTIQN) 2 1 麥克風（MICROPHONE·) 22 語音壓·縮-合成區段（VOICE COMPRESSION- SYNTHESIS SECTION) 23 效應區段（EFFECT SECTION) 24 資料庫（DATABASE) 圖2 22 語音壓縮-合成區段（VOICE COMPRESSION-SYNTHESIS SECTION) 壓縮語音資料（COMPRESSED VOICE DATA) 24 資料庫（DATABASE) 3 0 資料處理區段（DATA PR©名ESSING SECTION) 3 1 碼簿（CODE BOOK) 32 短期震盪區段（SHORT TERM OSCILLATING SECTION) -29- 裝訂線本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐) 529018 A7 B7 五、發明説明（27 ) 3 3 來源波形-複製區段（SOURCE WAVEFORM-REPRODUCING SECTION) 34 喉嚨近似濾波器（THROAT APPROXIMATION FILTER) 3 5 頻譜濾波器（SPECTRAL FILTER) 輸出（OUTPUT) 4 0 語音資料庫（VOICE DATABASE) 控制訊號（CONTROL SIGNAL) 語音符號（VOICE SYMBOL) 4 1 波形資料庫（WAVEFORM DATABASE) 4 2 反應係數修改資料庫（REFLECTION COEFFICIENT MODIFYING DATABASE) 圖3 MIDI 資料（MIDI DATA) 12 電話功能區段（TELEPHONE FUNCTION SECTION) 13 儲存裝置（STORAGE MEANS) 語音符號（VOICE SYMBOL) 演奏資料（PERFORMANCE DATA) 15 緩衝區（BUFFER) 到音樂音調合成區段（TO MUSICAL TONE SYNTHESIS SECTION) 3 6 引導語音音高/節奏-確定區段（guide voice pitch/tempo-determining section) 標示音高（DESIGNATE PITCH) ___ -30- 本紙張尺度適用中國國家標準(CNS) A4規格(210x 297公釐） 529018 A7 B7 五、發明説明（28 ) 3 7 内插器（interpolator) 控制訊號（CONTROL SIGNAL)+語音符號（VOICE SYMBOL) 到資料庫（TO DATABASE) 3 8 發聲線-分析區段（vocal line-analyzing section) 圖4 MIDI 資料（MIDI DATA) 用於聲音引導之語音符號的順序（SEQUNECE OF VOICE SYMBOLS FOR VOICE GUIDE) 額外訊息（EXCLUSIVE MESSAGE) 圖5 · 2a 基站（BAS’E STATION) 2b 基站（BASE STATION) 2c 基站（BASE STATION) 2d 基站（BASE STATION) 3 行動交換站（MOBILE EXCHANGE STATION) 4 閘交換站（GATE EXCHANGE STATION)Because the radio is weak, the device is installed in the room to transmit the output signal by the input section 20, and the FM receiver can receive the snoring sounds in the two compartments (Vehkle (called 咖 nt)). And in this case, sound. Therefore, a cancellation circuit is provided to prevent howling and 'when the vocal is played, the transmitter section of the cellular telephone 1 is used for additional requests for transmission of requests, and thus for supplying power to the transmission standby area & The power supply can be maintained in the OFF state, except when a demand is made, thereby extending the service life of the battery. FIG. 6 shows a configuration example of a backing device of a terminal device applied to a second embodiment of the present invention, and a distribution center. This embodiment is basically different from the first embodiment in terms of f == and display function. More specifically, the second embodiment differs from the first embodiment in that the modem 111 and the control section 112 correspond to the transmitter / receiver function section 11 and the telephone function section of the first embodiment, respectively. The configuration of 12 is different, and the display section 126 is provided separately, but the other components of this embodiment and the components corresponding to the first embodiment have the same functions, and thus other components use the same reference numbers. Alas, and a detailed description is omitted here. In FIG. 6, reference numeral 100 indicates that the backing device of the terminal device applied according to the second specific embodiment and the back-up device 100 can download backing data from the distribution center 6. The backing device 100 and the distribution center 6 are connected to each other through a communication line. For example, a communication line may be formed by a telephone line. The backing device 1 〇〇 includes the modem 111. The desired backing data items can be distributed from the distribution center 6 _ -24- This paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm) binding line 529018 A7 ------_____ B7_ V. Description of the Invention (22) ''-Download. The modem in demodulates the received signal and modulates the signal to be transmitted and then transmits the modulated signal to the communication line. The control section 112 includes a display control section 125 and a speech synthesis section 122. The control area 112 controls the overall operation of the backing device 110. When the speech is synthesized by the control area end two, the speech parameters read from the database 24 can be supplied to the speech synthesis section 122 'and the speech synthesis section 122 can synthesize speech based on the speech parameters. The data 24 stores the voice parameters from “a” to, “n” sounds and imitati sounds. Similar to the storage device 13 in the first embodiment, the storage device is a memory for storing the dispersed backing vocals. The accompaniment data is composed of the performance data formed by the user's request, the music product-it string event, and the language symbol data is composed of the bamboo syllables attached to the performance data indicating the respective syllables of the lyrics, and The guide lyrics display data for displaying lyrics is on the device 126. The guide lyrics display data is supplied from the modem lu to the control section 112. When the performance data is played, the guide lyrics display data is transmitted one by one from the control section 112 Go to the display 126, # This corresponds to the guide lyrics of each paragraph are sequentially displayed on the display 126. At the same time, the background image data is read from the large-capacity storage device and is not displayed on the map. Second, it is related to the guide lyrics. It is displayed on display 126. In addition to the guide lyrics display data, the accompaniment data is in MIDI format. As shown in Figure 4 by Lin Li, additional information is inserted into the MIDI data. Therefore, the amount of data for a single piece of music in the accompaniment can be reduced to a small amount, making it possible for a piece of accompaniment to be transmitted in a short time. -25- This paper size applies to the Chinese national standard (CNi ^ A4 specification (210X 297 mm) 529018 A7 B7 V. Description of the invention (23) In this specific embodiment, the speech separated by the data separation section 14 The symbol data is supplied to the control section 112 together with the performance data. The control section 112 synthesizes the guidance speech based on the speech symbol data and transmits the synthesized guidance speech from the speech synthesis section 122. The guidance speech is provided as a vocal prompt Help the user to sing without looking at the guide lyrics displayed on the display section 126. According to the music tone synthesis section 16, the music tones of the backing music are copied, the guide voice is synthesized, and then through the output section 20 Output. When the accompaniment is played by the accompaniment device 100, the singing voice input through the microphone 21 is also output from the output section 20. Then, with reference to FIG. 7, the root In the second specific embodiment, the configuration settings of the speech synthesis section 122 of the control section 112 and the database 24 of the backing device 100 are shown in more detail here. The speech synthesis section 122 shown in FIG. 7 is different from the first The speech compression-synthesis section 22 of the cellular phone 1 to which the terminal device of the specific embodiment is applied, because no encoder is provided here. The other parts of the configuration are similar to the speech compression-synthesis of the cellular phone 1 in configuration settings. The other parts of the section 22, and therefore detailed descriptions are omitted herein. Further, similar to the first specific embodiment, the performance data of the sound line is analyzed by the processing section of the control section 112. This analysis process is performed by the processing area of the control section 112 and the execution of the analysis program. The flow of this process is similar to that performed by the analysis process performed by the hive phone 1 of the first specific embodiment, as shown in FIG. 3, and therefore a detailed description is omitted herein. -26- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 529018 A7 B7 V. Description of the invention (24) Now, a description will be given of the backing device 100 how to download the backing data. The backing device 100 accesses the distribution center 6 through the modem 1. As a result, the backing device 100 is connected to the distribution center 6. Then, when the operation input device is not displayed, and when the following follow-up instructions are displayed on the display 126, the user can request a backing data item with the desired song head up and download the same track. In this case, the accompaniment data includes the voice symbol data of the guide voice and the guide lyrics display data attached here. It should be noted that the distribution center 6 can be connected to the Internet to thereby allow backing material to be downloaded from the distribution center 6 through the Internet. -Needless to say, the purpose of the present invention can be accomplished by installing software code to achieve the terminal device function of the above specific embodiment. From the storage medium of the electronic device in which the code is recorded, such as a vocal device , Cellular telephone, or personal computer (PC), and the computer (or CPU) that triggers the electronic device to execute the program. In this case, the program code installed in the electronic device using a storage medium reaches a new function of the invention, and the storage medium storing the program code constitutes the present invention. The storage medium for recording the code may be, for example, a floppy disk, a hard disk, an optical memory disk, a magneto-opticai ^ 4isk, a CD-ROM, CD-R (CD-Recordable, magnetic tape, nonvolatile memory card, or ROM. In addition, the code can be sent from the server computer through the communication network -27- This paper size is in accordance with China National Standard (CNS) A4 (210X 297 mm) 529018 A7 — B7 " "-" 丨丨 " 丨 _ 5. The invention description (25) is not required It goes without saying that the present invention includes a case in which the illustrated embodiment functions are not only performed by executing a code read by the computer, but also cause an operating system (OS) operating on the computer to perform operations according to the code. Instructions, execute part or all of the operation. Furthermore, the present invention also includes a case in which a code read from a storage medium is written on an extension function board inserted in a backing device or a pc or connected to Extended function The memory provided in the unit, and the CPU or similar object integrated in the extended function board or the extended function unit perform part or all of the operations based on the instructions of the code to complete the functions of the illustrated embodiment. Industry Industrial applicability As described above, the terminal device according to the present invention can be applied to a vocal device with a communication function and a mobile phone, so a cellular phone or a car phone can have a vocal function. Furthermore, it is It is possible to apply the terminal device of the present invention to an electronic device having a backing function by providing a communication function for connecting a modem, a cellular phone, or the like to the device. Figure 1 1 Cellular Phone 2 Base Station (BASE STATION) 10 antenna (ANTENNA) 11 transmitter / receiver section, f5 & RANSMITTER / RECEIVER SECTION) 12 telephone function section (TELEPHONE FUNCTION SECTION) 13 storage device (STORAGE MEANS) ____- 28- This paper size is applicable to China Standards (CNs) A4 specifications (210X297 mm) 529018 A7 B7 V. Description of invention (26) MIDE MIDE DATA 14 DATA SEPARATING SECTION VOICE SYMBOL PERFORMANCE DATA 15 BUFFER MEMORY (BUFF) 16 MUSICAL TONE SYNTHESIS SECTION ) 17 EFFECT SECTION 19 AMPLIFIER SECTION 20 OUTPUT SECTIQN 2 1 MICROPHONE · 22 VOICE COMPRESSION- SYNTHESIS SECTION 23 Effect Section (EFFECT SECTION) 24 database (DATABASE) Figure 2 22 VOICE COMPRESSION-SYNTHESIS SECTION Compression VOICE DATA 24 DATABASE 3 0 DATABASE section PR © Name ESSING SECTION) 3 1 CODE BOOK 32 Short-term Oscillation Section (SHORT TERM OSCILLATING SECTION) -29- Binding Line This paper applies Chinese National Standard (CNS) A4 specifications (210X297 mm) 529018 A7 B7 V. Description of the invention (27) 3 3 SOURCE WAVEFORM-REPRODUCING SECTION 3 4 THROAT APPROXIMATION FILTER 3 5 SPECTRAL FILTER OUTPUT 4 0 VOICE DATABASE CONTROL SIGNAL VOICE SYMBOL 4 1 Waveform database ( WAVEFORM DATABASE) 4 2 REFLECTION COEFFICIENT MODIFYING DATABASE Figure 3 MIDI DATA 12 TELEPHONE FUNCTION SECTION 13 STORAGE MEANS VOICE SYMBOL performance data ( PERFORMANCE DATA) 15 BUFFER to TO MUSICAL TONE SYNTHESIS SECTION 3 6 guide voice pitch / tempo-determining section Designate PITCH ) ___ -30- This paper size is applicable to Chinese National Standard (CNS) A4 (210x 297 mm) 529018 A7 B7 V. Description of the invention (28) 3 7 Interpolator Control signal + voice symbol (VOICE SYMBOL) to the database (TO DATABASE) 3 8 vocal line-analyzin g section) Figure 4 MIDI DATA Sequence of voice symbols used for voice guidance (SEQUNECE OF VOICE SYMBOLS FOR VOICE GUIDE) Extra message (EXCLUSIVE MESSAGE) Figure 5 · 2a BAS'E STATION 2b BASE STATION) 2c base station (BASE STATION) 2d base station (BASE STATION) 3 mobile exchange station (MOBILE EXCHANGE STATION) 4 gate exchange station (GATE EXCHANGE STATION)

5a 普通電話交換站（ORDINARY TELEPHONE EXCHANGE STATION)5a ORDINARY TELEPHONE EXCHANGE STATION

5b 普通電話交換站（ORDINARY TELEPHONE EXCHANGE STATION)5b ORDINARY TELEPHONE EXCHANGE STATION

5c 普通電話交換站（ORDINARY TELEPHONE EXCHANGE STATION) 6 分佈中心（DISTRIBUTION CENTER) ___-31_-__ 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐） 529018 A7 B7 五、發明説明（29 ) 圖6 6 分佈中心（DISTRIBUTION CENTER)5c Ordinary Telephone Exchange Station (ORDINARY TELEPHONE EXCHANGE STATION) 6 Distribution Center (DISTRIBUTION CENTER) ___- 31 _-__ This paper size applies to China National Standard (CNS) A4 specifications (210X 297 mm) 529018 A7 B7 V. Description of the invention (29 ) Figure 6 6 DISTRIBUTION CENTER

Hi數據機（MODEM) 13 儲存裝置（STORAGE MEANS) MIDI資料（MIDI DATA) 1 4 資料分離區段（DATA SEPARATING SECTION) 語音符號（VOICE SYMBOL) 演奏資料（PERFORMANCE DATA) 15 緩衝記憶體（BUFFER MEMORY) (BUFF)) 16 音樂音調合成區段（MUSICAL TONE SYNTHESIS SECTION) 17 效應區段（EFFECT SECTION) 1 9 放大器區段（AMPLIFIER SECTION) 2 0 輸出區段（OUTPUT SECTION) 2 1 麥克風（MICROPHONE) 2 3 效應區段（EFFECT SECTION) 2 4 資料庫（DATABASE) 100 伴唱裝置（KARAOKE APPARATUS) 112 控制區段（CONTRL SECTION) 122 語音合成區段（VOICE SYNTHESIS SECTION) 125 顯示控制區段（DISPLAY^OONTROL SECTION) 126 顯示器（DISPLAY) 圖7 122 語音合成區段（VOICE SYNTHESIS SECTION) _ -32- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 裝訂線 529018 A7 B7 五、發明説明（30 ) 2 4 資料庫（DATABASE)MODEM 13 STORAGE MEANS MIDI DATA 1 4 DATA SEPARATING SECTION VOICE SYMBOL PERFORMANCE DATA 15 BUFFER MEMORY (BUFF)) 16 MUSICAL TONE SYNTHESIS SECTION 17 EFFECT SECTION 1 9 AMPLIFIER SECTION 2 0 OUTPUT SECTION 2 1 MICROPHONE 2 3 EFFECT SECTION 2 4 DATABASE 100 KARAOKE APPARATUS 112 CONTRL SECTION 122 VOICE SYNTHESIS SECTION 125 DISPLAY ^ OONTROL SECTION 126 DISPLAY Figure 7 122 VOICE SYNTHESIS SECTION _ -32- This paper size is applicable to China National Standard (CNS) A4 (210X 297 mm) Gutter 529018 A7 B7 V. Description of the invention (30 ) 2 4 DATABASE

3 2 短期震盪區段（SHORT TERM OSCILLATING SECTION) 3 3 來源波形-複製區段（SOURCE WAVEFORM-REPRODUCING SECTION) 34 喉嚨近似濾波器（THROAT APPROXIMATION FILTER) 3 5 頻譜濾波器（SPECTRAL FILTER) 輸出（OUTPUT) 40 語音資料庫（VOICE DATABASE) 控制訊號（CONTROL SIGNAL) 語音符號（VOICE SYMBOL) 4 1 波形資料庫（WAVEFORM DATABASE) 4 2 反應係數修改資料庫（REFLECTION COEFFICIENT MODIFYING DATABASE) -33· 本紙張尺度適用中國國家標準(CNS) A4規格(2l〇 X 297公釐)3 2 SHORT TERM OSCILLATING SECTION 3 3 SOURCE WAVEFORM-REPRODUCING SECTION 34 THROAT APPROXIMATION FILTER 3 5 SPECTRAL FILTER OUTPUT 40 VOICE DATABASE CONTROL SIGNAL VOICE SYMBOL 4 1 WAVEFORM DATABASE 4 2 REFLECTION COEFFICIENT MODIFYING DATABASE -33 · This paper is for China National Standard (CNS) A4 specification (210 × 297 mm)

Claims

6. Scope of Patent Application 1. A terminal device, which is distributed with content data composed of a series of performance events formed by a series of performance events, and voice symbol data formed by voice symbols indicating individual lyrics syllables attached to the performance data The terminal device includes: a musical tone synthesizing section that copies musical tones from the performance data; a voice synthesizing section that synthesizes guided speech based on the voice symbol data; a voice synthesizing control section that uses pre-reading The performance data controls the speech synthesis section in such a way that the characteristics of the synthesis-guided speech change according to the performance data. 2. If the patent application is for the terminal device of item 1, wherein the performance data is MIDI format performance data, it has voice symbol data that will be inserted into the performance data with additional information. 3. For example, the terminal device of the first patent application scope further includes an analysis section that analyzes performance data including sound lines included in the performance data, wherein the speech synthesis control section controls the speech synthesis area based on the analysis result of the analysis section. Segment to change the pitch and pitch of the guide speech to be synthesized by the speech synthesis section according to the utterance line. 4. If the terminal device of the third scope of the patent application, the speech synthesis control section controls the synthesis timing of the speech synthesis section based on the analysis and results of the synthesis section, so the speech synthesis section is synthesized by the speech synthesis section. The guidance speech is made before the corresponding sound line. 5. If the terminal device in the scope of patent application No. 4 includes stored voice parameters -34- This paper size applies Chinese National Standard (CNS) A4 specifications (210 X 297 mm) AB c D 529018 A database, wherein the speech synthesis control section reads the speech parameters from the speech database based on the speech symbol data and supplies it to the speech synthesis section, so that the guidance speech synthesized by the speech synthesis section has a correspondence with the speech symbol data Each syllable of the sound, and the pitch and pitch here can be changed according to the sound line. 6. A terminal device that is disseminated with content data composed of performance data formed by a series of performance events and voice symbol data formed by speech symbols attached to the performance data indicating individual lyrics syllables, the terminal device Contains: ~ Install a phone function section to initiate telephone speech; a music tone synthesis section to copy music tones from performance data; and a speech synthesis section to synthesize guidance based on the speech symbol data Voice and decode voice data for phone calls. Line 7. For example, the terminal device under the scope of patent application No. 6 further includes a speech synthesis control section, which controls the speech synthesis section by pre-reading the performance data. In this way, the characteristics of the synthesized guidance speech are based on the Performance data to change. 8. As for the terminal device under the scope of application for patent item 6, wherein the performance data is MIDI format performance data, it has voice symbol data such as additional information that will be inserted into the performance data. 9. If the terminal device under the scope of patent application No. 6 further includes an analysis section for analyzing performance data including sound lines included in the performance data, the term -35- This paper standard applies to China National Standard (CNS) A4 specifications ( (210X 297mm) AB c D 529018 VI. Patent application range The sound synthesis control section controls the speech synthesis section based on the analysis result of the analysis section, so as to change the guidance speech to be synthesized by the speech synthesis section according to the sound line Pitch and pitch. 10. For the terminal device with the scope of patent application No. 9, wherein the speech synthesis control section controls the synthesis timing of the speech synthesis section based on the analysis result of the synthesis section, so the guidance speech synthesized by the speech synthesis section is in The vocal line corresponding to this sounds before. 11. The terminal device according to item 9 of the scope of patent application, includes a voice database storing voice parameters, and the voice synthesis control section reads voice parameters from the voice database based on the voice symbol data and supplies the voice synthesis section to the voice synthesis section. Therefore, the guide speech synthesized by the speech synthesis section has each syllable that matches the material of the speech symbol, and the pitch and pitch here can be changed according to the utterance line. 12. A method of guiding speech playback, which is a speech symbol formed by using a piece of content data composed of performance data formed by a series of performance events, and a speech symbol indicating individual lyrics syllables attached to the performance data For the terminal device of data, the method includes the following steps: copying the musical tones from the performance data; synthesizing the guidance speech based on the voice symbol data; and pre-reading the performance data to save money ^ and the characteristics of the synthesized guidance speech are based on the Performance data to change. 13. —A storage medium whose storage program is used to cause a computer to use a composition composed of a series of performance events formed by distributing performance data. -36- This paper size applies to China National Standard (CNS) A4 specifications (210X 297 mm) gutter 8 8 8 8 AB c D 529018 6. The patent application scope includes the terminal device of the phonetic symbol data formed by the phonetic symbol data of the individual lyrics syllables attached to the performance data, and executes the copy guide voice The method includes: a module for copying musical tones to copy the guidance speech from the performance data; a guidance speech synthesis module for synthesizing the guidance speech based on the speech symbol data; and a guidance speech change The module is used to pre-read the performance data, and in this way, the characteristics of the synthesized guidance voice are changed according to the performance data. -37- This paper size applies to China National Standard (CNS) A4 (210X 297mm)