JP4370811B2 - Voice display output control device and voice display output control processing program - Google Patents

Voice display output control device and voice display output control processing program Download PDF

Info

Publication number
JP4370811B2
JP4370811B2 JP2003143499A JP2003143499A JP4370811B2 JP 4370811 B2 JP4370811 B2 JP 4370811B2 JP 2003143499 A JP2003143499 A JP 2003143499A JP 2003143499 A JP2003143499 A JP 2003143499A JP 4370811 B2 JP4370811 B2 JP 4370811B2
Authority
JP
Japan
Prior art keywords
accent
pronunciation
image
word
correct
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2003143499A
Other languages
Japanese (ja)
Other versions
JP2004347786A (en
Inventor
嘉行 村田
孝 湖城
Original Assignee
カシオ計算機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by カシオ計算機株式会社 filed Critical カシオ計算機株式会社
Priority to JP2003143499A priority Critical patent/JP4370811B2/en
Publication of JP2004347786A publication Critical patent/JP2004347786A/en
Application granted granted Critical
Publication of JP4370811B2 publication Critical patent/JP4370811B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice display output control device, an image display control device, a voice display output control processing program, and an image display control processing program for outputting data such as voice, text, and images in synchronization.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, for example, there is a language learning device that outputs language speech and displays its mouth shape.
[0003]
In this language learning apparatus, voice information of a native language user and mouth-shaped image data are recorded in a sample data memory in advance by a microphone and a camera. Then, the learner's voice information and mouth-shaped image data are recorded by the microphone and the camera, and the waveforms of the respective voice information of the learner and the native language user recorded in advance in the sample data memory and Corresponding image data of each corresponding mouth shape is displayed in a chart format.
[0004]
Thereby, it is intended to clearly analyze and display the difference in language pronunciation between the native language user and the learner (see, for example, Patent Document 1).
[0005]
[Patent Document 1]
JP 2001-318592 A
[0006]
[Problems to be solved by the invention]
Using such a conventional language learning device, it is possible to know the voice of the native language user who is a model and its mouth image, but the accent of each language is mainly the voice of the accent part. However, there is a problem that the timing of accents in each learning language is difficult to understand because there is no clear difference in the mouth image itself.
[0007]
The present invention has been made in view of the above problems, and in the display of an image synchronized with the audio output, an audio display output control device, an image display control device, which can clearly show the timing of accents, Another object of the present invention is to provide an audio display output control processing program and an image display control processing program.
[0012]
[Means for solving the problem]
  Claims of the invention1In the voice display output control device according to the above, the word storage means stores a plurality of words and the correct accented phonetic symbols and the phonetic symbols with error accents of each word in association with each other, and the voice data output means stores the stored words. Output the voice data of the correct accent or the voice data of the error accent, and display the text of the word in synchronism with the pronunciation voice data of the word output by the text synchronization display control means, and the image display control means The image including at least the mouth part is displayed in different display forms when the sound data of correct accent is output by the sound data output means and when the sound data of error accent is output, and The mouth image display control means connects the mouth portion included in the display image. Te, the audio data outputting means in synchronism with the sound audio data output by displaying an image of the mouth type corresponding to the sound speech data. Then, along with the synchronous display of the word text by the text synchronous display control means by the accent detection means, the accent of the word is detected from the accented phonetic symbol stored by the word storage means, and the image change display control means The image displayed by the image display control means is changed according to the accent detection.Further, the word stored by the correct / incorrect accent display control means, the correct accented phonetic symbol associated with the word and the phonetic symbol with the error accent are displayed side by side, and the correct / incorrect accent selection means displays the words displayed side by side. Select either the correct accented phonetic or the error accented phonetic. Then, the voice data output means outputs correct voice pronunciation data or correct accent voice data of the corresponding word in accordance with the correct / incorrect word accent selection by the correct / incorrect accent selection means.
[0013]
  According to this, it is possible not only to output correct accent pronunciation voice data and error accent pronunciation voice data for the words stored by the word storage means, but also to display the word text synchronized with the pronunciation voice data and to display the display image. Mouth-shaped images corresponding to the pronunciation speech data for the mouth part can be displayed, and the display image can be changed according to the detection of the word accent, so you can learn the correct accent and error accent for the word easily and clearlyFurther, it is possible to select the correct accented phonetic symbol or the error accented phonetic symbol for the word stored by the word storage means and output the pronunciation voice data, and display the word text synchronized with the pronunciation voice data and Mouth-shaped images corresponding to pronunciation speech data for the mouth part included in the display image can be displayed, and the display image can be changed according to the detection of word accents, making it easier and clearer to correct and accent correct words You can learn at the timing.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings.
[0027]
(First embodiment)
FIG. 1 is a block diagram showing a configuration of an electronic circuit of a portable device 10 according to an embodiment of an audio display output control device (image display control device) of the present invention.
[0028]
The portable device (PDA: personal digital assistants) 10 is configured by a computer that reads a program recorded on various recording media or a program transmitted by communication and whose operation is controlled by the read program. The electronic circuit includes a CPU (central processing unit) 11.
[0029]
The CPU 11 is a PDA control program stored in advance in the FLASH memory 12A in the memory 12, or a PDA control program read from the external recording medium 13 such as a ROM card into the memory 12 via the recording medium reading unit 14, or the Internet. The operation of each part of the circuit is controlled in accordance with the PDA control program read into the memory 12 from the other computer terminal (30) on the communication network N such as The stored PDA control program is input on the communication network N received by an input signal corresponding to a user operation from the input unit 17a composed of switches and keys and the coordinate input device 17b composed of a mouse and a tablet, or to the power transmission control unit 15. Communication from other computer terminals (30) No., or the external communication device that is received via the communication unit 16 by the short-range wireless connection or a wired connection using Bluetooth (R) (PC: personal computer) is activated in response to the communication signal from 20.
[0030]
The CPU 11 is connected to the memory 12, the recording medium reading unit 14, the power transmission control unit 15, the communication unit 16, the input unit 17a, and the coordinate input device 17b. Are connected to a sound input unit 19a for inputting a sound, a stereo sound output unit 19b for outputting sound with left and right channel speakers L and R, and the like.
[0031]
The CPU 11 has a built-in timer for processing time counting.
[0032]
The memory 12 of the portable device 10 includes a FLASH memory (EEP-ROM) 12A and a RAM 12B.
[0033]
The FLASH memory (EEP-ROM) 12A has a network communication for data communication with each computer terminal (30) on the communication network N via the system program that controls the overall operation of the portable device 10 and the power transmission control unit 15. In addition to storing programs, external device communication programs for data communication with an external communication device (PC) 20 via the communication unit 16, schedule management programs, address management programs, and dictionary search and search For synchronized playback of various data such as speech, text, and face images (including mouth-shaped composite images) corresponding to headwords, setting the type of the face image (character), and testing the headword accent Various PDA control programs such as the dictionary processing program 12a are stored.
[0034]
Further, the FLASH memory (EEP-ROM) 12A further includes a dictionary database 12b (see FIG. 2), dictionary voice data 12c, character image data 12d (see FIG. 3), and voice-specific (type) image data 12e (FIG. 4). And a dictionary time code file 12f (see FIGS. 5 and 6) are stored.
[0035]
As the dictionary database 12b, data of various dictionaries such as an English-Japanese dictionary, a Japanese-English dictionary, a national language dictionary, and the like are stored. As shown in FIG. Time code file No and storage destination address for easy synchronized playback of audio, text, and images, HTML file No and storage destination address for setting an image playback window, Text file No and storage destination address, No. and storage address of text mouth synchronization file that associates each character, phonetic symbol, mouth type number of text, No and storage address of sound file that is voice data, data number and storage address of dictionary contents, Each is linked and stored.
[0036]
In each embodiment, for phonetic symbols described in the specification, it is difficult to input formal phonetic symbols, so similar characters are substituted, and formal phonetic symbols are described on the drawings.
[0037]
FIG. 2 is a diagram showing synchronous reproduction link data for one headword “low” in the dictionary database 12b stored in the memory 12 of the portable device 10, and FIG. A table showing storage destination addresses, FIG. 5B shows the text data “low” stored in accordance with the text file No., and FIG. 4C shows text characters stored in accordance with the text mouth synchronization file No. , Phonetic symbols, mouth type numbers.
[0038]
As the dictionary voice data 12c, voice data for pronunciation for each headword in the dictionary database 12b is stored in association with the sound file No. and the address.
[0039]
FIG. 3 is a diagram showing character image data 12d that is stored in the memory 12 of the portable device 10 and that is selectively used by user settings for synchronous display of pronunciation-portable images in a dictionary search for a dictionary.
[0040]
In this embodiment, three types of character images (face images) No1 to No3 are prepared as the character image data 12d, and each character image No1, No2, No3 has a combined rectangular area of its mouth-shaped image. Mouth image area data (X1, Y1, X2, Y2) for designating as coordinates of two diagonal points are stored in association with each other.
[0041]
The three types of character images (face images) No1 to No3 are further accented face images No1 'to No3' (for expressing pronunciation emphasis at the timing of the accent of the found word searched in the dictionary. 12 (C) (2) and FIG. 13 (B) (2)) are stored, and further, American character images No1US to No3US (see FIG. 15) when American or English pronunciation sounds are designated. And English character images No1UK to No3UK (see FIG. 16), and accent face images No1US 'to No3US' (see (2) in FIG. 15B) and No1UK 'to No3UK' (see FIG. 16B) (2) Reference) is stored.
[0042]
FIG. 4 is stored in the memory 12 of the portable device 10 and is used for the synchronous display of the pronunciation mouth type images in the dictionary word search, the mouth image areas (X1, Y1, X2) of the character images (12d: No1 to No3). , Y2) is a diagram showing voice-specific mouth image data 12e synthesized and displayed.
[0043]
As the mouth-specific mouth image data 12e, mouth-shaped images 12e1, 12e2,... Associated with each phonetic symbol required for pronunciation of all headwords stored in the dictionary database 12b are the mouth number No. It is stored in association with n.
[0044]
Further, the dictionary time code file 12f stored in the memory 12 of the portable device 10 is used for synchronous reproduction of voice / text / face images (including mouth-shaped composite images) corresponding to the searched words in the dictionary. This command file (see Fig. 5) is prepared not for every headword but for each headword that has the same number of characters, number of phonetic symbols, and pronunciation timing, and is compressed and encrypted using a predetermined algorithm. Has been.
[0045]
FIG. 5 is a diagram showing a time code file 12f23 (12i) of the file No. 23 associated with the headword “low” in the dictionary time code file 12f stored in the memory 12 of the portable device 10.
[0046]
In the time code file 12fn, a time code for performing command processing for synchronously reproducing various data (speech, text, and images) at a reference processing unit time (for example, 25 ms) at a predetermined time interval described and set as header information H in advance. Each time code is a reference number or a specified numerical value for associating a command code indicating an instruction with data contents (text file / sound file / image file etc.) related to the command. It consists of a combination with parameter data consisting of
[0047]
For example, the file playback time by the time code file 12f23 of the headword “low” shown in FIG. 5 is 1 second after a playback process consisting of a time code of 40 steps when a preset reference processing unit time is 25 ms. Become.
[0048]
FIG. 6 is a diagram in which command codes of various commands described in the dictionary time code file 12fn (see FIG. 5) of the portable device 10 are associated with instruction contents to be analyzed based on parameter data thereof.
[0049]
Commands used for the time code file 12fn include standard commands and extended commands. The standard commands include LT (i-th text load). VD (i-th text phrase display). BL (Character counter reset / i-th phrase block designation). HN (no highlight, character counter count up). HL (up to i-th character, character count). LS (1 line scrolling / character counter count up). DH (i-th HTML file display). DI (i-th image file display). PS (i-th sound file play). CS (Clear All File). PP (pause for basic time i seconds). FN (end of processing). There are NP (invalid) commands.
[0050]
Further, in the RAM 12B in the memory 12, a search word memory 12g in which a search word associated with the search processing of the dictionary database 12b is read and stored according to the search word number corresponds to the searched search word. Dictionary data memory 12h, in which dictionary data such as meaning contents to be read is read out from the dictionary database 12b according to the dictionary data number and stored, and voice / text / image synchronization corresponding to the searched entry A reproduction time code file 12fn (see FIG. 5) for reproduction is read out from the dictionary time code file 12f according to the time code file No in the dictionary database 12b, decompressed, decoded and stored. A memory 12i is prepared.
[0051]
Further, in the RAM 12B in the memory 12, HTML files for setting the text / image synchronous reproduction windows W1 and W2 (see FIGS. 12 and 13) on the headword search screen G2 are stored in the dictionary database. A synchronization HTML file memory 12j that is read from and stored in accordance with the HTML file No. 12b, and a text file memory for synchronization in which the search headword text data is read from the dictionary database 12b and stored in accordance with the text file No. 12k, synchronization sound file memory 12m in which the pronunciation sound data of the search headword is read out from the dictionary sound data 12c and stored according to the sound file No in the dictionary database 12b, and the pronunciation image of the search headword Character image set by user for display A synchronization image file memory 12n that is read out from the character image data 12d (see FIG. 3) and stored, and a mouth that indicates a composition area of the mouth image in the character image stored in the synchronization image file memory 12n Mouth image area memory 12p in which image area data (X1, Y1; X2, Y2) are stored, and voice / text according to time code file 12fn corresponding to the search headword stored in time code file memory 12i An image expansion buffer 12q or the like is prepared in which a character image to be synchronized and a mouth image are expanded and synthesized and stored.
[0052]
That is, the headword searched by activating the dictionary processing program 12a stored in the FLASH memory 12A of the portable device (PDA) 10 is “low”, and correspondingly in the dictionary time code file 12f The time code file 12f read from and stored in the reproduction time code file memory 12i is, for example, the time code file 12f23 shown in FIG. 5, and the third command code “ When “DI” and parameter data “00” are read, this command “DI” is the i-th image file display command, and is stored in the synchronization image file 12n linked from the parameter data i = 00. The character image 12dn is read and displayed.
[0053]
When the fourth command code “PS” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “PS” is the i-th sound file playback command. The audio data 12cn stored in the synchronization sound file 12m linked from the parameter data i = 00 is read and output.
[0054]
When the sixth command code “VD” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “VD” is the i-th text phrase display command. According to the parameter data i = 00, the 0th clause of the text (in this case, the text file “low” of the search word stored in the synchronization text file memory 12k is displayed.
[0055]
Further, when the ninth command code “NP” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “NP” is an invalid instruction, and therefore the current file output State is maintained.
[0056]
The detailed operation of the synchronized playback of pronunciation speech / text / image (mouth-shaped image) corresponding to the search word based on the time code file 12f23 (12i) having the file contents shown in FIG. 5 will be described later. I will explain it again.
[0057]
Next, various operations performed by the mobile device 10 having the above-described configuration will be described.
[0058]
FIG. 7 is a flowchart showing main processing according to the dictionary processing program 12a of the portable device 10.
[0059]
FIG. 8 is a flowchart showing a headword synchronized reproduction process accompanying the main process of the portable device 10.
[0060]
FIG. 9 is a flowchart showing a text corresponding mouth display process executed by interruption in accordance with the highlight display of each entry word character accompanying the entry synchronized playback process of the portable device 10.
[0061]
FIG. 10 is a diagram showing a setting display state of the character image for synchronous reproduction accompanying the character setting process in the main process of the portable device 10.
[0062]
When the mode is switched to the character image setting mode by operating the “setting” key 17a1 and the cursor key 17a2 of the input unit 17a (step S1 → S2), for example, three types of character image data 12d1 (No1) stored in the FLASH memory 12A ), 12d2 (No2), 12d3 (No3) [see FIG. 3] are read and displayed on the display unit 18 as a character image list selection screen G1, as shown in FIG. 10 (step S3).
[0063]
In this character image list selection screen G1, the selection frame X of each character image is moved by the operation of the cursor key 17a3 to select a user desired character image (for example, 12d3 (No 3)). When the selection of the character image is detected by the determination operation using the (voice) key 17a4 (step S4), the character image 12dn detected by the selection is read out and stored in the synchronization image file memory 12n in the RAM 12B. (Step S5). Mouth image area data (X1, Y1; X2, Y2) indicating the synthesized area of the mouth-shaped image of the character image 12dn that has been selected and detected is also read out and transferred to the mouth image area memory 12p in the RAM 12B. (Step S6).
[0064]
Thus, the character image to be synthesized with the mouth-shaped image to be displayed in synchronization with the pronunciation sound of the headword is selected and set along with the headword search.
[0065]
FIG. 11 is a diagram showing a search word entry display screen G2 accompanying the word search process in the main process of the portable device 10.
[0066]
For example, in order to perform a headword search based on dictionary data of an English-Japanese dictionary stored in the dictionary database 12b, after setting the English-Japanese dictionary search mode by operating the "English-Japanese" key 17a5 of the input unit 17a, the search target Is entered (for example, “low”) (steps S7 → S8), a plurality of headwords including the first matching word and a matching character are searched from the dictionary data of the English-Japanese dictionary. Then, it is read out and displayed on the display unit 18 as a list of search terms (not shown) (step S9).
[0067]
In this search headword list screen, the headword (in this case, “low”) that matches the search target headword input by the user is selected and instructed by the cursor key, and the “translation / decision (voice)” key 17a4. Is operated (step S10), the selected and detected headword "low" is stored in the headword memory 12g in the RAM 12B, and the pronunciation / part of speech / corresponding to the headword "low" is stored. Dictionary data such as meaning contents is read out and stored in the entry word corresponding dictionary data memory 12h in the RAM 12B, and displayed on the display unit 18 as a search entry display screen G2 as shown in FIG. S11).
[0068]
Here, for the headword “low” displayed in the search, the pronunciation voice is output, and at the same time, the character of the headword, the phonetic symbol, and the mouth-shaped image of the pronunciation are displayed in synchronization. When the “/ decision (voice)” key 17a4 is operated (step S12), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA).
[0069]
FIG. 12 shows a headword character display window W1 displayed on the search headword display screen G2 in the setting state of the character image No3 in accordance with the synchronous reproduction processing in the headword search processing of the portable device 10 and the pronunciation type. It is a figure which shows the display state of the display window W2, The figure (A) is a figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search headword display screen G2. (B) is a diagram showing a change state of the headword character display window W1 synchronized with the output of the pronunciation sound and the pronunciation type display window W2 not corresponding to the accent, and FIG. It is a figure which shows the change state of the spoken word display window W1 and the pronunciation opening type | mold display window W2 corresponding to an accent.
[0070]
When the synchronous reproduction process (step SA) in FIG. 8 is started in accordance with the operation of the “translation / decision (voice)” key 17a4 in the state where the search headword display screen G2 is displayed, Initialization processing such as clear processing of each work area is performed (step A1). First, the synchronous reproduction link data (refer to FIG. 2) for the current search term “low” stored in the dictionary database 12b. Based on the headword search screen G2, the HTML file for setting the synchronous playback windows W1 and W2 (see FIG. 12) of the text / image is read according to the HTML file No3 and written to the synchronization HTML file memory 12j. It is. Also, the text data “low (with phonetic symbol)” of the search headword is read out according to the text file No 4222 and written into the synchronization text file memory 12k. In addition, the pronunciation sound data of the search headword is read according to the sound file No 4222 and written into the synchronization sound file memory 12m (step A2).
[0071]
It should be noted that the character image (in this case, 12d3 (No 3)) set by the user for displaying the search headline pronunciation image is already stored in the character image data 12d (see FIG. 3) in accordance with step S5 accompanying the character setting process. It is read from the inside and written to the synchronization image file memory 12n, and the mouth image area data (X1, Y1; X2, Y2), which is the pronunciation mouth type image composition area in the character image 12d3 (No3), is also set in the character setting. According to step S6 accompanying the process, it has already been written in the mouth image area memory 12p.
[0072]
Then, from the time code file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary time code file 12f in the FLASH memory 12A, the current search headword The time code file 12f23 (see FIG. 5) corresponding to “low” is decoded and read in accordance with the time code file No23 described in the synchronous reproduction link data (see FIG. 2), and the time in the RAM 12B is read. It is transferred and stored in the code file memory 12i (step A3).
[0073]
In this manner, settings for reading various files for synchronous reproduction of the pronunciation voice / text / speaking mouth type image corresponding to the search headword “low” into the RAM 12B, and the time code file 12f23 for synchronous reproduction of these files. When the transfer setting to the RAM 12B is completed, the processing unit time (for example, 25 ms) by the CPU 11 of the time code file (CAS file) 12f23 (see FIG. 5) stored in the time code file memory 12i is the header of the time code file 12f23. Information H is set (step A4).
[0074]
A read pointer is set at the head of the time code file 12f23 stored in the time code file memory 12i, and at the top of various files written in the synchronization file memories 12j, 12k, 12m, and 12n. Is set (step A5), and a timer for timing the reproduction processing timing of each synchronous file is started (step A6).
[0075]
In step A6, when the processing timer is started, the read pointer set in step A5 is set every processing unit time (25 ms) corresponding to the current time code file 12f23 set in step A4. The command code and parameter data of the time code file 12f23 (see FIG. 5) at the initial position are read (step A7).
[0076]
Then, it is determined whether or not the command code read from the time code file 12f23 (see FIG. 5) is “FN” (step A8). If “FN” is determined, the synchronization is performed at that time. A reproduction process stop process is instructed (step A8 → A9).
[0077]
On the other hand, when it is determined that the command code read from the time code file 12f23 (see FIG. 5) is not “FN”, processing corresponding to the content of the command code (see FIG. 6) is executed. (Step A10).
[0078]
If it is determined that the time measured by the timer has reached the next processing unit time (25 ms), the read pointer for the time code file 12f23 (see FIG. 5) stored in the RAM 12B moves to the next position. Set (step A11 → A12), and the process from reading the command code and its parameter data in the time code file 12f23 (see FIG. 5) at the position of the read pointer in step A7 is repeated (steps A12 → A7 to A10). .
[0079]
Here, the synchronized playback output operation of the pronunciation voice / text / speech mouth image file based on the time code file 12f23 of the search headword “low” shown in FIG. 5 will be described in detail.
[0080]
That is, the time code file 12f23 is a command process executed every (reference) processing unit time (for example, 25ms) preset in the header H. First, the time code file 12f23 (see FIG. 5). When the first command code “CS” (clear all file) and its parameter data “00” are read out, an instruction to clear the output of all the files is given, and the output of the text / audio / image file is cleared ( Step A10).
[0081]
When the second command code “DH” (i-th HTML file display) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code DH. The headword text / image frame data of the HTML data is read from the HTML file memory 12j, and the text / image is synchronized on the headword search screen G2 with respect to the display unit 18 as shown in FIG. Playback windows W1 and W2 are set (step A10).
[0082]
When the third command code “DI” (i-th image file display) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code DI. The character image 12d (No. 3 in this case) set and stored in the character setting process (steps S2 to S6) is read from the image file memory 12n, and as shown in FIG. The image is displayed in the synchronized playback window W2 for the image set in the HTML file on the screen G2 (step A10).
[0083]
When the fourth command code “PS” (i-th sound file play) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code PS. The pronunciation sound data corresponding to the search headword “low” set and stored in the step A2 is read from the sound file memory 12m, and the sound output from the stereo sound output unit 19b is started (step A10). .
[0084]
When the fifth command code “LT” (i-th text load) and its parameter data “00” are read, the synchronization data in the RAM 12B is read according to the parameter data (i = 0) read together with the command code LT. Text data “l”, “o”, “w” (including phonetic symbols) corresponding to the search headword “low” set and stored in step A2 is specified in the text file memory 12k (step A10). ).
[0085]
When the sixth command code “VD” (i-th text phrase display) and its parameter data “00” are read, the fifth command is determined according to the parameter data (i = 0) read together with the command code VD. The text data “l”, “o”, “w” (including phonetic symbols) specified according to the code “LT” is read out, and as shown in FIG. Are displayed in the text synchronized playback window W1 (step A10).
[0086]
When the seventh command code “BL” (character counter reset / i-th clause block designation) and its parameter data “00” are read, the character of the search term “low” displayed in the text synchronous playback window W1 The counter is reset (step A10).
[0087]
When the eighth command code “HL” (highlight / character count up to i-th character) and its parameter data “01” are read, the parameter data (i = 1) read together with the command code HL is read. Then, as shown in FIG. 12A, the first character “l” of the search headword “low” (including phonetic symbols) displayed in the text synchronized playback window W1 and the pronunciation corresponding thereto. Up to the symbol, highlight (identification) display HL by color change display, reverse display, underline display, etc. is performed, and the character counter counts up to the second character and the corresponding phonetic symbol (step A10).
[0088]
At the time of highlight (identification) display for each character of the search headword “low” and its corresponding phonetic symbol by the time code file 12f23, the text corresponding mouth display processing in FIG. 9 is interrupted.
[0089]
That is, when the character “l” of the search headword “low” that is highlighted (identified) HL this time is detected (step B1), the pronunciation type image corresponding to the detected character “l” is converted into the dictionary. According to the mouth number “36” corresponding to the text “l” indicated by the text mouth synchronization file (see FIG. 2C) in the database 12b, the pronunciation mouth type is selected from the speech-specific mouth image data 12e (see FIG. 4). It is read out as image 12e2 (No36) (step B2). Then, the pronunciation type image 12e2 (No. 36) for the character “l” of the search headword “low” displayed in the highlighted (identified) manner is shown in FIG. 12A (FIG. 12B (1)). As shown, the mouth image area memory 12p in the RAM 12B stores the mouth image composition area of the character image 12d (No 3) displayed in the image synchronized playback window W2 on the headword search screen G2. Are synthesized and displayed according to the mouth image area (X1, Y1; X2, Y2) (step B3).
[0090]
Here, it is determined whether or not there is an accent mark for the phonetic symbol of the current highlight (identification) display text “l” indicated by the text mouth synchronization file (see FIG. 2C) (step B4). In the case of the phonetic symbol [l] of the highlight (identification) display text “l”, it is determined that there is no accent mark, so that the character image 12d (No 3) is kept displayed as its normal face image. (Step B4 → B5).
[0091]
If it is determined that there is an accent mark, the character image 12d (No. 3) is changed to the accent face image No. 3 ′ (see (2) in FIG. 12C) for pronunciation emphasis expression. (Step B4 → B6).
[0092]
Then, the output timing of the pronunciation voice data corresponding to the search headword “low” started to be output from the stereo voice output unit 19b according to the fourth command code “PS”, and the processing unit by this time code file 12f23 Since the time code file 12f23 is created in association with the identification display timing for each character of the search word “low” corresponding to the time (25 ms), the search word “low” is created. When the first character “l” is identified and displayed, and the sound output image 12e (No36) is synchronously synthesized and displayed, the pronunciation sound corresponding to the pronunciation symbol corresponding to this is output synchronously.
[0093]
Thus, the identification display of the first character “l” of the search headword “low”, the synthesized display of the pronunciation mouth image 12e3 (No36) to the set character image 12d (No3), and the output of the pronunciation sound Are performed synchronously.
[0094]
When the ninth command code “NP” is read, the character image and text data synchronous display screen and the sound output data synchronous output state corresponding to the current search headword “low” are maintained.
[0095]
Thereafter, according to the twelfth command code “HL” and the thirty-fifth command code “HL”, as shown in FIG. 12 (C) {circle around (2)} and FIG. 12 (C) {circle around (3)}, The text data “low” of the search headword and its phonetic symbol are sequentially high, such as the second character “o” and the phonetic symbol [o], the third character “w” and the phonetic symbol [u]. The light (identification) display HL is displayed (step A10). At the same time, in the image synchronous reproduction window W2, the mouth image area (X1, Y1; X2, Y2) of the set character image 12d (No3) is synthesized in accordance with the mouth display processing corresponding to the text in FIG. The phonetic mouth type image also corresponds to the mouth mouth type file 12e (No8) corresponding to the mouth number 8 and the sounding mouth type image 12e (No8) corresponding to the mouth number 8 in accordance with the text mouth synchronization file (see FIG. 2C). ) Are read out from the voice-specific mouth image 12e, and are sequentially synthesized and displayed synchronously (steps B1 to B3).
[0096]
Furthermore, the pronunciation voice data of the search headword “low” output from the stereo voice output unit 19b according to the fourth command code “PS” is also highlighted (identification) of the text “low” and its phonetic symbol. ) The sound of reading out the display part is sequentially output synchronously.
[0097]
Note that each sound mouth type image 12e (No36) by the text corresponding mouth display processing synchronized with the highlight (identification) display HL for each character “l”, “o”, and “w” of the search headword “low”. In the composite switching display (steps B1 to B5) for the set character image 12d (No3) of 12e (No9) → 12e (No8), as shown in FIG. "And the phonetic symbol image 12e (No. 9) in combination with the highlight (identification) display HL of the phonetic symbol, it is determined that the phonetic symbol of the highlight (identification) display text" o "has an accent mark. Therefore, as shown in FIG. 12 (C) (2), the character image 12d (No 3) at this time is changed to the accent face image No 3 ′ for pronunciation emphasis expression and displayed (step). Flop B4 → B6).
[0098]
That is, when the combined display of the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “o” of the search headword “Low” shown in FIG. 12 and the pronunciation type image 12e (No9) is displayed. The normal set character (face) image 12d (No. 3) shown in FIG. 12 (B) (2), which is the composition destination of the mouth shape image 12e (No. 9), is shown in FIG. 12 (C) (2). For example, since the face image 12d (No 3 ′) corresponding to an accent representing a state of strong pronunciation due to sweating of the head or a heel of the mouth is displayed, the user can pronounce the search speech “Low” and its pronunciation The utterance timing and the corresponding parts of the letters “L”, “o”, “w” and their phonetic symbols, as well as each of the phonetic mouth type images 12e (No36 → No9 → No8) can be easily learned by their respective synchronized playback. Big In addition, it becomes possible to learn realistically the timing for emphasizing speech according to the accent.
[0099]
FIG. 13 shows a headword character display window W1 and a pronunciation type that are displayed on the search headword display screen G2 in the setting state of the character image No1 in accordance with the synchronized playback processing in the headword search processing of the portable device 10. It is a figure which shows the display state of the display window W2, The figure (A) is a figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search headword display screen G2. (B) is a figure which shows the change state of the headword character display window W1 and the pronunciation window display window W2 which synchronized with the output of the pronunciation sound.
[0100]
That is, in the character setting process in steps S1 to S6 in FIG. 7, the animation style is selected from the three types of character image data 12d (No1), 12d (No2), 12c (No3) [see FIG. 3] stored in advance. In the state in which the character image 12d (No1) is selected and set, the headword search processing and synchronous playback processing for the headword “low” to be searched are performed as in Steps S7 to SA, and the text corresponding to FIG. When the mouth display processing is performed, as shown in FIG. 13A and FIG. 13B, the search word “window” on the search word display window W1 for the search word display screen G2 is displayed. “Low” and a highlight (identification) display HL synchronized with the sound output of the phonetic symbols are sequentially performed. Accordingly, the sound output type display window W2 uses the animated character image 12d (No. 1) set in the character setting process (steps S1 to S6) as a basic face image and outputs the sound output and text ( Each of the sound source type images 12e (No. 36 → No. 9 → No. 8) synchronized with the highlight display HL (including phonetic symbols) is sequentially switched and combined and displayed.
[0101]
Then, as shown in (2) in FIG. 13B, the pronunciation mouth type image 12e accompanying the highlight (identification) display HL of the second character “o” of the search headword “low” and its phonetic symbol. When (No9) is compositely displayed, it is determined that there is an accent mark in the phonetic symbol of the highlighted (identification) display text “o”, so that the animation character image 12d (No1) at this time is pronounced emphasis expression The accent face image No1 ′ is changed to be displayed (step B4 → B6).
[0102]
That is, the highlight (identification) display synchronized with the output of the pronunciation voice for the accent character “o” of the search headword “Low” when the anime-like character image 12d (No. 1) shown in FIG. 13 is selected and set. When switching and displaying the HL and the pronunciation mouth image 12e (No9), the normal animation character (face) image 12d (No1), which is the composition destination of the mouth image 12e (No9), is, for example, sweating of the head. And the accent-corresponding face image 12d (No1 ') that expresses the state of strong pronunciation due to body shake, the user can pronounce the search pronunciation word "Low", its utterance timing, and each character. “L”, “o”, “w” and corresponding portions of their phonetic symbols, as well as each phonetic mouth type image 12e (No36 → No9 → No8), can be easily learned by their respective synchronized playback In addition to being able to do so, it will be possible to learn realistically the timing for emphasizing speech according to accents.
[0103]
In the synchronous reproduction process of the text, pronunciation sound, and pronunciation mouth image associated with the headword search described with reference to FIGS. 11 to 13, the contents of the English-Japanese dictionary data stored in advance as the dictionary database 12b are: Although the case where the content corresponds only to the pronunciation of one American country has been described, for example, as described with reference to FIGS. 14 to 16, the content of English-Japanese dictionary data stored in advance as the dictionary database 12b If the phonetic content has content corresponding to the pronunciation of two countries, American and English, specify the pronunciation form of either one of the American or British languages and use the text, pronunciation, and pronunciation Mouth-shaped image synchronous reproduction processing may be performed.
[0104]
FIG. 14 is a diagram showing a search headword display screen G2 when an English-Japanese dictionary containing pronunciation forms of two countries of the United States / UK is used in the headword search process in the main process of the mobile device 10. .
[0105]
In order to search for a word based on dictionary data of an English-Japanese dictionary that records pronunciation forms of, for example, two countries of the US / UK stored in the dictionary database 12b, the “English-Japanese” key 17a5 of the input unit 17a is operated. After the search mode of the English-Japanese dictionary is set, when a search word (for example, “laugh”) is input (steps S7 → S8), a plurality of words including a match and a matching character at the head are input. The headwords are retrieved from the dictionary data of the English-Japanese dictionary and read out, and displayed on the display unit 18 as a list of search headwords (not shown) (step S9).
[0106]
On this search headword list screen, the headword (in this case, “laugh”) that matches the search target headword input by the user is selected and instructed by the cursor key, and the “translation / decision (voice)” key 17a4. Is operated (step S10), the selected and detected headword “laugh” is stored in the headword memory 12g in the RAM 12B, and the US / UK corresponding to the headword “laugh” is stored. Dictionary data such as pronunciation / parts of speech / meaning contents of the two countries is read out and stored in the dictionary data memory 12h corresponding to the entry word in the RAM 12B. As shown in FIG. 14, the display unit is displayed as the search entry display screen G2. 18 (step S11).
[0107]
Here, for the headword “laugh” displayed in the search, the pronunciation sound of either the American pronunciation [laef] or the English pronunciation [la: f] is selectively output at the same time. In order to synchronize and display the headword characters, phonetic symbols and pronunciation mouth images corresponding to the US dialect or English dialect identifier [US] displayed in the dictionary data on the search headword display screen G2 When either [English] is specified (step S11a) and the “translation / decision (voice)” key 17a4 is operated (step S12), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA). ).
[0108]
FIG. 15 shows a headword character display window W1 displayed on the search headword display screen G2 when the American pronunciation [US] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A shows the display state of the pronunciation window type display window W2, and FIG. 9A shows the setting display state of the word entry character display window W1 and the pronunciation window type display window W2 with respect to the search word entry display screen G2. FIG. 6B is a diagram showing a change state of the headword character display window W1 and the pronunciation-mouth type display window W2 synchronized with the output of the American pronunciation sound.
[0109]
That is, either the US dialect or the English dialect identifier [US] or [English] displayed in the dictionary data on the search headword display screen G2 is designated, and the process proceeds to the synchronous playback process in FIG. Then, in step A2 of the synchronous reproduction process, for example, when the US dialect identifier [US] is designated, it corresponds to the animation character image 12d (No1) set in advance in the character setting process (steps S2 to S6). Then, the American character image 12d (No1US) is read and transferred to the synchronization image file memory 12n in the RAM 12B. At the same time, based on the synchronous reproduction link data (see FIG. 2) for the current search word “laugh” stored in the dictionary database 12b, the text / image is synchronously reproduced on the word search screen G2. An HTML file for setting the windows W1 and W2 (see FIG. 15) is read according to the HTML file No and written to the synchronization HTML file memory 12j. In addition, the text data “laugh (with American dialect phonetic symbols)” of the search headword is read according to the text file No. and written into the synchronization text file memory 12k. In addition, pronunciation voice data of the American dialect of the search headword is read according to the sound file No and written into the sound file memory for synchronization 12m (step A2).
[0110]
Then, from the time code file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary time code file 12f in the FLASH memory 12A, the current search headword The time code file 12fn (see FIG. 5) corresponding to “laugh” is decoded and read according to the time code file No described in the synchronous reproduction link data (see FIG. 2), and the time in the RAM 12B is read. It is transferred and stored in the code file memory 12i (step A3).
[0111]
In the case where the synchronized playback process of the pronunciation voice, the headword character, and the pronunciation type image according to the time code file 12fn corresponding to the search headword “laugh” is the search headword “low” already described. In the same manner as described above, when the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. 9 are started, the text synchronized reproduction window W1 on the search word display screen G2 includes The phonetic symbol of the US dialect is displayed together with the search headword “laugh”, and the character synchronization image for American English designed to have, for example, the US flag F in the image-synchronized playback window W2. 12d (No1US) is displayed as the target image for the mouth image synthesis.
[0112]
As a result, in synchronization with the pronunciation voice output of the US dialect of the search headline “laugh”, as shown in (1) to (3) in FIG. The word “laugh” and the highlight (identification) display HL from the first character of the phonetic symbol are sequentially displayed, and in the image synchronous reproduction window W2, the mouth character character 12d (No1US) is used as a base. For the image area (X1, Y1; X2, Y2), the sound mouth type image 12e (Non1 → Non2 → Non3) corresponding to the mouth number of each phonetic symbol is read out from the mouth image data 12e according to sound and switched sequentially. It is synthesized and displayed.
[0113]
Also in this case, according to the same text corresponding mouth display processing, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “au” of the search word “Laugh”, the pronunciation type image 12e. At the time of switching composition display of (Non2), the American character (face) image 12d (No1US), which is the composition destination of the mouth image 12e (Non2), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1US ') corresponding to the accent representing the state is displayed, the user can pronounce the pronunciation sound of the American dialect of the search word "Laugh", its utterance timing, and the letters "L", "au", " “gh” and the corresponding part of the phonetic symbol, and further each of the phonetic mouth type images 12e (Non1-> Non2-> Non3) can be easily learned by the respective synchronized playback. Not Rubakari, the speech emphasizes timing will be able to learn in the real according to the US dialect accent.
[0114]
FIG. 16 shows a headword character display window W1 displayed on the search headword display screen G2 when English pronunciation [English] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A shows the display state of the pronunciation window type display window W2, and FIG. 9A shows the setting display state of the word entry character display window W1 and the pronunciation window type display window W2 with respect to the search word entry display screen G2. FIG. 4B is a diagram showing a change state of the headword character display window W1 and the pronunciation type display window W2 synchronized with the output of the English pronunciation sound.
[0115]
That is, among the dialect identifiers [US] or [English] displayed in the dictionary data on the search word display screen G2 shown in FIG. 14, for example, the English dialect identifier [English] is designated. When the process proceeds to the synchronized playback process (step SA) in FIG. 8 (step S11a), in step A2 of the synchronized playback process, the animation character image preset in the character setting process (steps S2 to S6). The English character image 12d (No1UK) is read in correspondence with 12d (No1) and transferred to the synchronization image file memory 12n in the RAM 12B. At the same time, based on the synchronous reproduction link data (see FIG. 2) for the current search word “laugh” stored in the dictionary database 12b, the text / image is synchronously reproduced on the word search screen G2. The HTML file for setting the windows W1 and W2 (see FIG. 16) is read according to the HTML file No and written to the synchronization HTML file memory 12j. In addition, the text data “laugh (with English dialect phonetic symbols)” of the search headword is read according to the text file No. and written into the synchronization text file memory 12k. Also, the pronunciation voice data of the English dialect of the search headword is read according to the sound file No and written into the sound file memory for synchronization 12m (step A2).
[0116]
Then, from the time code file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary time code file 12f in the FLASH memory 12A, the current search headword The time code file 12fn (see FIG. 5) corresponding to “laugh” is decoded and read according to the time code file No described in the synchronous reproduction link data (see FIG. 2), and the time in the RAM 12B is read. It is transferred and stored in the code file memory 12i (step A3).
[0117]
In the case where the synchronized playback process of the pronunciation voice, the headword character, and the pronunciation type image according to the time code file 12fn corresponding to the search headword “laugh” is the search headword “low” already described. In the same manner as described above, when the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. 9 are started, the text synchronized reproduction window W1 on the search word display screen G2 includes The phonetic symbol of the English dialect is displayed together with the search headword “laugh”, and the image synchronized playback window W2 is designed to have, for example, a British cap M1 and a walking stick M2 with a set animation-like character image. An English character image 12d (No1UK) is displayed as a target image for mouth-shaped image synthesis.
[0118]
As a result, in synchronization with the pronunciation voice output of the English dialect of the search word “laugh”, as shown in (1) to (3) in FIG. The word “laugh” and the highlight (identification) display HL from the first character of the phonetic symbol are sequentially displayed, and the image synchronized playback window W2 uses the English character image 12d (No1UK) as a base. For the image area (X1, Y1; X2, Y2), the sound mouth type image 12e (Non1 → Non2 → Non3) corresponding to the mouth number of each phonetic symbol is read out from the mouth image data 12e according to sound and switched sequentially. It is synthesized and displayed.
[0119]
Also in this case, according to the same text corresponding mouth display processing, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “au” of the search word “Laugh”, the pronunciation type image 12e. At the time of switching composition display of (Non2), the English character (face) image 12d (No1UK), which is the composition destination of the mouth image 12e (Non2), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1UK ') corresponding to the accent representing the state is displayed, the user can pronounce the pronunciation voice of the English dialect of the search word "Laugh", its utterance timing, and the letters "L", "au", " “gh” and the corresponding part of the phonetic symbol, and further each of the phonetic mouth type images 12e (Non1-> Non2-> Non3) can be easily learned by the respective synchronized playback. Not Rubakari, the speech emphasizes timing will be able to learn in real depending on the accent of the British dialect.
[0120]
Next, with reference to the main process of the mobile device 10 having the above-described configuration, for example, an accent test process capable of performing a test for applying a correct / incorrect answer of an English word accent will be described.
[0121]
FIG. 17 is a diagram showing an operation display state when an incorrect answer is selected in accordance with the accent test process of the mobile device 10, and FIG. 17A shows an accent test question display screen G3, and FIG. Is a diagram showing the setting display state of the headword character display window W1 and the pronunciation window display window W2 with respect to the headword display screen G2 of the subject, and FIG. It is a figure which shows the change state of the outgoing character display window W1 and the pronunciation window type display window W2.
[0122]
FIG. 18 is a diagram showing an operation display state when a correct answer is selected in accordance with the accent test process of the mobile device 10, where FIG. 18A shows an accent test question display screen G3, and FIG. The figure which shows the setting display state of the headword character display window W1 and the pronunciation window type display window W2 with respect to the headword display screen G2 of a question subject, The figure (C) is the headword synchronized with the output of the pronunciation sound of a correct answer accent. It is a figure which shows the change state of the word character display window W1 and the pronunciation window type display window W2.
[0123]
That is, when the “accent test” key 17a6 in the input unit 17a is operated to set the accent test mode (step S13), a found word is randomly selected from the dictionary data stored in advance in the dictionary database 12c. (Step S14), as shown in FIG. 17 (A), for the randomly selected word “low”, a correct accent phonetic symbol with an accent in the “o” part and an incorrect accent with an accent in the “u” part. An accent test question display screen G3 with the phonetic symbols as selection items Et / Ef is displayed on the display unit 18 (step S15).
[0124]
In this accent test question display screen G3, when the selection frame X is moved by operating the cursor key 17a2, and a selection item Ef having a phonetic symbol with an incorrect accent is selected and detected (step S16), the character setting process ( In step S2 to S6), the character image and its related image (in this case, the anime-like character image 12d (No1) and its accent corresponding image (No1 ')) that have been selected and set in advance as the synthesis destination of the pronunciation mouth type image are For example, the yellow normal color is changed to the blue character image (No1BL) (No1BL ′) (steps S17 → S18).
[0125]
At the same time, the pronunciation voice data read out from the dictionary voice data 12c corresponding to the question word “low” is corrected to voice data corresponding to the pronunciation symbol of the erroneous accent selected by the user (step S19). ).
[0126]
Then, the question word “low” is stored in the headword memory 12g in the RAM 12B, and dictionary data such as pronunciation / parts of speech / semantic contents corresponding to the headword “low” is read and stored in the RAM 12B. It is stored in the entry word corresponding dictionary data memory 12h and displayed on the display unit 18 as a search entry word display screen G2 corresponding to the question word as shown in FIG. 17B (step S20).
[0127]
Here, with respect to the accented question word “low” selected by the user, at the same time that the pronunciation sound is output, the characters of the found word, the pronunciation symbol and the mouth-shaped image of the pronunciation are displayed synchronously. When the “translation / decision (voice)” key 17a4 is operated (step S21), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA).
[0128]
Then, in step A2 of the synchronous reproduction process, the animation character image 12d (No1BL) changed to blue according to the user selection of the wrong accent is read out and transferred to the synchronization image file memory 12n in the RAM 12B. The At the same time, based on the synchronous reproduction link data (see FIG. 2) for the current question word “low” stored in the dictionary database 12b, for synchronous reproduction of text and images on the search word display screen G2. An HTML file for setting the windows W1 and W2 (see FIG. 17B) is read according to the HTML file No and written to the synchronization HTML file memory 12j. Also, text data “low (with error pronunciation symbol)” of the question word is read out and written into the synchronization text file memory 12k. Further, the pronunciation sound data corrected according to the error accent of the question word is read and written into the synchronization sound file memory 12m (step A2).
[0129]
Then, the current question word “low” is selected from the timecode file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary timecode file 12f in the FLASH memory 12A. The time code file 12fn (see FIG. 5) corresponding to “” is decoded and decoded in accordance with the time code file No described in the synchronous reproduction link data (see FIG. 2), and the time code file in the RAM 12B is read out. It is transferred to the memory 12i and stored (step A3).
[0130]
Then, the synchronized reproduction processing of the pronunciation sound of the accented accent, the headword character, and the pronunciation type image in accordance with the time code file 12fn corresponding to the question word “low” is performed for the search headword “low” already described. As in the case, the process is started by the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. Then, as shown in FIG. 17B, in the synchronized text reproduction window W1 (Ef) on the search word display screen G2, the phonetic symbol of the wrong accent by the user selection is displayed together with the question word “low”. In addition, in the image synchronized playback window W2, the animation-like character image 12d (No1BL) whose color is changed to blue by the user selection of the error accent is displayed as a target image for mouth-shaped image synthesis.
[0131]
Thus, in synchronism with the output of the pronunciation sound of the wrong accent corresponding to the question word “low”, as shown in (1) to (3) in FIG. 17 (C), in the text synchronous reproduction window W1 (Ef). The highlighting (identification) display HL from the question word “low” and the head character of the incorrect phonetic symbol is sequentially performed, and in the image synchronous reproduction window W2, the blue color is changed by the selection of the incorrect accent. On the basis of the anime-like character image 12d (No1BL), for the mouth image area (X1, Y1; X2, Y2), the pronunciation mouth type image 12e (No36 → No9 → No8) corresponding to the mouth number of each phonetic symbol. It is read out from the voice-specific mouth image data 12e, sequentially switched and synthesized and displayed.
[0132]
In this case as well, according to the same text corresponding mouth display process, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the erroneous accent character “u” of the found word “Low”, the pronunciation type image When switching and displaying 12e (No8), the blue-colored animated character (face) image 12d (No1BL), which is the composition destination of the mouth image 12e (No8), is caused by, for example, sweating of the head or shaking of the body. Since it is changed and displayed in the accent-corresponding blue face image 12d (No1BL ') representing the state of strong pronunciation, the user can pronounce the wrong accent pronunciation voice of the question word "Low", its wrong utterance timing, and each correspondence The sounding mouth type image 12e (No36 → No9 → No8) to be learned can be clearly learned as being due to an incorrect accent.
[0133]
On the other hand, as shown in FIG. 18A, when the selection frame X is moved by operating the cursor key 17a2 on the accent test question display screen G3, for example, the selection item Et having a correct phonetic symbol is selected and detected. (Step S16) The process proceeds to the synchronous reproduction process in FIG. 8 without performing the blue color changing process (Step S18) of the character image 12d (No1) or the pronunciation sound correcting process (Step S19) according to the error accent. (Step S17 → SA).
[0134]
Then, as described above with reference to FIG. 13, the synchronized reproduction of the pronunciation speech / text / speech-type image corresponding to the search word “low” in the state where the animation character image 12e (No. 1) is set. In the same manner as the processing, as shown in FIG. 18B, in the synchronized text reproduction window W1 (Et) on the search word display screen G2, the correct accent pronunciation by the user selection is displayed together with the question word “low”. A symbol is displayed, and an animation-like character image 12d (No. 1) of a normal color as set in advance is displayed as a target image for mouth-shaped image synthesis in the image synchronous reproduction window W2.
[0135]
As a result, in synchronism with the sound output of the correct accent corresponding to the question word “low”, as shown in (1) to (3) in FIG. 18 (C), in the text synchronous reproduction window W1 (Et), Highlight (identification) display HL from the first word of the question word “low” and its correct phonetic symbol is sequentially performed, and in the image synchronous reproduction window W2, a normal-color anime character image as set in advance. On the basis of 12d (No1), for the mouth image area (X1, Y1; X2, Y2), the sound mouth type image 12e (No36 → No9 → No8) corresponding to the mouth number of each phonetic symbol is the voice-specific mouth image. The data 12e is read out, sequentially switched and combined and displayed.
[0136]
In this case as well, according to the same text corresponding mouth display processing, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the correct accent character “o” of the found word “Low”, the pronunciation type image 12e. When switching and displaying (No9), the animated character (face) image 12d (No1), which is the composition destination of the mouth-shaped image 12e (No9), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1 ') corresponding to the accent representing the state is displayed, the user can pronounce the correct accented speech of the question word "Low", its correct utterance timing, and each corresponding pronunciation mouth image 12e (No36). → No9 → No8) can be clearly learned.
[0137]
Therefore, according to the synchronized playback function of pronunciation speech / text / speech mouth type image associated with the search for the headword by the portable device 10 of the first embodiment having the above-described configuration, the search word “low” is input. When the dictionary data corresponding to the search headword is searched and displayed as the search headword display screen G2, when the “translation / decision (voice)” key 17a4 is operated, the search headword “low” is displayed. In synchronization with the pronunciation sound output from the stereo sound output unit 19b in accordance with the time code file 12f23, the search headword “low” and the highlight (identification) display HL of the pronunciation symbol in the text synchronized playback window W1. In the image synchronized playback window W2, the mouth image area (X1, Y1; X2) is set based on the preset character image 12d (No. 3). To Y2), the sound outlet type image 12e corresponding to the mouth numbers of phonetic symbols (No36 → No9 → No8) are displayed while being sequentially switched synthesized is read from the audio another item image data 12e.
[0138]
In addition, when switching and displaying the highlight (identification) display HL and the pronunciation mouth image 12e (No9) in synchronization with the output of the pronunciation sound for the accent character “o” of the search headword “Low”, the mouth shape is concerned. An accent-corresponding face image 12d (No3 ') representing a state in which the character (face) image 12d (No3), which is the synthesis destination of the image 12e (No9), is pronounced strongly by, for example, sweating of the head or shaking of the mouth Therefore, the user can pronounce the pronunciation sound of the search word “Low”, its utterance timing, the corresponding part of each character “L”, “o”, “w” and its pronunciation symbol, and each sound mouth type image. 12e (No. 36 → No. 9 → No. 8) can be easily learned not only by the respective synchronized playback, but also can be realistically learned when to emphasize the utterance according to the accent. Become.
[0139]
Furthermore, according to the synchronized playback function of pronunciation voice / text / speaking mouth type image accompanying the search for the headword by the mobile device 10 of the first embodiment having the above-described configuration, for example, a dictionary database having phonetic symbols of the US dialect and the English dialect When the headword search is performed based on 12b, as shown in FIG. 15 or FIG. 16, the US sound [US] or the English sound [English] is designated and the “translation / decision (voice)” key 17a4 is set. When the operation is performed, the search headword “laugh” and the highlight (identification) of the US sound or English phonetic symbol are displayed in the synchronized text playback window W1 in synchronization with the specified US sound or English sound. While the HL is sequentially performed, the preset character image 12d (No1) is displayed in the image synchronous reproduction window W2 as a US sound expression (No1US) or an English sound expression (No1UK). For the mouth image area (X1, Y1; X2, Y2) displayed, the sound mouth type image 12e (Non1 → Non2 → Non3) corresponding to the mouth number of each phonetic symbol of American sound or English sound is a voice-specific mouth. Since it is read out from the image data 12e, sequentially switched, synthesized, and displayed, the phonetic voice of the US dialect corresponding to the search headword and the phonetic symbol / spoken mouth type and the phonetic voice of the English dialect and the phonetic symbol / It becomes possible to learn clearly from the pronunciation type.
[0140]
Further, according to the synchronized playback function of pronunciation speech / text / speech mouth type image accompanying the search for the entry word by the portable device 10 of the first embodiment having the above-described configuration, each found word recorded in the dictionary database 12b When the correct accent phonetic symbol and the wrong accent phonetic symbol are included and the “accent test” key 17a6 is operated as shown in FIGS. 17 and 18, the randomly selected found word “low” is displayed. A correct accent phonetic symbol and a wrong accent phonetic symbol are displayed as an accent test question display screen G3. When a correct accented phonetic symbol is selected, each of the pronunciation mouth type images 12e (No36 → No9 → No8) based on the normal set character image 12d (No1) in synchronization with the correct pronunciation voice output. When the phonetic symbol of the wrong accent is selected and each of the phonetic symbols is selected based on the character image 12d (No1BL) whose color is changed in blue in synchronization with the erroneous phonetic sound output. The combined display of images 12e (No36 → No9 → No8) is performed, and the character image 12e (No1) (No1BL) as the mouth-shaped image composite base is accent-compatible even when the correct and incorrect accent parts are reproduced. Since the character image 12e (No1 ') (No1BL') is displayed in a modified manner, correct access of various words is possible. And pronunciation cement, and pronunciation wrong accent, it becomes possible to clearly learn the synchronous reproduction of the audio-text image corresponding to the respective.
[0141]
In the first embodiment, the synchronized live reprocessing of the pronunciation speech / text (with pronunciation symbols) / speech mouth type image corresponding to the search headword is performed by the synchronized playback processing according to the time code file 12f. Pronunciation type corresponding to the phonetic symbol corresponding to the identification display character by the text corresponding mouth display processing executed by interruption in accordance with the sequential identification display of the text characters synchronized with each other and the sequential identification display of each character As described in the second embodiment, various phonetic symbols including phonetic symbols with accent symbols and their respective phonetic voice data and phonetic face images are stored in advance. A plurality of sets are stored in association with each other, and as the headword characters to be reproduced are highlighted in order from the top, the phonetic symbols of the highlighted characters are sequentially It may be configured to perform display of output and face image data of the sound speech data correlated.
[0142]
(Second Embodiment)
FIG. 19 is a flowchart showing a headword synchronized reproduction process of the mobile device 10 according to the second embodiment.
[0143]
That is, in the mobile device 10 according to the second embodiment, various phonetic symbols including accented phonetic symbols, their respective phonetic voice data, and phonetic voice data corresponding to the various phonetic symbols. A plurality of pronunciation face images composed of mouth portions and facial expressions of different forms are stored in the memory 12 in advance.
[0144]
For example, an arbitrary headword “low” is input and searched for an English-Japanese dictionary stored in advance as the dictionary database 12b, and displayed as a search headword display screen G2 as shown in FIG. When the “translation / decision (speech)” key 17a4 is operated in order to perform the synchronized reproduction of the pronunciation voice and the pronunciation face image, the synchronous reproduction process of the second embodiment shown in FIG. 19 is started. The
[0145]
When the synchronous playback process of the second embodiment is started, as shown in FIG. 12 or FIG. 13, the text synchronous playback window W1 is first opened on the search word display screen G2, and the search word search is started. The characters “low” and the phonetic symbols are highlighted and displayed HL from the top in the order of their pronunciation (step C1). Then, the phonetic symbol of the found character that has been highlighted and displayed HL is read (step C2), and it is determined whether or not it has an accent symbol (step C3).
[0146]
Here, as shown in FIG. 12 (B) (1) or FIG. 13 (B) (1), the phonetic symbol of the character “l” in the found word “low” highlighted HL this time is without an accent symbol. In some cases, unaccented pronunciation voice data corresponding to the phonetic symbol stored in advance in the memory 12 is read out and output from the stereo voice output unit 19b (steps C3 → C4). The accented pronunciation face image is read and displayed on the image synchronous reproduction window W2 (step C5).
[0147]
Then, the character “o” next to the search word “low” currently being output is read (step C6 → C7), and the process returns to step C1 again. As shown by (2) in FIG. 13B, the emphasis identification display HL is displayed together with the phonetic symbol (step C1).
[0148]
If it is determined that the phonetic symbol of the letter “o” in the found word “low” highlighted HL this time has an accent symbol (steps C 2 and C 3), it is stored in the memory 12 in advance. Accented phonetic voice data corresponding to the phonetic symbol is read out and output from the stereo voice output unit 19b (step C3 → C8), and FIG. 12 (C) {circle around (2)} or FIG. As shown by 2 ▼, a pronunciation face image expressed with an accent by, for example, sweating of the head or shaking of the body is read out and displayed on the image synchronous reproduction window W2 (step C9).
[0149]
Therefore, even in the case of the mobile device 10 of the second embodiment, the output of the pronunciation sound and the pronunciation face image associated with the highlight (identification) display HL of the accent character “o” of the search headword “Low” are displayed. At the time of display, since the pronunciation face image is displayed as an accent-corresponding face image expressing a state of strong pronunciation due to sweating of the head or shaking of the body based on the accented phonetic symbol, the user can Not only can you easily learn each of the words “L”, “o”, “w” of the search headword “Low” and their pronunciation sounds, and each pronunciation face image by their corresponding output, but also utterance according to the accent The part to emphasize can be learned realistically.
[0150]
In the second embodiment, various phonetic symbols including accented phonetic symbols stored in advance in the memory 12, their respective phonetic voice data, and phonetic voice data corresponding to the various phonetic symbols. For the pronunciation face image consisting of mouth parts and facial expressions of different forms corresponding to, the output of the pronunciation speech associated with the accented phonetic symbol is set larger than the pronunciation speech associated with the accentless phonetic symbol, The opening degree of the mouth portion of the pronunciation face image associated with the accented phonetic symbol is set larger than the opening degree of the mouth portion of the pronunciation face image associated with the accented phonetic symbol. Further, the facial expression in the face image is set such that the facial expression of the pronunciation face image associated with the accented phonetic symbol is more emphasized than the facial expression of the pronunciation face image associated with the accentless pronunciation symbol.
[0151]
In the second embodiment, various phonetic symbols including accented phonetic symbols, their respective phonetic speech data, and different forms of mouth corresponding to the phonetic speech data corresponding to the various phonetic symbols. Pre-recorded pronunciation face images consisting of parts and facial expressions, highlight each character of the search headword in the order of its pronunciation, and read out and output the pronunciation sound associated with the pronunciation symbol, Although the corresponding pronunciation face image is read and displayed, as described in the next third embodiment, the pronunciation sound of the entry word corresponding to each entry word in the dictionary database 12b The pronunciation face image is stored in combination, and the pronunciation voice and the pronunciation face image are read and output along with the character display of the search headword, and the peak level of the pronunciation voice signal at this time is output. Determining an accent moiety by detecting the Le, it may be configured to change control to the mouth and facial expressions of the form a different display form of the sound face image.
[0152]
(Third embodiment)
FIG. 20 is a flowchart showing a headword synchronized reproduction process of the mobile device 10 according to the third embodiment.
[0153]
That is, in the mobile device 10 according to the third embodiment, the pronunciation sound and the pronunciation face image of the entry word are stored in combination in advance corresponding to each entry word in each dictionary data of the dictionary database 12b.
[0154]
For example, an arbitrary headword “low” is input and searched for an English-Japanese dictionary stored in advance as the dictionary database 12b, and displayed as a search headword display screen G2 as shown in FIG. In this state, when the “translation / decision (voice)” key 17a4 is operated so as to perform the synchronized reproduction of the pronunciation sound and the pronunciation face image, the synchronous reproduction process of the third embodiment shown in FIG. 20 is started. The
[0155]
When the synchronous playback process of the third embodiment is started, as shown in FIG. 12 or FIG. 13, the text synchronous playback window W1 is first opened on the search word display screen G2, and the search word search is started. Each character “low” is highlighted and displayed HL from the top in the order of pronunciation (step D1). Then, the pronunciation sound data of the portion corresponding to the found character that has been highlighted and displayed HL is read (step D2) and output from the stereo sound output unit 19b (step D3).
[0156]
Here, for example, whether the signal (waveform) level of the pronunciation sound data of the portion corresponding to the character “l” in the found word “low” highlighted this time HL is a sound signal level (accent portion) of a certain value or more. If it is determined (step D4) and it is determined that it is not higher than a certain audio signal level, that is, it is not an accent part, a pronunciation face image stored in association with the search word is read out. Then, it is displayed as it is on the image synchronous reproduction window W2 (step D5).
[0157]
Then, the character “o” next to the search headword “low” currently being output is read (step D6 → D7), and the process returns to the processing from step D1 again, and the highlight identification display HL is displayed (step D1). ).
[0158]
Then, the pronunciation sound data of the portion corresponding to the found character “o” that has been highlighted and displayed this time is displayed (step D2), output from the stereo sound output unit 19b (step D3), and the highlighted display. It is determined whether or not the signal (waveform) level of the pronunciation sound data of the portion corresponding to the word character “o” subjected to the HL is a sound signal level (accent portion) equal to or higher than a certain value (step D4).
[0159]
Here, when it is determined that the sound signal level is equal to or higher than a certain audio signal level, that is, an accent portion, the pronunciation face image stored in association with the search headword is read and the face image is The face image is controlled to be changed to a face image with a large opening degree and a strong expression (for example, FIG. 12 (B) (2) → FIG. 12 (C) (2)) and displayed on the image synchronous reproduction window W2 ( Step D4 → D8).
[0160]
If the sound signal waveform level of the pronunciation sound is determined to be a certain value or more and is determined to be an accent part, the display color of the highlighted search word is further changed or added. Or it is good also as a structure which changes and displays in the form which shows that it is a character of an accent part by changing a character font.
[0161]
Therefore, even in the case of the portable device 10 of the third embodiment, the output of the pronunciation sound and the pronunciation face image associated with the highlight (identification) display HL of the accent character “o” of the search headword “Low” are displayed. At the time of display, based on the sound signal level at that time being equal to or higher than a certain value, the sound generation face image is changed and displayed, for example, as an accent-corresponding face image with a large mouth opening and a strong expression. Therefore, the user can not only easily learn each character “L”, “o”, “w” of the search headword “Low” and their pronunciation sounds, and also the pronunciation face image by their corresponding outputs. This makes it possible to learn realistically the part that emphasizes the utterance according to the accent.
[0162]
In the description of the synchronized playback function for each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth image) of the search headword in each of the embodiments, the accent of the headword exists in one place. In the case where the accent of the search word is present in two places, the first accent and the second accent, the accent-corresponding pronunciation face image (corresponding pronunciation mouth to be displayed) corresponding to each accent portion is described. The type image) may be displayed in different forms depending on, for example, the size of the opening of the mouth, the strength of the facial expression, or the like depending on whether the first accent or the second accent.
[0163]
Note that each processing method by the portable device 10 described in each embodiment, that is, main processing according to the dictionary processing program 12a in the first embodiment shown in the flowchart of FIG. 7, the main shown in the flowchart of FIG. Entry-synchronized playback process associated with the process, text-corresponding mouth display process executed by interruption in accordance with the highlight display of each entry word associated with the entry-synchronized playback process shown in the flowchart of FIG. 9, FIG. Each method such as the headword synchronized playback process in the second embodiment shown in the flowchart of FIG. 20 and the headword synchronized playback process in the third embodiment shown in the flowchart of FIG. 20 can be executed by the computer. Programs include memory cards (ROM cards, RAM cards, DATA / CARD, etc.), magnetic disks (floppy disks, hard disks, etc.). De disks, etc.), optical disk (CD-ROM, DVD, etc.) can be distributed and stored in the external recording medium 13 such as a semiconductor memory. Various computer terminals having a communication function with the communication network (Internet) N read the program stored in the external recording medium 13 into the memory 12 by the recording medium reading unit 14, and the operation is performed by the read program. By being controlled, a synchronized reproduction function of each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth type image) corresponding to the search headword described in each embodiment is realized, and the method described above A similar process can be executed.
[0164]
The program data for realizing each of the above methods can be transmitted on a communication network (Internet) N in the form of a program code, and the above-mentioned data can be transmitted from a computer terminal connected to the communication network (Internet) N. It is also possible to capture the program data and realize a synchronized reproduction function of each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth image) corresponding to the search headword described above.
[0165]
Note that the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the invention at the stage of implementation. Further, each of the embodiments includes inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. For example, even if some constituent requirements are deleted from all the constituent requirements shown in each embodiment or some constituent features are combined, the problems described in the column of the problem to be solved by the invention can be solved. When the effects described in the column of the effect of the invention can be obtained, a configuration in which these constituent elements are deleted or combined can be extracted as an invention.
[0168]
[The invention's effect]
  As aboveClaims of the invention1(Claims2) Related voice display output control device (voice display output control processing program)In addition to outputting correct accent pronunciation data and erroneous accent pronunciation data for words stored by the word storage means, the display of word text synchronized with the pronunciation sound data and the mouth portion included in the display image Mouth-shaped images corresponding to phonetic sound data can be displayed, and the display image can be changed according to the detection of word accents, so that correct and incorrect accents can be learned easily and clearly, and word storage It is possible to select the correct accented phonetic symbol or the error accented phonetic symbol for the word stored by the means and output the pronunciation voice data, and the word text synchronized with the pronunciation voice data is displayed and included in the display image Corresponds to pronunciation voice data for mouth part Can display mouth type images, it is possible to change a display image in response to the detection of a word accent, so the correct accents and error accent of the word can be learned more easily and clearly timing.
[0175]
Therefore, according to the present invention, an audio display output control device, an image display control device, and an audio display output control processing program that can clearly show the timing of accents in the display of an image synchronized with audio output, an image A display control processing program can be provided.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an electronic circuit of a mobile device 10 according to an embodiment of an audio display output control device (image display control device) of the present invention.
FIG. 2 is a diagram showing link data for synchronous playback for one headword “low” in the dictionary database 12b stored in the memory 12 of the mobile device 10, and FIG. (B) shows the text data “low” stored according to the text file No., and (C) shows the text stored according to the text mouth synchronization file No. The figure which shows a character, a phonetic symbol, and a mouth type number.
FIG. 3 is a diagram showing character image data 12d stored in the memory 12 of the portable device 10 and selectively used by user settings for synchronous display of pronunciation-portable images in a dictionary search for a dictionary.
FIG. 4 is stored in the memory 12 of the portable device 10 and is used for the synchronous display of pronunciation mouth-type images in the search for a dictionary entry word. The mouth image areas (X1, Y1, X1) of the character image (12d: No1 to No3) are used. The figure which shows the audio | voice separate mouth image data 12e synthesize | combined and displayed on (X2, Y2).
5 is a view showing a time code file 12f23 (12i) of a file No. 23 associated with the headword “low” in the dictionary time code file 12f stored in the memory 12 of the portable device 10. FIG.
6 is a diagram showing the command codes of various commands described in the dictionary time code file 12fn (see FIG. 5) of the mobile device 10 in association with the command contents to be analyzed based on the parameter data.
FIG. 7 is a flowchart showing main processing according to the dictionary processing program 12a of the mobile device 10;
FIG. 8 is a flowchart showing a headword synchronized reproduction process accompanying a main process of the mobile device 10;
FIG. 9 is a flowchart showing a text corresponding mouth display process executed by interruption in accordance with the highlight display of each headword character accompanying the headword synchronized playback process of the mobile device 10;
10 is a view showing a setting display state of a character image for synchronous reproduction accompanying a character setting process in the main process of the portable device 10. FIG.
FIG. 11 is a diagram showing a search headword display screen G2 associated with the headword search process in the main process of the mobile device 10;
12 is a headword character display window W1 and a pronunciation window that are displayed on a search headword display screen G2 in the setting state of the character image No3 in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. FIG. 6A is a diagram showing the display state of the type display window W2, and FIG. 6A is a diagram showing the setting display state of the headword character display window W1 and the pronunciation type display window W2 for the search headword display screen G2. FIG. (B) is a diagram showing a change state of the headword character display window W1 synchronized with the output of the pronunciation sound and the pronunciation type display window W2 not corresponding to the accent, and FIG. (C) is synchronized with the output of the pronunciation sound. The figure which shows the change state of the headword character display window W1 and the pronunciation corresponding | compatible mouth display window W2.
13 shows a headword character display window W1 and a pronunciation window displayed on the search headword display screen G2 in the setting state of the character image No1 in accordance with the synchronous reproduction processing in the headword search processing of the portable device 10. FIG. FIG. 6A is a diagram showing the display state of the type display window W2, and FIG. 6A is a diagram showing the setting display state of the headword character display window W1 and the pronunciation type display window W2 for the search headword display screen G2. FIG. 5B is a diagram showing a change state of the headword character display window W1 and the pronunciation window display window W2 synchronized with the output of the pronunciation sound.
FIG. 14 is a diagram showing a search headword display screen G2 when an English-Japanese dictionary containing pronunciation forms of two countries of the United States / UK is used in the headword search process in the main process of the mobile device 10;
FIG. 15 is a headword character display window that is displayed on the search headword display screen G2 when an American pronunciation [US] is designated in accordance with the synchronized playback process in the headword search process of the portable device 10; It is a figure which shows the display state of W1 and the pronunciation type display window W2, The figure (A) shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search word display screen G2. FIG. 6B is a diagram showing a change state of the headword character display window W1 and the pronunciation-mouth type display window W2 synchronized with the output of the American pronunciation sound.
FIG. 16 is a headword character display window that is displayed on the search headword display screen G2 when English pronunciation [English] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10; It is a figure which shows the display state of W1 and the pronunciation type display window W2, The figure (A) shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search word display screen G2. FIG. 6B is a diagram showing a change state of the headword character display window W1 and the pronunciation-mouth type display window W2 synchronized with the output of the English pronunciation sound.
FIG. 17 is a diagram showing an operation display state when an incorrect answer is selected in accordance with the accent test process of the mobile device 10; FIG. 17A is a diagram showing an accent test question display screen G3; ) Is a diagram showing the setting display states of the headword character display window W1 and the pronunciation type display window W2 for the headword display screen G2 to be examined, and FIG. 10C is synchronized with the output of the pronunciation sound of the error accent. The figure which shows the change state of the headword character display window W1 and the pronunciation-mouth type display window W2.
FIG. 18 is a diagram showing an operation display state when a correct answer is selected in accordance with the accent test process of the portable device 10; FIG. 18A shows an accent test question display screen G3, and FIG. Is a diagram showing the setting display state of the headword character display window W1 and the pronunciation window display window W2 with respect to the headword display screen G2 to be examined, and FIG. The figure which shows the change state of the outgoing character display window W1 and the pronunciation type display window W2.
FIG. 19 is a flowchart showing a headword synchronized playback process of the second embodiment of the mobile device 10;
FIG. 20 is a flowchart showing a headword synchronized reproduction process of the third embodiment of the mobile device 10;
[Explanation of symbols]
10 ... Mobile device
11 ... CPU
12 ... Memory
12A ... FLASH memory
12B ... RAM
12a ... Dictionary processing program
12b ... Dictionary database
12c ... Dictionary voice data
12d: Character image data
12d (No. n) ... set character image
12d (No. n ') ... face image corresponding to accent
12d (No. nUS) ... set character image for American language
12d (No. nUS ') ... Face image for American accent
12d (No. nUK) ... English set character image
12d (No. nUK ') ... Face image for English accents
12d (No. nBL) ... Blue change setting character image
12d (No. nBL ') ... Blue face image corresponding to accent
12e ... Voice-specific mouth image data
12f ... Dictionary time code file
12g ... Yomi word data memory
12h ... Dictionary data memory for headwords
12i Time code file No23
12j ... HTML file memory for synchronization
12k ... Text file memory for synchronization
12m ... Sound file memory for synchronization
12n ... Image file memory for synchronization
12p ... Mouth image area memory
12q ... Image development buffer
13: External recording medium
14 ... Recording medium reader
15 ... Transmission control unit
16: Communication department
17a ... Input section
17b ... Coordinate input device
18 ... Display section
19a ... Voice input part
19b ... Stereo audio output unit
20 ... Communication equipment (home PC)
30: Web server
N ... Communication network (Internet)
X: Selected frame
H ... Header information of time code table
G1 ... Character image list selection screen
G2 ... Yomi terms search screen
G3 ... Accent test questions display screen
W1 ... Headword character display window (text synchronous playback window)
W2… Speaking mouth type display window (window for synchronized playback of images)
HL ... Highlight (identification) display
Et ... Correct accent selection item
Ef: Error accent selection item

Claims (2)

  1. A word storage means for storing a plurality of words and phonetic symbols with correct accents and phonetic symbols with error accents for each of the words,
    Voice data output means for outputting correct accent pronunciation voice data or erroneous accent pronunciation voice data of the word stored by the word storage means;
    Text synchronous display control means for displaying the text of the word in synchronism with pronunciation voice data of the word output by the voice data output means;
    Image display control for displaying an image including at least a mouth portion in different display forms when the sound data of correct accent is output by the sound data output means and when sound data of error accent is output Means,
    A mouth image that displays a mouth-shaped image corresponding to the pronunciation sound data in synchronism with the pronunciation sound data output by the sound data output means for the mouth portion included in the image displayed by the image display control means Display control means;
    Accent detection means for detecting the accent of the word from the accented phonetic symbol stored by the word storage means in accordance with the synchronous display of the word text by the text synchronization display control means,
    An image change display control means for changing an image displayed by the image display control means in response to detection of an accent by the accent detection means;
    Correct / incorrect accent display control means for displaying a word stored by the word storage means and a correct accented pronunciation symbol and an error accented pronunciation symbol associated with the word side by side;
    A correct / incorrect accent selection means for selecting either a correct accented phonetic symbol or an error accented phonetic symbol of the word displayed by the correct / incorrect accent display control means,
    The voice data output means outputs correct voice pronunciation data or correct accent voice data of the corresponding word in accordance with correct / incorrect word accent selection by the correct / incorrect accent selection means.
    An audio display output control device characterized by the above.
  2. An audio display output control processing program for controlling a computer of an electronic device to synchronously reproduce audio data, text, and an image,
    The computer,
    Word storage means for storing a plurality of words and phonetic symbols with correct accents and phonetic symbols with error accents associated with each of the words,
    Voice data output means for outputting correct accent pronunciation voice data or error accent pronunciation voice data of the word stored by the word storage means;
    Text synchronized display control means for displaying the text of the word in synchronization with the pronunciation voice data of the word output by the voice data output means;
    Image display control for displaying an image including at least a mouth portion in different display forms when the sound data of correct accent is output by the sound data output means and when sound data of error accent is output means,
    A mouth image that displays a mouth-shaped image corresponding to the pronunciation sound data in synchronism with the pronunciation sound data output by the sound data output means for the mouth portion included in the image displayed by the image display control means Display control means,
    Accent detection means for detecting the accent of the word from the accented phonetic symbol of the corresponding word stored by the word storage means in accordance with the synchronous display of the word text by the text synchronous display control means,
    Image change display control means for changing the image displayed by the image display control means in accordance with the detection of accents by the accent detection means;
    Correct / incorrect accent display control means for displaying the word stored by the word storage means and the correct accented pronunciation symbol and the error accented pronunciation symbol associated with the word side by side;
    Correct / incorrect accent selection means for selecting either correct accented phonetic symbols or error accented phonetic symbols of the words displayed by the correct / incorrect accent display control means,
    Function as
    The voice data output means functions to output correct accent pronunciation voice data or erroneous accent pronunciation voice data of the corresponding word according to correct / incorrect word accent selection by the correct / incorrect accent selection means,
    A computer-readable audio display output control processing program.
JP2003143499A 2003-05-21 2003-05-21 Voice display output control device and voice display output control processing program Active JP4370811B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2003143499A JP4370811B2 (en) 2003-05-21 2003-05-21 Voice display output control device and voice display output control processing program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2003143499A JP4370811B2 (en) 2003-05-21 2003-05-21 Voice display output control device and voice display output control processing program

Publications (2)

Publication Number Publication Date
JP2004347786A JP2004347786A (en) 2004-12-09
JP4370811B2 true JP4370811B2 (en) 2009-11-25

Family

ID=33531274

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2003143499A Active JP4370811B2 (en) 2003-05-21 2003-05-21 Voice display output control device and voice display output control processing program

Country Status (1)

Country Link
JP (1) JP4370811B2 (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4626310B2 (en) * 2005-01-12 2011-02-09 ヤマハ株式会社 Pronunciation evaluation device
KR100593757B1 (en) 2005-02-14 2006-06-20 유철민 Foreign language studying device for improving foreign language studying efficiency, and on-line foreign language studying system using the same
WO2006085719A1 (en) * 2005-02-14 2006-08-17 Hay Kyung Yoo Foreign language studying device for improving foreign language studying efficiency, and on-line foreign language studying system using the same
JP4678672B2 (en) * 2005-03-09 2011-04-27 誠 後藤 Pronunciation learning device and pronunciation learning program
JP2006301063A (en) * 2005-04-18 2006-11-02 Yamaha Corp Content provision system, content provision device, and terminal device
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
JP4840052B2 (en) * 2006-09-28 2011-12-21 カシオ計算機株式会社 Pronunciation learning support device and pronunciation learning support program
KR100816378B1 (en) 2006-11-15 2008-03-25 주식회사 에듀왕 Method for studying english pronunciation using basic word pronunciation
WO2009066963A2 (en) * 2007-11-22 2009-05-28 Intelab Co., Ltd. Apparatus and method for indicating a pronunciation information
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
CN101383102A (en) * 2008-10-24 2009-03-11 无敌科技(西安)有限公司 Simulation video and audio synchronous display apparatus and method
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
WO2011152575A1 (en) 2010-05-31 2011-12-08 주식회사 클루소프트 Apparatus and method for generating vocal organ animation
US9483461B2 (en) * 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9578173B2 (en) 2015-06-05 2017-02-21 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179309B1 (en) 2016-06-09 2018-04-23 Apple Inc Intelligent automated assistant in a home environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
CN109147430A (en) * 2018-10-19 2019-01-04 渭南师范学院 A kind of teleeducation system based on cloud platform

Also Published As

Publication number Publication date
JP2004347786A (en) 2004-12-09

Similar Documents

Publication Publication Date Title
US5822720A (en) System amd method for linking streams of multimedia data for reference material for display
US6397183B1 (en) Document reading system, read control method, and recording medium
US6963841B2 (en) Speech training method with alternative proper pronunciation database
US7693717B2 (en) Session file modification with annotation using speech recognition or text to speech
KR900009170B1 (en) Synthesis-by-rule type synthesis system
US7292980B1 (en) Graphical user interface and method for modifying pronunciations in text-to-speech and speech recognition systems
US5697789A (en) Method and system for aiding foreign language instruction
US7107533B2 (en) Electronic book with multimode I/O
CN1206620C (en) Transcription and display input speech
US6115686A (en) Hyper text mark up language document to speech converter
Bucholtz Variation in transcription
JP5026452B2 (en) Text processor
US7149690B2 (en) Method and apparatus for interactive language instruction
US6377925B1 (en) Electronic translator for assisting communications
AU2016202974B2 (en) Automatically creating a mapping between text data and audio data
US6181351B1 (en) Synchronizing the moveable mouths of animated characters with recorded speech
TWI488174B (en) Automatically creating a mapping between text data and audio data
US6985864B2 (en) Electronic document processing apparatus and method for forming summary text and speech read-out
US8392186B2 (en) Audio synchronization for document narration with user-selected playback
US20080005656A1 (en) Apparatus, method, and file format for text with synchronized audio
US20030191645A1 (en) Statistical pronunciation model for text to speech
US20100036664A1 (en) Subtitle generation and retrieval combining document processing with voice processing
JP2007206317A (en) Authoring method and apparatus, and program
US8209169B2 (en) Synchronization of an input text of a speech with a recording of the speech
US20020054073A1 (en) Electronic book with indexed text-to-audio switching capabilities

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20060502

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20090402

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090421

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20090612

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20090811

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20090824

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120911

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130911

Year of fee payment: 4