WO2006001204A1

WO2006001204A1 - Automatic translation device and automatic translation method

Info

Publication number: WO2006001204A1
Application number: PCT/JP2005/010946
Authority: WO
Inventors: Hiromichi Ishibashi
Original assignee: Matsushita Electric Industrial Co., Ltd.
Priority date: 2004-06-23
Filing date: 2005-06-15
Publication date: 2006-01-05
Also published as: JP2007272260A

Abstract

An automatic translation device comprises a first speech analysis module for analyzing a speech spoken in a first language and generating a speech in a first intermediate language, a speech synthesis module for reconverting the speech in the first intermediate language into a speech in the first language and generating a speech signal, a first translation module for converting the speech in the first intermediate language into a speech in a second language, and a character generating module for displaying the speech on a display. Conversation information is aurally conveyed to the person speaking in the first language and wearing this automatic translation device and visually conveyed to the person speaking in the second language. In each conveying form, the content that one of the persons has spoken can be ascertained by the person by oneself.

Description

Specification

Automatic translation apparatus and automatic translation method

Technical field

[0001] The present invention relates to an automatic translation apparatus and an automatic translation method for executing interpretation processing between two languages in real time and in a user-friendly manner.

Background art

[0002] Conventional automatic translation apparatuses include those listed below. First, Patent Document 1 recognizes the voice of a speaker in the first language, translates it into a second language, converts it into a further language, and converts it to a listener in the second language, and vice versa. Is described. Further, Patent Document 2 describes an invention in which the translated second language is output by a dedicated speaker provided in a mobile phone. Furthermore, Patent Document 3 describes a technique for performing translation between two languages using a display display that does not use voice. Furthermore, Patent Documents 4 to 6 describe configurations suitable for a portable type. Further, Patent Document 7 describes a technique in which both voice and display are used and a voice input confirmation display is displayed on the display. In particular, in Patent Document 7, in order to allow a certain amount of “listening mistakes” in view of the fact that today's technology still does not fully recognize spoken language when performing automatic translation of speech input, It is stated that the first and second languages are displayed simultaneously.

Patent Document 1: Japanese Patent Laid-Open No. 9 292971

Patent Document 2: JP-A-8-265445

Patent Document 3: Japanese Patent Laid-Open No. 6-119372

Patent Document 4: JP-A-6-325072

Patent Document 5: JP-A-8-77176

Patent Document 6: Japanese Patent Laid-Open No. 9 160881

Patent Document 7: JP-A-8-278972

Disclosure of the invention

Problems to be solved by the invention [0004] However, the conventional configurations listed above have problems in practical use. First, the problem of the translation-to-speech translation method described in Patent Document 1 will be described. At first glance, this method has a problem when it is used in the field even in the field of mopile power, which is excellent in that conversation between two languages can be realized as if through a simultaneous interpreter. In other words, if you want to convey information to the other party by voice, it is necessary to emit the voice at a sufficient volume. In particular, when using outdoors, it is necessary to have a volume that can overcome the surrounding noise. For this reason, there is a method that uses a speaker with a sufficiently large output. In this case, the shape of the device becomes large, and the power consumption also increases fatal problems. Next, if the cellular phone format described in Patent Document 2 is adopted, the ear and the speaker can be brought close to each other, and the problem of size and power does not occur. However, it was cumbersome not only for me but also for the other party to wear equivalent devices.

[0005] Rather than voice conversation, a display method, that is, a method of talking with the other party using characters (see, for example, Patent Documents 3 to 7) is considered suitable for use in mopile. However, it is difficult to communicate because it is more difficult to communicate with the person who is looking at the display of the translator than the other party during conversation that is not natural in terms of daily conversation. There may still be a safety problem (for example, picking a spot that is distracted by writing).

[0006] An object of the present invention is to provide an automatic translation apparatus that executes interpretation processing between two languages in real time and in a user-friendly manner.

Means for solving the problem

[0007] An automatic translation apparatus according to the present invention comprises: a first speech analysis module that analyzes speech spoken in a first language and generates a first intermediate language;

A first translation module for converting the first intermediate language into a second language;

And a character synthesis module for displaying the converted second language on a display.

[0008] The first intermediate language may further include a first speech synthesis module that reconverts the first intermediate language into a first national language and generates the reconverted speech signal of the first national language. Furthermore, before converted A second speech synthesis module that generates a speech signal in the second language may be further provided.

[0009] Still further, a confirmation button for confirming the first intermediate language is further provided. In addition, a correction button may be further provided that makes the first intermediate language indeterminate and prompts the user to re-input the voice in the first language.

[0010] Furthermore, a second speech analysis module that analyzes speech spoken in the second language and generates a second intermediate language may be further provided. Further, a second translation module for converting the second intermediate language into the first language may be further provided.

[0011] Note that the character synthesis module may re-convert the second intermediate language into a second language and display it on the display.

[0012] Further, a confirmation button for confirming the second intermediate language may be further provided. Further, a correction button may be further provided that makes the second intermediate language uncertain and prompts the user to re-input the voice in the second language.

[0013] Furthermore, a first selection unit that switches an input source of the character synthesis module between the first translation module and the second speech analysis module may be further provided. Further, a second selection unit that switches an input source of the first speech synthesis module between the first speech analysis module and the second translation module may be further provided.

[0014] An automatic translation method according to the present invention comprises analyzing a speech spoken in a first language and generating a first intermediate language;

Converting the first intermediate language into a second language;

Displaying the converted second language on a display;

It is characterized by including.

[0015] Further, the method may further include a step of reconverting the first intermediate language into a first language and generating a reconverted speech signal of the first language. Furthermore, the method may further comprise the step of generating a converted speech signal of the second language.

[0016] Further, the method may further include a step of confirming the first intermediate language by listening to the voice signal of the first language. In addition, the method may further include a step of listening to the first national language speech signal to make the first intermediate language uncertain and prompting re-input of the first national language speech.

[0017] Further, the step of analyzing the speech spoken in the second language and generating the second intermediate language is further performed. May be included. Furthermore, the method may further include a step of converting the second intermediate language into the first language. Further, the method may further include a step of re-converting the second intermediate language into the second language and displaying it on the display!

[0018] Further, the method may further include the step of confirming the second intermediate language by viewing the second national language displayed on the display. In addition, the first displayed on the display

It may further include a step of looking at the bilingual language and making the second intermediate language uncertain and prompting the user to re-enter the speech of the second national language.

[0019] The automatic translation method may be configured as a computer-executable program. Furthermore, the above program may be stored in a computer-readable recording medium.

The invention's effect

[0020] With this configuration, the conversation information is transmitted to the oral language for the first language (Japanese) speaker wearing the device, and visually to the second language (English) speaker. In each transmission mode, the speaker can recognize the contents spoken by each speaker.

[0021] According to the automatic translation apparatus of the present invention, conversation between two languages can be realized in a more natural form suitable for mopile.

Brief Description of Drawings

FIG. 1 is a block diagram of an automatic translation apparatus according to Embodiment 1 of the present invention.

FIG. 2 is a partial external view of the automatic translation apparatus according to Embodiment 1 of the present invention.

FIG. 3 is a conceptual diagram for explaining the operation of the automatic translation apparatus according to Embodiment 1 of the present invention.

FIG. 4 is a conceptual diagram for explaining the operation of the automatic translation apparatus according to Embodiment 1 of the present invention.

FIG. 5 is a block diagram showing an internal configuration of a first translation module.

FIG. 6 is a flowchart of an automatic translation method according to the first embodiment of the present invention.

FIG. 7 is a flowchart of an automatic translation method according to the first embodiment of the present invention.

FIG. 8 is a block diagram of an automatic translation apparatus according to Embodiment 2 of the present invention. FIG. 9 is a partial external view of an automatic translation apparatus according to Embodiment 2 of the present invention. Explanation of symbols

[0023] 1, 7 microphone

2, 8 Voice analysis module

3, 51 Speech synthesis module

4, 10 translation module

5 Character synthesis module

6 display

9, 12 selector

11, 61 Speaker

20, 80 dictionaries

21 Word Analysis Department

22 Grammar Analysis Department

23 Semantic analysis part

BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, an automatic translation apparatus and an automatic translation method according to an embodiment of the present invention will be described with reference to the drawings. In the drawings, substantially the same members are denoted by the same reference numerals.

[Embodiment 1]

FIG. 1 is a block diagram showing the configuration of the automatic translation apparatus according to Embodiment 1 of the present invention. FIG. 2 is a partial external view. This automatic translation apparatus includes a first microphone 1, a first speech analysis module 2, a first speech synthesis module 3, a first translation module 4, a character synthesis module 5, a display 6, a second microphone 7, a second speech analysis module 8 A second translation module 10, a speaker 11, a first selector 9, a second selector 12, a first dictionary 20, and a second dictionary 80. The first microphone 1 converts the first language spoken by the first language speaker into an electrical signal. The first speech analysis module 2 converts the first language speech converted to the electrical signal into the first intermediate language. The first speech synthesis module 3 generates a first language speech signal from the first intermediate language, and also generates 11 speakers. The first translation module 4 converts the first intermediate language into the second Convert to Japanese. The character synthesis module 5 displays the second language on the display 6. The second microphone 7 converts the second language spoken by the second language speaker into an electrical signal. The second speech analysis module 8 converts the second language speech converted into the electrical signal into the second intermediate language. The second translation module 10 converts the second intermediate language into the first language. The first selector 9 switches the input source to the character synthesis module 5 between the first translation module 4 and the second speech analysis module 8. The second selector 12 switches the input source to the first speech synthesis module 3 between the first speech analysis module 2 and the second translation module 10. The first dictionary 20 is used in the first speech analysis module 2 and the first speech synthesis module 3. The second dictionary 80 is used in the second character synthesis module 5 and the second speech analysis module 8.

[0026] It should be noted that this automatic translation apparatus has, for example, the first microphone 1, the first speech analysis module 2, and the first translation as the most basic configuration in which not all of the above-mentioned members are essential members. It only needs to have module 4 and character synthesis module 5. In this case, the display 6 for display may be a display of a mobile phone or another mobile terminal. With this basic configuration, the first language spoken by the first language speaker is converted into the second language and displayed on the display, so that the second language speaker can easily understand visually. Further, as an applied configuration of the automatic translation apparatus, the first speech synthesis module 3 and the speaker 11 may be provided in addition to the basic configuration described above. With this applied configuration, the first language spoken by the first language speaker is once converted to the first intermediate language, and then the first language speaker himself hears the re-converted first language speech. It is possible to confirm whether or not the Japanese language voice analysis has been performed correctly. Furthermore, in a more applied manner, the second speech analysis module 8 and the second translation module 10 can be further provided. With this more applied configuration, only the first language speaker can install a two-way conversation between the first language speaker and the second language speaker by installing this automatic translation device. In this case, the first language speaker can orally recognize the translation result, and the second language speaker can visually recognize the translation result.

FIG. 5 is a block diagram showing a detailed configuration of the first speech analysis module 2. The first speech analysis module ² includes a word analysis unit 21, a grammar analysis unit 22, and a semantic analysis unit 23. The word analysis unit 21 recognizes a word from the voice signal. The grammar analyzer 22 The semantic analysis unit 23 analyzes the grammatical relationship between the word strings, extracts the meaning of each word, and outputs the first intermediate language corresponding to the extracted meaning. Here, “intermediate language” means a data string in which the meaning of words and syntactic attributes (ie, classification and connection of verbs, nouns, and adjectives) become clear. The second speech analysis module 8 is different from the first speech analysis module 2 in that it converts the second language speech signal into the second intermediate language, but can be realized by a configuration similar to the configuration shown in FIG.

Next, the operation of this automatic translation apparatus will be described with reference to FIGS. 1 to 4.

FIG. 2 is a schematic diagram showing a partial appearance of an automatic translation apparatus used in a conversation between a first language speaker and a second language speaker. In Fig. 2, the right force also shows the front and side views of the first language speaker and the side view of the second language speaker. This automatic translation apparatus is worn by a first language speaker. FIG. 2 shows a microphone 1, a speaker 11, a display 6, and a microphone 7. Fig. 2 shows an example in which “Ookini” spoken by the first language speaker is converted to the second language “Thank youj” and displayed on the display 6. Fig. 3 shows the first language speaker. The voice spoken by the Japanese (Japanese “Ookini”) is converted to an intermediate language code, then converted back to Japanese “Thank you” and confirmed, and converted to a second language (English “Thank you”) FIG. 6 is a diagram conceptually showing a process that is displayed on the display 6. Figure 4 shows that the speech spoken by a second language speaker (English “You are welcome”) is converted to an intermediate language code and then converted back to English “You are welcome”. It is a figure which shows notionally the process in which it is converted into 1 national language (Japanese "youtashimashi") and is output by voice.

First, the voice of a first language (hereinafter, Japanese) speaker is converted into an electrical signal by the microphone 1 and supplied to the first voice analysis module 2. The first speech analysis module 2 recognizes the speech signal and generates the first intermediate language C1. For example, as shown in FIG. 2, when a Japanese speaker utters “Ookini” as he or she urges microphone 1, this electric signal string is stored in (Japanese) dictionary 20 as shown in FIG. Based on the result of the analysis by the speech analysis module 2, a code ("CF5A88Dh" in Fig. 3) is generated as the first intermediate language meaning thanks.

[0030] After that, the first intermediate language C1 is converted into the second language (English) by the first translation module 4, and simultaneously supplied to the first speech synthesis module 3. Converted. As shown in Fig. 2, a Japanese speaker can hear the reconverted speech from the speaker 11 attached close to the ear. The first speech synthesis module 3 converts the first intermediate language (code) having the meaning of gratitude into Japanese using the dictionary 20. Here, the spoken word and the intermediate language do not necessarily correspond to 1: 1. For example, “Thank you”, “Ookini”, and “Dandan” (if registered in dictionary 20) may be converted to the same intermediate language “CF5A88Dh” that expresses gratitude. On the other hand, when converting from the common first intermediate language C1 to Japanese, the question is which Japanese to select. The meaning is understood by the speaker.

On the other hand, the second national language (English) converted by the first translation module 4 is documented by the character synthesis module 5 and displayed on the display 6.

[0032] The automatic translation method described above, which converts the speech spoken by the first language speaker into the second language and displays it on the display, will be described using the flowchart of FIG.

(a) First, processing is started (S01). The instruction to start the process is not particularly shown, but there is a method of pressing a button provided on the apparatus main body or the microphone, for example.

(b) Next, the operator (first language speaker) who operates the device inputs the first language into the microphone 1 (S02).

(c) Next, the input speech is converted to the first intermediate language C1 by the first speech analysis module 2 (S03). For speech analysis (recognition), specific speaker recognition technology can be used. For example, the first language speaker can dictate a predetermined basic sentence to the apparatus in advance, and can learn habits and unique passages in the apparatus. When the speech recognition is completed, the first intermediate language C1 is generated. Here, the first intermediate language C1 is an information stream whose meaning is fixed including attributes such as nouns and verbs for the first language.

(d) The first intermediate language C1 is supplied to the first speech synthesis module 3, converted into the first language, and output through the speaker 11 (S04). Here, the speaker 11 broadly includes a device that converts an electrical signal into sound, for example, a headphone and an earphone.

[0033] (e) The first language speaker listens to the speaker voice and the content of his speech is accurate. (S05). If it is determined that the recognition has been made in error, the first language speaker presses a correction button (not shown). On the other hand, if it is determined that the speaker is correctly recognized, the speaker presses a confirmation button (not shown).

In this way, once a Japanese speaker “listens” to “Thank you” that once converted the first intermediate language C1 generated in speech recognition into Japanese again, the speech that was spoken is recognized correctly. You can check if it ’s bad.

[0034] Note that the correction button is not particularly shown in the figure, but for example, a touch sensor provided on the display 6, a push button provided on the apparatus housing, and the like are conceivable. In addition, when using a headphone-type speaker, an acceleration sensor is installed there, and the same processing as pressing the correction button is performed when the head is shaken sideways, that is, when horizontal acceleration is detected. Can be done. On the other hand, the confirmation button that is pressed when it is determined that the dictation content has been correctly recognized can have the same mode as the correction button. In particular, when the acceleration sensor is used, when the vertical acceleration is detected by swinging the head vertically, the same processing as when the enter button is pressed can be performed. .

(F) Therefore, the apparatus determines whether or not the correction button Z confirmation button has been pressed (S06).

If the correction button is pressed, it is determined that the dictation content has been erroneously recognized, and the process returns to step S02.

(g) —If the confirm button is pressed, the first intermediate language C1 is confirmed and translated into the second language by the first translation module 4 (S07).

(h) Further, when the translation is completed, the second national language of the translation result is displayed on the display 6 by the character synthesis module 5 (S08).

As described above, the voice spoken by the first language speaker is converted into the second language and displayed on the display 6, so that the second language speaker can be visually recognized and is therefore affected by ambient noise. Without understanding the translation results.

[0036] Here, the first intermediate language C1 is translated by the first translation module 4 into the second intermediate language C2 of the second language (English). Since Japanese and English have different grammatical systems, intermediate language structures (modifications, connection relationships, etc.) must be converted based on the prescribed rules. . The translated second intermediate language C2 is documented by the character synthesis module 5 based on the dictionary 80 and displayed on the display 6. In the case of this embodiment, it is translated into “T hank you” representing gratitude. As shown in Fig. 2, display 6 should be worn around the chest of the first language (Japanese) speaker so that the second language (English) speaker can read with a natural perspective. It is good to do. In the arrangement shown in Fig. 2, the Japanese speaker cannot see the display 6 (natural appearance), but can already grasp the meaning of the spoken word accurately from the speaker 11. After that, you can think that the intermediate language is mechanically translated into English. If the display 6 can be read, the second language cannot be understood at all! / If the language is a language, do not use it! As a specific configuration of the display 6, a low power type reflective liquid crystal element, an electrophoretic element, or the like may be used. Also, it may use a liquid crystal display part of a mopile AV device such as a mobile phone or PDA, a video movie camera or a digital still camera.

Thus, the series of operations is completed.

Next, the case where a second language speaker responds will be described with reference to the flowchart shown in FIG.

(a) First, processing is started (Sl l). The instruction to start the process is not particularly shown, but there is a method of pressing a button provided on the apparatus main body or the microphone, for example. Further, the case where the acceleration sensor as described above is used is also conceivable. That is, the processing may be started when the head is swung vertically, that is, when acceleration in the vertical direction is detected. Note that this start process has a common purpose in the sense that the power process, which is the same as step 8 in FIG. 6, is advanced, and the first language speaker should not be confused. Conceivable.

(b) Next, the second language speaker (English speaker) speaks to the microphone 7 in the second language, and is converted into a voice signal by the microphone 7 (S12). In the example in Figure 4, it is “You are welcome”

(c) The second language speech signal is subjected to speech analysis (8) based on the dictionary 80 by the second speech analysis module 8, as shown in FIG. 4, and the second intermediate language C2 ("FAC653BOh" ) Is raw (S13). For speech analysis (speech recognition), unlike in the case of the first language, speakers of the second language often change from time to time, so use specific speaker recognition technology that can improve recognition efficiency. I can't. However, since the vocabulary that the second language speaker speaks is often “reply” to the first language speaker, the recognition of unspecified speakers is considered to be sufficient. Here, like the first intermediate language C 1 described above, the second intermediate language C 2 is an information stream whose meaning is fixed including attributes such as nouns and verbs for the second language.

(d) The second intermediate language C2 is converted into a second national language, documented by the character synthesis module 5, and displayed on the display 6 (S14).

[0038] (e) The second language speaker can confirm whether or not the word spoken by the second language speaker is correctly recognized (S15). Here, if it is determined that the recognition has been mistaken, the second language speaker presses a correction button (not shown). Alternatively, the second language speaker instructs the first language speaker to press the correction button by a gesture or the like. On the other hand, if it is determined that the speaker has been correctly recognized, the speaker presses a confirmation button (not shown).

In this way, when the English speaker "sees" the display of "You are welcome", which is the speech recognition 'second intermediate language C2 generated once again and converted into English again, the words spoken by me It is possible to check whether the force is recognized correctly.

Note that this correction button is not particularly shown, but can be realized by a touch sensor provided on the display 6, a push button provided on the apparatus housing, or the like. In addition, when using a headphone-type speaker, an acceleration sensor is provided there, and the above correction is made when a first language speaker shakes his / her head, that is, when a horizontal acceleration is detected. The same processing as when the normal button is pressed may be performed. Also, if it recognizes a specific word that represents negation (English.!), German “Nein!”, Chinese “Bu!”, Etc., the same processing as pressing the correction switch may be performed. Good. The above confirmation button can be realized in the same form as the above correction button. When the above acceleration sensor is used, the same processing as when the confirmation button is pressed may be performed when the head is swung vertically, that is, when acceleration in the vertical direction is detected. Good.

(F) Therefore, the apparatus determines whether or not the correction button Z confirmation button has been pressed (S 16). As described above, when it is determined that the dictation content of the second language speaker has been mistakenly recognized, the correction button is pressed or a corresponding means is taken, and the above processing is performed again after returning to step S12. Executed.

(g) —On the other hand, when the confirm button is pressed, the second intermediate language C2 is confirmed and translated into the first intermediate language C1 by the second translation module 10 (S17).

(h) Furthermore, if the translation is completed, the first national language of the translation result will be converted into the first national language by the first speech synthesis module 3, and the speaker 11 will also be able to produce a voice saying “you are welcome” in the first national language. (S18).

As described above, the speech spoken by the second language speaker is converted into the first language and output from the speaker 11 provided in the vicinity of the ear of the first language speaker. Translation results can be recognized orally without being affected by noise.

[0041] As described above, according to the present embodiment, automatic translation between two languages can be executed in a more natural manner with a configuration suitable for mopile. In other words, a Japanese speaker can hear the recognized self-language and the English speaker's language translated into Japanese from the speaker 11 in real time. Here, the “speaker” may be a so-called earphone. In addition, the English speaker can see the recognized self-word and the Japanese speaker's word translated into English on the display 6. This is possible even in the open air.

In FIG. 1, selectors 9 and 12 are used to switch between passing the own voice information and passing the other party's voice information. Specifically, it is controlled by a microprocessor (not shown), and for example, the audio input from microphone 1 and the audio input from microphone 2 are input according to the amplitude of each audio signal. The source may be switched.

[Embodiment 2]

FIG. 8 is a block diagram showing the configuration of the automatic translation apparatus according to Embodiment 2 of the present invention. Fig. 9 is a partial external view. This automatic translation apparatus is different from the automatic translation apparatus according to Embodiment 1 in that it includes a second speech synthesis module 51 and a second speaker 61. In FIG. 8, the same components as those in FIG. Omitted. In FIG. 8, the second speech synthesis module 51 converts only English converted by the first translation module 4 into a speech signal. As shown in FIG. 9, this audio signal is converted into audio by a second speaker 61 provided near the display 6. With the above operation, the English speaker can also visually confirm the translation of the spoken language of the Japanese speaker through the display 6, and in a relatively quiet environment, the second speaker 61 It is also possible to listen to the voice as a voice, and at the same time, it is possible to realize a conversation with higher accuracy.

Here, in the present embodiment, the second speaker 61 is separately provided in the vicinity of the display 6 as shown in FIG. 9. However, the present invention is not limited to these, and for example, a transparent piezoelectric speaker is used. May be pasted on the display surface of the display 6.

[0045] As described above, the present invention is described in detail with reference to the preferred embodiments. The present invention is not limited to these, but is within the technical scope of the present invention described in the claims. It will be apparent to those skilled in the art that preferred variations and modifications can be made.

Industrial applicability

[0046] The automatic translation apparatus according to the present invention has a function of transmitting conversation information to an oral speaker for a Japanese speaker and visually to an English speaker, and as an electronic interpreter for mopile, etc. Especially useful when used during overseas travel. It can also be used for international conferences.

Claims

The scope of the claims

[I] a first speech analysis module that analyzes speech spoken in a first language and generates a first intermediate language;

An automatic translation apparatus, comprising: a character synthesis module that displays the converted second language on a display.

[2] The first speech synthesis module according to claim 1, further comprising a first speech synthesis module that reconverts the first intermediate language into a first national language and generates the reconverted speech signal of the first national language. Automatic translation device.

[3] The automatic translation device according to [1] or [2], further comprising a second speech synthesis module that generates the converted speech signal of the second language.

[4] The automatic translation device according to any one of claims 1 to 3, further comprising a confirm button for confirming the first intermediate language.

[5] The automatic translation according to any one of claims 1 to 3, further comprising a correction button that makes the first intermediate language uncertain and prompts re-input of the speech of the first language. apparatus.

[6] The automatic speech processing device according to any one of claims 1 to 5, further comprising a second speech analysis module that analyzes speech spoken in a second language and generates a second intermediate language. Translation device.

7. The automatic translation apparatus according to claim 6, further comprising a second translation module that converts the second intermediate language into a first national language.

8. The automatic translation device according to claim 6 or 7, wherein the character synthesis module reconverts the second intermediate language into a second national language and displays it on a display.

9. The automatic translation apparatus according to claim 8, further comprising a confirm button for confirming the second intermediate language.

10. The automatic translation apparatus according to claim 8, further comprising a correction button that makes the second intermediate language uncertain and prompts the user to re-input the second language speech.

7. The apparatus of claim 6, further comprising a first selection unit that switches an input source of the character synthesis module between a first translation module and the second speech analysis module. Automatic translation device.

12. The automatic translation device according to claim 7, further comprising a second selection unit that switches an input source of the first speech synthesis module between the first speech analysis module and the second translation module. .

[13] analyzing speech spoken in a first language and generating a first intermediate language;

Converting the first intermediate language into a second language;

Displaying the converted second language on a display;

The automatic translation method characterized by including.

14. The automatic translation method according to claim 13, further comprising the step of reconverting the first intermediate language into a first national language and generating the reconverted speech signal of the first national language.

15. The automatic translation method according to claim 13, further comprising a step of generating a converted speech signal in the second language.

16. The automatic translation method according to claim 14, further comprising the step of confirming the first intermediate language by listening to the audio signal of the first language.

17. The automatic translation according to claim 14, further comprising the step of listening to the first national language speech signal and making the first intermediate language uncertain and prompting re-input of the first national language speech. Method.

[18] The automatic translation method according to any one of [13] to [17], further comprising the step of analyzing the speech spoken in the second language and generating a second intermediate language.

[19] The automatic translation method according to claim 18, further comprising a step of converting the second intermediate language into a first national language.

20. The automatic translation method according to claim 18, further comprising the step of reconverting the second intermediate language into a second national language and displaying the second intermediate language on a display.

21. The automatic translation method according to claim 20, further comprising the step of confirming the second intermediate language by looking at the second national language displayed on the display.

[22] The method may further include a step of urging the re-input of the second language speech by observing the second language displayed on the display and making the second intermediate language uncertain. 2 Automatic translation method described in 1.