WO2009104613A1 - Text conversion device, method, and program - Google Patents

Text conversion device, method, and program Download PDF

Info

Publication number
WO2009104613A1
WO2009104613A1 PCT/JP2009/052716 JP2009052716W WO2009104613A1 WO 2009104613 A1 WO2009104613 A1 WO 2009104613A1 JP 2009052716 W JP2009052716 W JP 2009052716W WO 2009104613 A1 WO2009104613 A1 WO 2009104613A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
text conversion
conversion
user
score
Prior art date
Application number
PCT/JP2009/052716
Other languages
French (fr)
Japanese (ja)
Inventor
玲史 近藤
康行 三井
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2009554331A priority Critical patent/JP5521554B2/en
Publication of WO2009104613A1 publication Critical patent/WO2009104613A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation

Definitions

  • the present invention relates to a text conversion device, a text conversion method, a text conversion program, a speech synthesizer, and a robot, and in particular, a text conversion device, a text conversion method, a text conversion program, and a voice that perform conversion that facilitates transmission of text contents.
  • the present invention relates to a synthesis apparatus and a robot.
  • Patent Document 1 discloses a sentence conversion technique for converting a character string described in a certain natural language into a character string in another representation of the natural language using a conversion rule prepared for each conversion purpose. Yes.
  • Patent Document 2 includes a driving load determination unit that determines a driving load of a driver who is driving a vehicle, controls the driver's voice output unit according to the driving load of the driver, and sets a reading pause period and speech rate. Techniques for changing are disclosed.
  • Patent Document 3 discloses a technique for controlling the sound quality of speech synthesis according to user attribute information in a voice response service device.
  • Patent Document 4 discloses a technique for estimating a speaker's emotion based on features such as sound pressure, pitch frequency, and duration of input speech.
  • An object of the present invention is to provide a text conversion device, a text conversion method, a text conversion program, a speech synthesizer, and a robot that perform text conversion so that the semantic content of the listener is easily transmitted in consideration of the psychological situation of the listener. There is.
  • Patent Document 1 includes, as a specific application example, an application example to (A) question answering system, (B) sentence compression system, (C) recommendation system, (D) difficult sentence conversion system, and Although there are examples of conversion to different expressions that will be performed in the past, there is no guarantee that the text after these conversions will be easily communicated to the listener.
  • Patent Document 1 suggests application to conversion / inverse conversion between written words and spoken words (paragraph 0064).
  • the purpose of conversion is to generate text that is easy to convey to the listener, and a specific example for that purpose. No specific conversion rules have been disclosed.
  • the “driver's driving load” as referred to in Patent Document 2 specifically refers to the vehicle speed.
  • the reading speed is higher than normal (the reading speed for other passengers). It is only disclosed that audio output is performed.
  • Patent Document 3 does not perform the text conversion as originally mentioned above, it is impossible to cope with the psychological situation of the listener that changes every moment.
  • Patent Document 4 describes a technique for estimating a speaker's emotion from input speech. Specifically, the application to a nursing robot that performs an inquiry and measures a change in emotion from its feature amount is disclosed. It has only been done.
  • the input text conversion suitable for the input parameter is performed within a range in which the parameter representing the psychological state of the user and the text are input and the meaning of the input text is not changed.
  • a text conversion device that performs the operation is provided.
  • a text conversion method using a text conversion device the step of inputting a parameter representing a user's psychological state and text, and the meaning of the sentence based on the input parameter. And a step of converting the input text within a range that does not change.
  • the input text is converted within a range that does not change the meaning of the input text and the process of inputting the parameter representing the psychological state of the user and the text.
  • a text conversion program for causing a computer to execute the process.
  • This text conversion program can be recorded on a computer-readable storage medium.
  • the present invention it is possible to convert an arbitrary text into a text whose meaning is easy to be transmitted to the listener.
  • the reason is that a configuration is adopted in which text conversion is performed with parameters representing the psychological situation of the listener as input. Also, derived from this effect, for example, the burden of considering the listener's situation when creating a text to be read is reduced, and the creation of these texts is facilitated.
  • the target text is “spoken as speech” according to a parameter (psychological situation parameter) that represents the psychological situation of the listener. It comprises text conversion means (text conversion unit in FIG. 1) for transforming the text itself so that it becomes “intelligible text”.
  • FIG. 2 is a block diagram showing the configuration of the text conversion apparatus according to the first embodiment of the present invention.
  • the text conversion apparatus includes a voice input unit 11 such as a microphone, a pitch frequency analysis unit 20, a psychological situation estimation unit 21, a text conversion unit 22, It is configured with.
  • the processing means of these text conversion apparatuses can be realized by a program that causes a computer constituting the text conversion apparatus to execute each process described later.
  • the pitch frequency analysis unit 20 is a means for analyzing the voice uttered by the user input from the voice input unit 11 and obtaining the pitch frequency.
  • the psychological situation estimation unit 21 is a means for obtaining a parameter x1 representing the degree of urgency of the user from the average value of the pitch frequency of the user voice and outputting it to the text conversion unit 22.
  • FIG. 3 is a diagram showing a map for obtaining the user's urgency level x1 from the average value of the pitch frequency.
  • the curve of the monotonically increasing relationship (strictly speaking, when the average value of the pitch frequency exceeds the first threshold th1).
  • the S-shaped curve gradually increases again. This is because when the pitch frequency is high, it is possible to estimate that the urgency level of the user is high because the voice is uttered.
  • the text conversion unit 22 is a means for converting the input text based on the score according to the user's urgency level x1 obtained as described above to generate the output text.
  • the score or the total score in the present embodiment is an index representing the ease of understanding as a voice.
  • FIG. 4 is a diagram showing a detailed configuration of the text conversion unit 22.
  • the text conversion unit 22 includes a word conversion unit 31, a text division unit 32, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
  • word conversion DB word conversion database
  • the word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to obtain word conversion candidate groups that are all sets of text candidates that can be changed and are included in the input text having a length L, and the respective conversion scores. Output.
  • the word conversion DB 34 records one or more pairs of words that have substantially the same meaning (the sentence meaning does not change even if they are replaced), and a conversion score S1 (i) when the words are converted by each word pair i. Has been.
  • FIG. 5 is an example of word pairs and conversion scores registered in the word conversion DB 34.
  • the conversion score can be set to be low for an input word having a homonym, and to be high for an output word having no homonym. This is because an input word in which a homonym is present (“personal computer” in FIG. 5 and “patker” each have a homonym of “PC”) is output word (in FIG. 5). “PC” and “Pokécon” do not have homonyms.
  • the text dividing unit 32 converts each text of the input word conversion candidate group into lengths L (1), L (2),. . .
  • a text conversion candidate group that is all combinations divided into N division units of L (N) is output.
  • the text lengths L and L (1) to L (N) can be easily obtained and will be described as using the number of characters correlated with the pronunciation time length. If it can be obtained, the number of mora more strongly correlated with the pronunciation duration can be used.
  • ⁇ 1 and ⁇ 2 are predetermined constants.
  • S1 calculated for each word conversion candidate is the sum of the conversion scores S1 (i) of the words used for conversion.
  • the S2 obtained in this way becomes smaller as the number of divisions is smaller, and as the phrase length after conversion and division is uniform.
  • FIG. 6 shows the result of calculating the score (total score) for the input text “I bought a personal computer” using S1 and S2.
  • the total score is increased every time the text is divided.
  • the text conversion candidates of candidate numbers 4 to 12 in FIG. 6 are more than the text conversion candidates of candidate numbers 1 to 3 that have been subjected to the same word conversion.
  • the overall score is also high.
  • a user with high urgency generates an output text that does not include synonyms as much as possible (easy to understand) and is finely divided (easy to hear).
  • the user's urgency level x1 is input to the word conversion unit 31 and the text division unit 32, and is not required at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the user's urgency level x1 is high, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text dividing unit 32 and candidate selecting unit 33. Is possible.
  • FIG. 7 is a block diagram showing the configuration of the text conversion apparatus according to the second embodiment of the present invention.
  • a text conversion apparatus includes a voice input unit 11 such as a microphone, a voice recognition unit 12, a response message generation unit 13, an utterance speed measurement unit 23, A psychological situation estimation unit 21, a text conversion unit 22, a text speech synthesis unit 14, and a speaker 15 are configured.
  • the processing means of the speech recognition unit 12, the response message generation unit 13, the speech rate measurement unit 23, the psychological situation estimation unit 21, the text conversion unit 22, and the text speech synthesis unit 14 constitutes a text conversion device. It can be realized by a program that causes a computer to execute each process described later.
  • the voice recognition unit 12 is means for recognizing the voice input from the microphone 11 and outputting it to the response word generation unit 13.
  • the response word generation unit 13 is a unit that generates a word that responds to the content of the user's utterance recognized by the voice recognition unit 12 and outputs it to the text conversion unit 22 as input text.
  • the utterance speed measuring unit 23 is a means for measuring the utterance speed of the voice uttered by the user.
  • the sound input from the microphone 11 is also input to the speech rate measuring unit 23, and the speech rate of the speech uttered by the user is measured.
  • the psychological state estimation unit 21 outputs a numerical value x2 representing the degree of urgency based on the value of the speech rate measured by the speech rate measurement unit 23.
  • the numerical value x2 representing the degree of urgency can be obtained from a relationship given in advance so as to have a monotonically increasing relationship with the speech rate value (unit: mora per second).
  • the text conversion unit 22 is a means for converting the input text based on the score according to the user's degree of urgency x2 obtained as described above, and generating an output text.
  • the score or the total score in the present embodiment is an index representing the ease of understanding as a voice.
  • FIG. 8 is a diagram showing a detailed configuration of the text conversion unit 22.
  • the text conversion unit 22 includes a word conversion unit 31, a text summarization unit 36, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
  • word conversion DB word conversion database
  • the word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to output word conversion candidate groups which are all sets of changeable text candidates included in the input text of length L.
  • word conversion DB 34 one or more pairs of words having substantially the same meaning are recorded as in the first embodiment.
  • the text summarizing section 36 summarizes the text of the input word conversion candidate group having a length L21 and outputs a summary text having a length L22. When a plurality of summary candidates can be generated by the text summarizing unit 36, all of them are included in the text conversion candidate group.
  • FIG. 9 shows the result of calculating the score (total score) for the input text “My personal computer has been broken” using S1 and S2.
  • the user's urgency level x2 is not remarkably high but is a certain value or more (frequency 7 or more and less than 9)
  • the text conversion candidate of candidate number 3 is selected.
  • the user's urgency level x2 is low (frequency less than 7)
  • the text conversion candidate with candidate number 1 is selected. That is, when the user's urgency level x2 is high, a candidate that is paraphrased shortly is selected by the word conversion of candidate number 3 "My PC has been broken.”
  • the text conversion candidates of candidate numbers 4 to 9 in FIG. 6 are the text conversion candidates of candidate numbers 1 to 3 that have been subjected to the same word conversion.
  • the user's urgency level x2 is input to the word conversion unit 31 and the text summarization unit 36, and is not required at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the user's urgency level x2 is high, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text summarization unit 36 and candidate selection unit 33. Is possible.
  • the numerical value x2 representing the degree of urgency of the user is not limited to the above, and can be obtained by the following methods.
  • the numerical value x2 representing the degree of urgency is obtained as a monotonically increasing relationship with the corresponding value by inputting the moving speed (vehicle speed or driving wheel rotational speed) of a vehicle such as an automobile on which the user is boarding (driving). .
  • the numerical value x2 representing the degree of urgency is obtained when the brake operation of the automobile is input, the value of x2 is increased when the user driving the automobile depresses the brake, and the acceleration of the movement of the brake pedal is large.
  • X2 can be obtained by further increasing the value.
  • FIG. 10 is a block diagram showing the configuration of the text conversion apparatus according to the third embodiment of the present invention.
  • the text conversion device includes a voice input unit 11 such as a microphone, a voice recognition unit 12, a response message generation unit 13, a speech rate measurement unit 23, A psychological situation estimation unit 21, a text conversion unit 22, a text speech synthesis unit 14, and a speaker 15 are configured.
  • the processing means of the text conversion apparatus of this embodiment can be realized by a program that causes a computer constituting the text conversion apparatus to execute each process described later.
  • the psychological situation estimation unit 21 outputs a value x3 indicating the user's concentration level from the user's utterance speed, and the text conversion unit 22 performs text conversion using the value x3 indicating the user's concentration level.
  • the other elements are the same as those in the second embodiment described above, and the differences will be mainly described.
  • the psychological situation estimation unit 21 determines that there is a high possibility of being distracted by things other than dialogue and is a numerical value indicating the degree of concentration. x3 is output.
  • the numerical value x3 representing the degree of concentration can be obtained from a relationship given in advance so as to have a monotonously decreasing relationship with the value Vdiff of the temporal variation component of the speech speed value.
  • the text conversion unit 22 is a means for converting the input text based on the score according to the user concentration x3 obtained as described above, and generating the output text.
  • FIG. 11 is a diagram showing a detailed configuration of the text conversion unit 22.
  • the text conversion unit 22 includes a word conversion unit 31, a text enhancement unit 37, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
  • word conversion DB word conversion database
  • the word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to output word conversion candidate groups which are all sets of changeable text candidates included in the input text of length L.
  • the word conversion DB 34 as in the first embodiment, one or more pairs of words having substantially the same meaning and a conversion score S1 (i) when the word is converted by each word pair i are recorded. Yes.
  • S1 calculated for each word conversion candidate is the sum of the conversion scores S1 (i) of the words used for conversion.
  • the text emphasizing unit 37 extracts an important word from the text of the input word conversion candidate group having a length L21, and generates an output text having a length L22 in which the important word is repeated an arbitrary number of times (phrase repetition processing). For example, for the input text “Please press the B button next”, the text emphasizing unit 37 extracts the “B button” as an important word, and repeats the important word twice with a punctuation mark. "B button, please press B button”. When the input text includes a plurality of important words, the text emphasizing unit 37 outputs all combinations of patterns obtained by repeating each important word as a text conversion candidate group.
  • Important word candidates may be defined in the text emphasizing unit 37 in advance, or an object or the like may be extracted as an important word according to a certain rule.
  • a text conversion candidate that performs both the word conversion and text emphasis is selected.
  • a text conversion candidate that is not subjected to the word conversion or text emphasis is selected.
  • the selection is made after all possible candidates are listed in advance.
  • the user's concentration x3 is input to the word conversion unit 31 and the text emphasizing unit 37, and is unnecessary at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the degree of user concentration x3 is low, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text enhancement unit 37 and candidate selection unit 33. Is possible.
  • the numerical value x3 representing the user concentration level is not limited to the above, and can be obtained by the following methods.
  • the numerical value x3 representing the concentration level is obtained by measuring and inputting the electrical resistance of the user's skin to estimate the amount of sweating of the user as a value that has a monotonous decrease relationship with the electrical resistance. It can be obtained from the relationship that the degree of concentration is high.
  • the numerical value x3 representing the degree of concentration can be obtained from the relationship that the degree of concentration is high when the number of breaths per hour is small by measuring and inputting the user's respiration.
  • the numerical value x3 representing the degree of concentration can be obtained from the relationship that the degree of concentration is high when the number of beats per hour is large by measuring and inputting the user's pulse.
  • the text conversion based on the user's urgency level, urgency level, and concentration level has been described, but the user urgency levels of the above first to third embodiments, It is also possible to perform text conversion based on the degree of urgency and the degree of concentration, respectively, and select a text suitable for the psychological situation of the user from the text conversion and utter it.
  • the parameters representing various psychological situations such as the user's urgency level, urgency level, and concentration level, input / output text, and text conversion program may be anything that can be handled as a physical or electrical signal by the computer. .
  • the text conversion program causes a computer in which parameters and text representing the psychological situation are input to function as physical means for outputting the converted text.
  • the present invention can be used in various applications such as a speech synthesizer, a speech dialogue system, a speech automatic response device, an intelligent robot, etc., which changes the utterance text by combining with a text-to-speech synthesizer. it can. It should be noted that the embodiments and examples can be changed and adjusted within the scope of the entire disclosure (including claims) of the present invention and based on the basic technical concept. Various combinations and selections of various disclosed elements are possible within the scope of the claims of the present invention. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the entire disclosure including the claims and the technical idea.

Abstract

It is possible to convert a sentence of an inputted text according to a parameter expressing a psychological state of a listener without changing the content. That is, a parameter reflecting a psychological state of a listener (tension, urgency, concentration) is used to deform the text itself to be transferred so that the listener at the moment can easily understand the content of the text.

Description

[規則37.2に基づきISAが決定した発明の名称] テキスト変換装置、方法、プログラム[Name of invention determined by ISA based on Rule 37.2] Text conversion device, method, program
 (関連出願についての記載)
 本願は、先の日本特許出願2008-037603号(2008年2月19日出願)の優先権を主張するものであり、前記先の出願の全記載内容は、本書に引用をもって繰込み記載されているものとみなされる。
 本発明は、テキスト変換装置、テキスト変換方法、テキスト変換プログラム、音声合成装置及びロボットに関し、特に、テキストの内容が伝わりやすくするような変換を行うテキスト変換装置、テキスト変換方法、テキスト変換プログラム、音声合成装置及びロボットに関する。
(Description of related applications)
This application claims the priority of the previous Japanese Patent Application No. 2008-037603 (filed on Feb. 19, 2008), and the entire description of the previous application is incorporated herein by reference. Is considered to be.
The present invention relates to a text conversion device, a text conversion method, a text conversion program, a speech synthesizer, and a robot, and in particular, a text conversion device, a text conversion method, a text conversion program, and a voice that perform conversion that facilitates transmission of text contents. The present invention relates to a synthesis apparatus and a robot.
 特許文献1に、ある自然言語で記述された文字列を、変換目的毎に用意される変換規則を用いて、当該自然言語の他の表現による文字列に変換する文変換技術が、開示されている。 Patent Document 1 discloses a sentence conversion technique for converting a character string described in a certain natural language into a character string in another representation of the natural language using a conversion rule prepared for each conversion purpose. Yes.
 特許文献2には、車両を運転中のドライバの運転負荷を判断する運転負荷判断手段を備え、ドライバの運転負荷に応じて、ドライバ用音声出力手段を制御し、読み上げの休止期間や発話速度を変更する技術が開示されている。 Patent Document 2 includes a driving load determination unit that determines a driving load of a driver who is driving a vehicle, controls the driver's voice output unit according to the driving load of the driver, and sets a reading pause period and speech rate. Techniques for changing are disclosed.
 特許文献3には、音声応答サービス装置において、利用者の属性情報に応じて、音声合成の音質を制御する技術が開示されている。 Patent Document 3 discloses a technique for controlling the sound quality of speech synthesis according to user attribute information in a voice response service device.
 特許文献4には、入力した音声の音圧、ピッチ周波数、継続時間等の特徴量に基づいて、話者の感情を推定する技術が開示されている。 Patent Document 4 discloses a technique for estimating a speaker's emotion based on features such as sound pressure, pitch frequency, and duration of input speech.
特許第3932350号公報Japanese Patent No. 3932350 特開2005-070703号公報JP-A-2005-070703 特許第3936351号公報Japanese Patent No. 3936351 特開2003-228391号公報JP 2003-228391 A
 以上の特許文献1~4の全開示内容は、本書に引用をもって繰り込み記載されているものとする。以下に本発明による関連技術の分析を与える。
 受聴者は常に精一杯の能力を内容の受聴に用いているとは限らず、逐次の心理的状況によっては、聞き漏らしたり、誤解をしたりすることがある。本発明の目的は、上記受聴者の心理的状況を考慮し、その意味内容が伝わりやすくなるようなテキスト変換を行うテキスト変換装置、テキスト変換方法、テキスト変換プログラム、音声合成装置及びロボットを提供することにある。
The entire disclosures of Patent Documents 1 to 4 above are incorporated herein by reference. The following is an analysis of the related art according to the present invention.
The listener does not always use his / her full ability to listen to the content, and may be missed or misunderstood depending on the sequential psychological situation. An object of the present invention is to provide a text conversion device, a text conversion method, a text conversion program, a speech synthesizer, and a robot that perform text conversion so that the semantic content of the listener is easily transmitted in consideration of the psychological situation of the listener. There is.
 この点、特許文献1には、具体的な適用例として(A)質問応答システム、(B)文内圧縮システム、(C)推敲システム、(D)難解文変換システムへの適用例と、その際に行われるであろう異なる表現への変換例が挙げられているが、これらの変換後のテキストが、受聴者に伝わりやすいものであるという保証はない。また、その他、特許文献1には、書き言葉と話し言葉での変換・逆変換への適用が示唆されているが(段落0064)、受聴者に伝わりやすいテキストを生成するといった変換目的や、そのための具体的な変換規則の開示はなされていない。 In this regard, Patent Document 1 includes, as a specific application example, an application example to (A) question answering system, (B) sentence compression system, (C) recommendation system, (D) difficult sentence conversion system, and Although there are examples of conversion to different expressions that will be performed in the past, there is no guarantee that the text after these conversions will be easily communicated to the listener. In addition, Patent Document 1 suggests application to conversion / inverse conversion between written words and spoken words (paragraph 0064). However, the purpose of conversion is to generate text that is easy to convey to the listener, and a specific example for that purpose. No specific conversion rules have been disclosed.
 また、特許文献2でいうところの「ドライバの運転負荷」は、具体的には車速のことであり、運転中であれば、通常(他の同乗者に対する読み上げ速度)よりも早い読み上げ速度で、音声出力を行うことが開示されているに過ぎない。 In addition, the “driver's driving load” as referred to in Patent Document 2 specifically refers to the vehicle speed. When driving, the reading speed is higher than normal (the reading speed for other passengers). It is only disclosed that audio output is performed.
 また、特許文献3記載の技術は、もとより上記目的として挙げたようなテキストの変換を行うものではないが、刻々と変わっていく受聴者の心理的状況に対応することも不可能である。 In addition, although the technique described in Patent Document 3 does not perform the text conversion as originally mentioned above, it is impossible to cope with the psychological situation of the listener that changes every moment.
 また、特許文献4には、入力音声から話者の感情を推定する技術が記載されているが、具体的には問診を行ないその特徴量から感情の変化を計測する介護ロボットへの適用が開示されているにすぎない。 Further, Patent Document 4 describes a technique for estimating a speaker's emotion from input speech. Specifically, the application to a nursing robot that performs an inquiry and measures a change in emotion from its feature amount is disclosed. It has only been done.
 本発明の第1の視点によれば、ユーザの心理状況を表すパラメータと、テキストと、を入力として、該入力テキストの文意を変えない範囲で、前記入力パラメータに適した該入力テキストの変換動作を行うテキスト変換装置が提供される。 According to the first aspect of the present invention, the input text conversion suitable for the input parameter is performed within a range in which the parameter representing the psychological state of the user and the text are input and the meaning of the input text is not changed. A text conversion device that performs the operation is provided.
 本発明の第2の視点によれば、テキスト変換装置によるテキスト変換方法であって、ユーザの心理状況を表すパラメータと、テキストと、を入力するステップと、前記入力パラメータに基づいて、文意を変えない範囲で前記入力されたテキストを変換するステップと、を含む、テキスト変換方法が提供される。 According to a second aspect of the present invention, there is provided a text conversion method using a text conversion device, the step of inputting a parameter representing a user's psychological state and text, and the meaning of the sentence based on the input parameter. And a step of converting the input text within a range that does not change.
 本発明の第3の視点によれば、ユーザの心理状況を表すパラメータと、テキストと、を入力する処理と、前記入力されたテキストの文意を変えない範囲で、前記入力されたテキストを変換する処理と、をコンピュータに実行させるテキスト変換プログラムが提供される。なお、このテキスト変換プログラムは、コンピュータが読み取り可能な記憶媒体に記録することができる。 According to a third aspect of the present invention, the input text is converted within a range that does not change the meaning of the input text and the process of inputting the parameter representing the psychological state of the user and the text. And a text conversion program for causing a computer to execute the process. This text conversion program can be recorded on a computer-readable storage medium.
 本発明によれば、任意のテキストを、その意味内容が受聴者に伝わりやすいテキストに変換することが可能になる。その理由は、受聴者の心理的状況を表すパラメータを入力としてテキスト変換を行う構成を採用したことにある。また、本効果に派生して、例えば、読み上げテキストの作成時に受聴者の状況を考慮する負担が軽減され、これらのテキストの作成が容易化される。 According to the present invention, it is possible to convert an arbitrary text into a text whose meaning is easy to be transmitted to the listener. The reason is that a configuration is adopted in which text conversion is performed with parameters representing the psychological situation of the listener as input. Also, derived from this effect, for example, the burden of considering the listener's situation when creating a text to be read is reduced, and the creation of these texts is facilitated.
本発明の概要を説明するための図である。It is a figure for demonstrating the outline | summary of this invention. 本発明の第1の実施形態に係るテキスト変換装置の構成を表したブロック図である。It is a block diagram showing the structure of the text converter which concerns on the 1st Embodiment of this invention. 音声のピッチ周波数と、ユーザの緊急度との関係を説明するための図である。It is a figure for demonstrating the relationship between the pitch frequency of an audio | voice, and the urgency level of a user. 本発明の第1の実施形態に係るテキスト変換装置のテキスト変換部の詳細構成を表したブロック図である。It is a block diagram showing the detailed structure of the text conversion part of the text converter which concerns on the 1st Embodiment of this invention. 本発明の第1の実施形態に係るテキスト変換装置の単語変換データベースの構成を説明するための図である。It is a figure for demonstrating the structure of the word conversion database of the text converter which concerns on the 1st Embodiment of this invention. 本発明の第1の実施形態に係るテキスト変換装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the text conversion apparatus which concerns on the 1st Embodiment of this invention. 本発明の第2の実施形態に係るテキスト変換装置の構成を表したブロック図である。It is a block diagram showing the structure of the text conversion apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施形態に係るテキスト変換装置のテキスト変換部の詳細構成を表したブロック図である。It is a block diagram showing the detailed structure of the text conversion part of the text converter which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施形態に係るテキスト変換装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the text conversion apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第3の実施形態に係るテキスト変換装置の構成を表したブロック図である。It is a block diagram showing the structure of the text conversion apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第3の実施形態に係るテキスト変換装置のテキスト変換部の詳細構成を表したブロック図である。It is a block diagram showing the detailed structure of the text conversion part of the text converter which concerns on the 3rd Embodiment of this invention.
符号の説明Explanation of symbols
 11 マイクロフォン(音声入力部)
 12 音声認識部
 13 応答文言生成部
 14 テキスト音声合成部
 15 スピーカ
 20 ピッチ周波数分析部
 21 心理状況推定部
 22 テキスト変換部
 23 発話速度測定部
 31 単語変換部
 32 テキスト分割部
 33 候補選択部
 34 単語変換データベース(単語変換DB)
 36 テキスト要約部
 37 テキスト強調部
11 Microphone (voice input unit)
DESCRIPTION OF SYMBOLS 12 Speech recognition part 13 Response word generation part 14 Text speech synthesis part 15 Speaker 20 Pitch frequency analysis part 21 Psychological condition estimation part 22 Text conversion part 23 Speech rate measurement part 31 Word conversion part 32 Text division part 33 Candidate selection part 34 Word conversion Database (word conversion DB)
36 Text summary section 37 Text enhancement section
 続いて、本発明を好適な実施形態として第1~第3の実施形態を示して説明する。これらの実施形態は、図1に抽象化されるように、いずれも、受聴者の心理的状況を表すパラメータ(心理的状況パラメータ)に応じて、対象のテキストが「音声として発せられた場合にわかりやすいテキスト」となるように、テキスト自体を変形するテキスト変換手段(図1のテキスト変換部)を備えるものである。 Subsequently, the first to third embodiments will be described and described as preferred embodiments of the present invention. In these embodiments, as abstracted in FIG. 1, in both cases, the target text is “spoken as speech” according to a parameter (psychological situation parameter) that represents the psychological situation of the listener. It comprises text conversion means (text conversion unit in FIG. 1) for transforming the text itself so that it becomes “intelligible text”.
[第1の実施形態]
 始めに、ユーザの心理的状況を表すパラメータとして、ユーザの緊急度(急いでいる度合い)を用いてテキスト変換を行う本発明の第1の実施形態について説明する。図2は、本発明の第1の実施形態に係るテキスト変換装置の構成を表したブロック図である。
[First Embodiment]
First, a first embodiment of the present invention in which text conversion is performed using a user's urgency (degree of urgency) as a parameter representing a user's psychological situation will be described. FIG. 2 is a block diagram showing the configuration of the text conversion apparatus according to the first embodiment of the present invention.
 図2を参照すると、本発明の第1の実施形態に係るテキスト変換装置は、マイクロフォン等の音声入力部11と、ピッチ周波数分析部20と、心理状況推定部21と、テキスト変換部22と、を備えて構成される。これらテキスト変換装置の処理手段は、テキスト変換装置を構成するコンピュータに、後記する各処理を実行させるプログラムにより実現することができる。 Referring to FIG. 2, the text conversion apparatus according to the first embodiment of the present invention includes a voice input unit 11 such as a microphone, a pitch frequency analysis unit 20, a psychological situation estimation unit 21, a text conversion unit 22, It is configured with. The processing means of these text conversion apparatuses can be realized by a program that causes a computer constituting the text conversion apparatus to execute each process described later.
 ピッチ周波数分析部20は、音声入力部11より入力されたユーザの発声する音声を分析し、ピッチ周波数を得る手段である。 The pitch frequency analysis unit 20 is a means for analyzing the voice uttered by the user input from the voice input unit 11 and obtaining the pitch frequency.
 心理状況推定部21は、ユーザ音声のピッチ周波数の平均値から、ユーザの緊急度を表すパラメータx1を求め、テキスト変換部22に出力する手段である。 The psychological situation estimation unit 21 is a means for obtaining a parameter x1 representing the degree of urgency of the user from the average value of the pitch frequency of the user voice and outputting it to the text conversion unit 22.
 図3は、ピッチ周波数の平均値から、ユーザの緊急度x1を求めるマップを表した図である。図3の例では、ピッチ周波数の平均値が高い程、ユーザの緊急度x1が高くなるような単調増加関係の曲線(より厳密には、ピッチ周波数の平均値が第1の閾値th1を超えると、ユーザの緊急度x1が急増し、ピッチ周波数の平均値が第2の閾値th2を超えると、再び緩やかに増えていくようなS字状の曲線)となっている。これは、ピッチ周波数が高い場合、うわずった発声が行われているので、ユーザの緊急度は高いと推定できるからである。 FIG. 3 is a diagram showing a map for obtaining the user's urgency level x1 from the average value of the pitch frequency. In the example of FIG. 3, the higher the average value of the pitch frequency is, the higher the user's urgency level x1 is. The curve of the monotonically increasing relationship (strictly speaking, when the average value of the pitch frequency exceeds the first threshold th1). When the user's urgency level x1 increases rapidly and the average value of the pitch frequency exceeds the second threshold th2, the S-shaped curve gradually increases again. This is because when the pitch frequency is high, it is possible to estimate that the urgency level of the user is high because the voice is uttered.
 テキスト変換部22は、上記のようにして得られるユーザの緊急度x1に応じて、入力テキストをスコアに基づき変換して、出力テキストを生成する手段である。本実施形態におけるスコア又は総合スコアとは、音声としてのわかりやすさを表す指標である。 The text conversion unit 22 is a means for converting the input text based on the score according to the user's urgency level x1 obtained as described above to generate the output text. The score or the total score in the present embodiment is an index representing the ease of understanding as a voice.
 図4は、テキスト変換部22の詳細構成を表した図である。テキスト変換部22は、単語変換部31と、テキスト分割部32と、候補選択部33と、単語変換データベース(単語変換DB)34と、を備えて構成される。 FIG. 4 is a diagram showing a detailed configuration of the text conversion unit 22. The text conversion unit 22 includes a word conversion unit 31, a text division unit 32, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
 単語変換部31は、単語変換DB34に登録された単語対を用いて、長さLの入力テキストに含まれる変更可能な、すべての組のテキスト候補である単語変換候補群とそれぞれの変換スコアを出力する。 The word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to obtain word conversion candidate groups that are all sets of text candidates that can be changed and are included in the input text having a length L, and the respective conversion scores. Output.
 単語変換DB34には、意味が略同一である(置き換えても文意が変わらない。)一組以上の単語対と、各単語対iによって文言を変換した場合の変換スコアS1(i)が記録されている。図5は、単語変換DB34に登録された単語対と変換スコアの例である。変換スコアは、同音異義語が存在する入力単語に対しては、低く、同音異義語が存在しない出力単語に対しては変換スコアが高くなるように設定することができる。これは、同音異義語が存在する入力単語(図5の「パーソナルコンピュータ」と、「パトカー」はそれぞれ「PC」という同音異義語を持つ。)を同音異義語が存在しない出力単語(図5の「パソコン」、「ポケコン」は同音異義語を持たない。)への置き換えを促進するためである。 The word conversion DB 34 records one or more pairs of words that have substantially the same meaning (the sentence meaning does not change even if they are replaced), and a conversion score S1 (i) when the words are converted by each word pair i. Has been. FIG. 5 is an example of word pairs and conversion scores registered in the word conversion DB 34. The conversion score can be set to be low for an input word having a homonym, and to be high for an output word having no homonym. This is because an input word in which a homonym is present (“personal computer” in FIG. 5 and “patker” each have a homonym of “PC”) is output word (in FIG. 5). “PC” and “Pokécon” do not have homonyms.
 テキスト分割部32は、入力された単語変換候補群の各テキストを、読点又は所定のポーズ記号によって、長さL(1),L(2),...L(N)のN個の分割単位に分割したすべての組合せであるテキスト変換候補群を出力する。 The text dividing unit 32 converts each text of the input word conversion candidate group into lengths L (1), L (2),. . . A text conversion candidate group that is all combinations divided into N division units of L (N) is output.
 例えば、読点を用いて、入力テキストを分割する場合、読点がM個挿入可能である場合、2のM乗種類の分割が可能である。 For example, when the input text is divided using reading marks, when M reading marks can be inserted, division of 2 M power types is possible.
 なお、上記テキスト長L、L(1)~L(N)としては、簡便に求めることが可能であり、発音時間長と相関のある文字数を用いるものとして説明するが、入力テキストからモーラ数を求めることが可能である場合には、発音時間長とより相関の強いモーラ数を用いることもできる。 The text lengths L and L (1) to L (N) can be easily obtained and will be described as using the number of characters correlated with the pronunciation time length. If it can be obtained, the number of mora more strongly correlated with the pronunciation duration can be used.
 候補選択部33は、テキスト分割部32が出力したテキスト変換候補のすべての組み合わせの中で、y1=x1-(α1*S1+α2*S2)が正かつ最小となる候補を選択し、該当変換後テキストを出力する。ここで、α1、α2は、予め定めた定数である。 The candidate selection unit 33 selects a candidate in which y1 = x1− (α1 * S1 + α2 * S2) is positive and minimum among all combinations of the text conversion candidates output by the text dividing unit 32, and the corresponding converted text Is output. Here, α1 and α2 are predetermined constants.
 ここで、単語変換候補毎に算出されるS1は、変換に使用した単語の変換スコアS1(i)の和である。 Here, S1 calculated for each word conversion candidate is the sum of the conversion scores S1 (i) of the words used for conversion.
 また、テキスト変換候補毎に算出されるS2は、S2=L^2-Σ(L(i)*L(i))で求められる。このようにして得られるS2は、分割回数が小さい程、また変換、分割後のフレーズ長が均一である程、小さな値となる。 Also, S2 calculated for each text conversion candidate is obtained by S2 = L ^ 2-Σ (L (i) * L (i)). The S2 obtained in this way becomes smaller as the number of divisions is smaller, and as the phrase length after conversion and division is uniform.
 図6は、「私は、パーソナルコンピュータを、買った。」という入力テキストに対するスコア(総合スコア)を上記S1、S2を用いて算出した結果を表している。なお、図6の例では、上記定数α1、α2として、α1=10、α2=1を設定している。 FIG. 6 shows the result of calculating the score (total score) for the input text “I bought a personal computer” using S1 and S2. In the example of FIG. 6, α1 = 10 and α2 = 1 are set as the constants α1 and α2.
 図6を参照して、スコアの算出の方法を説明する。例えば、候補番号1の単語変換「なし」、テキスト分割「なし」のケースでは、S1=0(変換なし)、S2=20^2-(20^2)=0と算出され、総合スコアは、α1×0+α2×0=0と算出される。 Referring to FIG. 6, the score calculation method will be described. For example, in the case of word conversion “none” for candidate number 1 and text division “none”, S1 = 0 (no conversion), S2 = 20 ^ 2− (20 ^ 2) = 0, and the total score is It is calculated as α1 × 0 + α2 × 0 = 0.
 同様に、候補番号2の単語変換「a」(図5の「パーソナルコンピュータ」を「パソコン」に変換、テキスト分割「なし」のケースでは、S1=50、S2=20^2-(13^2)=231と算出され、総合スコアは、α1×50+α2×231=731と算出される。 Similarly, word conversion “a” of candidate number 2 (in the case of “personal computer” in FIG. 5 converted to “personal computer” and text division “none”, S1 = 50, S2 = 20 ^ 2- (13 ^ 2 ) = 231, and the total score is calculated as α1 × 50 + α2 × 231 = 731.
 同様に、候補番号3の単語変換「b」(図5の「パーソナルコンピュータ」を「PC」に変換、テキスト分割「なし」のケースでは、S1=50、S2=20^2-(13^2)=231と算出され、総合スコアは、α1×50+α2×231=731と算出される。 Similarly, the word conversion “b” of candidate number 3 (in the case of “personal computer” in FIG. 5 converted to “PC” and text division “none”, S1 = 50, S2 = 20 ^ 2- (13 ^ 2 ) = 231, and the total score is calculated as α1 × 50 + α2 × 231 = 731.
 上記総合スコアを用いた上述の候補選択部33のy1=x1-(α1*S1+α2*S2)が正かつ最小となる候補を選択する基準に当てはめると、ユーザの緊急度x1が著しく高いときには、候補番号2のテキスト変換候補が選択される。ユーザの緊急度x1が著しく高くはないが一定値以上である場合、候補番号3のテキスト変換候補が選択される。ユーザの緊急度x1が低い場合、候補番号1のテキスト変換候補が選択される。つまり、ユーザの緊急度x1が高いときには、候補番号2の「私は、パソコンを、買った。」という同義語が少なく、かつ、短く言い換えられた候補が選択される。 If the criterion for selecting a candidate for which y1 = x1− (α1 * S1 + α2 * S2) of the candidate selection unit 33 using the total score is positive and minimum is applied, when the user's urgency level x1 is extremely high, The text conversion candidate of number 2 is selected. If the user's urgency level x1 is not remarkably high but is a certain value or more, the text conversion candidate with candidate number 3 is selected. When the user's urgency level x1 is low, the text conversion candidate with candidate number 1 is selected. That is, when the user's urgency level x1 is high, the candidate number 2 is selected with a short synonym of “I bought a personal computer” and a short paraphrase.
 上記総合スコアは、テキスト分割がされる毎に高くなり、例えば、図6の候補番号4~12の各テキスト変換候補は、それぞれ同一の単語変換を行った候補番号1~3のテキスト変換候補よりも総合スコアが高くなっている。すべてのテキスト変換候補が得られている状態では、ユーザの緊急度x1が200である場合、入力テキストを1回分割した候補番号7(総合スコア=159、y1=41)が選択される。同様に、ユーザの緊急度x1が上がって500である場合、読み上げ文が短くなるよう単語変換を行った候補番号3(総合スコア=479、y1=21)が選択される。更に、ユーザの緊急度x1が更に上がって900である場合には、更に、短かく、細かく分割された候補番号11(総合スコア=855、y1=45)が選択される。 The total score is increased every time the text is divided. For example, the text conversion candidates of candidate numbers 4 to 12 in FIG. 6 are more than the text conversion candidates of candidate numbers 1 to 3 that have been subjected to the same word conversion. The overall score is also high. In a state where all text conversion candidates are obtained, when the user's urgency level x1 is 200, candidate number 7 (total score = 159, y1 = 41) obtained by dividing the input text once is selected. Similarly, when the user's urgency level x1 is increased to 500, candidate number 3 (total score = 479, y1 = 21) obtained by performing word conversion so as to shorten the read-out sentence is selected. Further, when the user's urgency level x1 is further increased to 900, the candidate number 11 (total score = 855, y1 = 45) that is further finely divided is selected.
 以上のように、緊急度の高いユーザには、なるべく同義語を含まず(わかりやすく)、細かく分割が行われた(聞き取りやすい)出力テキストが生成される。 As described above, a user with high urgency generates an output text that does not include synonyms as much as possible (easy to understand) and is finely divided (easy to hear).
 なお、本実施形態においては、選択可能な候補を予めすべて挙げてから選択を行うものとして説明したが、単語変換部31やテキスト分割部32にユーザの緊急度x1を入力し、各段階で不要な候補を削除、あるいは、最適な候補を選択するようにしてもよい。例えば、ユーザの緊急度x1が高い場合には、スコアS1が高い変換を行ったテキスト変換候補のみを出力するようにすることで、テキスト分割部32や候補選択部33の負荷や処理時間を低減することが可能である。 In the present embodiment, description has been made assuming that all candidates that can be selected are listed in advance, and selection is performed. However, the user's urgency level x1 is input to the word conversion unit 31 and the text division unit 32, and is not required at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the user's urgency level x1 is high, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text dividing unit 32 and candidate selecting unit 33. Is possible.
[第2の実施形態]
 続いて、ユーザの心理的状況を表すパラメータとして、ユーザの切迫度(差し迫っている度合い)を用いてテキスト変換を行う本発明の第2の実施形態について説明する。図7は、本発明の第2の実施形態に係るテキスト変換装置の構成を表したブロック図である。
[Second Embodiment]
Next, a second embodiment of the present invention in which text conversion is performed using a user's urgency level (immediate degree) as a parameter representing the user's psychological situation will be described. FIG. 7 is a block diagram showing the configuration of the text conversion apparatus according to the second embodiment of the present invention.
 図7を参照すると、本発明の第2の実施形態に係るテキスト変換装置は、マイクロフォン等の音声入力部11と、音声認識部12と、応答文言生成部13と、発話速度測定部23と、心理状況推定部21と、テキスト変換部22と、テキスト音声合成部14と、スピーカ15と、を備えて構成される。上記音声認識部12と、応答文言生成部13と、発話速度測定部23と、心理状況推定部21と、テキスト変換部22と、テキスト音声合成部14との処理手段は、テキスト変換装置を構成するコンピュータに、後記する各処理を実行させるプログラムにより実現することができる。 Referring to FIG. 7, a text conversion apparatus according to the second embodiment of the present invention includes a voice input unit 11 such as a microphone, a voice recognition unit 12, a response message generation unit 13, an utterance speed measurement unit 23, A psychological situation estimation unit 21, a text conversion unit 22, a text speech synthesis unit 14, and a speaker 15 are configured. The processing means of the speech recognition unit 12, the response message generation unit 13, the speech rate measurement unit 23, the psychological situation estimation unit 21, the text conversion unit 22, and the text speech synthesis unit 14 constitutes a text conversion device. It can be realized by a program that causes a computer to execute each process described later.
 音声認識部12は、マイクロフォン11から入力された音声を認識し、応答文言生成部13に出力する手段である。 The voice recognition unit 12 is means for recognizing the voice input from the microphone 11 and outputting it to the response word generation unit 13.
 応答文言生成部13は、音声認識部12にて認識されたユーザの発話内容に応答する文言を生成し、入力テキストとしてテキスト変換部22に出力する手段である。 The response word generation unit 13 is a unit that generates a word that responds to the content of the user's utterance recognized by the voice recognition unit 12 and outputs it to the text conversion unit 22 as input text.
 発話速度測定部23は、ユーザの発声する音声の発話速度を測定する手段である。マイクロフォン11から入力された音声は、発話速度測定部23にも入力され、ユーザの発声する音声の発話速度の測定が行われる。 The utterance speed measuring unit 23 is a means for measuring the utterance speed of the voice uttered by the user. The sound input from the microphone 11 is also input to the speech rate measuring unit 23, and the speech rate of the speech uttered by the user is measured.
 心理状況推定部21は、発話速度測定部23にて測定された発話速度の値に基づいて、切迫度を表す数値x2を出力する。 The psychological state estimation unit 21 outputs a numerical value x2 representing the degree of urgency based on the value of the speech rate measured by the speech rate measurement unit 23.
 ここで、切迫度を表す数値x2は、発話速度の値(単位はモーラ毎秒)と単調増加の関係となるよう、予め与えられた関係によって求めることができる。 Here, the numerical value x2 representing the degree of urgency can be obtained from a relationship given in advance so as to have a monotonically increasing relationship with the speech rate value (unit: mora per second).
 テキスト変換部22は、上記のようにして得られるユーザの切迫度x2に応じて、入力テキストをスコアに基づき変換して、出力テキストを生成する手段である。本実施形態におけるスコア又は総合スコアとは、音声としてのわかりやすさを表す指標である。 The text conversion unit 22 is a means for converting the input text based on the score according to the user's degree of urgency x2 obtained as described above, and generating an output text. The score or the total score in the present embodiment is an index representing the ease of understanding as a voice.
 図8は、テキスト変換部22の詳細構成を表した図である。テキスト変換部22は、単語変換部31と、テキスト要約部36と、候補選択部33と、単語変換データベース(単語変換DB)34と、を備えて構成される。 FIG. 8 is a diagram showing a detailed configuration of the text conversion unit 22. The text conversion unit 22 includes a word conversion unit 31, a text summarization unit 36, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
 単語変換部31は、単語変換DB34に登録された単語対を用いて、長さLの入力テキストに含まれる変更可能な、すべての組のテキスト候補である単語変換候補群を出力する。 The word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to output word conversion candidate groups which are all sets of changeable text candidates included in the input text of length L.
 単語変換DB34には、上記第1の実施形態と同様、意味が略同一である一組以上の単語対が記録されている。 In the word conversion DB 34, one or more pairs of words having substantially the same meaning are recorded as in the first embodiment.
 本実施形態では、各単語変換候補毎に算出されるスコアS1は、単語変換部31による、変換前のテキストの文字数L11と、変換後のテキストの文字数L12から、S1=L11-L12として求めるものとする。 In the present embodiment, the score S1 calculated for each word conversion candidate is obtained as S1 = L11−L12 from the number of characters L11 of the text before conversion and the number of characters L12 of the text after conversion by the word conversion unit 31. And
 テキスト要約部36は、入力された単語変換候補群の長さL21のテキストを文書要約し、長さL22の要約テキストを出力する。テキスト要約部36により複数の要約候補を生成可能である場合は、そのすべてがテキスト変換候補群に含まれる。 The text summarizing section 36 summarizes the text of the input word conversion candidate group having a length L21 and outputs a summary text having a length L22. When a plurality of summary candidates can be generated by the text summarizing unit 36, all of them are included in the text conversion candidate group.
 各テキスト変換候補毎に算出されるスコアS2は、S2=L21-L22で求めるものとする。 Suppose that the score S2 calculated for each text conversion candidate is obtained by S2 = L21−L22.
 候補選択部33は、単語変換部31とテキスト要約部36が出力した各テキスト変換候補のすべての組み合わせの中で、y2=x2-(α1*S1+α2*S2)が正かつ最小となる候補を選択し、出力する。ただし、すべてのテキスト変換候補のy2が負になる場合は、候補選択部33は、S1+S2が最大となる候補を選択し、出力する。 The candidate selection unit 33 selects a candidate in which y2 = x2− (α1 * S1 + α2 * S2) is positive and minimum among all combinations of the text conversion candidates output from the word conversion unit 31 and the text summarization unit 36. And output. However, when y2 of all the text conversion candidates becomes negative, the candidate selection unit 33 selects and outputs a candidate that maximizes S1 + S2.
 図9は、「私が持っているパーソナルコンピュータが壊れてしまった。」という入力テキストに対するスコア(総合スコア)を上記S1、S2を用いて算出した結果を表している。なお、図9の例では、上記y2の算出式中の定数α1、α2として、α1=1、α2=1を設定している。 FIG. 9 shows the result of calculating the score (total score) for the input text “My personal computer has been broken” using S1 and S2. In the example of FIG. 9, α1 = 1 and α2 = 1 are set as the constants α1 and α2 in the calculation formula of y2.
 図9を参照して、スコアの算出の方法を説明する。例えば、候補番号1の単語変換「なし」、テキスト要約「なし」のケースでは、S1=0(変換による短縮なし)、S2=0(要約による短縮なし)と算出され、総合スコアは0と算出される。 Referring to FIG. 9, the score calculation method will be described. For example, in the case of word conversion “none” for candidate number 1 and text summary “none”, S1 = 0 (no shortening due to conversion), S2 = 0 (no shortening due to summary), and the total score is calculated as 0. Is done.
 同様に、候補番号2の単語変換「a」(図5の「パーソナルコンピュータ」を「パソコン」に変換、テキスト要約「なし」のケースでは、S1=7、S2=0と算出され、総合スコアは7と算出される。 Similarly, the word conversion “a” of candidate number 2 (in the case of “personal computer” in FIG. 5 converted to “personal computer” and the text summary “none” is calculated as S1 = 7, S2 = 0, and the total score is 7 is calculated.
 同様に、候補番号3の単語変換「b」(図5の「パーソナルコンピュータ」を「PC」に変換、テキスト要約「なし」のケースでは、S1=9、S2=0と算出され、総合スコアは9と算出される。 Similarly, in the case of the word conversion “b” of candidate number 3 (“personal computer” in FIG. 5 is converted to “PC” and the text summary is “none”, S1 = 9 and S2 = 0 are calculated, and the total score is 9 is calculated.
 上記総合スコアを用いた上述の候補選択部33のy2=x2-(α1*S1+α2*S2)が正かつ最小となる候補を選択する基準に当てはめると、ユーザの切迫度x2が著しく高いとき(度数9以上)には、候補番号1~3のうち最も短い候補番号3のテキスト変換候補が選択される。ユーザの切迫度x2が著しく高くはないが一定値以上である場合(度数7以上9未満)、候補番号3のテキスト変換候補が選択される。ユーザの切迫度x2が低い場合(度数7未満)、候補番号1のテキスト変換候補が選択される。つまり、ユーザの切迫度x2が高いときには、候補番号3の「私が持っているPCが壊れてしまった。」という単語変換により、短く言い換えられた候補が選択される。 When the above-described candidate selection unit 33 using the total score is applied to a criterion for selecting a candidate where y2 = x2− (α1 * S1 + α2 * S2) is positive and minimum, the user's urgency level x2 is extremely high (frequency) 9 or more), the text conversion candidate with the shortest candidate number 3 among the candidate numbers 1 to 3 is selected. When the user's urgency level x2 is not remarkably high but is a certain value or more (frequency 7 or more and less than 9), the text conversion candidate of candidate number 3 is selected. When the user's urgency level x2 is low (frequency less than 7), the text conversion candidate with candidate number 1 is selected. That is, when the user's urgency level x2 is high, a candidate that is paraphrased shortly is selected by the word conversion of candidate number 3 "My PC has been broken."
 上記総合スコアは、テキスト要約の効果が大きくなると更に高くなり、例えば、図6の候補番号4~9の各テキスト変換候補は、それぞれ同一の単語変換を行った候補番号1~3のテキスト変換候補よりも総合スコアが高くなっている。すべてのテキスト変換候補が得られている状態では、ユーザの切迫度x2が度数20である場合、入力テキストを単語変換と、文書要約により短かく言い換えた候補番号9(総合スコア=18、y2=2)が選択される。 The total score becomes higher as the effect of text summarization becomes larger. For example, the text conversion candidates of candidate numbers 4 to 9 in FIG. 6 are the text conversion candidates of candidate numbers 1 to 3 that have been subjected to the same word conversion. The overall score is higher than. In a state where all text conversion candidates are obtained, if the user's urgency level x2 is 20, the candidate number 9 (total score = 18, y2 = 2) is selected.
 以上のように、本実施形態では、緊急度が高いと判定された場合は、より短くて、短時間で伝達可能な可能性が高い文を生成することが可能となる。 As described above, in this embodiment, when it is determined that the degree of urgency is high, it is possible to generate a sentence that is shorter and highly likely to be transmitted in a short time.
 なお、本実施形態においても、選択可能な候補を予めすべて挙げてから選択を行うものとして説明したが、単語変換部31やテキスト要約部36にユーザの切迫度x2を入力し、各段階で不要な候補を削除、あるいは、最適な候補を選択するようにしてもよい。例えば、ユーザの切迫度x2が高い場合には、スコアS1が高い変換を行ったテキスト変換候補のみを出力するようにすることで、テキスト要約部36や候補選択部33の負荷や処理時間を低減することが可能である。 In the present embodiment, it has been described that selection is performed after all possible candidates are listed. However, the user's urgency level x2 is input to the word conversion unit 31 and the text summarization unit 36, and is not required at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the user's urgency level x2 is high, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text summarization unit 36 and candidate selection unit 33. Is possible.
 また、上記ユーザの切迫度を表す数値x2は、上記に限らず、以下に示す各方法で、得ることが可能である。 Further, the numerical value x2 representing the degree of urgency of the user is not limited to the above, and can be obtained by the following methods.
 例えば、上記切迫度を表す数値x2は、ユーザが搭乗(運転)している自動車等の乗り物の移動速度(車速あるいは駆動輪回転数)を入力とし、該当値と単調増加な関係としても得られる。 For example, the numerical value x2 representing the degree of urgency is obtained as a monotonically increasing relationship with the corresponding value by inputting the moving speed (vehicle speed or driving wheel rotational speed) of a vehicle such as an automobile on which the user is boarding (driving). .
 また例えば、上記切迫度を表す数値x2は、自動車のブレーキの動作を入力とし、自動車を運転しているユーザがブレーキを踏んだ時にx2の値を大きく、更にブレーキペダルの動きの加速度が大きい時に、x2の値が更に大きくなるようにすることによっても得ることができる。 Also, for example, the numerical value x2 representing the degree of urgency is obtained when the brake operation of the automobile is input, the value of x2 is increased when the user driving the automobile depresses the brake, and the acceleration of the movement of the brake pedal is large. , X2 can be obtained by further increasing the value.
[第3の実施形態]
 続いて、ユーザの心理的状況を表すパラメータとして、ユーザの集中度(集中している度合い)を用いてテキスト変換を行う本発明の第3の実施形態について説明する。図10は、本発明の第3の実施形態に係るテキスト変換装置の構成を表したブロック図である。
[Third Embodiment]
Next, a third embodiment of the present invention in which text conversion is performed using a user's concentration degree (concentration degree) as a parameter representing the user's psychological situation will be described. FIG. 10 is a block diagram showing the configuration of the text conversion apparatus according to the third embodiment of the present invention.
 図10を参照すると、本発明の第3の実施形態に係るテキスト変換装置は、マイクロフォン等の音声入力部11と、音声認識部12と、応答文言生成部13と、発話速度測定部23と、心理状況推定部21と、テキスト変換部22と、テキスト音声合成部14と、スピーカ15と、を備えて構成される。先の第2の実施形態と同様に、本実施形態のテキスト変換装置の処理手段は、テキスト変換装置を構成するコンピュータに、後記する各処理を実行させるプログラムにより実現することができる。 Referring to FIG. 10, the text conversion device according to the third exemplary embodiment of the present invention includes a voice input unit 11 such as a microphone, a voice recognition unit 12, a response message generation unit 13, a speech rate measurement unit 23, A psychological situation estimation unit 21, a text conversion unit 22, a text speech synthesis unit 14, and a speaker 15 are configured. As in the second embodiment, the processing means of the text conversion apparatus of this embodiment can be realized by a program that causes a computer constituting the text conversion apparatus to execute each process described later.
 本実施形態は、心理状況推定部21がユーザの発話速度からユーザの集中度を示す値x3を出力し、テキスト変換部22がユーザの集中度を示す値x3を用いてテキスト変換を行うものであり、その他要素は、上記した第2の実施形態と同様であるので、その相違点を中心に説明する。 In the present embodiment, the psychological situation estimation unit 21 outputs a value x3 indicating the user's concentration level from the user's utterance speed, and the text conversion unit 22 performs text conversion using the value x3 indicating the user's concentration level. The other elements are the same as those in the second embodiment described above, and the differences will be mainly described.
 心理状況推定部21は、発話速度の値の時間的変動成分からユーザの集中度x3を求める。具体的には、発話速度の最大値Vmaxと最小値Vminの値から、変動成分Vdiff=Vmax-Vminを計算する。 The psychological situation estimation unit 21 obtains the user concentration x3 from the temporal variation component of the speech rate value. Specifically, the fluctuation component Vdiff = Vmax−Vmin is calculated from the maximum value Vmax and the minimum value Vmin of the speech rate.
 心理状況推定部21は、この発話速度の値の時間的変動成分Vdiffの値が大きい時は、対話以外のことに気を取られている可能性が高いと判定して、集中度を表す数値x3を出力する。 When the value of the temporal variation component Vdiff of the utterance speed value is large, the psychological situation estimation unit 21 determines that there is a high possibility of being distracted by things other than dialogue and is a numerical value indicating the degree of concentration. x3 is output.
 従って、集中度を表す数値x3は、該発話速度の値の時間的変動成分の値Vdiffと、単調減少の関係となるよう、予め与えられた関係によって求めることができる。 Therefore, the numerical value x3 representing the degree of concentration can be obtained from a relationship given in advance so as to have a monotonously decreasing relationship with the value Vdiff of the temporal variation component of the speech speed value.
 テキスト変換部22は、上記のようにして得られるユーザの集中度x3に応じて、入力テキストをスコアに基づき変換して、出力テキストを生成する手段である。 The text conversion unit 22 is a means for converting the input text based on the score according to the user concentration x3 obtained as described above, and generating the output text.
 図11は、テキスト変換部22の詳細構成を表した図である。テキスト変換部22は、単語変換部31と、テキスト強調部37と、候補選択部33と、単語変換データベース(単語変換DB)34と、を備えて構成される。 FIG. 11 is a diagram showing a detailed configuration of the text conversion unit 22. The text conversion unit 22 includes a word conversion unit 31, a text enhancement unit 37, a candidate selection unit 33, and a word conversion database (word conversion DB) 34.
 単語変換部31は、単語変換DB34に登録された単語対を用いて、長さLの入力テキストに含まれる変更可能な、すべての組のテキスト候補である単語変換候補群を出力する。 The word conversion unit 31 uses the word pairs registered in the word conversion DB 34 to output word conversion candidate groups which are all sets of changeable text candidates included in the input text of length L.
 単語変換DB34には、上記第1の実施形態と同様、意味が略同一である一組以上の単語対と、各単語対iによって文言を変換した場合の変換スコアS1(i)が記録されている。 In the word conversion DB 34, as in the first embodiment, one or more pairs of words having substantially the same meaning and a conversion score S1 (i) when the word is converted by each word pair i are recorded. Yes.
 ここで、単語変換候補毎に算出されるS1は、変換に使用した単語の変換スコアS1(i)の和である。 Here, S1 calculated for each word conversion candidate is the sum of the conversion scores S1 (i) of the words used for conversion.
 テキスト強調部37は、入力された単語変換候補群の長さL21のテキストから重要語を抽出し、該重要語を任意の回数繰り返した長さL22の出力テキストを生成する(フレーズ繰り返し処理)。例えば、「次はBボタンを押してください」という入力テキストに対し、テキスト強調部37は、「Bボタン」を重要語として抽出し、読点を挟んで該重要語を二度繰り返すことにより、「次はBボタン、Bボタンを押してください」というテキストを生成する。入力テキストに複数の重要語が含まれている場合は、テキスト強調部37は、各重要語をそれぞれ繰り返したパターンの組み合わせすべてをテキスト変換候補群として出力する。 The text emphasizing unit 37 extracts an important word from the text of the input word conversion candidate group having a length L21, and generates an output text having a length L22 in which the important word is repeated an arbitrary number of times (phrase repetition processing). For example, for the input text “Please press the B button next”, the text emphasizing unit 37 extracts the “B button” as an important word, and repeats the important word twice with a punctuation mark. "B button, please press B button". When the input text includes a plurality of important words, the text emphasizing unit 37 outputs all combinations of patterns obtained by repeating each important word as a text conversion candidate group.
 重要語の候補は、予めテキスト強調部37内に定義しておいてもよいし、目的語等を一定の規則で重要語として抽出するようにしてもよい。 Important word candidates may be defined in the text emphasizing unit 37 in advance, or an object or the like may be extracted as an important word according to a certain rule.
 各テキスト変換候補毎に算出されるスコアS2は、S2=L22-L21で求めるものとする。 Suppose that the score S2 calculated for each text conversion candidate is obtained by S2 = L22−L21.
 候補選択部33は、単語変換部31とテキスト強調部37が出力した各テキスト変換候補のすべての組み合わせの中で、y3=(1/x3)-(β1*S1+β2*S2)が正かつ最小となる候補を選択し、出力する。ただし、すべてのテキスト変換候補のy3が負になる場合は、候補選択部33は、S1+S2が最大となる候補を選択し、出力する。 The candidate selection unit 33 determines that y3 = (1 / x3) − (β1 * S1 + β2 * S2) is positive and minimum among all combinations of the text conversion candidates output by the word conversion unit 31 and the text enhancement unit 37. Select the candidate to be output. However, when y3 of all the text conversion candidates is negative, the candidate selection unit 33 selects and outputs a candidate having the maximum S1 + S2.
 本実施形態におけるスコアの算出の方法を説明する。例えば、単語変換「なし」、テキスト強調「なし」のケースでは、S1=0(変換なし)、S2=0(強調なし)と算出され、総合スコアは0と算出される。 The score calculation method in this embodiment will be described. For example, in the case of word conversion “none” and text enhancement “none”, S1 = 0 (no conversion) and S2 = 0 (no enhancement) are calculated, and the total score is calculated as zero.
 一方、図5の「パーソナルコンピュータ」を「パソコン」に変換等の単語変換を行い、テキスト強調「なし」のケースでは、S1=50、S2=0と算出され、β1、β2をそれぞれ1とした場合、総合スコアは50と算出される。 On the other hand, word conversion such as conversion of “personal computer” to “personal computer” in FIG. 5 is performed, and in the case of text emphasis “none”, S1 = 50 and S2 = 0 are calculated, and β1 and β2 are set to 1, respectively. In this case, the total score is calculated as 50.
 一方、図5の「パーソナルコンピュータ」を「パソコン」に変換等の単語変換を行い、テキスト強調「ボタンB」の2回繰り返しを行ったケースでは、S1=50、S2=4と算出され、β1、β2をそれぞれ1とした場合、総合スコアは54と算出される。 On the other hand, in the case where word conversion such as conversion of “personal computer” in FIG. 5 to “personal computer” is performed and the text emphasis “button B” is repeated twice, S1 = 50 and S2 = 4 are calculated, and β1 , Β2 is 1, and the total score is calculated as 54.
 上記総合スコアを用いた上述の候補選択部33のy3=(1/x3)-(β1*S1+β2*S2)が正かつ最小となる候補を選択する基準に当てはめると、ユーザの集中度x3が低いときには、上記単語変換と、テキスト強調の双方を行ったテキスト変換候補が選択される。反対に、ユーザの集中度x3が高いと判断されるときには、上記単語変換やテキスト強調を行っていないテキスト変換候補が選択される。 When the above-mentioned candidate selection unit 33 using the total score is applied to a criterion for selecting a candidate in which y3 = (1 / x3) − (β1 * S1 + β2 * S2) is positive and minimum, the user concentration level x3 is low. Sometimes, a text conversion candidate that performs both the word conversion and text emphasis is selected. On the other hand, when it is determined that the user's concentration x3 is high, a text conversion candidate that is not subjected to the word conversion or text emphasis is selected.
 以上のように、本実施形態では、ユーザの集中度が低いと判定された場合は、より冗長だが判りやすい表現の文を生成することが可能となる。 As described above, in this embodiment, when it is determined that the user's concentration is low, it is possible to generate a more verbose but easy-to-understand expression sentence.
 なお、本実施形態においても、選択可能な候補を予めすべて挙げてから選択を行うものとして説明したが、単語変換部31やテキスト強調部37にユーザの集中度x3を入力し、各段階で不要な候補を削除、あるいは、最適な候補を選択するようにしてもよい。例えば、ユーザの集中度x3が低い場合には、スコアS1が高い変換を行ったテキスト変換候補のみを出力するようにすることで、テキスト強調部37や候補選択部33の負荷や処理時間を低減することが可能である。 In this embodiment, the selection is made after all possible candidates are listed in advance. However, the user's concentration x3 is input to the word conversion unit 31 and the text emphasizing unit 37, and is unnecessary at each stage. Simple candidates may be deleted or an optimal candidate may be selected. For example, when the degree of user concentration x3 is low, only the text conversion candidates that have undergone conversion with a high score S1 are output, thereby reducing the load and processing time of the text enhancement unit 37 and candidate selection unit 33. Is possible.
 また、上記ユーザの集中度を表す数値x3は、上記に限らず、以下に示す各方法で、得ることが可能である。 Further, the numerical value x3 representing the user concentration level is not limited to the above, and can be obtained by the following methods.
 例えば、上記集中度を表す数値x3は、ユーザの皮膚の電気抵抗を測定し入力することにより、電気抵抗と単調減少の関係にある値としてのユーザの発汗量を推定し、発汗量が多い場合に集中度が高いという関係から求めることができる。 For example, the numerical value x3 representing the concentration level is obtained by measuring and inputting the electrical resistance of the user's skin to estimate the amount of sweating of the user as a value that has a monotonous decrease relationship with the electrical resistance. It can be obtained from the relationship that the degree of concentration is high.
 また例えば、上記集中度を表す数値x3は、ユーザの呼吸を測定し入力することにより、時間当たりの呼吸回数が少ない時は集中度が高いという関係から求めることができる。 Also, for example, the numerical value x3 representing the degree of concentration can be obtained from the relationship that the degree of concentration is high when the number of breaths per hour is small by measuring and inputting the user's respiration.
 また例えば、上記集中度を表す数値x3は、ユーザの脈拍を測定し入力することにより、時間当たりの拍動数が多い時は集中度が高いという関係から求めることができる。 Also, for example, the numerical value x3 representing the degree of concentration can be obtained from the relationship that the degree of concentration is high when the number of beats per hour is large by measuring and inputting the user's pulse.
 以上、本発明の好適な実施形態を説明したが、本発明は、上記した実施形態に限定されるものではなく、本発明の基本的技術的思想を逸脱しない範囲で、更なる変形・置換・調整を加えることができる。また例えば、上記した第2の実施形態では、変換した単語が同義語を持つか否かといった観点のスコアを用いないものとして説明したが、このスコアを適宜補正して総合スコアに加算することで、切迫しているユーザに伝わりやすい音声を出力することが可能となる。 The preferred embodiments of the present invention have been described above. However, the present invention is not limited to the above-described embodiments, and further modifications, replacements, and replacements may be made without departing from the basic technical idea of the present invention. Adjustments can be made. Further, for example, in the above-described second embodiment, it has been described that the score from the viewpoint of whether or not the converted word has a synonym is not used, but by appropriately correcting this score and adding it to the total score It is possible to output a voice that is easy to be transmitted to an imminent user.
 更に、上記した第1~第3の実施形態では、ユーザの緊急度、切迫度、集中度によるテキスト変換を取り上げて説明したが、上記した第1~第3の実施形態のユーザの緊急度、切迫度、集中度によるテキスト変換をそれぞれ行ない、その中からユーザの心理的状況に適ったものを選択して発話するように構成することもできる。また、ユーザの緊急度、切迫度、集中度に限らず、その他の心理的状況を表すパラメータにより、テキスト変換を行うことももちろん可能である。
 なお、上記ユーザの緊急度、切迫度、集中度といった各種心理的状況を表すパラメータ、入出力されるテキストおよびテキスト変換プログラムは、コンピュータが物理的ないし電気的信号として取り扱い可能なものであればよい。テキスト変換プログラムは、これら心理的状況を表すパラメータ及びテキストが入力されたコンピュータを、変換後のテキストを出力させるための物理的手段として機能させる。
Further, in the above first to third embodiments, the text conversion based on the user's urgency level, urgency level, and concentration level has been described, but the user urgency levels of the above first to third embodiments, It is also possible to perform text conversion based on the degree of urgency and the degree of concentration, respectively, and select a text suitable for the psychological situation of the user from the text conversion and utter it. Of course, it is possible to perform text conversion not only with the user's urgency level, urgency level, and concentration level but also with parameters representing other psychological situations.
The parameters representing various psychological situations such as the user's urgency level, urgency level, and concentration level, input / output text, and text conversion program may be anything that can be handled as a physical or electrical signal by the computer. . The text conversion program causes a computer in which parameters and text representing the psychological situation are input to function as physical means for outputting the converted text.
 本発明は、テキスト音声合成装置と組み合わせることにより、ユーザの心理的状況を察して発話テキストを変更する、音声合成装置、音声対話システム、音声自動応答装置、知能ロボット等の各種用途に用いることができる。
 なお、本発明の全開示(請求の範囲を含む)の枠内において、さらにその基本的技術思想に基づいて、実施形態ないし実施例の変更・調整が可能である。また、本発明の請求の範囲の枠内において種々の開示要素の多様な組み合わせ乃至選択が可能である。すなわち、本発明は、請求の範囲を含む全開示、技術的思想にしたがって当業者であればなし得るであろう各種変形、修正を含むことは勿論である。
The present invention can be used in various applications such as a speech synthesizer, a speech dialogue system, a speech automatic response device, an intelligent robot, etc., which changes the utterance text by combining with a text-to-speech synthesizer. it can.
It should be noted that the embodiments and examples can be changed and adjusted within the scope of the entire disclosure (including claims) of the present invention and based on the basic technical concept. Various combinations and selections of various disclosed elements are possible within the scope of the claims of the present invention. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the entire disclosure including the claims and the technical idea.

Claims (29)

  1.  ユーザの心理状況を表すパラメータと、テキストと、を入力として、該入力テキストの文意を変えない範囲で、前記入力パラメータに適した該入力テキストの変換動作を行うことを特徴とする、テキスト変換装置。 A text conversion characterized in that a conversion operation of the input text suitable for the input parameter is performed within a range that does not change the meaning of the input text, taking as input a parameter representing the user's psychological state and text. apparatus.
  2.  複数のテキスト変換候補を作成し、それぞれのテキスト変換候補に対して、出力テキストを音声として受聴した場合のわかりやすさを示すスコアを求め、前記入力パラメータと釣り合うスコアを持つテキスト変換候補を選択することを特徴とする、
     請求項1記載のテキスト変換装置。
    Creating a plurality of text conversion candidates, obtaining a score indicating the clarity when the output text is received as speech for each text conversion candidate, and selecting a text conversion candidate having a score that matches the input parameter. Features
    The text conversion apparatus according to claim 1.
  3.  前記ユーザの心理状況を表すパラメータとしてユーザの緊急度を用いることを特徴とする、請求項1または2に記載のテキスト変換装置。 3. The text conversion apparatus according to claim 1, wherein the user's urgency is used as a parameter representing the user's psychological state.
  4.  前記ユーザの心理状況を表すパラメータとしてユーザの切迫度を用いることを特徴とする、請求項1または2に記載のテキスト変換装置。 3. The text conversion apparatus according to claim 1, wherein the user's urgency level is used as a parameter representing the user's psychological state.
  5.  前記ユーザの心理状況を表すパラメータとしてユーザの集中度を用いることを特徴とする、請求項1または2に記載のテキスト変換装置。 The text conversion device according to claim 1 or 2, wherein the user's concentration is used as a parameter representing the psychological state of the user.
  6.  前記ユーザの緊急度を、ユーザの発声する音声のピッチ周波数の平均値で代替することを特徴とする、請求項3に記載のテキスト変換装置。 The text conversion device according to claim 3, wherein the urgency level of the user is replaced with an average value of pitch frequencies of voices uttered by the user.
  7.  前記ユーザの切迫度を、ユーザの発声する音声の速度で代替することを特徴とする、請求項4に記載のテキスト変換装置。 5. The text conversion apparatus according to claim 4, wherein the urgency level of the user is replaced by a speed of voice uttered by the user.
  8.  前記ユーザの集中度を、ユーザの発声する音声の速度の時間的変動成分で代替することを特徴とする、請求項5に記載のテキスト変換装置。 6. The text conversion apparatus according to claim 5, wherein the degree of concentration of the user is replaced with a temporal variation component of a speed of voice uttered by the user.
  9.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、
     前記置換した単語が持つ同音異義語の数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項2ないし8いずれか一に記載のテキスト変換装置。
    Create multiple text conversion candidates by replacing a word in the input text with another word,
    9. The text conversion apparatus according to claim 2, wherein a score is given to each of the text conversion candidates according to the number of homonyms that the replaced word has.
  10.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、
     前記各テキスト変換候補を読み上げた時の時間長さ、または前記各テキスト変換候補の文の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項2ないし8いずれか一に記載のテキスト変換装置。
    Create multiple text conversion candidates by replacing a word in the input text with another word,
    The score is given to each of the text conversion candidates according to a time length when the text conversion candidates are read out or a sentence length of each of the text conversion candidates. A text conversion device according to claim 1.
  11.  入力テキストを複数の分割単位に分割することにより、複数のテキスト変換候補を作成し、
     前記各テキスト変換候補に含まれる分割単位の数または各分割単位の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項2ないし8いずれか一に記載のテキスト変換装置。
    Create multiple text conversion candidates by dividing the input text into multiple units,
    The text according to any one of claims 2 to 8, wherein a score is given to each text conversion candidate according to the number of division units included in each text conversion candidate or the length of each division unit. Conversion device.
  12.  入力テキスト中から1以上の重要語を抽出して、該重要語を二回以上重ねる変換操作を行うことにより、複数のテキスト変換候補を作成し、
     前記変換操作の回数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項2ないし8いずれか一に記載のテキスト変換装置。
    By extracting one or more important words from the input text and performing a conversion operation of overlapping the important words twice or more, a plurality of text conversion candidates are created,
    The text conversion apparatus according to claim 2, wherein a score is given to each of the text conversion candidates according to the number of times of the conversion operation.
  13.  ユーザの音声又はユーザの操作内容に基づいて、ユーザの心理状況を推定するパラメータを出力する心理状況推定部と、
     前記ユーザの心理状況を推定するパラメータに基づいて、入力テキスト中の単語の同義語への置換、入力テキストへのポーズ点挿入、入力テキスト中のフレーズの繰り返しのいずれか1以上のテキスト変換処理を行うテキスト変換部と、
     を備えたことを特徴とする、テキスト変換装置。
    Based on the user's voice or user's operation content, a psychological situation estimation unit that outputs a parameter for estimating the user's psychological situation;
    Based on the parameter for estimating the psychological state of the user, one or more text conversion processes of replacement of a word in the input text with a synonym, insertion of a pause point in the input text, and repetition of a phrase in the input text are performed. A text converter to perform,
    A text conversion device comprising:
  14.  前記テキスト変換部は、所定のテキスト変換規則に従って、テキスト変換候補群を生成し、前記テキスト変換候補群に含まれる各テキスト変換候補に、所定のスコア算出式に従ってスコアを付与していき、前記ユーザの心理状況を推定するパラメータに適合するスコアが付与されたテキスト変換候補を選択する請求項13に記載のテキスト変換装置。 The text conversion unit generates a text conversion candidate group according to a predetermined text conversion rule, assigns a score according to a predetermined score calculation formula to each text conversion candidate included in the text conversion candidate group, and the user The text conversion device according to claim 13, wherein a text conversion candidate to which a score that matches a parameter for estimating a psychological state of the text is selected.
  15.  請求項1ないし14いずれか一に記載のテキスト変換装置と、前記テキスト変換装置から出力されるテキストを読み上げるテキスト音声合成手段と、を備えることを特徴とする、音声出力装置。 15. A speech output device comprising: the text conversion device according to claim 1; and a text-to-speech synthesis unit that reads out text output from the text conversion device.
  16.  請求項15に記載の音声出力装置を含み、受聴者の心理状況に応じた音声出力を行うことを特徴とする、ロボット。 A robot comprising the voice output device according to claim 15 and performing voice output according to a listener's psychological situation.
  17.  テキスト変換装置によるテキスト変換方法であって、
     ユーザの心理状況を表すパラメータと、テキストと、を入力するステップと、
     前記入力パラメータに基づいて、文意を変えない範囲で前記入力されたテキストを変換するステップと、を含む、テキスト変換方法。
    A text conversion method by a text conversion device,
    Inputting parameters representing the user's psychological state and text;
    Converting the inputted text within a range that does not change the meaning of the sentence based on the input parameter.
  18.  所定のテキスト変換候補作成規則を用いて、前記テキストから、複数のテキスト変換候補を生成し、
     前記各テキスト変換候補に対し、それぞれ出力テキストを音声として受聴した場合のわかりやすさを示すスコアを求め、前記入力パラメータと釣り合うスコアを持つテキスト変換候補を選択する、請求項17に記載のテキスト変換方法。
    A plurality of text conversion candidates are generated from the text using a predetermined text conversion candidate creation rule,
    18. The text conversion method according to claim 17, wherein for each of the text conversion candidates, a score indicating ease of understanding when the output text is received as speech is obtained, and a text conversion candidate having a score commensurate with the input parameter is selected.
  19.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、
     前記置換した単語が持つ同音異義語の数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項18に記載のテキスト変換方法。
    Create multiple text conversion candidates by replacing a word in the input text with another word,
    19. The text conversion method according to claim 18, wherein a score is given to each of the text conversion candidates in accordance with the number of homonyms that the replaced word has.
  20.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、
     前記各テキスト変換候補を読み上げた時の時間長さ、または前記各テキスト変換候補の文の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項18に記載のテキスト変換方法。
    Create multiple text conversion candidates by replacing a word in the input text with another word,
    19. The score according to claim 18, wherein a score is given to each text conversion candidate according to a time length when the text conversion candidates are read out or a sentence length of each text conversion candidate. Text conversion method.
  21.  入力テキストを複数の分割単位に分割することにより、複数のテキスト変換候補を作成し、
     前記各テキスト変換候補に含まれる分割単位の数または各分割単位の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項18に記載のテキスト変換方法。
    Create multiple text conversion candidates by dividing the input text into multiple units,
    The text conversion method according to claim 18, wherein a score is given to each text conversion candidate according to the number of division units included in each text conversion candidate or the length of each division unit.
  22.  入力テキスト中から1以上の重要語を抽出して、該重要語を二回以上重ねる変換操作を行うことにより、複数のテキスト変換候補を作成し、
     前記変換操作の回数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項18に記載のテキスト変換方法。
    By extracting one or more important words from the input text and performing a conversion operation of overlapping the important words twice or more, a plurality of text conversion candidates are created,
    The text conversion method according to claim 18, wherein a score is given to each text conversion candidate according to the number of times of the conversion operation.
  23.  ユーザの心理状況を表すパラメータと、テキストと、を入力する処理と、
     前記入力されたテキストの文意を変えない範囲で、前記入力されたテキストを変換する処理と、をコンピュータに実行させるテキスト変換プログラム。
    A process of inputting a parameter representing the psychological state of the user and text,
    A text conversion program that causes a computer to execute a process of converting the input text within a range that does not change the meaning of the input text.
  24.  所定のテキスト変換候補作成規則を用いて、前記テキストから、複数のテキスト変換候補を生成する処理と、
     前記各テキスト変換候補に対し、それぞれ出力テキストを音声として受聴した場合のわかりやすさを示すスコアを求め、前記入力パラメータと釣り合うスコアを持つテキスト変換候補を選択する処理と、を前記コンピュータに実行させる、請求項23に記載のテキスト変換プログラム。
    A process of generating a plurality of text conversion candidates from the text using a predetermined text conversion candidate creation rule;
    A process for obtaining a score indicating ease of understanding when the output text is received as speech for each of the text conversion candidates and selecting the text conversion candidate having a score that matches the input parameter is executed by the computer. Item 24. The text conversion program according to Item 23.
  25.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、
     前記置換した単語が持つ同音異義語の数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項24に記載のテキスト変換プログラム。
    Create multiple text conversion candidates by replacing a word in the input text with another word,
    The text conversion program according to claim 24, wherein a score is given to each of the text conversion candidates according to the number of homonyms possessed by the replaced word.
  26.  入力テキスト中の単語を別の単語に置換することにより、複数のテキスト変換候補を作成し、前記各テキスト変換候補を読み上げた時の時間長さ、または前記各テキスト変換候補の文の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項24に記載のテキスト変換プログラム。 By replacing a word in the input text with another word, a plurality of text conversion candidates are created, and the time length when each text conversion candidate is read out, or the sentence length of each text conversion candidate is set. 25. The text conversion program according to claim 24, wherein a score is given to each of the text conversion candidates accordingly.
  27.  入力テキストを複数の分割単位に分割することにより、複数のテキスト変換候補を作成し、
     前記各テキスト変換候補に含まれる分割単位の数または各分割単位の長さに応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項24に記載のテキスト変換プログラム。
    Create multiple text conversion candidates by dividing the input text into multiple units,
    The text conversion program according to claim 24, wherein a score is given to each text conversion candidate according to the number of division units included in each text conversion candidate or the length of each division unit.
  28.  入力テキスト中から1以上の重要語を抽出して、該重要語を二回以上重ねる変換操作を行うことにより、複数のテキスト変換候補を作成し、
     前記変換操作の回数に応じて、前記各テキスト変換候補にスコアを与えることを特徴とする、請求項24に記載のテキスト変換プログラム。
    By extracting one or more important words from the input text and performing a conversion operation of overlapping the important words twice or more, a plurality of text conversion candidates are created,
    The text conversion program according to claim 24, wherein a score is given to each text conversion candidate according to the number of times of the conversion operation.
  29.  請求項23ないし28いずれか一に記載のテキスト変換プログラムにより、出力されるテキストを、テキスト音声合成技術によって音声に変換して出力する処理を更に前記コンピュータに実行させる音声出力プログラム。 29. A voice output program for causing the computer to further execute a process of converting a text to be output into a voice by a text voice synthesis technique using the text conversion program according to any one of claims 23 to 28.
PCT/JP2009/052716 2008-02-19 2009-02-17 Text conversion device, method, and program WO2009104613A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009554331A JP5521554B2 (en) 2008-02-19 2009-02-17 Text conversion device, method, and program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008-037603 2008-02-19
JP2008037603 2008-02-19

Publications (1)

Publication Number Publication Date
WO2009104613A1 true WO2009104613A1 (en) 2009-08-27

Family

ID=40985492

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/052716 WO2009104613A1 (en) 2008-02-19 2009-02-17 Text conversion device, method, and program

Country Status (2)

Country Link
JP (1) JP5521554B2 (en)
WO (1) WO2009104613A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3244408A1 (en) * 2016-05-09 2017-11-15 Sony Mobile Communications, Inc Method and electronic unit for adjusting playback speed of media files
JP2019121139A (en) * 2017-12-29 2019-07-22 Airev株式会社 Summarizing device, summarizing method, and summarizing program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH045694A (en) * 1990-04-23 1992-01-09 Oki Electric Ind Co Ltd Rule synthesizing device
JPH0997094A (en) * 1995-09-29 1997-04-08 Matsushita Electric Ind Co Ltd Onboard voice synthesizer
JPH10288532A (en) * 1997-04-15 1998-10-27 Toyota Motor Corp Voice guide device for vehicle
JP2000069390A (en) * 1998-08-25 2000-03-03 Fujitsu General Ltd Television receiver for the aged

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1083769B1 (en) * 1999-02-16 2010-06-09 Yugen Kaisha GM & M Speech converting device and method
JP4085926B2 (en) * 2003-08-14 2008-05-14 ソニー株式会社 Information processing terminal and communication system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH045694A (en) * 1990-04-23 1992-01-09 Oki Electric Ind Co Ltd Rule synthesizing device
JPH0997094A (en) * 1995-09-29 1997-04-08 Matsushita Electric Ind Co Ltd Onboard voice synthesizer
JPH10288532A (en) * 1997-04-15 1998-10-27 Toyota Motor Corp Voice guide device for vehicle
JP2000069390A (en) * 1998-08-25 2000-03-03 Fujitsu General Ltd Television receiver for the aged

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3244408A1 (en) * 2016-05-09 2017-11-15 Sony Mobile Communications, Inc Method and electronic unit for adjusting playback speed of media files
JP2019121139A (en) * 2017-12-29 2019-07-22 Airev株式会社 Summarizing device, summarizing method, and summarizing program
JP7142435B2 (en) 2017-12-29 2022-09-27 Airev株式会社 Summarization device, summarization method, and summarization program

Also Published As

Publication number Publication date
JP5521554B2 (en) 2014-06-18
JPWO2009104613A1 (en) 2011-06-23

Similar Documents

Publication Publication Date Title
JP4085130B2 (en) Emotion recognition device
JP5327054B2 (en) Pronunciation variation rule extraction device, pronunciation variation rule extraction method, and pronunciation variation rule extraction program
US9799323B2 (en) System and method for low-latency web-based text-to-speech without plugins
EP3370230B1 (en) Voice interaction apparatus, its processing method, and program
JP4914295B2 (en) Force voice detector
JP6440967B2 (en) End-of-sentence estimation apparatus, method and program thereof
WO2002073595A1 (en) Prosody generating device, prosody generarging method, and program
US20180130462A1 (en) Voice interaction method and voice interaction device
KR20110019020A (en) Method and apparatus for processing text data
JP2016206929A (en) Interpretation device, method and program
Elbarougy et al. Improving speech emotion dimensions estimation using a three-layer model of human perception
JP5044783B2 (en) Automatic answering apparatus and method
KR20150065523A (en) Method and apparatus for providing counseling dialogue using counseling information
CN114120985A (en) Pacifying interaction method, system and equipment of intelligent voice terminal and storage medium
JP2000267687A (en) Audio response apparatus
JP5521554B2 (en) Text conversion device, method, and program
WO2009107441A1 (en) Speech synthesizer, text generator, and method and program therefor
JP2002041084A (en) Interactive speech processing system
JP5818753B2 (en) Spoken dialogue system and spoken dialogue method
Ishi et al. Analysis of Acoustic-Prosodic Features Related to Paralinguistic Information Carried by Interjections in Dialogue Speech.
GB2598563A (en) System and method for speech processing
JP2015179198A (en) Reading device, reading method, and program
JP7373348B2 (en) Voice dialogue device, trained model for voice dialogue, and voice dialogue method
KR102358087B1 (en) Calculation apparatus of speech recognition score for the developmental disability and method thereof
Yao Machine Learning Algorithms for Speech Emotion Classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09712904

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2009554331

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09712904

Country of ref document: EP

Kind code of ref document: A1