JP2016218848A

JP2016218848A - Language expression rewriting apparatus, method and program

Info

Publication number: JP2016218848A
Application number: JP2015104613A
Authority: JP
Inventors: 千明宮崎; Chiaki Miyazaki; 太一片山; Taichi Katayama; 徹平野; Toru Hirano; 竜一郎東中; Ryuichiro Higashinaka; 俊朗牧野; Toshiaki Makino; 義博松尾; Yoshihiro Matsuo
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-05-22
Filing date: 2015-05-22
Publication date: 2016-12-22
Anticipated expiration: 2035-05-22
Also published as: JP6161656B2

Abstract

PROBLEM TO BE SOLVED: To achieve rewriting to wide variety of language expressions.SOLUTION: For each of plural language characteristics containing a literary style, a predicate functional expression and a personal pronoun and expressing characteristics corresponding to a character, a setting section 11 stores setting values related to rewriting of language expressions based on the language characteristics, in a setting value database 22. On the basis of the setting value stored in the setting value database 22, a rewriting processing section 13 constituted of plural processing sections 13a to 13l including a literary style conversion section 13a, a predicate functional expression/character word ending conversion section 13e, a personal pronoun conversion section 13h, applies at least one type of rewriting processing to an input sentence 23, from among language expression rewriting processing based on plural types of language characteristics including: processing of converting a type of a literary style; processing of converting a predicate functional expression according to a character; and processing of converting a personal pronoun according to the character.SELECTED DRAWING: Figure 1

Description

本発明は、言語表現書き換え装置、学習装置、方法、及びプログラムに係り、特に、入力されたテキストの言語表現を書き換える言語表現書き換え装置、方法、及びプログラムに関する。 The present invention relates to a language expression rewriting device, a learning device, a method, and a program, and more particularly, to a language expression rewriting device, method, and program for rewriting the language expression of input text.

従来、文末の機能語列（文末表現）のみを対象としてテキストを書き換えることにより、言語表現にキャラクタ付けを行う方法が存在する。例えば、著者の属性が付与されたテキストデータを用いて、著者の属性値毎に偏って多く使われる文末表現を抽出し、発話のキャラクタ付けに利用する技術が提案されている（非特許文献１）。 Conventionally, there is a method for characterizing a language expression by rewriting a text only for a function word sequence (end sentence expression) at the end of the sentence. For example, a technique has been proposed in which text data to which an author's attribute is assigned is used to extract sentence end expressions that are used in a biased manner for each attribute value of the author and used for characterizing an utterance (Non-Patent Document 1). ).

宮崎千明、平野徹、東中竜一郎、牧野俊朗、松尾義博、「発話にキャラクタ性を与えるための文末表現の変換」、人工知能学会研究会資料(SIG-SLUD-68), pp. 41-46, 2013.Chiaki Miyazaki, Toru Hirano, Ryuichiro Higashinaka, Toshiro Makino, Yoshihiro Matsuo, “Conversion of sentence ending expression to give utterance character”, Artificial Intelligence Society Study Group Material (SIG-SLUD-68), pp. 41-46 , 2013.

しかし、上記非特許文献１の技術では、文末表現のみを書き換えの対象としているため、例えば、テレビアニメや漫画の登場人物のように個性豊かなキャラクタの言語的特徴を、多様なバリエーションで表現するためには十分ではない、という問題がある。 However, in the technique of Non-Patent Document 1, since only the sentence end expression is a target of rewriting, for example, the linguistic features of a character with rich individuality such as characters in TV animation and manga are expressed in various variations. There is a problem that this is not enough.

本発明は、上記問題を解決するために成されたものであり、多様なバリエーションの言語表現への書き換えを実現する言語表現書き換え装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above problem, and an object of the present invention is to provide a language expression rewriting apparatus, method, and program for realizing rewriting to various language expressions.

上記目的を達成するために、第１の発明に係る言語表現書き換え装置は、文体、述部機能表現、及び人称代名詞を含み、かつキャラクタに応じた特徴が表れる複数種類の言語的特徴の各々について、該言語的特徴に基づく言語表現の書き換えに関する設定値を設定する設定部と、前記設定部により設定された設定値に基づいて、文体の種類を変換する処理、述部機能表現をキャラクタに応じて変換する処理、及び人称代名詞である形態素をキャラクタに応じた人称代名詞に変換する処理を含む複数種類の言語的特徴に基づく言語表現の書き換え処理のうち、少なくとも１種類以上の書き換え処理を入力文に適用する書き換え処理部と、を含んで構成することができる。 In order to achieve the above object, the language expression rewriting device according to the first invention includes a sentence style, a predicate function expression, and a personal pronoun, and each of a plurality of types of linguistic characteristics that express characteristics according to a character. A setting unit for setting a setting value for rewriting a language expression based on the linguistic feature; a process for converting a style type based on the setting value set by the setting unit; and a predicate function expression according to a character At least one type of rewriting processing based on a plurality of types of linguistic features including processing to convert and morpheme that is a personal pronoun into a personal pronoun corresponding to the character. And a rewrite processing unit to be applied.

第１の発明に係る言語表現書き換え装置によれば、設定部が、文体、述部機能表現、及び人称代名詞を含み、かつキャラクタに応じた特徴が表れる複数種類の言語的特徴の各々について、言語的特徴に基づく言語表現の書き換えに関する設定値を設定する。そして、書き換え処理部が、設定部により設定された設定値に基づいて、文体の種類を変換する処理、述部機能表現をキャラクタに応じて変換する処理、及び人称代名詞である形態素をキャラクタに応じた人称代名詞に変換する処理を含む複数種類の言語的特徴に基づく言語表現の書き換え処理のうち、少なくとも１種類以上の書き換え処理を入力文に適用する。 According to the language expression rewriting apparatus according to the first invention, the setting unit includes a sentence style, a predicate function expression, and a personal pronoun, and each of a plurality of types of linguistic characteristics that express characteristics according to the character Set the value related to rewriting linguistic expressions based on specific features. Then, the rewrite processing unit converts the style of the style based on the setting value set by the setting unit, the process of converting the predicate function expression according to the character, and the morpheme that is a personal pronoun according to the character. Of the linguistic expression rewriting processes based on a plurality of types of linguistic features including the process of converting to personal pronouns, at least one type of rewriting process is applied to the input sentence.

また、第２の発明に係る言語表現書き換え方法は、設定部、及び書き換え処理部を含む言語表現書き換え装置における言語表現書き換え方法であって、前記設定部が、文体、述部機能表現、及び人称代名詞を含み、かつキャラクタに応じた特徴が表れる複数種類の言語的特徴の各々について、該言語的特徴に基づく言語表現の書き換えに関する設定値を設定し、前記書き換え処理部が、前記設定部により設定された設定値に基づいて、文体の種類を変換する処理、述部機能表現をキャラクタに応じて変換する処理、及び人称代名詞である形態素をキャラクタに応じた人称代名詞に変換する処理を含む複数種類の言語的特徴に基づく言語表現の書き換え処理のうち、少なくとも１種類以上の書き換え処理を入力文に適用する方法である。 A language expression rewriting method according to a second invention is a language expression rewriting method in a language expression rewriting device including a setting unit and a rewriting processing unit, wherein the setting unit includes a sentence style, a predicate function expression, and a personal name. For each of a plurality of types of linguistic features including pronouns and displaying features according to the character, setting values for rewriting the language expression based on the linguistic features are set, and the rewriting processing unit is set by the setting unit Multiple types including processing to convert stylistic types, processing to convert predicate function expressions according to characters, and processing to convert morphemes that are personal pronouns into personal pronouns according to characters Is a method of applying at least one kind of rewriting process to the input sentence among the rewriting processes of the linguistic expression based on the linguistic features.

このように、文末表現だけでなく、多様な言語表現の書き換え処理を任意に組み合わせて実施するため、多様なバリエーションの言語表現への書き換えを実現することができる。 In this way, not only the sentence end expression but also various language expression rewriting processes are arbitrarily combined, so that various variations of language expression can be rewritten.

また、第１及び第２の発明において、前記設定部は、文構造、活用形、言いよどみ、方言又は特殊語彙、特定の音素、及びキャラクタを弁別可能だが意味を持たない弁別的無意味表現をさらに含む前記複数種類の言語的特徴の各々について、前記設定値を設定し、前記書き換え処理部は、前記設定部により設定された設定値に基づいて、文構造を変換する処理、活用形を変換する処理、言いよどみの表現に変換する処理、特定の語彙を方言又は特殊語彙へ変換する処理、特定の音素をキャラクタに応じた音素に変換する処理、及びキャラクタに応じた弁別的無意味表現を挿入する処理をさらに含む前記複数種類の言語的特徴に基づく言語表現の書き換え処理のうち、少なくとも１種類以上の書き換え処理を入力文に適用することができる。これにより、より多様なバリエーションの言語表現への書き換えを実現することができる。 In the first and second aspects of the invention, the setting unit further includes a discriminative meaningless expression that can discriminate, but has no meaning, a sentence structure, an inflection form, a slang, a dialect or special vocabulary, a specific phoneme, and a character. The setting value is set for each of the plurality of types of linguistic features including, and the rewrite processing unit converts a sentence structure conversion and a utilization form based on the setting value set by the setting unit Processing, processing to convert to sloppy expression, processing to convert a specific vocabulary to dialect or special vocabulary, processing to convert a specific phoneme to a phoneme according to the character, and insert a discriminative meaningless expression according to the character Of the rewriting processing of the language expression based on the plurality of types of linguistic features further including processing, at least one type of rewriting processing can be applied to the input sentence. As a result, rewriting to more various variations of language expressions can be realized.

また、第１及び第２の発明において、前記設定部は、文字種、分かち書き、及び記号類をさらに含む前記複数種類の言語的特徴の各々について、前記設定値を設定し、前記書き換え処理部は、前記設定部により設定された設定値に基づいて、前記文字種を変換する処理、分かち書きに変換する処理、及び記号類を挿入する処理をさらに含む前記複数種類の言語的特徴に基づく言語表現の書き換え処理のうち、少なくとも１種類以上の書き換え処理を入力文に適用することができる。これにより、より多様なバリエーションの言語表現への書き換えを実現することができる。 In the first and second aspects of the invention, the setting unit sets the setting value for each of the plurality of types of linguistic features further including character types, segmentation, and symbols, and the rewrite processing unit includes: Language expression rewriting processing based on the plurality of types of linguistic features, further including processing for converting the character type, processing for converting to character writing, and processing for inserting symbols based on the setting value set by the setting unit Among these, at least one type of rewrite processing can be applied to the input sentence. As a result, rewriting to more various variations of language expressions can be realized.

また、第３の発明に係る言語表現書き換えプログラムは、コンピュータを、上記の言語表現書き換え装置の各部として機能させるためのプログラムである。 A language expression rewriting program according to the third invention is a program for causing a computer to function as each part of the language expression rewriting apparatus.

以上説明したように、本発明の言語表現書き換え装置、方法、及びプログラムによれば、文末表現だけでなく、多様な言語表現の書き換え処理を任意に組み合わせて実施するため、多様なバリエーションの言語表現への書き換えを実現することができる。 As described above, according to the language expression rewriting device, method, and program of the present invention, not only the sentence end expression but also various language expression rewriting processes can be arbitrarily combined to implement various variations of language expressions. Rewriting to can be realized.

本実施形態に係る言語表現書き換え装置の概略構成を示す機能ブロック図である。It is a functional block diagram which shows schematic structure of the language expression rewriting apparatus which concerns on this embodiment. 機能表現とその意味との対応表の一例を示す図である。It is a figure which shows an example of the correspondence table | surface of function expression and its meaning. 文体別機能表現リストの一例を示す図である。It is a figure which shows an example of the function expression list according to a style. 活用表の一例を示す図である。It is a figure which shows an example of an utilization table. 形態素列同士の置換ルールの一例を示す図である。It is a figure which shows an example of the replacement rule of morpheme strings. キャラクタ別接続表現リストの一例を示す図である。It is a figure which shows an example of the connection expression list classified by character. キャラクタ別機能表現リストの一例を示す図である。It is a figure which shows an example of the function expression list according to character. 崩れ活用ルールの一例を示す図である。It is a figure which shows an example of a collapse utilization rule. キャラクタ別人称代名詞リストの一例を示す図である。It is a figure which shows an example of the personal pronoun list according to character. キャラクタ別語彙置換ルールの一例を示す図である。It is a figure which shows an example of the vocabulary replacement rule classified by character. キャラクタ別音素置換ルールの一例を示す図である。It is a figure which shows an example of the phoneme replacement rule classified by character. 本実施形態における言語表現書き換え処理ルーチンの一例を示すフローチャートである。It is a flowchart which shows an example of the language expression rewriting process routine in this embodiment.

以下、図面を参照して本発明の実施形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜言語表現書き換え装置の構成＞
本実施形態に係る言語表現書き換え装置１０は、ＣＰＵと、ＲＡＭと、後述する言語表現書き換え処理ルーチンを実行するための言語表現書き換えプログラムや各種データを記憶したＲＯＭとを含むコンピュータで構成することができる。言語表現書き換え装置１０は、機能的には、図１に示すように、設定部１１と、基本解析部１２と、書き換え処理部１３とを含む構成で表される。 <Configuration of language expression rewriting device>
The language expression rewriting device 10 according to the present embodiment may be configured by a computer including a CPU, a RAM, a language expression rewriting program for executing a language expression rewriting processing routine described later, and a ROM storing various data. it can. The language expression rewriting apparatus 10 is functionally represented by a configuration including a setting unit 11, a basic analysis unit 12, and a rewriting processing unit 13, as shown in FIG.

言語表現書き換え装置１０は、日本語の入力文２３（テキストデータ）を入力として受け取り、入力文２３の中に含まれる言語表現を、指定された設定に即して書き換えた書き換え文３２を出力する。本実施形態では、テレビアニメや漫画の登場人物のような個性の強いキャラクタの発話において頻繁に観察される以下の１２種類の言語的特徴に関する書き換え項目を、任意の組み合わせで指定可能にする。 The language expression rewriting device 10 receives a Japanese input sentence 23 (text data) as an input, and outputs a rewritten sentence 32 in which the language expression included in the input sentence 23 is rewritten in accordance with a designated setting. . In the present embodiment, the following 12 types of linguistic features that are frequently observed in the utterances of characters with strong personalities such as characters in TV animations and comics can be specified in any combination.

書き換え項目
（ａ）文体、（ｂ）文字種、（ｃ）分かち書き、（ｄ）文構造、（ｅ）述部機能表現・キャラ語尾、（ｆ）活用形、（ｇ）言いよどみ、（ｈ）人称代名詞、（ｉ）方言・特殊語彙、（ｊ）音素置換、（ｋ）弁別的無意味表現、（ｌ）記号類 Rewrite item (a) Style, (b) Character type, (c) Split writing, (d) Sentence structure, (e) Predicate function expression / character ending, (f) Practical use, (g) Sword stagnation, (h) Personal pronoun , (I) dialect / special vocabulary, (j) phoneme substitution, (k) discriminative meaningless expression, (l) symbols

設定部１１は、後述する書き換え処理部１３での言語表現の書き換えに関する設定値が記載された設定ファイル２１を読み込む。設定ファイル２１には、上記の書き換え項目の各々について、以下のような書き換えに関する設定値（設定を指定するための値、ファイル名、文字列）が記載される。 The setting unit 11 reads a setting file 21 in which setting values relating to rewriting of the language expression in the rewriting processing unit 13 described later are described. In the setting file 21, for each of the above rewriting items, the following setting values related to rewriting (values for specifying settings, file names, character strings) are described.

（ａ）文体変換の設定値（０＝無変換、１＝常体（だ体）、２＝敬体（です・ます体）、３＝敬体（でございます体））
（ｂ）文字種変換の設定値（０＝無変換、１＝ひらがな化）
（ｃ）分かち書き変換の設定値（０＝無変換、１＝読点区切り、２＝空白区切り）
（ｄ）文構造変換用のキャラクタ別接続表現リスト２６（詳細は後述）のファイル名（ファイル名を指定しない場合は、無変換とみなす）
（ｅ）述部機能表現・キャラ語尾変換用のキャラクタ別機能表現リスト２７（詳細は後述）のファイル名（ファイル名を指定しない場合は、無変換とみなす）
（ｆ）活用形変換の設定値（０＝無変換、１＝崩れた活用形に変換）
（ｇ）言いよどみ変換の設定値（０＝無変換、１＝言いよどみ化）
（ｈ）人称代名詞置換用のキャラクタ別人称代名詞リスト２９（詳細は後述）のファイル名（ファイル名を指定しない場合は、無変換とみなす）
（ｉ）方言・特殊語彙置換用のキャラクタ別語彙置換ルール３０（詳細は後述）のファイル名（ファイル名を指定しない場合は、無変換とみなす）
（ｊ）音素置換用のキャラクタ別音素置換ルール３１（詳細は後述）のファイル名（ルールのファイル名を指定しない場合は、無変換とみなす）
（ｋ）使用する弁別的無意味表現を示す文字列（弁別的無意味表現を指定しない場合は、無変換とみなす）
（ｌ）使用する記号類を示す文字列（記号類を指定しない場合は、無変換とみなす） (A) Stylistic conversion setting values (0 = no conversion, 1 = normal body, 2 = respected body (is / mass), 3 = respected body)
(B) Setting value of character type conversion (0 = no conversion, 1 = Hiragana conversion)
(C) Setting value for the split-line conversion (0 = no conversion, 1 = reading mark delimiter, 2 = blank delimiter)
(D) File name of sentence connection conversion character-specific expression 26 (details will be described later) (If no file name is specified, no conversion is assumed)
(E) File name of character-specific function expression list 27 (details will be described later) for predicate function expression / character ending conversion (if no file name is specified, it is regarded as no conversion)
(F) Setting value for utilization type conversion (0 = no conversion, 1 = conversion to broken utilization type)
(G) Set value for stagnation conversion (0 = no conversion, 1 = stagnation)
(H) File name of the personal pronoun list 29 for personal pronoun substitution (details will be described later) (if no file name is specified, it is regarded as no conversion)
(I) File name of character-specific vocabulary replacement rule 30 (details will be described later) for dialect / special vocabulary replacement (if no file name is specified, it is regarded as no conversion)
(J) File name of phoneme replacement character-specific phoneme replacement rule 31 (details will be described later) (if no rule file name is specified, it is regarded as no conversion)
(K) Character string indicating the discriminative meaningless expression to be used (if no discriminative meaningless expression is specified, it is regarded as no conversion)
(L) Character string indicating the symbols to be used (If no symbols are specified, it is considered as no conversion)

設定部１１は、読み込んだ設定ファイル２１に記載された各項目についての設定値を、設定値データベース（ＤＢ）２２に記憶する。 The setting unit 11 stores a setting value for each item described in the read setting file 21 in a setting value database (DB) 22.

なお、各項目の詳細については、対応する書き換え処理部１３の各処理部の説明で合わせて行うこととし、ここでの説明は省略する。 The details of each item will be described together with the description of each processing unit of the corresponding rewrite processing unit 13, and the description here will be omitted.

基本解析部１２は、入力文２３を読み込む。入力文２３は、日本語で書かれたテキストデータである。例えば、ブログ、ＳＮＳ（Social Networking Service）等への書き込み、発話の音声認識結果やテキストチャットなど、文字化された日本語のテキストデータであれば、本実施形態の入力文２３として適用可能である。 The basic analysis unit 12 reads the input sentence 23. The input sentence 23 is text data written in Japanese. For example, any textual Japanese text data such as writing to a blog, SNS (Social Networking Service), speech recognition result of an utterance, or text chat can be applied as the input sentence 23 of the present embodiment. .

基本解析部１２は、読み込んだ入力文２３を係り受け解析器にかけ、その出力から形態素境界、各形態素の読み、各形態素の品詞、活用語の活用型・活用形、文節境界、及び文節主辞の情報を取得する。なお、基本解析部１２では、入力文２３の形態素境界、各形態素の読み、各形態素の品詞、活用語の活用型・活用形、文節境界、及び文節主辞の情報が取得できさえすれば、必ずしも係り受け解析器を使用する必要はない。例えば、形態素解析器で形態素境界、読み、品詞、活用語の活用型・活用形を取得し、文節境界や文節主辞は個別のアルゴリズムを使用して求めてもよい。 The basic analysis unit 12 applies the read input sentence 23 to the dependency analyzer, and from the output, reads the morpheme boundary, the reading of each morpheme, the part of speech of each morpheme, the use type / use form of the usage word, the phrase boundary, and the phrase head Get information. The basic analysis unit 12 does not necessarily have to acquire information on the morpheme boundary of the input sentence 23, the reading of each morpheme, the part of speech of each morpheme, the utilization type / utilization form of the usage word, the phrase boundary, and the phrase headword. There is no need to use a dependency analyzer. For example, the morpheme boundary, reading, part of speech, and utilization type / utilization form of the usage word may be acquired by the morphological analyzer, and the phrase boundary and the phrase head word may be obtained using individual algorithms.

また、基本解析部１２は、図２に示すような機能表現とその意味との対応表を用いて、文字列マッチによって、入力文２３に含まれる機能表現の意味ラベルを取得する（参考文献１参照）。なお、機能表現の意味ラベルの取得手法として、例えば参考文献２に記載の、機械学習を用いて適切なラベルを推定する手法を用いてもよい。 Further, the basic analysis unit 12 acquires a semantic label of the functional expression included in the input sentence 23 by character string matching using a correspondence table between the functional expression and its meaning as shown in FIG. 2 (reference document 1). reference). In addition, as a technique for acquiring a semantic label of a functional expression, for example, a technique described in Reference 2 for estimating an appropriate label using machine learning may be used.

参考文献１：松吉俊、佐藤理史、宇津呂武仁、“日本語機能表現辞書の編纂”、自然言語処理、１４．５，２００７
参考文献２：今村賢治、泉朋子、菊井玄一郎、佐藤理史、“述部機能表現の意味ラベルタガー”、言語処理学会第１７回年次大会発表論文集、２０１１ References 1: Shun Matsuyoshi, Satoshi Sato, Takehito Utsuro, “Editing Japanese Functional Expression Dictionary”, Natural Language Processing, 14.5, 2007
Reference 2: Kenji Imamura, Atsuko Izumi, Genichiro Kikui, Satoshi Sato, “Semantic Label Tagger for Predicate Functional Expression”, Proc. Of the 17th Annual Conference of the Language Processing Society, 2011

例えば、「私は寒がりなので、暖かい服装を選んだ。」という入力文２３を基本解析部１２が読み込んだ場合、以下に示す（１）形態素境界、文節境界、文節主辞、（２）各形態素の読み、（３）各形態素の品詞、活用語の活用型・活用形、（４）機能表現の意味ラベルの情報が取得される。以下、基本解析部１２で取得されるこれらの情報をまとめて、基本解析結果という。 For example, when the basic analysis unit 12 reads the input sentence 23 “I am cold, I chose warm clothes.” (1) Morphological boundaries, phrase boundaries, phrase headwords, and (2) each morpheme , (3) part of speech of each morpheme, utilization type / utilization form of utilization word, and (4) semantic label of function expression. Hereinafter, these pieces of information acquired by the basic analysis unit 12 are collectively referred to as basic analysis results.

（１）｛私｝＿は／｛寒がり｝＿な＿ので＿、／｛暖か｝＿い／｛服装｝＿を／｛選｝＿ん＿だ＿。
（２）｛ワタシ｝＿ハ／｛サムガリ｝＿ナ＿ノデ＿、／｛アタタカ｝＿イ／｛フクソウ｝＿ヲ／｛エラ｝＿ン＿ダ＿。
（３）｛代名詞｝＿係助詞／｛名詞｝＿助動詞＿接続助詞＿読点／｛形容詞語幹｝＿活用語尾：連体形／｛名詞｝＿格助詞／｛動詞語幹：バ行五段｝＿活用語尾：連用形−音便＿助動詞＿句点
（４）｛＊｝＿＊／｛＊｝＿＊＿理由＿＊／｛＊｝＿＊／｛＊｝＿＊／｛＊｝＿＊＿完了＿＊ (1) Because {I} _ is / {cold} __, __, {warm} _i / {clothes} _ / {select} _n_.
(2) {I am} _ha / {Samgari} _Na_Node_, / {Atata} _I / {Fusou} _wo / {Era} _n_da_.
(3) {pronoun} _involvement particle / {noun} _auxiliary verb_conjunctive particle_reading point / {adjective stem} _utilization ending: collocation / {noun} _case particle / {verb stem: ba line five steps} _utilization Ending: Consecutive form-sound flight _ auxiliary verb _ punctuation point (4) {*} _ * / {*} _ * _ reason _ * / {*} _ * / {*} _ * / {*} _ * _ done_ *

上記の基本解析結果では、形態素境界が「＿」、文節境界が「／」、文節主辞が「｛｝」で示されている。なお、文節が区切れているところでは、必ず形態素も区切れているため、文節境界は形態素境界でもある。また、活用語（動詞、形容詞、助動詞）の活用語尾は語幹から切り離しておく（例えば、「選＿ん＿だ」）。さらに、（４）において、「＊」は、機能表現ではない、又は機能表現であっても本実施形態で処理対象となる意味ラベルが付与されなかったことを意味する。 In the above basic analysis result, the morpheme boundary is indicated by “_”, the phrase boundary is indicated by “/”, and the phrase main word is indicated by “{}”. It should be noted that where a clause is separated, a morpheme is always separated, so the clause boundary is also a morpheme boundary. In addition, the usage endings of the usage words (verbs, adjectives, auxiliary verbs) are separated from the stem (for example, “select_n_”). Furthermore, in (4), “*” means that it is not a function expression, or even if it is a function expression, a semantic label to be processed in this embodiment has not been assigned.

書き換え処理部１３は、図１に示すように、文体変換部１３ａ、文字種変換部１３ｂ、分かち書き変換部１３ｃ、文構造変換部１３ｄ、述部機能表現・キャラ語尾変換部１３ｅ、活用形変換部１３ｆ、言いよどみ変換部１３ｇ、人称代名詞置換部１３ｈ、方言・特殊語彙置換部１３ｉ、音素置換部１３ｊ、弁別的無意味表現挿入部１３ｋ、及び記号類挿入部１３ｌを含む。 As shown in FIG. 1, the rewrite processing unit 13 includes a style conversion unit 13a, a character type conversion unit 13b, a segmentation conversion unit 13c, a sentence structure conversion unit 13d, a predicate function expression / character ending conversion unit 13e, and a utilization type conversion unit 13f. , A word conversion unit 13g, a personal pronoun replacement unit 13h, a dialect / special vocabulary replacement unit 13i, a phoneme replacement unit 13j, a discriminative meaningless expression insertion unit 13k, and a symbol class insertion unit 13l.

書き換え処理部１３の各処理部には、入力文２３について基本解析部１２で解析された基本解析結果が各々入力される。ただし、設定値ＤＢ２２に記憶された各処理部に対応する項目の設定値が、各処理部の書き換え処理を適用しないことを示す場合、その処理部へは基本解析結果は入力されない。例えば、設定値ＤＢ２２に記憶された項目（ａ）文体変換の設定値が０＝無変換の場合、文体変換部１３ａには、基本解析結果は入力されない。 The basic analysis results obtained by analyzing the input sentence 23 by the basic analysis unit 12 are input to the respective processing units of the rewrite processing unit 13. However, when the setting value of the item corresponding to each processing unit stored in the setting value DB 22 indicates that the rewriting process of each processing unit is not applied, the basic analysis result is not input to the processing unit. For example, when the setting value of the item (a) style conversion stored in the setting value DB 22 is 0 = no conversion, the basic analysis result is not input to the style conversion section 13a.

以下、基本解析結果が入力された各処理部の書き換え処理について、詳述する。 Hereinafter, the rewriting process of each processing unit to which the basic analysis result is input will be described in detail.

文体変換部１３ａは、設定値ＤＢ２２から、項目（ａ）文体変換の設定値（１＝常体（だ体）、２＝敬体（です・ます体）、３＝敬体（でございます体）のいずれか）を取得する。文体変換部１３ａは、入力文２３の述部の機能表現を、取得した設定値が示す文体に合わせて置換することにより、文体を変換した文を出力する。 The style conversion unit 13a uses the setting value DB 22 to set the item (a) style conversion setting values (1 = ordinary body, 2 = respected body), 3 = rewarded body ) Get one). The stylistic conversion unit 13a outputs a sentence obtained by converting the stylization by replacing the functional expression of the predicate of the input sentence 23 with the stylistic indicated by the acquired setting value.

具体的には、文体変換部１３ａは、例えば図３に示すような文体別機能表現リスト２４を参照して、入力文２３（基本解析結果）の末尾の文節に含まれる機能表現（主辞より後ろの形態素列）の置換後の表記を取得する。図３の例では、文体別機能表現リスト２４は、文体（常体（だ体）、敬体（です・ます体）、敬体（でございます体））別に、「文節主辞の品詞」及び「機能表現の意味」毎の置換先の「機能表現の表記」が定められている。 Specifically, the style conversion unit 13a refers to, for example, a style-specific function expression list 24 as shown in FIG. 3, and the function expression (after the main letter) included in the last sentence of the input sentence 23 (basic analysis result). To obtain the representation after replacement of the morpheme string. In the example of FIG. 3, the style-specific function expression list 24 is divided into “sentence parts of phrase headwords” and “style of phrases”, according to style (regular (datai), respect (mass), honor (body)). The “function expression notation” to be replaced is defined for each “function expression meaning”.

なお、文体別機能表現リスト２４において、置換の対象となる機能表現が動詞の場合は、置換先の「機能表現の表記」は、置換先の機能表現が要求する動詞の活用語尾の活用形を表すタグ（以下、［活用形］と表記する。例えば、［連用形］）を含む表記で定められている。この場合、文体変換部１３ａは、例えば、図４に示す活用表２５を参照し、タグ［活用形］の部分を、置換先の表記に変換する。図４に示す活用表２５において、空のセルは、活用語尾を挟まずに語幹と後続の形態素とが接続するものを示す。例えば、一段活用の動詞の未然形（ア段）は、語幹「見」と助動詞「ない」とが直に接続し、「見ない」となる。また、「−」が記載されているセルは、日本語として存在しない組み合わせ（考慮不要の組み合わせ）を示す。 When the functional expression to be replaced is a verb in the functional expression list 24 according to the style, the “functional expression notation” of the replacement destination indicates the usage form of the verb ending required by the functional expression of the replacement destination. It is defined by a notation including a tag to be represented (hereinafter referred to as [utilized form]. For example, [continuous form]). In this case, the style conversion unit 13a refers to, for example, the utilization table 25 illustrated in FIG. 4 and converts the tag [utilization form] portion into the replacement destination notation. In the utilization table 25 shown in FIG. 4, an empty cell indicates that the stem and the subsequent morpheme are connected without sandwiching the utilization ending. For example, the verb form (A-dan) of the verb used in the first stage is directly connected to the stem “see” and the auxiliary verb “no”, and becomes “do not see”. A cell in which “-” is written indicates a combination that does not exist in Japanese (a combination that does not need to be considered).

上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３の基本解析結果に対する文体の変換を例に、文体変換部１３ａの処理をより詳細に説明する。 The processing of the stylistic conversion unit 13a will be described in more detail by taking the conversion of the stylization to the basic analysis result of the input sentence 23 “I am cold, so I chose warm clothes.”

文体変換部１３ａは、基本解析結果から、述部の文節主辞の品詞＝｛動詞語幹：バ行五段｝、その文節の機能表現の意味ラベル＝“完了”の情報を取得する。そして、文体変換部１３ａは、図３に示す文体別機能表現リスト２４を参照して、基本解析結果から取得した情報と一致する「文節主辞の品詞」及び「機能表現の意味」に対応付けられた「機能表現の表記」を取得する。設定値ＤＢ２２から取得された設定値が２＝敬体（です・ます体）、又は３＝敬体（でございます体）の場合、下記に示すような処理中間結果が得られる。なお、以下では、置換された箇所を<< >>で表す。 The stylistic conversion unit 13a acquires information on the part of speech of the predicates of the predicate = {verb stem: ba line five steps} and the meaning label of the functional expression of the clause = “complete” from the basic analysis result. Then, the style conversion unit 13a refers to the style-specific function expression list 24 shown in FIG. 3 and is associated with “part of speech of phrase main part” and “meaning of function expression” that match the information acquired from the basic analysis result. Acquire "notation of function expression". When the set value acquired from the set value DB 22 is 2 = respected body (3), or 3 = respect body (is the body), the following processing intermediate result is obtained. In the following, the replaced part is represented by << >>.

敬体（です・ます体）の例：
私は寒がりなので、暖かい服装を選<<［連用形］ました>>。
敬体（でございます体）の例：
私は寒がりなので、暖かい服装を選<<［連用形−音便］たのでございます>>。 Example of honored body:
I'm cold, so I chose warm clothes.
Example of honored body:
I am cold, so I chose warm clothes << [continuous use-sound] >>.

上記の処理中間結果のように、機能表現が置換された文節がタグ［活用形］を含む場合には、文体変換部１３ａは、活用表２５から、タグ［活用形］が示す動詞の活用形と、その文節の主辞である動詞の活用型とが一致する活用形の表記を取得する。そして、文体変換部１３ａは、下記に示すように、処理中間結果に含まれるタグ［活用形］を、活用表２５から取得した活用形の表記に置換する。 When the clause whose functional expression is replaced includes the tag [utilization form] as in the above processing intermediate result, the stylistic conversion unit 13a uses the verb utilization form indicated by the tag [utilization form] from the utilization table 25. And the usage form of the verb that is the main part of the phrase is acquired. Then, the style conversion unit 13a replaces the tag [utilization form] included in the processing intermediate result with the utilization form notation acquired from the utilization table 25, as shown below.

敬体（です・ます体）の例：
私は寒がりなので、暖かい服装を選<<［連用形］ました>>。
⇒私は寒がりなので、暖かい服装を選<<びました>>。
文節「選［連用形］ました」における、文節主辞の品詞は「動詞語幹：バ行五段」、タグ「活用形」が示す活用形は「連用形」であるので、この条件にマッチする活用形の表記「び」が取得される。 Example of honored body:
I'm cold, so I chose warm clothes.
⇒I am cold, so I chose warm clothes.
In the phrase “Selected [Continuous]”, the part of speech of the phrase head is “Verb stem: Ba line 5 Dan”, and the usage form indicated by the tag “Utilization form” is “Consecutive form”. The notation “Bi” is acquired.

敬体（でございます体）の例：
私は寒がりなので、暖かい服装を選<<［連用形−音便］たのでございます>>。 ⇒私は寒がりなので、暖かい服装を選<<んだのでございます>>。
文節「選［連用形−音便］たのでございます」における、文節主辞の品詞は「動詞語幹：バ行五段」、タグ「活用形」が示す活用形は「連用形（音便形）」であるので、この条件にマッチする活用形の表記「ん」が取得される。 Example of honored body:
I am cold, so I chose warm clothes << [continuous use-sound] >>. ⇒I am cold, so I choose warm clothes.
The phrase part of speech in the phrase “Select [Continuous Form-Sound Service]” is “Verb stem: Ba line 5 Dan”, and the utilization form indicated by the tag “Utilization Form” is “Continuous Form (Sound Form)”. Since there is, the usage notation “n” that matches this condition is acquired.

なお、文節主辞の活用型が「ガ行五段活用」、「バ行五段活用」、「マ行五段活用」、及び「ナ行五段活用」のいずれかである場合は、置換先の機能表現の先頭文字（活用語尾に後続する文字）「て」又は「た」をそれぞれ「で」又は「だ」に置換する。上記の例では、文節主辞の活用型が「バ行五段活用」であるので、「選［連用形−音便］たのでございます」が、「選んだのでございます」に置換されている。 In addition, if the usage type of the phrase main word is any one of “5 row utilization”, “5 row utilization”, “5 row utilization”, and “5 row utilization”, the replacement destination The first character (the character that follows the ending ending) of the function expression “te” or “ta” is replaced with “de” or “da”, respectively. In the above example, because the phrase headword usage type is “Ba row five-stage usage”, “Selected [continuous form-sound service]” is replaced with “I selected it”.

なお、上記では、適切な活用語尾を挿入するために、置換先の機能表現が要求する動詞の活用語尾の活用形を表すタグを利用したが、この方法に限定されない。例えば、２つの形態素を結合する際にどのような活用語尾が挿入されるべきかを、何らかの機械学習の手法によって事前に学習しておき、文体変換部１３ａにおける活用語尾の挿入に利用してもよい。例えば、「選」と「た」との間にどのような活用語尾が入るべきかを推定するモデルを学習しておく。そして、語幹が「選」の動詞を「語幹＋活用語尾＋助動詞「た」」の形に置換したい場合には、学習したモデルの出力から置換先の活用語尾を得る、という使い方ができる。 In the above description, in order to insert an appropriate usage ending, a tag representing the usage form of the verb usage ending required by the functional expression of the replacement destination is used. However, the present invention is not limited to this method. For example, what kind of effective ending should be inserted when two morphemes are combined may be learned in advance by some machine learning technique and used for inserting the effective ending in the style conversion unit 13a. Good. For example, a model for estimating what kind of ending ending should be between “Select” and “Ta” is learned. If the verb whose stem is “Select” is to be replaced with the form of “stem + utilized ending + auxiliary verb“ ta ””, it is possible to obtain the utilized ending at the replacement destination from the output of the learned model.

また、別の方法として、置換対象の形態素（列）の前後にどのような形態素が共起しているかを考慮した形態素列同士の置換ルールを用いて文体を変換することも可能である。この場合、例えば、図５に示すような形態素列同士の置換ルールを用いて、文体を変換することができる。 As another method, it is also possible to convert a style using a replacement rule between morpheme sequences that considers what morphemes co-occur before and after the morpheme (sequence) to be replaced. In this case, for example, the style can be converted using a replacement rule between morpheme strings as shown in FIG.

なお、本実施形態では、取り得る設定値が０＝無変換、１＝常体（だ体）、２＝敬体（です・ます体）、３＝敬体（でございます体）の４種類の場合について説明するが、他の文体（例えば、「常体（である体）」）へ変換するための設定値を追加してもよい。 In this embodiment, the possible setting values are 4 types: 0 = no conversion, 1 = normal body, 2 = respected body, and 3 = respected body. However, a setting value for conversion to another style (for example, “normal body”) may be added.

文字種変換部１３ｂは、設定値ＤＢ２２に記憶された項目（ｂ）文字種変換の設定値（１＝ひらがな化）にしたがって、漢字をひらがなに変換した文を出力する。具体的には、文字種変換部１３ｂは、下記に示すように、基本解析結果に含まれる各形態素の読み（カタカナで書かれた部分）を全てひらがなに置換する。 The character type conversion unit 13b outputs a sentence in which kanji characters are converted into hiragana according to the item (b) character type conversion setting value (1 = Hiragana conversion) stored in the setting value DB 22. Specifically, as shown below, the character type conversion unit 13b replaces all morpheme readings (portions written in katakana) included in the basic analysis result with hiragana characters.

ひらがな化の例：
わたしはさむがりなので、あたたかいふくそうをえらんだ。 Hiragana example:
I'm awkward, so I chose a warm fusou.

なお、本実施形態では、取り得る設定値が０＝無変換、又は１＝ひらがな化の２種類の場合について説明するが、全ての文字をカタカナに変換する「カタカナ化」のオプションを用意してもよい。また、「５０％（２回に１回の割合で）ひらがな化する」、「２０％（５回に１回の割合で）カタカナ化する」などのように、文字種変換を実施する割合を指定できるようにしてもよい。また、品詞が「名詞」の形態素のみひらがな化する、などのように、文字種変換の対象とする品詞を指定してもよい。 In this embodiment, there are two types of setting values that can be taken: 0 = no conversion or 1 = Hiragana conversion. However, an option for “katakana” that converts all characters into katakana is prepared. Also good. Also, specify the rate of character type conversion, such as "50% hiragana" (once every two times), "20% (once every five times) Katakana" You may be able to do it. In addition, the part of speech that is to be subjected to character type conversion may be specified, such as hiraganaizing only the morpheme whose part of speech is “noun”.

分かち書き変換部１３ｃは、設定値ＤＢ２２から、項目（ｃ）分かち書き変換の設定値（１＝読点区切り、又は２＝空白区切り）を取得する。分かち書き変換部１３ｃは、入力文２３の文節境界に、指定された区切り文字を挿入することにより、下記に示すように、分かち書きが変換された文を出力する。なお、区切り文字（読点又は空白）を挿入することにより、読点や空白が連続してしまう場合は、区切り文字を挿入しないこととする。また、「暖かい」と「服装」との間のように、連体修飾関係にある２つの連続する文節の間には区切り文字を挿入しないこととする。 The segmentation conversion unit 13c obtains the setting value (1 = reading point delimiter or 2 = blank delimiter) of the item (c) segmentation conversion from the setting value DB 22. The segmentation conversion unit 13c inserts a designated delimiter character at the clause boundary of the input sentence 23, thereby outputting a sentence in which the segmentation is converted as shown below. Note that if a delimiter (a reading or a blank) is inserted to cause consecutive readings or blanks, no delimiter is inserted. In addition, no delimiter is inserted between two consecutive clauses in a linkage modification relationship, such as between “warm” and “clothes”.

分かち書き変換（読点区切り）の例：
私は<<、>>寒がりなので、暖かい服装を<<、>>選んだ。 Examples of broken line conversion (punctuation break):
I chose <<, >> warm clothes because I was cold.

文構造変換部１３ｄは、設定値ＤＢ２２から、項目（ｄ）文構造変換用のキャラクタ別接続表現リスト２６のファイル名を取得する。そして、文構造変換部１３ｄは、入力文２３を単文に分割すると共に、取得したファイル名が示すキャラクタ別接続表現リスト２６から得られる接続表現を分割箇所に挿入することで、文構造を変換した文を出力する。 The sentence structure conversion unit 13d acquires the file name of the item-by-character connection expression list 26 for sentence structure conversion from the setting value DB 22. Then, the sentence structure conversion unit 13d divides the input sentence 23 into simple sentences, and converts the sentence structure by inserting connection expressions obtained from the character-specific connection expression list 26 indicated by the acquired file name into the divided portions. Output a sentence.

具体的には、入力文２３が複文である場合は、文構造変換部１３ｄは、例えば図６に示すようなキャラクタ別接続表現リスト２６のうち、取得したファイル名が示すキャラクタ別接続表現リスト２６を参照する。そして、文構造変換部１３ｄは、入力文２３から、キャラクタ別接続表現リスト２６に記載された意味ラベルを持つ文節を探し、該当する意味ラベルが付与されている形態素を削除し、削除箇所に句点を挿入する。また、文構造変換部１３ｄは、上記意味ラベルが付与されている形態素の直前の形態素が活用語（動詞、形容詞、助動詞）である場合は、その活用語を終止形に変換する。終止形としては、形態素解析結果の一部として出力されることの多い「基本形」を利用してもよいし、活用表２５を利用してもよい。そして、文構造変換部１３ｄは、上記意味ラベルに対応する接続表現をキャラクタ別接続表現リスト２６から取得し、挿入した句点の後ろに挿入する。なお、接続表現と共に読点を挿入してもよい。 Specifically, when the input sentence 23 is a compound sentence, the sentence structure conversion unit 13d, for example, in the character-specific connection expression list 26 shown in FIG. 6, the character-specific connection expression list 26 indicated by the acquired file name. Refer to Then, the sentence structure conversion unit 13d searches the input sentence 23 for a phrase having a semantic label described in the character-specific connection expression list 26, deletes the morpheme to which the corresponding semantic label is assigned, and puts a punctuation mark at the deleted position. Insert. In addition, when the morpheme immediately before the morpheme to which the semantic label is assigned is a usage word (a verb, an adjective, or an auxiliary verb), the sentence structure conversion unit 13d converts the usage word into a final form. As the final form, a “basic form” that is often output as a part of the morphological analysis result may be used, or the utilization table 25 may be used. Then, the sentence structure conversion unit 13d acquires the connection expression corresponding to the semantic label from the character-specific connection expression list 26, and inserts it after the inserted phrase. A reading mark may be inserted together with the connection expression.

例えば、設定値ＤＢ２２に記憶された項目（ｄ）文構造変換用のキャラクタ別接続表現リスト２６のファイル名として、キャラＡのキャラクタ別接続表現リスト２６が指定されているとする。上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３では、文節「寒がりなので、」の形態素「ので」の意味ラベル「理由」が、キャラクタ別接続表現リスト２６に記載された意味ラベルの１つに該当する。そこで、「ので」及びその後の読点「、」を削除すると共に、削除箇所に句点「。」を挿入する。また、「ので」の前の助動詞「な」を終止形「だ」に置換する。さらに、挿入した句点「。」の後に、キャラＡのキャラクタ別接続表現リスト２６において、意味ラベル「理由」に対応付けられている接続詞「なので」を取得し、読点「、」と共に挿入する。これにより、下記に示すように、入力文２３の文構造が変換される。 For example, it is assumed that the character-by-character connection representation list 26 of the character A is designated as the file name of the item-by-character connection representation list 26 for sentence structure conversion stored in the setting value DB 22. In the input sentence 23 “I am cold, so I chose warm clothes.” The meaning label “reason” of the morpheme “So” in the phrase “Because it is cold” is listed in the connection representation list 26 for each character. Corresponds to one of the semantic labels. Therefore, “so” and the subsequent punctuation mark “,” are deleted, and a punctuation mark “.” Is inserted at the deleted part. Also, the auxiliary verb “na” before “no” is replaced with the final form “da”. Further, after the inserted punctuation mark “.”, The conjunction “because” associated with the semantic label “reason” is acquired in the character-specific connection expression list 26 of character A and inserted together with the reading mark “,”. As a result, the sentence structure of the input sentence 23 is converted as shown below.

単文化の例：
私は寒がり<<だ。なので、>>暖かい服装を選んだ。 Single culture example:
I'm cold <<. So I chose warm clothes.

なお、上記では、文の境界を示すために句点を挿入することとしたが、文の境界を示すことができさえすればどのような記号を用いてもよい。 In the above description, a punctuation point is inserted to indicate a sentence boundary. However, any symbol may be used as long as it can indicate a sentence boundary.

また、「は」などの提題を表す助詞又は主格を表す助詞「が」が複数個現れる入力文２３は、単文化の対象外、すなわち、文構造変換部１３ｄによる書き換え処理の対象外としてもよい。例えば、「私は、彼が寒がりなので、マフラーを貸してあげました。」という入力文２３が、「私は、彼が寒がりだ。なので、マフラーを貸してあげました。」のように書き換えられることを避けるためである。 Also, an input sentence 23 in which a plurality of particles representing a suggestion such as “ha” or a particle “ga” representing a main case appears out of the scope of a single culture, that is, out of the rewriting process by the sentence structure conversion unit 13d. Good. For example, the input sentence 23, “I gave him a muffler because he is cold,” “I gave him a muffler, because he was cold.” This is to avoid being overwritten.

述部機能表現・キャラ語尾変換部１３ｅは、設定値ＤＢ２２から、項目（ｅ）述部機能表現・キャラ語尾変換用のキャラクタ別機能表現リスト２７のファイル名を取得する。キャラクタ別機能表現リスト２７の一例を図７に示す。キャラクタ別機能表現リスト２７の構成は、文体別機能表現リスト２４と同様である。また、述部機能表現・キャラ語尾変換部１３ｅの処理も、文体変換部１３ａの処理と同様である。ただし、キャラクタ別機能表現リスト２７では、キャラクタの個性を表現可能な述部機能表現及び語尾を任意に定めることができる。図７に示すキャラＢのキャラクタ別機能表現リスト２７の例のように、キャラクタ付けのために日本語文法の範囲外の表現（キャラ語尾）を用いることもできる。これにより、述部機能表現・キャラ語尾変換部１３ｅでは、文体変換部１３ａとは異なる言語表現の書き換えを実現することができる。 The predicate function expression / character ending conversion unit 13e acquires the file name of the item-specific function expression list 27 for predicate function expression / character ending conversion from the setting value DB 22. An example of the character-specific function expression list 27 is shown in FIG. The structure of the character-specific function expression list 27 is the same as that of the style-specific function expression list 24. The processing of the predicate function expression / character ending conversion unit 13e is the same as the processing of the style conversion unit 13a. However, in the character-specific function expression list 27, predicate function expressions and endings capable of expressing the character individuality can be arbitrarily determined. As in the example of the character-by-character function expression list 27 of the character B shown in FIG. 7, expressions outside the Japanese grammar range (character endings) can be used for character assignment. Thereby, in the predicate function expression / character ending conversion unit 13e, rewriting of a language expression different from that of the style conversion unit 13a can be realized.

例えば、設定値ＤＢ２２に記憶された項目（ｅ）述部機能表現・キャラ語尾変換用のキャラクタ別機能表現リスト２７のファイル名として、キャラＡのキャラクタ別機能表現リスト２７が指定されているとする。この場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３は、下記のように述部機能表現が変換される。 For example, it is assumed that the character-by-character function expression list 27 of the character A is designated as the file name of the item-by-character function expression list 27 for the item (e) predicate function expression / character ending conversion stored in the setting value DB 22. . In this case, the predicate function expression of the input sentence 23 "I chose cold clothes because I am cold" is converted as follows.

述部機能表現・キャラ語尾変換の例：
私は寒がりなので、暖かい服装を選ん<<だの>>。 Example of predicate function expression and character ending conversion:
I'm cold, so I choose warm clothes.

活用形変換部１３ｆは、設定値ＤＢ２２に記憶された項目（ｆ）活用形変換の設定値（１＝崩れた活用形に変換）にしたがって、形容詞（語幹及び活用語尾）を崩れた表現に置換することにより、活用形を変換した文を出力する。具体的には、活用形変換部１３ｆは、入力文２３に含まれる形容詞の語幹及び活用語尾の表記を、例えば図８に示すような崩れ活用ルール２８の「入力（表記）」部分と照合し、対応する「出力」部分の語幹及び活用語尾の表記を取得して置換する。 The utilization form conversion unit 13f replaces the adjectives (stems and utilization endings) with corrupted expressions in accordance with the item (f) utilization form conversion setting values (1 = converted into corrupted utilization forms) stored in the setting value DB 22. By doing so, the sentence which converted the utilization form is output. Specifically, the utilization form conversion unit 13f collates the notation of the adjective stem and utilization ending included in the input sentence 23 with the “input (notation)” portion of the collapse utilization rule 28 as shown in FIG. , Acquire and replace the notation of the stem and inflection ending of the corresponding “output” part.

例えば、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３の場合、形容詞「暖かい」の「かい」の部分が崩れ活用ルール２８の「入力（表記）」の１つに該当する。そこで、「かい」の部分を、対応する「出力（表記）」である「けぇ」に置換することで、下記に示すように、入力文２３の活用形を崩れた活用形に変換する。 For example, in the case of the input sentence 23 “I am cold, I chose warm clothes.” The adjective “warm” part of “Kai” collapses and is one of the “input (notation)” of the utilization rule 28. It corresponds to. Therefore, by replacing the “kai” part with “ke” which is the corresponding “output (notation)”, as shown below, the utilization form of the input sentence 23 is converted into a disrupted utilization form.

崩れた活用形に変換の例：
私は寒がりなので、暖<<けぇ>>服装を選んだ。 Example of conversion to a broken usage:
I was cold, so I chose warm clothes.

なお、本実施形態では、例えば図８に示すような崩れ活用ルール２８を用いたルールベースの手法で変換処理を行う場合について説明したが、これに限定されない。例えば、何らかの機械学習を用いてコーパス（崩れた活用形が使用されるテキストデータ）から学習しておいたモデルを用いて、崩れていない語幹及び活用語尾の置換先となる崩れた語幹及び活用語尾を推定するなどしてもよい。 In the present embodiment, for example, the case where the conversion process is performed by a rule-based method using the collapse utilization rule 28 as illustrated in FIG. 8 is described, but the present invention is not limited to this. For example, using a model that has been learned from a corpus (text data that uses corrupted usage) using some kind of machine learning, the broken stems and usage endings that are the replacement destinations of unbroken stems and used endings May be estimated.

言いよどみ変換部１３ｇは、設定値ＤＢ２２に記憶された項目（ｇ）言いよどみ変換の設定値（１＝言いよどみ化）にしたがって、入力文２３を、言いよどみが表れた文に変換する。具体的には、言いよどみ変換部１３ｇは、入力文２３の文頭の形態素の読み（本実施形態の基本解析結果の例ではカタカナで記載）の１文字目をひらがな化し、文頭の形態素の前に挿入する。なお、挿入したひらがなの後に、読点を挿入してもよい。 The stagnation conversion unit 13g converts the input sentence 23 into a sentence in which stagnation appears according to the item (g) stagnation conversion setting value (1 = scrambled) stored in the setting value DB 22. Specifically, the sloppy conversion unit 13g hiraganaizes the first character of the reading of the morpheme at the beginning of the input sentence 23 (described in katakana in the example of the basic analysis result of this embodiment) and inserts it before the morpheme at the beginning of the sentence To do. A reading mark may be inserted after the inserted hiragana.

例えば、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３の場合、下記に示すように、入力文２３が言いよどみ化された文に変換される。 For example, in the case of the input sentence 23 “I am cold and I chose warm clothes”, the input sentence 23 is converted into a scrambled sentence as shown below.

言いよどみ化の例：
<<わ、>>私は寒がりなので、暖かい服装を選んだ。 Example of stagnation:
<< Wa >>I'm cold, so I chose warm clothes.

なお、上記では、文頭の形態素のみを言いよどみ化の対象とする場合について説明したが、節の先頭の形態素を対象として、「わ、私は寒がりなので、あ、暖かい服装を選んだ。」のように変換してもよいし、各文節の先頭の形態素を対象として、「わ、私は、さ、寒がりなので、あ、暖かい服装を、え、選んだ。」のように変換してもよい。また、言いよどみ化対象の形態素の表記がカタカナの場合は、言いよどみ化対象の形態素の前に挿入する文字をカタカナにしてもよい。また、例えば、「わ、わ、私は・・・」のように、挿入するひらがな又はカタカナを２回以上重ねて挿入してもよい。 In the above, the case where only the morpheme at the beginning of the sentence is the object of stagnation is explained, but the morpheme at the beginning of the section is targeted, “Wow, I am cold, so I chose warm clothes.” You can convert it like this, or even if you convert the target morpheme of each phrase as “Wow, I ’m cold, so I chose warm clothes. Good. In addition, if the notation of the morpheme subject to stagnation is katakana, the characters to be inserted before the sneaking target morpheme may be katakana. Further, for example, hiragana or katakana to be inserted may be inserted twice or more like “Wa, Wa, I ...”.

人称代名詞置換部１３ｈは、設定値ＤＢ２２から、項目（ｈ）人称代名詞置換用のキャラクタ別人称代名詞リスト２９のファイル名を取得する。そして、人称代名詞置換部１３ｈは、入力文２３に含まれる人称代名詞の形態素を、取得したファイル名が示すキャラクタ別人称代名詞リスト２９から得られる人称代名詞に置換した文を出力する。 The personal pronoun replacement unit 13h obtains the file name of the character-specific personal pronoun list 29 for item (h) personal pronoun replacement from the setting value DB 22. Then, the personal pronoun replacement unit 13h outputs a sentence in which the morphemes of the personal pronouns included in the input sentence 23 are replaced with personal pronouns obtained from the character-specific personal pronoun list 29 indicated by the acquired file name.

具体的には、人称代名詞置換部１３ｈは、例えば図９に示すようなキャラクタ別人称代名詞リスト２９のうち、取得したファイル名が示すキャラクタ別人称代名詞リスト２９を参照する。そして、人称代名詞置換部１３ｈは、入力文２３において、キャラクタ別人称代名詞リスト２９の「入力」部分と表記が一致する形態素を、対応する「出力」部分の表記と置換する。 Specifically, the personal pronoun replacement unit 13h refers to, for example, the character-specific personal pronoun list 29 indicated by the acquired file name in the character-specific personal pronoun list 29 as shown in FIG. Then, the personal pronoun replacement unit 13h replaces the morpheme whose notation matches the “input” portion of the character-specific personal pronoun list 29 in the input sentence 23 with the corresponding “output” portion notation.

例えば、設定値ＤＢ２２に記憶された項目（ｈ）人称代名詞置換用のキャラクタ別人称代名詞リスト２９のファイル名として、キャラＡのキャラクタ別人称代名詞リスト２９が指定されているとする。この場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３は、下記のように人称代名詞が置換される。 For example, it is assumed that the character-specific personal pronoun list 29 of character A is designated as the file name of the character-specific personal pronoun list 29 for item (h) personal pronoun replacement stored in the setting value DB 22. In this case, the above-mentioned input sentence 23 “I chose cold clothes because I am cold” is replaced with the personal pronoun as follows.

人称代名詞置換の例：
<<あたし>>は寒がりなので、暖かい服装を選んだ。 Example of personal pronoun substitution:
Since << I >> was cold, I chose warm clothes.

方言・特殊語彙置換部１３ｉは、設定値ＤＢ２２から、項目（ｉ）方言・特殊語彙置換用のキャラクタ別語彙置換ルール３０のファイル名を取得する。そして、方言・特殊語彙置換部１３ｉは、入力文２３に含まれる特定の形態素を、取得したファイル名が示すキャラクタ別語彙置換ルール３０にしたがって、方言又は特殊語彙に置換した文を出力する。 The dialect / special vocabulary replacement unit 13i acquires the file name of the character-specific vocabulary replacement rule 30 for item (i) dialect / special vocabulary replacement from the setting value DB 22. Then, the dialect / special vocabulary replacement unit 13i outputs a sentence in which a specific morpheme included in the input sentence 23 is replaced with a dialect or special vocabulary according to the character-specific vocabulary replacement rule 30 indicated by the acquired file name.

具体的には、方言・特殊語彙置換部１３ｉは、例えば図１０に示すようなキャラクタ別語彙置換ルール３０のうち、取得したファイル名が示すキャラクタ別語彙置換ルール３０を参照する。そして、方言・特殊語彙置換部１３ｉは、入力文２３において、キャラクタ別語彙置換ルール３０の「入力」部分と表記が一致する形態素を、対応する「出力」部分の表記と置換する。キャラクタ別語彙置換ルール３０では、特定の置換元の語彙を、目的のキャラクタらしい語彙に変換するルールが定められる。 Specifically, the dialect / special vocabulary replacement unit 13i refers to, for example, the character-specific vocabulary replacement rule 30 indicated by the acquired file name among the character-specific vocabulary replacement rules 30 as shown in FIG. Then, the dialect / special vocabulary replacing unit 13i replaces the morpheme whose notation matches the “input” portion of the character-specific vocabulary replacement rule 30 in the input sentence 23 with the notation of the corresponding “output” portion. In the character-specific vocabulary replacement rule 30, a rule for converting a specific replacement source vocabulary into a vocabulary like a target character is defined.

また、方言・特殊語彙置換部１３ｉは、置換先の語彙が活用語（動詞、形容詞、助動詞）である場合は、置換元の語彙（「入力」部分が該当）の活用形に合わせて活用語尾を調整してもよい。なお、助動詞の活用については、動詞型の活用をするものは動詞の活用表を参照し、形容詞型の活用をするものは形容詞の活用表を参照して取得することができる。 Also, the dialect / special vocabulary replacement unit 13i, when the replacement destination vocabulary is a utilization word (verb, adjective, auxiliary verb), the utilization ending is adapted to the utilization form of the replacement vocabulary (the “input” part is applicable). May be adjusted. Regarding the use of auxiliary verbs, those using the verb type can be obtained by referring to the verb usage table, and those using the adjective type can be obtained by referring to the adjective usage table.

例えば、設定値ＤＢ２２に記憶された項目（ｉ）方言・特殊語彙置換用のキャラクタ別語彙置換ルール３０のファイル名として、キャラＡのキャラクタ別語彙置換ルール３０が指定されているとする。この場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３は、下記のように特定の語彙が方言又は特殊語彙に置換される。 For example, it is assumed that the character-by-character vocabulary replacement rule 30 for character A is specified as the file name of the character-specific vocabulary replacement rule 30 for item (i) dialect / special vocabulary replacement stored in the setting value DB 22. In this case, in the input sentence 23 “I chose cold clothes because I am cold”, a specific vocabulary is replaced with a dialect or a special vocabulary as follows.

方言・特殊語彙置換の例：
私は寒がりなので、<<ぬくとい>>服装を選んだ。 Examples of dialect / special vocabulary substitution:
I chose cold clothes because I was cold.

なお、方言・特殊語彙置換部１３ｉで置換された表現は、その他の書き換え処理において更なる書き換え処理が加わらないように保護してもよい。 Note that the expression replaced by the dialect / special vocabulary replacing unit 13i may be protected from being subjected to further rewriting processing in other rewriting processing.

音素置換部１３ｊは、設定値ＤＢ２２から、項目（ｊ）音素置換用のキャラクタ別音素置換ルール３１のファイル名を取得する。そして、音素置換部１３ｊは、入力文２３に含まれる特定の文字を、取得したファイル名が示すキャラクタ別音素置換ルール３１にしたがって、置換先の文字に置換した文を出力する。 The phoneme replacement unit 13j acquires the file name of the character-specific phoneme replacement rule 31 for item (j) phoneme replacement from the setting value DB 22. Then, the phoneme replacement unit 13j outputs a sentence in which a specific character included in the input sentence 23 is replaced with a replacement destination character according to the character-specific phoneme replacement rule 31 indicated by the acquired file name.

具体的には、音素置換部１３ｊは、例えば図１１に示すようなキャラクタ別音素置換ルール３１のうち、取得したファイル名が示すキャラクタ別音素置換ルール３１を参照する。そして、音素置換部１３ｊは、入力文２３において、キャラクタ別音素置換ルール３１の「入力」部分と表記が一致する文字を、対応する「出力」部分の表記と置換する。なお、本実施形態では、音素の置換を文字単位で捉えて置換することとする。例えば、「な」から「にゃ」、「の」から「にょ」への文字的な置換は、「ｎａ」から「ｎｙａ」へ、「ｎｏ」から「ｎｙｏ」への音素的な置換（「ｎ」から「ｎｙ」への音素置換）を捉えるためのものである。 Specifically, the phoneme replacement unit 13j refers to, for example, the character-specific phoneme replacement rule 31 indicated by the acquired file name among the character-specific phoneme replacement rules 31 as illustrated in FIG. Then, the phoneme replacement unit 13j replaces the character whose notation matches the “input” portion of the character-specific phoneme replacement rule 31 in the input sentence 23 with the corresponding “output” portion notation. In the present embodiment, the phoneme replacement is recognized and replaced in units of characters. For example, character substitution from “na” to “nya” and “no” to “nyo” is a phonetic substitution from “na” to “nya” and “no” to “nyo” (“n”). Phoneme replacement from “ny” to “ny”).

例えば、設定値ＤＢ２２に記憶された項目（ｊ）音素置換用のキャラクタ別音素置換ルール３１のファイル名として、キャラＡのキャラクタ別音素置換ルール３０が指定されているとする。この場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３は、下記のように特定の文字が置換される。 For example, it is assumed that the character-by-character phoneme replacement rule 30 for the character A is designated as the file name of the item-by-character phoneme replacement rule 31 for phoneme replacement stored in the setting value DB 22. In this case, the above-mentioned input sentence 23 “I chose cold clothes because I am cold” is replaced with a specific character as follows.

音素的な置換を捉えた文字置換の例：
私は寒がり<<にゃにょ>>で、暖かい服装を選んだ。 Examples of character substitution that captures phonemic substitution:
I got cold and chose warm clothes.

なお、キャラクタ別音素置換ルール３１の「入力」部分と一致する全ての文字を置換する場合に限定されず、文字の置換を実行するか否かを、その文字が属する形態素の品詞や、形態素内での出現位置などを条件（制約）にして決定してもよい。例えば、「ある文字が属する形態素の品詞が名詞である場合は置換しない」、「ある文字が形態素の先頭に位置する場合は置換しない」、などの制約を設けることができる。他にも、「同じ形態素内で複数回の置換を行ってはいけない」という制約や、「連続した２つ以上の文字を置換してはいけない」という制約を設けてもよい。 In addition, it is not limited to replacing all characters that match the “input” portion of the character-specific phoneme replacement rule 31, and whether or not to perform character replacement depends on the part of speech of the morpheme to which the character belongs, It may be determined by using the condition (constraint) as the appearance position at. For example, restrictions such as “do not replace if the part of speech of a morpheme to which a certain character belongs are nouns”, “do not replace if a certain character is located at the beginning of the morpheme” can be provided. In addition, there may be provided a constraint that “a plurality of substitutions should not be performed within the same morpheme” or a constraint that “two or more consecutive characters should not be replaced”.

ここで問題になるのが、制約の数が多くなると、制約のあらゆる組み合わせを考慮したルールを人手で定義するのが困難になるという点である。そこで、置換元（入力）の文字、置換先（出力）の文字、置換元文字が属する形態素の品詞、置換元文字の出現位置、置換元文字が属する形態素内で既に実施された置換の回数、置換元文字までの連続文字置換回数などを特徴量として、何らかの機械学習によりモデルを学習しておき、このモデルを使用して、文字の置換を実施するようにしてもよい。 The problem here is that as the number of constraints increases, it becomes difficult to manually define rules that take into account all combinations of constraints. Therefore, the replacement source (input) character, the replacement destination (output) character, the part of speech of the morpheme to which the replacement source character belongs, the appearance position of the replacement source character, the number of replacements already performed in the morpheme to which the replacement source character belongs, A model may be learned by some machine learning using the number of consecutive character replacements up to the replacement source character as a feature amount, and character replacement may be performed using this model.

弁別的無意味表現挿入部１３ｋは、設定値ＤＢ２２に記憶された項目（ｋ）弁別的無意味表現を示す文字列を取得し、入力文２３の末尾に挿入することで、弁別的無意味表現が挿入された文を出力する。本実施形態において、弁別的無意味表現とは、日本語としては何の意味も持たないが、キャラクタの弁別を補助する表現のことを指す。 The discriminative meaningless expression insertion unit 13k acquires a character string indicating the item (k) discriminative meaningless expression stored in the setting value DB 22 and inserts it at the end of the input sentence 23, thereby discriminating meaningless expression. The statement with is inserted. In this embodiment, the discriminative meaningless expression refers to an expression that has no meaning as Japanese but assists the discrimination of characters.

例えば、弁別的無意味表現を示す文字列として、「ピョン！」が指定されている場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３には、下記のように弁別的無意味表現が挿入される。 For example, when “Pyon!” Is specified as a character string indicating a discriminative meaningless expression, the input sentence 23 “I chose cold clothes because I am cold” is as follows. A discriminative meaningless expression is inserted in

弁別的無意味表現挿入の例：
私は寒がりなので、暖かい服装を選んだ。<<ピョン！>> Example of discriminative meaningless insertion:
I chose cold clothes because it was cold. << Pyon! >>

なお、本実施形態では、文末の句点の後ろに弁別的無意味表現を挿入する。句点がなければ、句点を挿入したうえで、弁別的無意味表現を句点に後続させる。 In this embodiment, a discriminative meaningless expression is inserted after the ending point of the sentence. If there is no punctuation, the punctuation is inserted after the punctuation, and a discriminative meaningless expression follows the punctuation.

例えば、風貌がカエルのようなキャラクタ（カエル）、ボールのように丸いキャラクタ（ボール）、トゲがたくさん生えたキャラクタ（トゲ）という３種のキャラクタが存在するとする。この３者が似通った言語表現を使うため、言語的な差異が伝わりづらい場合でも、カエルの発話に弁別的無意味表現「ピョン！」を挿入することで、ボールやトゲではなく、カエルの発話であることを読み手又は聞き手に対して強く印象付けることができる。 For example, it is assumed that there are three types of characters: a character that looks like a frog (frog), a round character like a ball (ball), and a character that has a lot of thorns (thorn). Because these three use similar language expressions, even if it is difficult to communicate linguistic differences, the frog's utterance, not the ball or thorns, is inserted by inserting the discriminative meaningless expression "Pyon!" It can be impressed strongly to the reader or the listener.

なお、上記では、入力文２３の末尾に弁別的無意味表現を挿入する場合について説明したが、「ピョン！私は寒がりなので、暖かい服装を選んだ。」のように、文頭に挿入するなど、その他の箇所に弁別的無意味表現を挿入してもよい。 In the above, the case where a discriminative meaningless expression is inserted at the end of the input sentence 23 has been described. However, “Pyon! I am cold, so I chose warm clothes.” A discriminative meaningless expression may be inserted in other places.

記号類挿入部１３ｌは、設定値ＤＢ２２に記憶された項目（ｌ）記号類を示す文字列を取得し、入力文２３の末尾に挿入することで、記号類が挿入された文を出力する。記号類挿入部１３ｌの処理は、挿入する文字列が弁別的無意味表現ではなく記号類である点を除いて、弁別的無意味表現挿入部１３ｋと同様である。なお、本実施形態において、記号類とは、★（星）や♪（音符）のような記号や、（＊＾ｏ＾＊）や（＞＿＜；）のような顔文字を指すこととする。 The symbol insertion unit 13l acquires a character string indicating the item (l) symbol stored in the setting value DB 22, and inserts it at the end of the input sentence 23, thereby outputting a sentence in which the symbol is inserted. The processing of the symbol insertion unit 13l is the same as that of the discrimination meaningless expression insertion unit 13k except that the character string to be inserted is not a discrimination meaningless expression but a symbol class. In the present embodiment, the symbols indicate symbols such as ★ (star) and ♪ (note), and emoticons such as (* ^ o ^ *) and (> _ <;). To do.

例えば、記号類を示す文字列として、「（＊＾ｏ＾＊）」が指定されている場合、上記の「私は寒がりなので、暖かい服装を選んだ。」という入力文２３には、下記のように記号類が挿入される。 For example, when “(* ^ o ^ *)” is designated as a character string indicating symbols, the above-mentioned input sentence 23 “I chose cold clothes because I am cold” Symbols are inserted as follows.

記号類挿入の例：
私は寒がりなので、暖かい服装を選んだ。<<（＊＾ｏ＾＊）>> Example of symbol insertion:
I chose cold clothes because it was cold. << (* ^ o ^ *) >>

上記のように、書き換え処理を行うことが設定された書き換え処理部１３の各処理部１３ａ〜１３ｌにおいて、入力文２３（基本解析結果）に対する書き換え処理が行われ、各処理部１３ａ〜１３ｌから書き換え文３２が出力される。 As described above, the rewrite processing for the input sentence 23 (basic analysis result) is performed in each of the processing units 13a to 13l of the rewrite processing unit 13 that is set to perform the rewrite processing, and the rewriting is performed from each of the processing units 13a to 13l. A statement 32 is output.

なお、書き換え処理部１３の各処理部１３ａ〜１３ｌの書き換え処理は任意の組み合わせで実行することができる。例えば、ある処理部で書き換えられた結果を別の処理部でさらに書き換える場合は、ある処理部の出力を再度、基本解析部１２に渡し、新規に取得した基本解析結果を次の処理部に渡せばよい。又は、ある処理部での書き換えに基づいて、入力文２３の基本解析結果を書き換えた上で、次の処理部に渡すようにしてもよい。例えば、文体変換部１３ａで書き換えられた結果を、別の処理部に渡す場合、以下のような基本解析結果を次の処理部へ渡すことができる。 Note that the rewriting processes of the processing units 13a to 13l of the rewriting processing unit 13 can be executed in any combination. For example, when the result rewritten by a certain processing unit is further rewritten by another processing unit, the output of a certain processing unit is again passed to the basic analysis unit 12, and the newly acquired basic analysis result can be passed to the next processing unit. That's fine. Alternatively, the basic analysis result of the input sentence 23 may be rewritten based on rewriting by a certain processing unit and then passed to the next processing unit. For example, when the result rewritten by the style conversion unit 13a is passed to another processing unit, the following basic analysis result can be passed to the next processing unit.

（１）｛私｝＿は／｛寒がり｝＿な＿ので＿、／｛暖か｝＿い／｛服装｝＿を／｛選｝＿<<び>>＿<<ました>>＿。
（２）｛ワタシ｝＿ハ／｛サムガリ｝＿ナ＿ノデ＿、／｛アタタカ｝＿イ／｛フクソウ｝＿ヲ／｛エラ｝＿<<ｄｕｍｍｙ>>＿<<ｄｕｍｍｙ>>＿。
（３）｛代名詞｝＿係助詞／｛名詞｝＿助動詞＿接続助詞＿読点／｛形容詞語幹｝＿活用語尾：連体形／｛名詞｝＿格助詞／｛動詞語幹：バ行五段｝＿<<活用語尾：連用形>>＿<<ｄｕｍｍｙ>>＿句点
（４）｛＊｝＿＊／｛＊｝＿＊＿理由＿＊／｛＊｝＿＊／｛＊｝＿＊／｛＊｝＿＊＿完了＿＊ (1) Because {I} _ is / {cold} ___, / {warm} _i / {clothes} _ / {selection} _ << and >> _ << done >> _.
(2) {I am} _Ha / {Samgari} _Na_Node_, / {Attaka} _I / {Fuse} _wo / {Era} _ << dummy >> _ << dummy >> _.
(3) {pronoun} _involvement particle / {noun} _auxiliary verb_conjunctive particle_reading point / {adjective stem} _utilization ending: collocation / {noun} _case particle / {verb stem: ba line five steps} _ <<Utilization ending: continuous form >> _ << dummy >> _ Place (4) {*} _ * / {*} _ * _ reason _ * / {*} _ * / {*} _ * / {*} _ * _Done_ *

<< >>箇所は書き換えられた形態素に関する情報を示す。なお、上記の例では、他の処理部で使用される機会のない情報は「ｄｕｍｍｙ」としているが、正しい情報を付与してもよい。なお、（２）形態素の読みは、文字種変換部１３ｂにおけるひらがな化で使用するが、「び＿ました」が既にひらがなであるため、「ｄｕｍｍｙ」としても問題ない。また、本実施形態では、機能表現の品詞を使う処理がないため、「ました」の品詞は「ｄｕｍｍｙ」としても問題ない。 << >> indicates information about the rewritten morpheme. In the above example, information that has no opportunity to be used by other processing units is “dummy”, but correct information may be given. Note that (2) morpheme reading is used for hiragana conversion in the character type conversion unit 13b. However, since “bi_ha” is already hiragana, there is no problem with “dummy”. In the present embodiment, since there is no processing that uses the part of speech of the functional expression, there is no problem even if the part of speech of “ta” is “dummy”.

書き換え処理部１３の各処理部１３ａ〜１３ｌの書き換え処理を任意に組み合わせることで、「私は寒がりなので、暖かい服装を選んだ。」という入力文２３について、例えば、「（Ｉ）<<オレ>>は寒がり<<だ。だから>>、暖<<けぇ>>服装を選んだ<<ぜ！>>」や、「（ＩＩ）<<あ、あたしは、さむ>>がりなの。だ、だから、あたたかいふくそう>>を<<、えらんだの>>」のように個性豊かな書き換えが可能となる。複数の処理部１３ａ〜１３ｌの書き換え処理を適用した場合には、最終的な処理部の出力を、書き換え文３２として出力する。なお、（Ｉ）の例は、（ｈ）人称代名詞、（ｃ）分かち書き、（ｆ）活用形、及び（ｅ）述部機能表現・キャラ語尾の項目についての書き換え処理を組み合わせた例である。また、（ＩＩ）の例は、（ｂ）文字種、（ｇ）言いよどみ、（ｈ）人称代名詞、（ｃ）分かち書き、及び（ｅ）述部機能表現・キャラ語尾の項目についての書き換え処理を組み合わせた例である。 By arbitrarily combining the rewriting processes of the processing units 13a to 13l of the rewriting processing unit 13, the input sentence 23 “I chose cold clothes because it is cold,” for example, “(I) << >> is cold <<, so >>, warm << Ke >> chose clothes << ze! >> "and" (II) << That's why it becomes possible to rewrite the warm fussy >> as if it were <<, Elder >>>>. When the rewriting process of the plurality of processing units 13 a to 13 l is applied, the final output of the processing unit is output as the rewritten sentence 32. The example of (I) is an example in which (h) personal pronoun, (c) split writing, (f) utilization form, and (e) predicate function expression / rewriting processing for character ending items are combined. In addition, the example of (II) combines (b) character type, (g) sloppy, (h) personal pronoun, (c) split writing, and (e) rewrite processing for predicate function expression / character ending items. It is an example.

なお、本実施形態では、複数の書き換え処理を適用する場合、例えば、以下の点を考慮して、適用する書き換え処理の順番を定めておくことができる。
・（ｄ）文構造の変換は、その他１１種の書き換え処理よりも先に適用するのが良い。
・（ｇ）言いよどみへの変換、及び（ｂ）文字種の変換は、（ｄ）文構造、（ｉ）方言・特殊語彙、（ａ）文体、（ｅ）述部機能表現・キャラ語尾、及び（ｈ）人称代名詞よりも後に適用するのが良い。ここで、（ｇ）言いよどみへの変換と（ｂ）文字種の変換の適用順序はどちらが先でも構わない。
・（ｉ）方言・特殊語彙の置換は、（ｄ）文構造より後、かつ、（ａ）文体、（ｅ）述部機能表現・キャラ語尾の変換よりも先に適用するのが良い。
本実施形態では、上記３点を踏まえ、（ｄ）文構造、（ｉ）方言・特殊語彙置換、（ａ）文体、（ｅ）述部機能表現・キャラ語尾、（ｈ）人称代名詞、（ｂ）文字種、（ｆ）活用形、（ｇ）言いよどみ、（ｊ）音素置換、（ｃ）分かち書き、（ｋ）弁別的無意味表現、（ｌ）記号類の順で書き換え処理を適用するものとする。 In the present embodiment, when a plurality of rewriting processes are applied, for example, the order of the rewriting processes to be applied can be determined in consideration of the following points.
(D) The sentence structure conversion should be applied before the other 11 types of rewrite processing.
(G) Conversion to slogan and (b) character type conversion are (d) sentence structure, (i) dialect / special vocabulary, (a) style, (e) predicate function expression, character ending, and ( h) It should be applied after personal pronouns. Here, either (g) conversion to stagnation or (b) conversion of character type may be applied first.
(I) Dialect / special vocabulary replacement should be applied after (d) sentence structure and before (a) sentence style, (e) predicate function expression / character ending conversion.
In this embodiment, based on the above three points, (d) sentence structure, (i) dialect / special vocabulary substitution, (a) style, (e) predicate function expression / character ending, (h) personal pronoun, (b Rewrite processing shall be applied in the order of :) character type, (f) inflection, (g) sloppy, (j) phoneme substitution, (c) segmentation, (k) discriminative meaningless expression, (l) symbols. .

なお、図３〜図１１に示した各リスト又は各ルール内の＊は、任意の文字列を表す。また、図３〜図１１に示した各リスト又は各ルール内の「例」又は「備考」は、各リスト又は各ルールの説明を補助する適用例等であり、各リスト又は各ルールの項目として定めておく必要はない。 Note that * in each list or each rule shown in FIGS. 3 to 11 represents an arbitrary character string. In addition, “example” or “remarks” in each list or each rule shown in FIGS. 3 to 11 is an application example that assists in explanation of each list or each rule, and is an item of each list or each rule. There is no need to specify.

＜言語表現書き換え装置の作用＞
次に、本実施形態に係る言語表現書き換え装置１０の作用について説明する。目的のキャラクタに応じた言語表現の書き換えに関する設定値が記載された設定ファイル２１、及びキャラクタに応じた言語表現への書き換えの対象となる入力文２３が言語表現書き換え装置１０に入力されると、言語表現書き換え処理装置１０において、図１２に示す言語表現書き換え処理ルーチンが実行される。 <Operation of language expression rewriting device>
Next, the operation of the language expression rewriting device 10 according to the present embodiment will be described. When a setting file 21 in which setting values relating to rewriting of a language expression according to a target character and an input sentence 23 to be rewritten to a language expression according to a character are input to the language expression rewriting device 10, In the language expression rewriting processing device 10, a language expression rewriting processing routine shown in FIG. 12 is executed.

ステップＳ１１で、設定部１１が、設定ファイル２１を読み込み、読み込んだ設定ファイル２１に記載された各項目についての設定値を、設定値ＤＢ２２に記憶する。 In step S <b> 11, the setting unit 11 reads the setting file 21 and stores setting values for each item described in the read setting file 21 in the setting value DB 22.

次に、ステップＳ１２で、基本解析部１２が、入力された入力文２３を読み込み、読み込んだ入力文２３を解析し、形態素境界、各形態素の読み、各形態素の品詞、活用語の活用型・活用形、文節境界、文節主辞、及び機能表現の意味ラベルの情報を取得する。基本解析部１２は、取得した情報を、基本解析結果として、書き換え処理部１３へ出力する。 Next, in step S12, the basic analysis unit 12 reads the input sentence 23 that has been input, analyzes the input sentence 23 that has been read, reads the morpheme boundary, the reading of each morpheme, the part of speech of each morpheme, Acquires information on the usage form, phrase boundaries, phrase headings, and semantic labels of functional expressions. The basic analysis unit 12 outputs the acquired information to the rewrite processing unit 13 as a basic analysis result.

次に、ステップＳ１３で、上記で詳述したように、書き換え処理部１３の各処理部１３ａ〜１３ｌで、入力文２３（基本解析結果）に対する書き換え処理を行う。次に、ステップＳ１４で、書き換え処理部１３が、上記ステップＳ１３の書き換え処理の結果である書き換え文３２を出力して、言語表現書き換え処理ルーチンは終了する。 Next, in step S13, as described in detail above, the rewriting process for the input sentence 23 (basic analysis result) is performed in each of the processing units 13a to 13l of the rewriting processing unit 13. Next, in step S14, the rewrite processing unit 13 outputs the rewritten sentence 32 that is the result of the rewrite process in step S13, and the language expression rewrite process routine ends.

以上説明したように、本実施形態に係る言語表現書き換え処理装置によれば、（ａ）文体、（ｂ）文字種、（ｃ）分かち書き、（ｄ）文構造、（ｅ）述部機能表現・キャラ語尾、（ｆ）活用形、（ｇ）言いよどみ、（ｈ）人称代名詞、（ｉ）方言・特殊語彙、（ｊ）音素置換、（ｋ）弁別的無意味表現、及び（ｌ）記号類の１２種類の書き換え項目を任意に組み合わせた書き換え処理を実施する。このため、文末表現だけでなく、多様な言語表現の書き換えを任意に組み合わせて実施することができ、多様なバリエーションの言語表現への書き換えを実現することができる。 As described above, according to the language expression rewriting processing device according to the present embodiment, (a) style, (b) character type, (c) split writing, (d) sentence structure, (e) predicate function expression / character Ending, (f) inflection, (g) stagnation, (h) personal pronoun, (i) dialect / special vocabulary, (j) phoneme substitution, (k) discriminative meaningless expression, and (l) 12 Rewrite processing is performed by arbitrarily combining types of rewrite items. For this reason, not only sentence end expressions but also various language expressions can be rewritten in any combination, and various variations of language expressions can be rewritten.

また、書き換えの対象を文末表現に限定していないため、非特許文献１の手法では実現が難しかった、テレビアニメや漫画の登場人物のような個性の強いキャラクタらしい言語表現への変換が可能となる。 In addition, since the rewriting target is not limited to the sentence end expression, it can be converted into a language expression that seems to be a character with a strong individuality such as a TV animation or a cartoon character, which is difficult to realize with the technique of Non-Patent Document 1. Become.

例えば、入力文が「私は寒がりなので、暖かい服装を選んだ。」という文の場合、文末表現の変換のみを行う従来技術では、例えば「私は寒がりなので、暖かい服装を選んだぜ！」のようになる。一方、本実施形態のように、多様な言語表現の書き換えを任意に組み合わせて実施することで、例えば、上述したような「（Ｉ）オレは寒がりだ。だから、暖けぇ服装を選んだぜ！」や、「（ＩＩ）あ、あたしは、さむがりなの。だ、だから、あたたかいふくそうを、えらんだの」のように多様で、個性豊かな書き換えが可能となる。 For example, if the input sentence is "I am cold, I chose warm clothes." For example, in the conventional technology that only converts the sentence ending expression, for example, "I am cold, so I chose warm clothes! "become that way. On the other hand, as in this embodiment, by rewriting various language expressions in any combination, for example, “(I) I am cold. So I chose warm clothes. "(II) Ah, I'm sad. That's why I've chosen a warm fusou.

人と対話をするシステム（対話システム）に本実施形態を適用した場合には、システムを運営する者（システムのキャラクタをデザインする者）は、キャラクタ別の設定ファイルと数種の表現リスト及び変換ルールとを用意するだけで、多様なキャラクタらしさを持つ発話を簡単に作成することができ、対話システムのキャラクタを増やす際にかかるコストを大幅に削減することができる。 When this embodiment is applied to a system that interacts with people (dialogue system), the person who operates the system (the person who designs the characters of the system) has a setting file for each character, several expression lists, and conversion. By preparing the rules, it is possible to easily create utterances having various character characteristics, and to greatly reduce the cost required for increasing the number of characters in the dialogue system.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、書き換え処理部１３の各処理部１３ａ〜１３ｌでの書き換え処理の方法は、上述した方法に限定されず、他の方法を適用してもよい。また、書き換え項目は、上記の（ａ）〜（ｌ）に限定されず、キャラクタに応じた発話において、そのキャラクタの特徴が表れる言語的特徴に関する書き換え項目であればよい。 For example, the rewrite processing method in each of the processing units 13a to 13l of the rewrite processing unit 13 is not limited to the method described above, and other methods may be applied. In addition, the rewrite items are not limited to the above (a) to (l), and may be rewrite items related to linguistic features in which the features of the character appear in the utterance according to the character.

また、上記実施形態では、複数の書き換え処理を組み合わせて適用する場合に、予め定めた順番で適用する場合について説明したが、適用される書き換え処理の順番を指定することにより、適用順を変更可能してもよい。この場合、設定ファイルに書き換え処理の順番を指定する情報も記載しておき、この情報に基づいて、各処理部を機能させるようにすればよい。例えば、「野原の花」という入力文２３に対して、図１１に示すキャラＡのキャラクタ別音素置換ルール３１を適用する（ｊ）音素置換と、ひらがな化の（ｂ）文字種とを適用する場合を考える。（ｊ）音素置換→（ｂ）文字種の順で書き換え処理を適用した場合には、「のはらにょはな」と書き換えられる。一方、（ｂ）文字種→（ｊ）音素置換の順で書き換え処理を適用した場合には、「にょはらにょはにゃ」と書き換えられる。このように、書き換え処理の順番を変えることによっても、キャラクタの個性の強さが異なるような、多様なバリエーションを表現することができる。 In the above embodiment, when a plurality of rewriting processes are applied in combination, the case where they are applied in a predetermined order has been described. However, the order of application can be changed by specifying the order of the rewriting processes to be applied. May be. In this case, information for specifying the order of the rewriting process is also described in the setting file, and each processing unit may be caused to function based on this information. For example, in the case where (j) phoneme replacement and (b) character type of hiragana are applied to the input sentence 23 of “Nohara no Hana”, the character-specific phoneme replacement rule 31 of character A shown in FIG. 11 is applied. think of. (J) Phoneme replacement → (b) When rewriting processing is applied in the order of character types, it is rewritten as “Noharanya Hana”. On the other hand, when the rewriting process is applied in the order of (b) character type → (j) phoneme replacement, it is rewritten as “Nyohara Nyonya”. In this way, by changing the order of the rewriting process, it is possible to express various variations in which the strength of the character personality is different.

また、（ｂ）文字種、（ｃ）分かち書き、及び（ｌ）記号類は、出力される書き換え文３２がテキストデータの場合に効果を発揮する書き換え項目である。つまり、最終出力が音声合成による読み上げである対話システム等の発話に適用する場合には、キャラクタ性付与効果を発揮しない。したがって、書き換え文３２の出力が音声合成による読み上げのみの場合には、文字種変換部１３ｂ、分かち書き変換部１３ｃ、及び記号類挿入部１３ｌは、書き換え処理部１３の構成から省略してもよい。又は、書き換え文３２の出力形態に応じて、（ｂ）文字種、（ｃ）分かち書き、及び（ｌ）記号類の書き換え項目の設定値を変更するようにしてもよい。例えば、書き換え文３２の出力が音声合成による読み上げのみの場合であって、設定ファイルに、（ｂ）文字種、（ｃ）分かち書き、及び（ｌ）記号類の書き換え処理を適用することを示す設定値が記載されていた場合には、設定値ＤＢ２２に設定値を記憶する際に、書き換え処理を適用しないことを示す設定値に変更した上で記憶するようにする。 Further, (b) character type, (c) split writing, and (l) symbols are rewrite items that are effective when the rewritten text 32 to be output is text data. That is, when applied to an utterance of a dialogue system or the like whose final output is reading by speech synthesis, the character imparting effect is not exhibited. Therefore, when the output of the rewritten sentence 32 is only read out by speech synthesis, the character type conversion unit 13b, the segmentation conversion unit 13c, and the symbol insertion unit 13l may be omitted from the configuration of the rewrite processing unit 13. Or, according to the output form of the rewritten sentence 32, the setting values of the rewriting items of (b) character type, (c) split writing, and (l) symbols may be changed. For example, when the output of the rewritten sentence 32 is only read out by speech synthesis, a setting value indicating that the rewriting process of (b) character type, (c) segmentation, and (l) symbols is applied to the setting file. When the setting value is stored in the setting value DB 22, the setting value is changed to a setting value indicating that the rewriting process is not applied and stored.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能であるし、ネットワークを介して提供することも可能である。 Further, in the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium or provided via a network. It is also possible to do.

１０言語表現書き換え装置
１１設定部
１２基本解析部
１３書き換え処理部
１３ａ文体変換部
１３ｂ文字種変換部
１３ｃ分かち書き変換部
１３ｄ文構造変換部
１３ｅ述部機能表現・キャラ語尾変換部
１３ｆ活用形変換部
１３ｇ言いよどみ変換部
１３ｈ人称代名詞置換部
１３ｉ方言・特殊語彙置換部
１３ｊ音素置換部
１３ｋ弁別的無意味表現挿入部
１３ｌ記号類挿入部
２１設定ファイル
２２設定値データベース
２３入力文
２４文体別機能表現リスト
２５活用表
２６キャラクタ別接続表現リスト
２７キャラクタ別機能表現リスト
２８崩れ活用ルール
２９キャラクタ別人称代名詞リスト
３０キャラクタ別語彙置換ルール
３１キャラクタ別音素置換ルール
３２書き換え文 DESCRIPTION OF SYMBOLS 10 Language expression rewriting apparatus 11 Setting part 12 Basic analysis part 13 Rewriting process part 13a Stylistic conversion part 13b Character type conversion part 13c Separation writing part 13d Sentence structure conversion part 13e Predicate function expression and character ending conversion part 13f Utilization form conversion part 13g Conversion unit 13h Personal pronoun replacement unit 13i Dialect / special vocabulary replacement unit 13j Phoneme replacement unit 13k Discriminant meaningless expression insertion unit 13l Symbol class insertion unit 21 Setting file 22 Setting value database 23 Input sentence 24 Stylistic function expression list 25 Usage table 26 Character-specific connection expression list 27 Character-specific function expression list 28 Collapse utilization rule 29 Character-specific personal pronoun list 30 Character-specific vocabulary replacement rule 31 Character-specific phoneme replacement rule 32 Rewritten sentence

Claims

A setting unit that sets a setting value related to rewriting of the language expression based on the linguistic feature for each of a plurality of types of linguistic features that include a stylistic style, a predicate function expression, and a personal pronoun; ,
Based on the setting value set by the setting unit, the process of converting the type of style, the process of converting the predicate function expression according to the character, and converting the morpheme that is a personal pronoun into the personal pronoun according to the character A rewriting processing unit that applies at least one type of rewriting processing to an input sentence among rewriting processing of language expressions based on a plurality of types of linguistic features including processing;
Language expression rewriting device including

For each of the plurality of types of linguistic features, the setting unit further includes a sentence structure, an inflection, a slang, a dialect or special vocabulary, a specific phoneme, and a distinct meaningless expression that can distinguish a character but has no meaning. Set the setting value,
The rewrite processing unit converts a sentence structure, converts a utilization form, converts to a sloppy expression, converts a specific vocabulary to a dialect or special vocabulary based on the setting value set by the setting unit A language expression rewriting process based on the plurality of types of linguistic features, further comprising: a process of converting, a process of converting a specific phoneme into a phoneme corresponding to the character, and a process of inserting a discriminative meaningless expression corresponding to the character. The language expression rewriting device according to claim 1, wherein at least one type of rewriting processing is applied to the input sentence.

The setting unit sets the setting value for each of the plurality of types of linguistic features further including a character type, a division, and symbols,
The rewriting processing unit includes the plurality of types of linguistic features further including a process of converting the character type, a process of converting to a split type, and a process of inserting symbols based on the setting value set by the setting unit. The language expression rewriting device according to claim 1 or 2, wherein at least one type of rewriting process is applied to the input sentence among the rewriting processes of the linguistic expression based on the language expression.

A language expression rewriting method in a language expression rewriting device including a setting unit and a rewriting processing unit,
For each of a plurality of types of linguistic features that include a style, a predicate functional expression, and a personal pronoun, and a feature corresponding to a character appears, a setting value for rewriting a linguistic expression based on the linguistic feature Set,
Based on the setting value set by the setting unit, the rewriting processing unit converts the type of stylistic, the processing for converting the predicate function expression according to the character, and the morpheme that is a personal pronoun according to the character. A language expression rewriting method that applies at least one type of rewriting process to an input sentence among rewriting processes of a language expression based on a plurality of types of linguistic features including a process of converting to a personal pronoun.

For each of the plurality of types of linguistic features, the setting unit further includes a sentence structure, an inflection, a slang, a dialect or special vocabulary, a specific phoneme, and a distinct meaningless expression that can distinguish a character but has no meaning. Set the setting value,
The rewrite processing unit converts a sentence structure, converts a utilization form, converts to a sloppy expression, converts a specific vocabulary to a dialect or special vocabulary based on the setting value set by the setting unit A language expression rewriting process based on the plurality of types of linguistic features, further comprising: a process of converting, a process of converting a specific phoneme into a phoneme corresponding to the character, and a process of inserting a discriminative meaningless expression corresponding to the character. The language expression rewriting method according to claim 4, wherein at least one type of rewriting processing is applied to the input sentence.

The setting unit sets the setting value for each of the plurality of types of linguistic features further including a character type, a division, and symbols,
The rewrite processing unit includes the plurality of types of linguistic features further including a process of converting the character type, a process of converting a division, and a process of inserting symbols based on the setting value set by the setting unit. 6. The language expression rewriting method according to claim 4, wherein at least one type of rewriting process is applied to the input sentence among the rewriting processes of the linguistic expression based on the language expression.

The language expression rewriting program for functioning a computer as each part of the language expression rewriting apparatus of any one of Claims 1-3.