JP7379968B2

JP7379968B2 - Learning support devices, learning support methods and programs

Info

Publication number: JP7379968B2
Application number: JP2019164749A
Authority: JP
Inventors: 誠北地
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2019-09-10
Filing date: 2019-09-10
Publication date: 2023-11-15
Anticipated expiration: 2039-09-10
Also published as: JP2021043306A

Description

本発明は、学習支援装置、学習支援方法及びプログラムに関する。 The present invention relates to a learning support device, a learning support method, and a program.

例えば、語学学習機などの学習装置において、学習対象となる言語のネイティブによる発音の音声データを再生する機能がある。ユーザは、単語、熟語、文章など、様々なテキストの中から任意のテキストを選択し、当該テキストに対応する音声データを聞くことで、リスニングの学習を行なうことができる。 For example, a learning device such as a language learning machine has a function of reproducing audio data of a native pronunciation of a language to be learned. A user can learn listening by selecting arbitrary text from various texts such as words, phrases, sentences, etc., and listening to audio data corresponding to the text.

従来の学習装置では、選択したテキストに対応する音声データの全体（先頭から末尾まで）の再生速度を変化させ、ユーザは、当該音声データをゆっくり聞いたり速く聞いたりして学習できる。 In conventional learning devices, the playback speed of the entire audio data (from the beginning to the end) corresponding to the selected text is changed, and the user can learn by listening to the audio data slowly or quickly.

また、一般的に日本人が苦手な音素である「Ｌ」及び「Ｒ」の発音の聞き取りを練習するために、当該音素を含む発音区間で再生速度を変化させた専用の音声データを用いて、聞き取り練習を行なうことのできる英語の音素「Ｌ」及び「Ｒ」の学習装置が考えられている（例えば、特許文献１参照。）。 In addition, in order to practice listening to the pronunciation of "L" and "R", which are phonemes that Japanese people generally have difficulty with, we used special audio data in which the playback speed was changed in the pronunciation section that included the phoneme. , and a learning device for the English phonemes "L" and "R" that allows listening practice (for example, see Patent Document 1).

特開２００４－３３４１６４号公報Japanese Patent Application Publication No. 2004-334164

従来、苦手な音素の発音の聞き取りを練習するための学習装置では、当該苦手な音素（「Ｌ」及び「Ｒ」など）を含む専用の音声データを用いる必要がある。 Conventionally, in a learning device for practicing listening to the pronunciation of phonemes that are difficult to pronounce, it is necessary to use dedicated audio data that includes the phonemes that are difficult to pronounce (such as "L" and "R").

このため、従来の学習装置では、ユーザが、例えば任意のテキストの音声データを再生させて聞いているときに、ユーザの苦手な音素を含む発音部分の聞き取り練習を行なうことはできない。 For this reason, with conventional learning devices, when a user is playing and listening to audio data of an arbitrary text, for example, it is not possible for the user to practice listening to pronunciation portions that include phonemes that the user is not good at.

本発明は、このような課題に鑑みてなされたもので、ユーザによるテキスト全体の聞き取りが妨げられることなく、ユーザが自然に苦手な音素を含む発音部分の聞き取り練習を行なうことが可能になる学習支援装置、学習支援方法及びプログラムを提供することを目的とする。 The present invention has been made in view of these problems, and is a learning method that allows users to practice listening to pronunciation parts that include phonemes that they are naturally bad at, without interfering with the user's ability to listen to the entire text. The purpose is to provide support devices, learning support methods, and programs.

本発明に係る第１の態様の学習支援装置は、学習の対象とされた単語に所定の発音となる音素が含まれているか否かを判定する判定手段と、前記判定手段により前記所定の発音となる音素が前記単語に含まれていると判定された場合に、前記単語を音声再生するときの再生速度を、前記所定の発音となる音素での再生速度が他の音素での再生速度よりも遅くなるように、設定する設定手段と、を備えることを特徴とする。
また、本発明に係る第２の態様の学習支援装置は、予め対応付けられた一対の発音のうち一方の発音となる音素が含まれた第１単語と他方の発音となる音素が含まれた第２単語とを学習の対象として選択する選択手段と、前記選択手段により選択された前記第１単語と前記第２単語とを音声再生するときの再生速度を、前記一方の発音となる音素での再生速度と前記他方の発音となる音素での再生速度とが他の音素での再生速度よりも遅くなるように、設定する設定手段と、を備え、前記選択手段は、前記一方の発音となる音素または前記他方の発音となる音素を除いて互いに対応する位置の音素間で発音が一致するように、前記学習の対象として前記第１単語と前記第２単語とを選択する、ことを特徴とする。
また、本発明に係る第１の態様の学習支援方法は、学習支援装置が実行する学習支援方法であって、学習の対象とされた単語に所定の発音となる音素が含まれているか否かを判定する判定処理と、前記判定処理により前記所定の発音となる音素が前記単語に含まれていると判定された場合に、前記単語を音声再生するときの再生速度を、前記所定の発音となる音素での再生速度が他の音素での再生速度よりも遅くなるように、設定する設定処理と、を含むことを特徴とする。
また、本発明に係る第２の態様の学習支援方法は、学習支援装置が実行する学習支援方法であって、予め対応付けられた一対の発音のうち一方の発音となる音素が含まれた第１単語と他方の発音となる音素が含まれた第２単語とを学習の対象として選択する選択処理と、前記選択処理により選択された前記第１単語と前記第２単語とを音声再生するときの再生速度を、前記一方の発音となる音素での再生速度と前記他方の発音となる音素での再生速度とが他の音素での再生速度よりも遅くなるように、設定する設定処理と、を含み、前記選択処理は、前記一方の発音となる音素または前記他方の発音となる音素を除いて互いに対応する位置の音素間で発音が一致するように、前記学習の対象として前記第１単語と前記第２単語とを選択する、ことを特徴とする。
また、本発明に係る第１の態様のプログラムは、コンピュータを学習の対象とされた単語に所定の発音となる音素が含まれているか否かを判定する判定手段、前記判定手段により前記所定の発音となる音素が前記単語に含まれていると判定された場合に、前記単語を音声再生するときの再生速度を、前記所定の発音となる音素での再生速度が他の音素での再生速度よりも遅くなるように、設定する設定手段、として機能させることを特徴とする。
また、本発明に係る第２の態様のプログラムは、コンピュータを、予め対応付けられた一対の発音のうち一方の発音となる音素が含まれた第１単語と他方の発音となる音素が含まれた第２単語とを学習の対象として選択する選択手段、前記選択手段により選択された前記第１単語と前記第２単語とを音声再生するときの再生速度を、前記一方の発音となる音素での再生速度と前記他方の発音となる音素での再生速度とが他の音素での再生速度よりも遅くなるように、設定する設定手段、として機能させ、前記選択手段は、前記一方の発音となる音素または前記他方の発音となる音素を除いて互いに対応する位置の音素間で発音が一致するように、前記学習の対象として前記第１単語と前記第２単語とを選択する、ことを特徴とする。 A learning support device according to a first aspect of the present invention includes a determining means for determining whether or not a word to be learned includes a phoneme resulting in a predetermined pronunciation; If it is determined that the word contains a phoneme, the playback speed at which the word is played back is set such that the playback speed for the phoneme that produces the predetermined pronunciation is higher than the playback speed for other phonemes. and setting means for setting the speed so that the speed is also slow.
Further, in the learning support device according to the second aspect of the present invention, the first word includes a phoneme that is pronounced as one of a pair of pronunciations that are associated in advance, and the phoneme that is pronounced as the other pronunciation. a selection means for selecting a second word as a learning target; and a selection means for selecting a second word as a learning target; and a selection means for selecting a second word as a learning target; setting means for setting the reproduction speed of the phoneme and the reproduction speed of the phoneme that is the other pronunciation to be slower than the reproduction speed of the other phoneme, and the selection means is configured to set the reproduction speed of the phoneme that is the other pronunciation. The first word and the second word are selected as the learning targets so that the pronunciations of the phonemes in corresponding positions match, excluding the phoneme that is pronounced as or the other phoneme. shall be.
Further, the learning support method according to the first aspect of the present invention is a learning support method executed by a learning support device, and the learning support method is a learning support method that is executed by a learning support device, and the learning support method determines whether or not a word to be learned includes a phoneme that has a predetermined pronunciation. a determination process for determining the predetermined pronunciation; and when it is determined by the determination process that the word includes a phoneme resulting in the predetermined pronunciation, the playback speed at which the word is reproduced aloud is set to the predetermined pronunciation. The present invention is characterized in that it includes a setting process for setting the reproduction speed of a phoneme such that the reproduction speed of the phoneme is slower than the reproduction speed of other phonemes.
Further, a learning support method according to a second aspect of the present invention is a learning support method executed by a learning support device, and includes a phoneme that is one of a pair of pronunciations that are associated in advance. a selection process of selecting one word and a second word that includes a phoneme that is pronounced as the other as a learning target; and a time of audio reproducing the first word and the second word selected by the selection process. a setting process for setting the playback speed of the phoneme such that the playback speed of the phoneme that is pronounced as one of the phonemes and the playback speed of the phoneme that is the other phoneme are slower than the playback speed of the other phoneme; and the selection process selects the first word as the learning target so that the pronunciations of phonemes in corresponding positions match each other, except for the phoneme that is pronounced in one direction or the phoneme that is pronounced in the other direction. and the second word.
Further, the program according to the first aspect of the present invention includes a determining means for determining whether or not a word to be learned by a computer includes a phoneme with a predetermined pronunciation; When it is determined that a phoneme that will be pronounced is included in the word, the playback speed when the word is reproduced aloud is set such that the playback speed of the phoneme that becomes the predetermined pronunciation is the same as the playback speed of other phonemes. It is characterized in that it functions as a setting means for setting so that the speed becomes slower than
Further, the program according to the second aspect of the present invention causes a computer to select a first word that includes a phoneme that is pronounced as one of a pair of pronunciations that are associated in advance, and a phoneme that includes the phoneme that is pronounced as the other pronunciation. a selection means for selecting the first word and the second word selected by the selection means as a learning target; and the reproduction speed of the phoneme which is the other pronunciation are slower than the reproduction speed of the other phoneme, and the selection means is configured to function as a setting means for setting the reproduction speed of the phoneme which is the other pronunciation and the reproduction speed of the phoneme which is the other pronunciation. The first word and the second word are selected as the learning targets so that the pronunciations of the phonemes in corresponding positions match, excluding the phoneme that is pronounced as or the other phoneme. shall be.

本発明の電子機器の実施形態に係る学習支援装置１０の外観構成を示す図。1 is a diagram showing an external configuration of a learning support device 10 according to an embodiment of an electronic device of the present invention. 学習支援装置１０の電子回路の構成を示すブロック図。1 is a block diagram showing the configuration of an electronic circuit of the learning support device 10. FIG. 苦手発音テーブル（２２ｇ）に３段階の語学レベルに区分して記述されたユーザが苦手な複数の音素の発音記号の一例を示す図。The figure which shows an example of the pronunciation symbols of the plurality of phonemes which a user is not good at and which are classified into three levels of language proficiency and described in the weak pronunciation table (22g). 学習支援装置１０の第１実施形態の音声再生処理（１）を示すフローチャート。2 is a flowchart showing audio reproduction processing (1) of the first embodiment of the learning support device 10. 音声再生処理（１）に含まれる音声選択処理（Ｓ１）を示すフローチャート。A flowchart showing audio selection processing (S1) included in audio reproduction processing (1). 音声再生処理（１）に含まれる苦手発音要素特定処理（Ｓ３）を示すフローチャート。A flowchart showing weak pronunciation element identification processing (S3) included in audio reproduction processing (1). 音声再生処理（１）に含まれる発音タイミング特定処理（Ｓ４）を示すフローチャート。12 is a flowchart showing the pronunciation timing specifying process (S4) included in the audio reproduction process (1). 音声再生処理（１）に含まれる話速変換区間設定方法特定処理（Ｓ５）を示すフローチャート。12 is a flowchart showing a speech speed conversion section setting method specifying process (S5) included in the audio reproduction process (1). 音声再生処理（１）に従った再生対象の音声データの通常の再生タイミングと、苦手な発音要素の発音部分に対応して再生速度を変化させ話速変換して再生する再生タイミングとを対比して示す図。The normal playback timing of the audio data to be played according to the audio playback process (1) is compared with the playback timing in which the playback speed is changed in response to the pronunciation part of the pronunciation element that is weak, and the speaking speed is converted and played back. Figure shown. 学習支援装置１０の第２実施形態の音声再生処理（２）を示すフローチャート。12 is a flowchart showing the audio reproduction process (2) of the second embodiment of the learning support device 10. 音声再生処理（２）に含まれる話速変換発音区間特定処理（Ａ４）を示すフローチャート。12 is a flowchart showing the speech speed conversion pronunciation section identification process (A4) included in the audio reproduction process (2). 音声再生処理（２）に従った２つの類似音素をそれぞれ含む２つの単語の音声データの通常の再生タイミングと、類似音素の発音部分に対応して再生速度を変化させ話速変換して再生する再生タイミングとを対比して示す図。The normal playback timing of the sound data of two words each containing two similar phonemes according to the sound playback process (2) and the playback speed are changed in accordance with the pronunciation part of the similar phoneme and the speech speed is converted and played back. FIG. 4 is a diagram showing a comparison with playback timing.

以下図面を参照して本発明の実施の形態について説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明の電子機器の実施形態に係る学習支援装置１０の外観構成を示す図である。 FIG. 1 is a diagram showing the external configuration of a learning support device 10 according to an embodiment of an electronic device of the present invention.

電子機器は、以下に説明する学習支援専用の学習支援装置１０（実施形態では電子辞書）として構成されるか、学習支援機能を備えたタブレット型のＰＤＡ(personal digital assistants)、ＰＣ(personal computer)、携帯電話、電子ブック、携帯ゲーム機などとして構成される。 The electronic device is configured as a learning support device 10 (electronic dictionary in the embodiment) dedicated to learning support described below, or a tablet-type personal digital assistant (PDA) or personal computer (PC) equipped with a learning support function. , mobile phones, e-books, portable game consoles, etc.

学習支援装置１０は、その本体ケース１１と蓋体ケース１２とがヒンジ部１３を介して展開／閉塞可能な折り畳み型ケースを備えて構成される。折り畳み型ケースを展開した本体ケース１１の表面には、［ホーム］キー１４ａ、機能指定キー１４ｂ、文字入力キー１４ｃ、［訳／決定］キー１４ｄ、［戻る／リスト］キー１４ｅ、カーソルキー１４ｆ、［シフト］キー１４ｇ、［音声］キー１４Ｓ、などを含むキー入力部（キーボード）１４、音声出力部（スピーカを含む）１５、および音声入力部（マイクを含む）１６が設けられる。 The learning support device 10 includes a foldable case in which a main body case 11 and a lid case 12 can be expanded/closed via a hinge part 13. On the surface of the main body case 11 when the foldable case is unfolded, there are a [Home] key 14a, a function designation key 14b, a character input key 14c, a [Translation/Enter] key 14d, a [Back/List] key 14e, a cursor key 14f, A key input unit (keyboard) 14 including a [shift] key 14g, a [sound] key 14S, etc., an audio output unit (including a speaker) 15, and an audio input unit (including a microphone) 16 are provided.

また、蓋体ケース１２の表面には、タッチパネル式表示部（ディスプレイ）１７が設けられる。タッチパネル式表示部１７は、ユーザがペンや指などでタッチした位置を検出するタッチ位置検出装置と表示装置が一体となった構造であり、バックライト付きのカラー液晶表示画面に透明タッチパネルを重ねて構成される。 Further, a touch panel type display section (display) 17 is provided on the surface of the lid case 12. The touch panel display unit 17 has a structure in which a touch position detection device that detects the position touched by the user with a pen or finger, and a display device are integrated, and a transparent touch panel is stacked on a backlit color liquid crystal display screen. configured.

そして、タッチパネル式表示部１７の右端には、キー入力部１４における一部のキーの押下操作や本学習支援装置１０の一部の機能の指定操作を、タッチ操作により行うためのキーや機能の表記（［ホーム］［音声］［訳／決定］など）が固定印刷されたタッチキーエリア１７Ａが設けられる。 At the right end of the touch panel display section 17, there are keys and functions for pressing some keys on the key input section 14 and specifying some functions of the learning support device 10 by touch operations. A touch key area 17A is provided in which a notation ([home], [voice], [translation/decision], etc.) is fixedly printed.

キー入力部１４の機能指定キー１４ｂは、各キーに表記されている辞書コンテンツ（［大辞典］など）、辞書コンテンツのカテゴリ（［国語］［古語］［漢和］［英和］など）、学習コンテンツのカテゴリ（［学習１］［学習２］）、［コンテンツ一覧］、ツールの一つのカテゴリ［学習帳］を、それぞれ直接指定するためのキーである。 The function designation keys 14b of the key input section 14 are used to select the dictionary content (such as [Great Dictionary]), the dictionary content category (such as [Japanese], [Old Japanese], [Kanwa], [English-Japanese], etc.) written on each key, and learning content. These keys are used to directly specify the categories ([Learning 1], [Learning 2]), [Content List], and one category of tools [Study Book].

また、キー入力部１４のキーは、［シフト］キー１４ｇが操作された後に続けて操作されることで、そのキートップに枠囲み無しで記載されたキー機能ではなく、枠囲みして記載されたキーとして機能できるようになっている。例えば、［シフト］キー１４ｇの操作後に［訳／決定］キー１４ｄが操作（以下、［シフト］＋［決定］キーと記す。）されると、登録対象として指定されているデータを登録する機能を起動させるための［登録］キーとなる。［シフト］＋［削除］キーは［設定］キーとなる。 Furthermore, when the keys of the key input unit 14 are operated continuously after the [Shift] key 14g is operated, the key functions are not written without a frame on the key top, but are written in a frame. It is designed to function as a key. For example, when the [Translation/Enter] key 14d is operated after operating the [Shift] key 14g (hereinafter referred to as [Shift] + [Enter] keys), the function registers the data specified as the registration target. This is the [registration] key to start the . The [Shift] + [Delete] key becomes the [Setting] key.

キー入力部１４の［音声］キー１４Ｓおよびタッチキーエリア１７Ａの［音声］タッチキーＢＳは、何れも、タッチパネル式表示部１７に表示されているテキストや項目の内容に対応する音声データを出力させるための音声再生機能を起動させるキーである。 The [sound] key 14S of the key input section 14 and the [sound] touch key BS of the touch key area 17A both output sound data corresponding to the content of the text or item displayed on the touch panel display section 17. This key activates the audio playback function.

例えば、図１に示すように、英和辞典の見出し語検索に従い、見出し語“establish”の見出し語説明画面ＧＥをタッチパネル式表示部１７に表示させた状態で、［音声］キー１４Ｓ（又は［音声］タッチキーＢＳ）の操作により音声再生機能を起動させる。そして、見出し語説明画面ＧＥ上の見出し語“establish”を再生対象として選択して反転表示（識別表示）ｈさせた状態で、［訳／決定］キー１４ｄを操作すると、選択された見出し語“establish”に対応する音声データ（見出し語“establish”を読み上げる、例えばネイティブの音声データ）が再生され、音声出力部１５から出力される。 For example, as shown in FIG. 1, in accordance with a headword search in an English-Japanese dictionary, the headword explanation screen GE for the headword "establish" is displayed on the touch panel display section 17, and the [Voice] key 14S (or ] Activate the audio playback function by operating the touch key BS). Then, when the headword "establish" on the headword explanation screen GE is selected as a playback target and highlighted (identified) h, when the [Translation/Enter] key 14d is operated, the selected headword " audio data corresponding to "establish" (for example, native audio data that reads out the entry word "establish") is reproduced and output from the audio output unit 15.

本実施形態の学習支援装置１０は、音声再生機能に基づき音声データを再生する際、当該音声データにユーザが苦手な音素の発音（例えば“establish”の“sh”[∫]に対応する発音）が含まれている場合に、当該苦手な音素を含む発音部分に対応する音声データの再生区間を特定し、特定された再生区間での音声データの再生速度を変化させる（例えば遅くする）と共に、話速変換して再生する機能を有する。 When the learning support device 10 of this embodiment plays back audio data based on the audio playback function, the learning support device 10 uses the audio data to pronounce phonemes that the user is not good at (for example, the pronunciation corresponding to "sh" [∫] in "establish"). is included, the reproduction section of the audio data corresponding to the pronunciation part including the difficult phoneme is specified, and the reproduction speed of the audio data in the specified reproduction section is changed (for example, slowed down), It has a function to convert speech speed and play it back.

これにより、本実施形態の学習支援装置１０では、再生対象としてユーザが選択した音声データの全体を再生する過程において、ユーザが苦手な音素を含む発音部分の音声データを、当該ユーザが聞き取り易いようにその再生速度を変化させて再生できる。 As a result, in the learning support device 10 of the present embodiment, in the process of reproducing the entire audio data selected by the user as a reproduction target, the learning support device 10 makes it easier for the user to hear the audio data of the pronunciation portion including the phoneme that the user is not good at. You can play by changing the playback speed.

図２は、学習支援装置１０の電子回路の構成を示すブロック図である。 FIG. 2 is a block diagram showing the configuration of the electronic circuit of the learning support device 10.

学習支援装置１０の電子回路は、コンピュータであるＣＰＵ（プロセッサ）２１を備える。 The electronic circuit of the learning support device 10 includes a CPU (processor) 21 that is a computer.

ＣＰＵ２１は、フラッシュＲＯＭなどの記憶部（ストレージ）２２に予め記憶されたプログラム（学習支援処理プログラム２２ａおよび音声再生処理プログラム２２ｂを含む）、あるいはメモリカードなどの外部記録媒体２３から記録媒体読取部２４により読み取られて記憶部２２に記憶されたプログラム、あるいは通信ネットワークＮ上のＷｅｂサーバ（ここではプログラムサーバ）３０から通信部２５を介してダウンロードされ記憶部２２に記憶されたプログラム、に従って回路各部の動作を制御する。 The CPU 21 reads a program (including a learning support processing program 22a and an audio reproduction processing program 22b) stored in advance in a storage section 22 such as a flash ROM, or a recording medium reading section 24 from an external recording medium 23 such as a memory card. Each part of the circuit is configured according to the program read by the computer and stored in the storage unit 22, or the program downloaded from the web server (program server here) 30 on the communication network N via the communication unit 25 and stored in the storage unit 22. Control behavior.

ＣＰＵ２１には、データ及び制御バスを介して、記憶部２２、記録媒体読取部２４、通信部２５を接続するほか、キー入力部１４、音声出力部１５、音声入力部１６、表示部１７、を接続する。 The CPU 21 is connected to a storage section 22, a recording medium reading section 24, and a communication section 25 via a data and control bus, as well as a key input section 14, an audio output section 15, an audio input section 16, and a display section 17. Connecting.

音声出力部１５は、記憶部２２に記憶されているかあるいは録音された音声データに基づく音声を出力する本体スピーカ１５Ｓを備える。 The audio output unit 15 includes a main body speaker 15S that outputs audio based on audio data stored in the storage unit 22 or recorded.

音声入力部１６は、ユーザ等の音声を入力する本体マイク１６Ｍを備える。 The voice input unit 16 includes a main body microphone 16M for inputting voice of a user or the like.

音声出力部１５および音声入力部１６は、共用の外部接続端子（ＥＸ）２６を備え、外部接続端子２６には、ユーザが必要に応じてイヤホンマイク２７を接続する。 The audio output section 15 and the audio input section 16 are provided with a shared external connection terminal (EX) 26, and the user connects an earphone microphone 27 to the external connection terminal 26 as necessary.

イヤホンマイク２７は、イヤホンを有すると共に、マイク２７ｍを備えたリモコン部２７Ｒを有する。 The earphone microphone 27 has an earphone and a remote control section 27R equipped with a microphone 27m.

記憶部２２は、プログラム（学習支援処理プログラム２２ａおよび音声再生処理プログラム２２ｂを含む）を記憶するプログラム記憶部のほか、学習コンテンツ記憶部２２ｃ、辞書データ記憶部２２ｄ、他のコンテンツ記憶部２２ｅ、語学レベルデータ記憶部２２ｆ、苦手発音テーブル記憶部２２ｇ、発音変化イディオムテーブル記憶部２２ｈ、音声再生モードデータ記憶部２２ｉ、話速変換区間設定データ記憶部２２ｊ、および話速変換再生区間データ記憶部２２ｋを備える。 The storage unit 22 includes a program storage unit that stores programs (including a learning support processing program 22a and an audio reproduction processing program 22b), a learning content storage unit 22c, a dictionary data storage unit 22d, another content storage unit 22e, and a language storage unit 22c. Level data storage section 22f, weak pronunciation table storage section 22g, pronunciation change idiom table storage section 22h, voice reproduction mode data storage section 22i, speech speed conversion section setting data storage section 22j, and speech speed conversion reproduction section data storage section 22k. Be prepared.

学習支援処理プログラム２２ａとしては、学習支援装置１０の全体の動作を司るシステムプログラム、通信部２５を介して外部の電子機器と通信接続するためのプログラム、および音声再生処理プログラム２２ｂと併せて学習コンテンツ記憶部２２ｃ、辞書データ記憶部２２ｄ、および他のコンテンツ記憶部２２ｅに記憶されている各種のコンテンツデータに応じた学習機能を実行するためのプログラムなどを記憶する。 The learning support processing program 22a includes a system program that controls the overall operation of the learning support device 10, a program for communicating with external electronic equipment via the communication unit 25, and a learning content program 22b. It stores programs and the like for executing learning functions according to various content data stored in the storage unit 22c, the dictionary data storage unit 22d, and the other content storage unit 22e.

音声再生処理プログラム２２ｂは、ユーザ操作に応じして選択された再生対象の音声データを再生するためのプログラム、および再生対象の音声データを再生する際、当該音声データにユーザが苦手な音素の発音が含まれている場合に、当該苦手な音素を含む発音部分に対応する音声データの再生区間を特定し、特定された再生区間での音声データの再生速度を変化させると共に、話速変換して再生するためのプログラムを含む。 The audio reproduction processing program 22b is a program for reproducing audio data to be reproduced selected in response to a user's operation, and when reproducing the audio data to be reproduced, a pronunciation of a phoneme that the user is not good at is added to the audio data. is included, identify the playback section of the audio data corresponding to the pronunciation part that includes the difficult phoneme, change the playback speed of the audio data in the specified playback section, and convert the speaking speed. Contains programs for playing.

学習コンテンツ記憶部２２ｃは、例えば、リスニングレッスンデータ２２ｃ１、スピーキングレッスンデータ２２ｃ２、などの学習コンテンツデータを記憶する。 The learning content storage unit 22c stores learning content data such as listening lesson data 22c1 and speaking lesson data 22c2.

リスニングレッスンデータ２２ｃ１は、例えば、リスニングレッスンの模範となる単語と文章に対応するテキストデータ（テキストデータには発音記号が付加されている）と当該テキストデータに対応する音声データを有し、単語または文章のテキストデータを表示部１７に表示させ、音声データを音声出力部１５から出力する機能を有する。 The listening lesson data 22c1 includes, for example, text data (phonetic symbols are added to the text data) corresponding to words and sentences serving as a model for the listening lesson, and audio data corresponding to the text data. It has a function of displaying text data of a sentence on the display unit 17 and outputting audio data from the audio output unit 15.

スピーキングレッスンデータ２２ｃ２は、例えば、スピーキングレッスンの模範となるテキストデータ（テキストデータには発音記号が付加されている）と当該テキストデータに対応する音声データを有し、テキストデータを表示部１７に表示させ、音声データを音声出力部１５から出力した後に、音声入力部１６から入力したユーザの音声データを解析し、正誤等の判定結果を表示や音声により出力する機能を有する。 The speaking lesson data 22c2 includes, for example, text data serving as a model for a speaking lesson (phonetic symbols are added to the text data) and audio data corresponding to the text data, and the text data is displayed on the display unit 17. It has a function of analyzing the user's voice data inputted from the voice input unit 16 after outputting the voice data from the voice output unit 15, and outputting the determination result such as correctness or error by display or voice.

辞書データ記憶部２２ｄは、例えば、英和辞書、和英辞書、英英辞書、国語辞書などの各種の辞書コンテンツデータを記憶し、辞書コンテンツデータは、例えば、ユーザ操作に応じてキー入力または音声入力される辞書検索の対象となる見出し語に基づいて、当該見出し語に対応する説明情報を辞書検索して表示や音声により出力する機能を有する。 The dictionary data storage unit 22d stores various dictionary content data such as an English-Japanese dictionary, a Japanese-English dictionary, an English-English dictionary, a Japanese dictionary, etc., and the dictionary content data is inputted by key input or voice input according to a user operation, for example. It has a function to search the dictionary for explanatory information corresponding to the headword based on the headword that is the target of the dictionary search, and output it by display or voice.

なお、辞書コンテンツデータは、各種の辞書のそれぞれにおいて、見出し語、見出し語の意味，内容を含む説明情報、見出し語を含む例文などのテキストデータ、および当該テキストデータに対応する音声データを有し、そのうち例えば見出し語および例文のテキストデータには、発音記号が付加されている。 Note that dictionary content data includes text data such as headwords, explanatory information including the meaning and content of the headwords, example sentences including the headwords, and audio data corresponding to the text data for each of the various dictionaries. , among which, for example, text data of headwords and example sentences have phonetic symbols added to them.

他のコンテンツ記憶部２２ｅは、学習コンテンツデータ（２２ｃ）、辞書コンテンツデータ（２２ｄ）以外の、例えば書籍、新聞、雑誌などの他のコンテンツデータを記憶する。他のコンテンツデータは、各コンテンツデータのテキストデータ、および当該テキトデータに対応する音声データを有する。 Other content storage section 22e stores content data other than learning content data (22c) and dictionary content data (22d), such as books, newspapers, and magazines. Other content data includes text data of each content data and audio data corresponding to the text data.

語学レベルデータ記憶部２２ｆは、ユーザの語学レベルのデータを、例えば、初級：１、中級：２、上級：３として記憶する。語学レベルは、ユーザ操作に応じてユーザ自身の語学レベルが入力されて記憶されるか、あるいは学習コンテンツデータ（２２ｃ）や辞書コンテンツデータ（２２ｄ）に応じた学習機能が実行された際に、当該学習機能の中で判定されたユーザの語学レベルが自動更新されて記憶される。 The language level data storage unit 22f stores data on the user's language level as, for example, beginner: 1, intermediate: 2, and advanced: 3. The language level is determined by inputting and storing the user's own language level in response to a user operation, or by executing the learning function according to the learning content data (22c) or dictionary content data (22d). The user's language level determined in the learning function is automatically updated and stored.

苦手発音テーブル記憶部２２ｇは、ユーザが苦手な音素の発音記号および当該発音記号に対応する音声データを、例えば３段階の語学レベル（初級：１、中級：２、上級：３）に区分けして対応付けたテーブルとして記憶する（図３参照）。 The weak pronunciation table storage unit 22g divides the phonetic symbols of phonemes that the user is weak at and the audio data corresponding to the phonetic symbols into, for example, three language levels (beginner: 1, intermediate: 2, advanced: 3). It is stored as an associated table (see FIG. 3).

図３は、苦手発音テーブル（２２ｇ）に３段階の語学レベルに区分して記述されたユーザが苦手な複数の音素の発音記号の一例を示す図である。 FIG. 3 is a diagram showing an example of pronunciation symbols for a plurality of phonemes that the user is not good at, which are classified into three language proficiency levels and described in the weak pronunciation table (22g).

図３に示す苦手発音テーブル（２２ｇ）では、ユーザが苦手な音素の発音記号として、ユーザが苦手で且つ聞き分けるのが難しい２つの類似する音素の発音記号の組みが、複数組み記述され、語学レベル１（初級）のユーザは当該テーブルに記述された全ての音素の組み（１６組み）が聞き分けの苦手な類似音素であることを示し、語学レベル２（中級）のユーザは当該テーブルに記述された全ての音素の組みのうち下から９組みが聞き分けの苦手な類似音素であることを示し、語学レベル３（上級）のユーザは当該テーブルに記述された全ての音素の組みのうち下から４組みが聞き分けの苦手な類似音素であることを示している。 In the weak pronunciation table (22g) shown in Figure 3, multiple sets of pronunciation symbols for two similar phonemes that the user is weak at and difficult to distinguish are described as pronunciation symbols for phonemes that the user is weak at, and the language proficiency level Users with language proficiency level 2 (intermediate) indicate that all phoneme pairs (16 pairs) described in the table are similar phonemes that are difficult to distinguish. This shows that the bottom 9 of all phoneme pairs are similar phonemes that are difficult to distinguish, and users with language proficiency level 3 (advanced) can choose the bottom 4 of all phoneme pairs described in the table. This shows that these are similar phonemes that are difficult to distinguish.

発音変化イディオムテーブル記憶部２２ｈは、熟語や成句など、複数の単語を連結して構成される語句のうち、単語を単一で発音した場合と比較して発音が変化する複数の発音変化語句（例えば“there is”：ゼァ・イズ→ゼァリズと発音変化）のデータをテーブルにして記憶する。発音変化語句のデータは、発音変化語句のテキストデータ、およびテキストデータに対応する音声データを有し、テキストデータには、発音が変化するテキストの範囲に対応して発音記号が付加（ユーザが苦手な音素の発音記号として付加）されている。 The pronunciation change idiom table storage unit 22h stores a plurality of pronunciation change idioms (phrases) whose pronunciation changes compared to the case where the words are pronounced singly, among words and phrases such as idioms and idiomatic phrases that are formed by connecting a plurality of words. For example, the data for "there is" (pronunciation changed from zea is → zeriz) is stored in a table. The data of words with pronunciation changes includes text data of the words with pronunciation changes and audio data corresponding to the text data. Phonetic symbols are added to the text data corresponding to the range of text where the pronunciation changes (for users who are not good at it). (added as phonetic symbols for phonemes).

音声再生モードデータ記憶部２２ｉは、音声データの再生モード（通常再生モードまたは苦手発音聞き取り（練習）モードまたは類似音素聞き分け（練習）モードなど）を示すデータを記憶する。音声データの再生モードは、例えばユーザ操作に応じて選択される。 The audio reproduction mode data storage unit 22i stores data indicating the reproduction mode of audio data (normal reproduction mode, weak pronunciation listening (practice) mode, similar phoneme discrimination (practice) mode, etc.). The audio data reproduction mode is selected, for example, in response to a user operation.

話速変換区間設定データ記憶部２２ｊは、再生対象の音声データのうち、ユーザが苦手な音素を含む発音部分として再生速度を変化させ話速変換して再生する再生区間を、音素単位に設定するか、単語単位に設定するか、文単位に設定するか、の設定方法を示すデータ（話速変換区間設定データ）を記憶する。話速変換区間設定データ（音素単位／単語単位／文単位）は、ユーザ操作に応じて任意に特定されるか、またはユーザの語学レベルに応じて特定される。 The speech speed conversion section setting data storage unit 22j sets, on a per phoneme basis, a playback section in which the playback speed is changed and the speech speed is converted and reproduced as a pronunciation portion that includes a phoneme that the user is not good at, among the audio data to be played. Data (speech speed conversion section setting data) indicating the setting method of whether to set on a per-word basis or on a per-sentence basis is stored. The speech speed conversion section setting data (phoneme unit/word unit/sentence unit) is arbitrarily specified according to the user's operation or according to the user's language level.

話速変換再生区間データ記憶部２２ｋは、再生対象の音声データの先頭（開始時間）から末尾（終了時間）までの再生タイミング（例えば先頭（開始時間）を０msecとした末尾（終了時間）までの時間で管理される：図９参照）において、話速変換区間設定データ（２２ｊ）に基づき特定された話速変換の再生区間に対応する再生タイミング（例えばＮ msec～Ｍ msec）のデータを記憶する。 The speech speed conversion playback section data storage unit 22k stores the playback timing from the beginning (start time) to the end (end time) of the audio data to be played (for example, from the beginning (start time) to the end (end time) when the beginning (start time) is 0 msec). (Managed by time: see FIG. 9), data of the playback timing (for example, N msec to M msec) corresponding to the playback section of the speech speed conversion specified based on the speech speed conversion section setting data (22j) is stored. .

このように構成された学習支援装置１０は、ＣＰＵ２１が学習支援処理プログラム２２ａおよび音声再生処理プログラム２２ｂに記述された命令に従い回路各部の動作を制御し、ソフトウエアとハードウエアとが協働して動作することにより、以下の動作説明で述べるような、音声再生機能を実現する。 In the learning support device 10 configured as described above, the CPU 21 controls the operation of each part of the circuit according to instructions written in the learning support processing program 22a and the audio reproduction processing program 22b, and the software and hardware cooperate. By operating, an audio playback function is realized as described in the operation description below.

次に、実施形態の学習支援装置（電子辞書）１０の動作について説明する。 Next, the operation of the learning support device (electronic dictionary) 10 of the embodiment will be explained.

（第１実施形態）
図４は、学習支援装置１０の第１実施形態の音声再生処理（１）を示すフローチャートである。 (First embodiment)
FIG. 4 is a flowchart showing the audio reproduction process (1) of the first embodiment of the learning support device 10.

図５は、音声再生処理（１）に含まれる音声選択処理（Ｓ１）を示すフローチャートである。 FIG. 5 is a flowchart showing the audio selection process (S1) included in the audio reproduction process (1).

図６は、音声再生処理（１）に含まれる苦手発音要素特定処理（Ｓ３）を示すフローチャートである。 FIG. 6 is a flowchart showing the weak pronunciation element identification process (S3) included in the audio reproduction process (1).

図７は、音声再生処理（１）に含まれる発音タイミング特定処理（Ｓ４）を示すフローチャートである。 FIG. 7 is a flowchart showing the pronunciation timing specifying process (S4) included in the audio reproduction process (1).

図８は、音声再生処理（１）に含まれる話速変換区間設定方法特定処理（Ｓ５）を示すフローチャートである。 FIG. 8 is a flowchart showing the speech speed conversion section setting method specifying process (S5) included in the audio reproduction process (1).

図９は、音声再生処理（１）に従った再生対象の音声データの通常の再生タイミングと、苦手な発音要素の発音部分に対応して再生速度を変化させ話速変換して再生する再生タイミングとを対比して示す図である。 FIG. 9 shows the normal playback timing of the audio data to be played according to the audio playback process (1), and the playback timing at which the playback speed is changed to correspond to the pronunciation part of the pronunciation element that is weak, and the speaking speed is converted and the playback is performed. FIG.

再生対象の音声データを選択するための音声選択処理（Ｓ１）（図５参照）において、例えばユーザによる機能指定キー１４ｂの操作に応じて辞書が選択されると（ステップＳ１０１（Ｙｅｓ））、ＣＰＵ２１は、選択された辞書データに対応して検索対象の見出し語を入力するための見出し語入力画面（図示せず）を表示部１７に表示させる（ステップＳ１０２）。 In the audio selection process (S1) for selecting audio data to be played (see FIG. 5), for example, when a dictionary is selected in response to the user's operation of the function designation key 14b (step S101 (Yes)), the CPU 21 displays on the display unit 17 a headword input screen (not shown) for inputting a headword to be searched for in accordance with the selected dictionary data (step S102).

見出し語入力画面において、ユーザ操作に応じて検索対象の見出し語が入力されると、ＣＰＵ２１は、入力された見出し語のデータを、選択された辞書データから検索し（ステップＳ１０３）、検索された見出し語とその意味，内容のデータを展開した見出し語説明画面ＧＥ（図１参照）を表示部１７に表示させる（ステップＳ１０４）。 When a headword to be searched is input in response to a user operation on the headword input screen, the CPU 21 searches the selected dictionary data for data of the input headword (step S103), and searches for the data of the input headword from the selected dictionary data. A headword explanation screen GE (see FIG. 1) in which data of headwords, their meanings, and contents are developed is displayed on the display unit 17 (step S104).

見出し語説明画面ＧＥにおいて、例えば当該画面ＧＥに対するユーザのタッチ操作に応じてテキストが選択され、選択されたテキストが反転表示（識別表示）ｈされると（ステップＳ１０５（Ｙｅｓ））、ＣＰＵ２１は、選択されたテキストに対応する音声データを再生対象に設定する（ステップＳ１０６）。 When a text is selected on the headword explanation screen GE, for example, in response to a user's touch operation on the screen GE, and the selected text is displayed in reverse (discrimination display) h (step S105 (Yes)), the CPU 21 Audio data corresponding to the selected text is set to be played (step S106).

一方、ユーザ操作に応じて、学習コンテンツ記憶部２２ｃに記憶されているリスニングレッスンデータ２２ｃ１の学習コンテンツが選択されると（ステップＳ１０７（Ｙｅｓ））、ＣＰＵ２１は、選択された学習コンテンツが有する、例えばリスニング練習の対象となる単語や文章の項目の一覧を表示部１７に表示させる（ステップＳ１０８）。 On the other hand, when the learning content of the listening lesson data 22c1 stored in the learning content storage unit 22c is selected in response to a user operation (step S107 (Yes)), the CPU 21 selects, for example, the learning content that the selected learning content has. A list of words and sentence items to be used for listening practice is displayed on the display unit 17 (step S108).

表示された項目の一覧から、ユーザの例えばタッチ操作に応じて任意の項目が選択されると（ステップＳ１０９（Ｙｅｓ））、ＣＰＵ２１は、選択された項目に対応する単語や文章のテキストの音声データを再生対象に設定する（ステップＳ１１０）。 When an arbitrary item is selected from the list of displayed items in response to the user's touch operation, for example (step S109 (Yes)), the CPU 21 selects audio data of the text of the word or sentence corresponding to the selected item. is set as a reproduction target (step S110).

また、ユーザ操作に応じて、他のコンテンツ記憶部２２ｅに記憶されている他のコンテンツが選択されると（ステップＳ１１１（Ｙｅｓ））、ＣＰＵ２１は、選択された他のコンテンツのテキストデータを表示部１７に表示させる（ステップＳ１１２）。 Further, when another content stored in the other content storage section 22e is selected in response to a user operation (step S111 (Yes)), the CPU 21 displays the text data of the selected other content on the display section. 17 (step S112).

表示された他のコンテンツのテキストデータの中から、ユーザの例えばタッチ操作に応じて単語や文章などの任意のテキストが選択されると（ステップＳ１１３（Ｙｅｓ））、ＣＰＵ２１は、選択されたテキストに対応する音声データを再生対象に設定する（ステップＳ１１４）。 When an arbitrary text such as a word or a sentence is selected from among the displayed text data of other contents in response to the user's touch operation (step S113 (Yes)), the CPU 21 selects the selected text. The corresponding audio data is set as a reproduction target (step S114).

このように、音声選択処理（Ｓ１）に従い再生対象の音声データが選択されると、ＣＰＵ２１は、音声再生モードデータ記憶部２２ｉに記憶されている再生モードのデータに基づき、苦手発音聞き取りモードか通常再生モードかを判定する（ステップＳ２）。 In this way, when the audio data to be played is selected according to the audio selection process (S1), the CPU 21 selects the weak pronunciation listening mode or the normal It is determined whether the mode is playback mode (step S2).

ここで、通常再生モードと判定されると（ステップＳ２（Ｎｏ））、ＣＰＵ２１は、再生対象の音声データをその先頭から末尾まで通常の再生速度タイミングに従い通常の再生速度で再生する（ステップＳ１１）。 Here, if it is determined that the mode is normal playback mode (step S2 (No)), the CPU 21 plays back the audio data to be played back from the beginning to the end at the normal playback speed according to the normal playback speed timing (step S11). .

一方、苦手発音聞き取りモードと判定されると（ステップＳ２（Ｙｅｓ））、ＣＰＵ２１は、ユーザが苦手な発音の要素（音素）を特定するための苦手発音要素特定処理（Ｓ２）（図６参照）に移行する。 On the other hand, if it is determined that the user is in the weak pronunciation listening mode (step S2 (Yes)), the CPU 21 performs weak pronunciation element identification processing (S2) to identify pronunciation elements (phonemes) that the user is weak in (see FIG. 6). to move to.

苦手発音要素特定処理に移行されると、ＣＰＵ２１は、ユーザが苦手な発音の要素を、当該ユーザが任意に特定するか、または当該ユーザの語学レベルに応じて自動で特定するか、または発音変化イディオムテーブル（２２ｈ）に基づき自動で特定するかの何れかの項目について、ユーザに選択させる項目選択画面を表示部１７に表示させる。 When the process moves to identify pronunciation elements that the user is weak in, the CPU 21 determines whether the user can arbitrarily identify the pronunciation elements that the user is weak in, or whether they can be automatically identified according to the user's language level, or whether the pronunciation changes have changed. An item selection screen is displayed on the display unit 17 to allow the user to select one of the items to be automatically specified based on the idiom table (22h).

項目選択画面において、ユーザが苦手な発音の要素を、ユーザが任意に特定する項目が選択されると（ステップＳ３１（Ｙｅｓ））、ＣＰＵ２１は、例えば英語系の辞書データから読み出した発音記号の一覧を表示部１７に表示させる（ステップＳ３２）。 On the item selection screen, when the user selects an item in which the user arbitrarily specifies an element of pronunciation that the user is not good at (step S31 (Yes)), the CPU 21 selects, for example, a list of phonetic symbols read from English dictionary data. is displayed on the display unit 17 (step S32).

発音記号の一覧において、ユーザ操作に応じて、当該ユーザが苦手な一つまたは複数の音素の発音記号が選択されると（ステップＳ３３（Ｙｅｓ））、ＣＰＵ２１は、選択された発音記号の発音要素を苦手発音要素として特定する（ステップＳ３４）。 In the list of phonetic symbols, when the phonetic symbol of one or more phonemes that the user is not good at is selected according to the user's operation (step S33 (Yes)), the CPU 21 selects the phonetic element of the selected phonetic symbol. is identified as a weak pronunciation element (step S34).

一方、項目選択画面において、ユーザが苦手な発音の要素を、ユーザの語学レベルに応じて自動で特定する項目が選択されると（ステップＳ３５（Ｙｅｓ））、ＣＰＵ２１は、語学レベルデータ記憶部２２ｆからユーザの語学レベル（初級：１、または中級：２、または上級：３）のデータを取得し（ステップＳ３６）、苦手発音テーブル（２２ｇ：図３参照）の中から、当該ユーザの語学レベルに応じた複数の発音記号の発音要素を苦手発音要素として特定する（ステップＳ３７）。 On the other hand, when an item is selected on the item selection screen that automatically identifies elements of pronunciation that the user is not good at according to the user's language level (step S35 (Yes)), the CPU 21 selects The data of the user's language level (beginner: 1, intermediate: 2, or advanced: 3) is obtained from (step S36), and the language level of the user is selected from the weak pronunciation table (22g: see Figure 3). The pronunciation elements of the corresponding plurality of phonetic symbols are identified as weak pronunciation elements (step S37).

また、項目選択画面において、ユーザが苦手な発音の要素を、発音変化イディオムテーブル（２２ｈ）に基づき自動で特定する項目が選択されると（ステップＳ３５（Ｎｏ））、ＣＰＵ２１は、発音変化イディオムテーブル（２２ｈ）にある複数の発音変化語句のテキストデータにそれぞれ対応付けられた、発音が変化するテキストの範囲に対応した発音記号の発音要素を苦手発音要素として特定する（ステップＳ３８）。 Further, on the item selection screen, when an item for automatically identifying pronunciation elements that the user is not good at based on the pronunciation change idiom table (22h) is selected (step S35 (No)), the CPU 21 selects the pronunciation change idiom table The pronunciation elements of the pronunciation symbol corresponding to the range of the text whose pronunciation changes, which are respectively associated with the text data of the plural pronunciation-changing words in (22h), are identified as weak pronunciation elements (step S38).

このように、苦手発音要素特定処理（Ｓ３）に従いユーザが苦手な発音の要素（音素）が特定されると、ＣＰＵ２１は、音声選択処理（Ｓ１）に従い選択された再生対象の音声データのうち、苦手発音要素特定処理（Ｓ３）に従い特定された苦手発音要素に対応する発音タイミングを特定するための発音タイミング特定処理（Ｓ４）（図７参照）に移行する。 In this way, when the pronunciation elements (phonemes) that the user is not good at are identified according to the weak pronunciation element identification process (S3), the CPU 21 selects the audio data to be played that is selected according to the audio selection process (S1). The process moves to a pronunciation timing specifying process (S4) (see FIG. 7) for specifying the pronunciation timing corresponding to the weak pronunciation element specified according to the weak pronunciation element specifying process (S3).

発音タイミング特定処理に移行されると、ＣＰＵ２１は、再生対象の音声データに対応するテキストデータに発音記号が付加されているか否かを判定する（ステップＳ４１）。 When proceeding to the pronunciation timing specifying process, the CPU 21 determines whether or not a phonetic symbol is added to the text data corresponding to the audio data to be reproduced (step S41).

音声選択処理（Ｓ１）に従い選択された再生対象の音声データが、例えば辞書データ（２２ｄ）または学習コンテンツデータ（２２ｃ）から選択された音声データである場合に、当該音声データに対応するテキストデータに発音記号が付加されていると判定されると（ステップＳ４１（Ｙｅｓ））、ＣＰＵ２１は、苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものか否かを判定する（ステップＳ４２）。 When the audio data to be played selected according to the audio selection process (S1) is, for example, audio data selected from dictionary data (22d) or learning content data (22c), text data corresponding to the audio data is When it is determined that a phonetic symbol has been added (step S41 (Yes)), the CPU 21 stores the pronunciation elements that the user is not good at, which are identified according to the weak pronunciation element identification process (S3), in the pronunciation change idiom table (22h). It is determined whether it is specified using (step S42).

苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものではない、すなわち、ユーザにより任意に特定された発音記号の発音要素であるか、ユーザの語学レベルに応じて苦手発音テーブル（２２ｇ）から特定された発音記号の発音要素であると判定されると（ステップＳ４２（Ｎｏ））、ＣＰＵ２１は、再生対象の音声データのうち、当該ユーザが苦手な発音要素の発音記号に対応する発音タイミングを、当該音声データに対応するテキストデータに付加された発音記号の位置に基づき特定する（ステップＳ４４）。 The pronunciation elements that the user is not good at, identified according to the weak pronunciation element identification process (S3), are not those identified using the pronunciation change idiom table (22h), that is, the pronunciation elements that the user is not good at are identified using the pronunciation change idiom table (22h). If it is determined that the pronunciation element is a pronunciation element or a pronunciation element of a pronunciation symbol specified from the weak pronunciation table (22g) according to the user's language level (step S42 (No)), the CPU 21 selects the sound to be played. Among the data, the pronunciation timing corresponding to the phonetic symbol of the pronunciation element that the user is not good at is specified based on the position of the phonetic symbol added to the text data corresponding to the audio data (step S44).

また、苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものであると判定された場合（ステップＳ４２（Ｙｅｓ））、ＣＰＵ２１は、再生対象の音声データに対応するテキストデータに、発音変化イディオムテーブル（２２ｈ）にある発音変化イディオム（発音変化語句）のテキストデータと一致する部分があるか、すなわち、再生対象の音声データに発音変化語句と同じく発音が変化する部分が含まれているかを判定する（ステップＳ４３）。 Further, if it is determined that the pronunciation element that the user is not good at, which was identified according to the weak pronunciation element identification process (S3), was identified using the pronunciation change idiom table (22h) (step S42 (Yes)). ), the CPU 21 determines whether the text data corresponding to the audio data to be played back has a portion that matches the text data of the pronunciation change idiom (pronunciation change word phrase) in the pronunciation change idiom table (22h). It is determined whether the audio data includes a portion where the pronunciation changes like the pronunciation-changed phrase (step S43).

再生対象の音声データに対応するテキストデータに、発音変化イディオム（発音変化語句）のテキストデータと一致する部分があると判定されると（ステップＳ４３（Ｙｅｓ））、ＣＰＵ２１は、再生対象の音声データのうち、発音変化イディオムテーブル（２２ｈ）を利用して特定されたユーザが苦手な発音要素の発音記号に対応する発音タイミングを、当該音声データに対応するテキストデータに付加された発音記号の位置に基づき特定する（ステップＳ４４）。 When it is determined that the text data corresponding to the audio data to be played back has a portion that matches the text data of the pronunciation-changing idiom (pronunciation-changing word phrase) (step S43 (Yes)), the CPU 21 selects the audio data to be played back. Among them, the pronunciation timing corresponding to the phonetic symbol of the pronunciation element that the user is not good at, which was identified using the pronunciation change idiom table (22h), is set to the position of the phonetic symbol added to the text data corresponding to the audio data. It is specified based on the information (step S44).

一方、音声選択処理（Ｓ１）に従い選択された再生対象の音声データが、例えば他のコンテンツデータ（２２ｅ）から選択された音声データである場合に、当該音声データに対応するテキストデータに発音記号が付加されていないと判定されると（ステップＳ４１（Ｎｏ））、ＣＰＵ２１は、再生対象の音声データを音声認識し、当該音声データをその先頭（開始時間）から末尾（終了時間）までに含まれる複数の音素毎の音素区間に分解する（ステップＳ４５）。 On the other hand, if the audio data to be played selected according to the audio selection process (S1) is audio data selected from other content data (22e), for example, the text data corresponding to the audio data does not include phonetic symbols. If it is determined that the audio data is not added (step S41 (No)), the CPU 21 performs speech recognition on the audio data to be played, and includes the audio data from the beginning (start time) to the end (end time). It is decomposed into phoneme sections for each of a plurality of phonemes (step S45).

そして、ＣＰＵ２１は、苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものか否かを判定する（ステップＳ４６）。 Then, the CPU 21 determines whether the pronunciation element that the user is not good at, which has been identified according to the weak pronunciation element identification process (S3), is one that has been identified using the pronunciation change idiom table (22h) (step S46). .

苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものではない、すなわち、ユーザにより任意に特定された発音記号の発音要素であるか、ユーザの語学レベルに応じて苦手発音テーブル（２２ｇ）から特定された発音記号の発音要素であると判定されると（ステップＳ４６（Ｎｏ））、ＣＰＵ２１は、ステップＳ４５にて音素区間に分解された再生対象の音声データの中に、任意または苦手発音テーブル（２２ｇ）から特定された苦手な発音要素の音声データに一致する音素区間があるかを判定する（ステップＳ４９）。 The pronunciation elements that the user is not good at, identified according to the weak pronunciation element identification process (S3), are not those identified using the pronunciation change idiom table (22h), that is, the pronunciation elements that the user is not good at are identified using the pronunciation change idiom table (22h). If it is determined that the pronunciation element is a pronunciation element or a pronunciation element of a pronunciation symbol specified from the weak pronunciation table (22g) according to the user's language level (step S46 (No)), the CPU 21 proceeds to step S45. It is determined whether or not there is a phoneme section in the audio data to be reproduced that has been decomposed into phoneme sections that matches the audio data of the weak pronunciation element specified from the arbitrary or weak pronunciation table (22g) (step S49).

再生対象の音声データの中に、任意または苦手発音テーブル（２２ｇ）から特定された苦手な発音要素の音声データに一致する音素区間があると判定されると（ステップＳ４９（Ｙｅｓ））、ＣＰＵ２１は、再生対象の音声データの中の、任意または苦手発音テーブル（２２ｇ）から特定された苦手な発音要素の音声データに一致した音素区間の発音タイミングを特定する（ステップＳ４８）。 If it is determined that there is a phoneme section in the audio data to be played that matches the audio data of the weak pronunciation element specified from the arbitrary or weak pronunciation table (22g) (step S49 (Yes)), the CPU 21 , the pronunciation timing of the phoneme section that matches the audio data of the weak pronunciation element specified from the arbitrary or weak pronunciation table (22g) in the audio data to be reproduced is specified (step S48).

また、苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、発音変化イディオムテーブル（２２ｈ）を利用して特定されたものであると判定された場合（ステップＳ４６（Ｙｅｓ））、ＣＰＵ２１は、ステップＳ４５にて音素区間に分解された再生対象の音声データの中に、発音変化イディオムテーブル（２２ｈ）から特定された苦手な発音要素の音声データに一致する音素区間があるかを判定する（ステップＳ４７）。 In addition, if it is determined that the pronunciation element that the user is not good at, which was identified according to the weak pronunciation element identification process (S3), was identified using the pronunciation change idiom table (22h) (step S46 (Yes)). ), the CPU 21 determines whether there is a phoneme section that matches the speech data of the weak pronunciation element identified from the pronunciation change idiom table (22h) in the speech data to be played that has been decomposed into phoneme sections in step S45. is determined (step S47).

再生対象の音声データの中に、発音変化イディオムテーブル（２２ｈ）から特定された苦手な発音要素の音声データに一致する音素区間があると判定されると（ステップＳ４７（Ｙｅｓ））、ＣＰＵ２１は、再生対象の音声データの中の、発音変化イディオムテーブル（２２ｈ）から特定された苦手な発音要素の音声データに一致した音素区間の発音タイミングを特定する（ステップＳ４８）。 When it is determined that there is a phoneme section in the audio data to be played that matches the audio data of the weak pronunciation element identified from the pronunciation change idiom table (22h) (step S47 (Yes)), the CPU 21 The pronunciation timing of the phoneme section that matches the audio data of the weak pronunciation element identified from the pronunciation change idiom table (22h) in the audio data to be reproduced is specified (step S48).

このように、発音タイミング特定処理（Ｓ４）に従い、再生対象の音声データの中のユーザが苦手な発音要素（音素）に対応する発音タイミングが特定されると、ＣＰＵ２１は、再生対象の音声データのうち、当該ユーザが苦手な発音要素（音素）の発音タイミングを含む発音部分として再生速度を変化させ話速変換して再生する再生区間を、音素単位に設定するか、単語単位に設定するか、文単位に設定するか、の設定方法を特定するための話速変換区間設定方法特定処理（Ｓ５）（図８参照）に移行する。 In this way, when the pronunciation timing corresponding to the pronunciation element (phoneme) that the user is not good at in the audio data to be played is specified according to the pronunciation timing specifying process (S4), the CPU 21 Among them, whether to set the playback section in which the playback speed is changed and the speaking speed is changed and played back as a pronunciation part that includes the pronunciation timing of the pronunciation element (phoneme) that the user is not good at, in units of phonemes or in units of words. The process moves to a speech speed conversion section setting method specifying process (S5) (see FIG. 8) for specifying a setting method for setting each sentence.

話速変換区間設定方法特定処理に移行されると、ＣＰＵ２１は、当該話速変換区間の設定方法について、ユーザが任意に特定するか、またはユーザの語学レベルに応じて自動で特定するかの何れかの設定方法特定項目について、ユーザに選択させる設定方法特定項目選択画面を表示部１７に表示させる。 When the process moves to the speech speed conversion section setting method specifying process, the CPU 21 determines whether the speech speed conversion section setting method is arbitrarily specified by the user or automatically specified according to the user's language level. A setting method specific item selection screen is displayed on the display unit 17 to allow the user to select the setting method specific item.

設定方法特定項目選択画面において、ユーザが任意に設定方法を特定する項目が選択されると（ステップＳ５１（Ｙｅｓ））、ＣＰＵ２１は、話速変換して再生する再生区間を、音素単位に設定するか、単語単位に設定するか、文単位に設定するか、の設定方法の一覧を表示部１７に表示させる（ステップＳ５２）。 When the user arbitrarily selects an item for specifying a setting method on the setting method specific item selection screen (step S51 (Yes)), the CPU 21 sets a reproduction section in which speech speed is converted and played back in units of phonemes. The display section 17 displays a list of setting methods, such as setting on a per-word basis, or setting on a per-sentence basis (step S52).

設定方法の一覧において、ユーザ操作に応じて、音素単位または単語単位または文単位のうち何れかの設定方法が選択されると（ステップＳ５３（Ｙｅｓ））、ＣＰＵ２１は、選択された設定方法を話速変換区間の設定方法として特定し、話速変換区間設定データ記憶部２２ｊに記憶させる（ステップＳ５４）。 In the list of setting methods, when a setting method of phoneme units, word units, or sentence units is selected according to the user's operation (step S53 (Yes)), the CPU 21 talks about the selected setting method. The method for setting the speed conversion section is specified and stored in the speech speed conversion section setting data storage section 22j (step S54).

一方、設定方法特定項目選択画面において、ユーザの語学レベルに応じて自動で特定する項目が選択されると（ステップＳ５１（Ｎｏ））、ＣＰＵ２１は、語学レベルデータ記憶部２２ｆからユーザの語学レベル（初級：１、または中級：２、または上級：３）のデータを取得する（ステップＳ５６）。 On the other hand, when an item to be automatically specified according to the user's language level is selected on the setting method specific item selection screen (step S51 (No)), the CPU 21 selects the user's language level ( Beginner: 1, intermediate: 2, or advanced: 3) data is acquired (step S56).

そして、話速変換区間の設定方法を、ユーザの語学レベルが（初級：１）である場合は＜文単位＞として特定し（ステップＳ５６→Ｓ５７ａ）、また、（中級：２）である場合は＜単語単位＞として特定し（ステップＳ５６→Ｓ５７ｂ）、また、（上級：３）である場合は＜音素単位＞として特定し（ステップＳ５６→Ｓ５７ｃ）、話速変換区間設定データ記憶部２２ｊに記憶させる。 Then, the setting method of the speech speed conversion section is specified as <sentence unit> if the user's language level is (beginner level: 1) (step S56 → S57a), and if the user's language level is (intermediate level: 2), It is specified as a <word unit> (step S56 → S57b), and if it is (advanced: 3), it is specified as a <phoneme unit> (step S56 → S57c), and is stored in the speech speed conversion section setting data storage unit 22j. let

ＣＰＵ２１は、話速変換区間設定方法特定処理（Ｓ５）に従い特定された話速変換区間の設定方法に基づいて、再生対象の音声データにおけるユーザが苦手な発音要素（音素）の発音タイミングを含む再生区間を特定する（ステップＳ６）。 The CPU 21 performs playback including pronunciation timings of pronunciation elements (phonemes) that the user is not good at in the audio data to be played back, based on the speech speed conversion section setting method specified according to the speech speed conversion section setting method identification process (S5). The section is specified (step S6).

例えば、音声選択処理（Ｓ１）にて選択された再生対象の音声データが、図９の（Ａ）に示すように、学習コンテンツデータ（２２ｃ）または辞書データ（２２ｄ）から選択された英単語“think”に対応する音声データであり、苦手発音要素特定処理（Ｓ３）に従い特定されたユーザが苦手な発音要素が、当該英単語“think”の“th”に対応する発音記号［θ］の発音要素（音素）であり、発音タイミング特定処理（Ｓ４）に従い特定された発音タイミングが、音声データ“think”のうち０～１００msecの発音タイミングであり、話速変換区間設定方法特定処理（Ｓ５）に従い特定された話速変換区間の設定方法が＜音素単位＞である場合、ＣＰＵ２１は、再生対象の音声データ“think”におけるユーザが苦手な発音要素（音素）［θ］を含む再生区間を、当該＜音素単位＞である０～１００msecとして特定し、話速変換再生区間データ記憶部２２ｋに記憶させる（ステップＳ６）。 For example, as shown in FIG. 9A, the audio data to be played selected in the audio selection process (S1) is an English word "" selected from the learning content data (22c) or the dictionary data (22d). The pronunciation element that the user is not good at, which was identified according to the weak pronunciation element identification process (S3), is the pronunciation of the phonetic symbol [θ] corresponding to “th” in the English word “think”. element (phoneme), the pronunciation timing specified according to the pronunciation timing specifying process (S4) is the pronunciation timing of 0 to 100 msec of the audio data "think", and according to the speech speed conversion section setting method specifying process (S5) When the method for setting the specified speaking speed conversion section is <phoneme unit>, the CPU 21 sets the playback section that includes the pronunciation element (phoneme) [θ] that the user is not good at in the audio data "think" to be played back to the corresponding speech speed conversion section. It is specified as <phoneme unit> from 0 to 100 msec, and is stored in the speech speed conversion reproduction section data storage section 22k (step S6).

ＣＰＵ２１は、再生対象の音声データ“think”の音声出力部１５からの通常再生を開始すると共に（ステップＳ７）、当該音声データ“think”の再生区間が、話速変換再生区間データ記憶部２２ｋに記憶された話速変換の対象となる再生区間０～１００msecであるか否かを判定する（ステップＳ８）。 The CPU 21 starts normal reproduction of the audio data "think" to be reproduced from the audio output unit 15 (step S7), and also stores the reproduction section of the audio data "think" in the speech speed conversion reproduction section data storage section 22k. It is determined whether or not the stored reproduction section to be subjected to speech speed conversion is from 0 to 100 msec (step S8).

そして、音声データ“think”の再生区間が、話速変換の対象となる再生区間０～１００msecであると判定される状態では（ステップＳ８（Ｙｅｓ））、ＣＰＵ２１は、図９の（Ａ）（Ｂ）に示すように、音声データ“think”の“th”に対応する発音部分について、再生速度を遅く（ここでは２．７倍に遅く）切り換えて変化させると共に話速変換して再生する（ステップＳ９ａ）。 Then, in a state in which it is determined that the reproduction section of the audio data "think" is a reproduction section of 0 to 100 msec that is subject to speech speed conversion (step S8 (Yes)), the CPU 21 performs the processing as shown in (A) in FIG. As shown in B), for the pronunciation part corresponding to "th" of the audio data "think", the playback speed is changed to a slower speed (in this case, 2.7 times slower) and the speech speed is changed and played back ( Step S9a).

また、音声データ“think”の再生区間が、話速変換の対象となる再生区間０～１００msecではない再生区間１００～４００msecと判定される状態では（ステップＳ８（Ｎｏ））、ＣＰＵ２１は、図９の（Ａ）（Ｂ）に示すように、音声データ“think”の“ink”に対応する発音部分について、再生速度を通常の再生速度に切り換えて話速変換せずに再生する（ステップＳ９ｂ）。 Further, in a state in which the reproduction section of the audio data "think" is determined to be a reproduction section of 100 to 400 msec, which is not the reproduction section of 0 to 100 msec that is subject to speech speed conversion (step S8 (No)), the CPU 21 As shown in (A) and (B), the playback speed is switched to the normal playback speed and the pronunciation portion corresponding to "ink" in the audio data "think" is played back without speech speed conversion (step S9b). .

そして、再生対象の音声データ“think”（０～４００msec）の全ての再生が終了したと判定されると（ステップＳ１０（Ｙｅｓ））、ＣＰＵ２１は、一連の音声再生処理（１）を終了する。 Then, when it is determined that the reproduction of all the audio data "think" (0 to 400 msec) to be reproduced has been completed (step S10 (Yes)), the CPU 21 ends the series of audio reproduction processing (1).

これにより、再生対象の音声データ“think”は、ユーザが苦手な“th”の発音記号［θ］に対応する発音要素（音素）を含む発音部分の再生区間において、再生速度が遅く切り換えられ話速変換されて再生されるので、当該再生対象の音声データ“think”の全体のユーザによる聞き取りが妨げられることなく、ユーザが自然に苦手な音素を含む発音部分の聞き取り練習を行なうことが可能になる。 As a result, the playback speed of the audio data "think" to be played back is switched to a slower speed in the playback section of the pronunciation part that includes the pronunciation element (phoneme) corresponding to the phonetic symbol [θ] of "th", which the user is not good at. Since it is played back after speed conversion, it is possible for the user to practice listening to pronunciation parts that include phonemes that are naturally difficult for the user, without interfering with the user's ability to hear the entire audio data "think" to be played. Become.

なお、ここでは、再生対象の音声データが単語であり、話速変換区間設定方法特定処理（Ｓ５）により特定された話速変換区間の設定方法が＜音素単位＞である場合の例について説明したが、再生対象の音声データが複数の単語からなる文であり（例えば英単語“think”を含む例文“Where do you think she lives?”）、当該話速変換区間の設定方法が＜単語単位＞として特定された場合には、再生対象の音声データ“ Where do you think she lives?”におけるユーザが苦手な発音要素（音素）［θ］を含む単語単位の発音部分である“think”の再生区間が特定され（図９の（Ａ）参照）、再生対象の音声データの全体である“ Where do you think she lives?”のうちの“think”の単語の再生区間において、再生速度が遅く（例えば２．７倍に遅く）切り換えられ話速変換して再生され、その他の単語の再生区間については話速変換せずに通常の再生速度で再生される（ステップＳ８，Ｓ９ａ，Ｓ１０）。 Here, an example has been described in which the audio data to be reproduced is a word, and the setting method of the speech speed conversion section specified by the speech speed conversion section setting method identification process (S5) is <phoneme unit>. However, the audio data to be played back is a sentence consisting of multiple words (for example, the example sentence "Where do you think she lives?" that includes the English word "think"), and the setting method for the speech speed conversion section is <word by word>. , the playback section of “think”, which is the pronunciation part of each word that includes the pronunciation element (phoneme) [θ] that the user is not good at, in the audio data to be played “Where do you think she lives?” is identified (see (A) in Figure 9), and the playback speed is slow (for example, 2.7 times slower) and the speech speed is changed and played back, and the playback sections of other words are played back at the normal playback speed without changing the speech speed (steps S8, S9a, S10).

これによれば、話速変換区間の設定方法が＜音素単位＞である場合と比較して、ユーザが苦手な発音要素（音素）［θ］を含む英単語“think”の音声データを、再生対象の文に含まれる他の単語よりもユーザにより聞き取り易く再生できる。 According to this, the audio data of the English word "think" that includes the pronunciation element (phoneme) [θ] that the user is not good at is played back, compared to the case where the setting method of the speech speed conversion interval is <phoneme unit>. The words can be reproduced in a way that is easier for the user to hear than other words included in the target sentence.

また、音声選択処理（Ｓ１）にて選択された再生対象の音声データが、例えば他のコンテンツ（２２ｅ）から選択された複数の文が連なる文章の音声データであって、話速変換区間の設定方法が＜文単位＞として特定された場合には、再生対象の音声データ（ここでは文章）におけるユーザが苦手な発音要素（音素）［θ］を含む単語を有した文単位の発音部分である文の再生区間が特定される。そして、再生対象の音声データの全体である文章のうちの苦手な発音要素（音素）［θ］を含む単語を有した文の再生区間において、再生速度が遅く（例えば２．７倍に遅く）切り換えられ話速変換して再生され、その他の文の再生区間については話速変換せずに通常の再生速度で再生される（ステップＳ８，Ｓ９ａ，Ｓ１０）。 In addition, the audio data to be played selected in the audio selection process (S1) is, for example, audio data of a sentence consisting of a plurality of sentences selected from other content (22e), and the speech speed conversion section is set. If the method is specified as <sentence unit>, it is a pronunciation part of the sentence unit that has a word that includes a pronunciation element (phoneme) [θ] that the user is not good at in the audio data to be played (in this case, a sentence). The reproduction section of the sentence is specified. Then, the playback speed is slow (for example, 2.7 times slower) in the playback section of a sentence that includes a word that includes a weak pronunciation element (phoneme) [θ] in the entire sentence that is the audio data to be played back. The reproduction section of the other sentences is reproduced at the normal reproduction speed without converting the speech speed (steps S8, S9a, S10).

これによれば、話速変換区間の設定方法が＜音素単位＞である場合、および＜単語単位＞である場合と比較して、ユーザが苦手な発音要素（音素）［θ］を含む単語を有した文の音声データを、再生対象の文章に含まれる他の文よりもユーザにより聞き取り易く再生できる。
また、複数の単語を連結して構成される語句であって、複数の単語が連続する部分で音が変化するような場合に、その複数の単語の発音部分を聞き取り易くすることができる。 According to this, compared to the case where the setting method of the speech speed conversion interval is <phoneme unit> or <word unit>, it is possible to reduce the number of words that include a pronunciation element (phoneme) [θ] that the user is not good at. It is possible to reproduce audio data of a sentence that is easier for the user to hear than other sentences included in the sentence to be reproduced.
Furthermore, when a phrase is formed by connecting a plurality of words and the sound changes in the part where the plurality of words are continuous, it is possible to make it easier to hear the pronunciation of the plurality of words.

なお、図７を参照して説明した発音タイミング特定処理（Ｓ４）において、再生対象の音声データに、ユーザが苦手な発音要素に対応する発音部分が含まれないと判定された場合（ステップＳ４３（Ｎｏ）／Ｓ４７（Ｎｏ）／Ｓ４９（Ｎｏ））には、当該音声データに話速変換の対象となる再生区間は特定されないので、同音声データはその全体の再生区間において通常の再生速度で再生される（ステップＳ５～Ｓ８（Ｎｏ），Ｓ９ｂ，Ｓ１０，終了）。 Note that in the pronunciation timing specifying process (S4) described with reference to FIG. No) / S47 (No) / S49 (No)), since the playback section that is subject to speech speed conversion is not specified in the audio data, the same audio data is played back at the normal playback speed in the entire playback section. (Steps S5 to S8 (No), S9b, S10, End).

以上のように構成した学習支援装置１０の第１実施形態の音声再生処理（１）によれば、辞書データ（２２ｄ）や学習コンテンツデータ（２２ｃ）などからユーザにより任意に選択された、例えば英語のテキストデータに対応する音声データを再生する際に、音声再生モード（２２ｉ）が苦手発音聞き取りモードに設定されている場合には、ユーザにより発音記号の一覧から任意に選択されるか、または苦手発音テーブル（２２ｇ）や発音変化イディオムテーブル（２２ｈ）に登録されている、ユーザが苦手な発音記号の発音要素が特定される。 According to the audio reproduction process (1) of the first embodiment of the learning support device 10 configured as described above, for example, English When playing audio data corresponding to text data, if the audio playback mode (22i) is set to the weak pronunciation listening mode, the user may select the phonetic symbol arbitrarily from a list of pronunciation symbols, or The pronunciation elements of the pronunciation symbols that the user is not good at, which are registered in the pronunciation table (22g) and the pronunciation change idiom table (22h), are identified.

すると、再生対象の音声データにおける苦手な発音要素に対応する発音タイミングが、当該音声データに対応するテキストデータに付加された発音記号の位置に基づき特定されるか、または当該音声データを音声認識して同音声データを構成する音素区間に分解し、苦手な発音要素の音声データと一致する音素区間を判定することで特定され、特定された苦手な発音要素の発音タイミングを含む発音部分の音声データの再生区間が特定される。 Then, the pronunciation timing corresponding to the difficult pronunciation element in the audio data to be played is identified based on the position of the phonetic symbol added to the text data corresponding to the audio data, or the audio data is recognized by voice recognition. The audio data of the pronunciation part that includes the pronunciation timing of the identified pronunciation element that is identified by dividing the same voice data into the phoneme sections that make up the phoneme section and determining the phoneme section that matches the audio data of the pronunciation element that is weak. The playback section is specified.

そして、再生対象の音声データの通常の再生が開始され、当該音声データの苦手な発音要素の発音タイミングを含む発音部分の再生区間では、再生速度が遅く切り換えられると共に話速変換されて再生される。 Then, the normal playback of the audio data to be played is started, and in the playback section of the pronunciation part that includes the pronunciation timing of the pronunciation element that is weak in the audio data, the playback speed is switched to a slower speed and the speech speed is converted and played back. .

これにより、ユーザにより選択されたテキスト全体の聞き取りが妨げられることなく、ユーザが自然に苦手な音素を含む発音部分の聞き取り練習を行なうことが可能になる。 This allows the user to practice listening to pronunciation portions that include phonemes that are naturally difficult for the user, without interfering with the user's ability to listen to the entire text selected by the user.

また、学習支援装置１０の第１実施形態の音声再生処理（１）によれば、再生対象の音声データにおける苦手な発音要素の発音タイミングを含む発音部分の再生区間は、ユーザにより任意に選択されて特定されるか、ユーザの語学レベル（２２ｆ）に応じて特定される、話速変換区間の設定方法（＜音素単位＞または＜単語単位＞または＜文単位＞）に従い特定される。 Further, according to the audio reproduction process (1) of the first embodiment of the learning support device 10, the reproduction section of the pronunciation portion including the pronunciation timing of the pronunciation element that is weak in the audio data to be reproduced can be arbitrarily selected by the user. It is specified according to the setting method of the speech speed conversion section (<phoneme unit>, <word unit>, or <sentence unit>), which is specified according to the user's language level (22f).

このため、例えばユーザが語学上級者である場合は、再生対象の音声データのうち、ユーザが苦手な発音要素（音素）を含む当該音素の発音部分のみ話速変換の対象となる再生区間として特定され、また、例えばユーザが語学中級者や語学初級者である場合は、再生対象の音声データのうち、ユーザが苦手な発音要素（音素）を含む発音部分として、単語全体や文全体が話速変換の対象となる再生区間として特定されるので、ユーザの語学レベルに応じて、当該ユーザが苦手な発音要素（音素）を含む音声データの発音部分を、当該ユーザが聞き取り易く且つ学習に効果的な範囲に特定して再生できる。 For this reason, for example, if the user is an advanced language expert, of the audio data to be played, only the pronunciation portion of the phoneme that includes the pronunciation element (phoneme) that the user is not good at is specified as the playback section that is subject to speech speed conversion. In addition, for example, if the user is an intermediate or beginner language learner, the entire word or sentence may be recorded at the speaking speed as a pronunciation portion of the audio data to be played that includes pronunciation elements (phonemes) that the user is not good at. Since it is specified as a playback section to be converted, depending on the user's language level, the pronunciation part of the audio data that includes pronunciation elements (phonemes) that the user is weak at can be easily heard by the user and effective for learning. You can play within a specific range.

以上、第１実施形態の音声再生処理（１）では、再生対象の音声データを再生する際に、当該音声データうち、ユーザが苦手な発音要素（音素）を含む発音部分の再生区間を特定し、特定した再生区間の再生速度を遅く切り換えて（変化させて）再生する実施例について説明した。 As described above, in the audio reproduction process (1) of the first embodiment, when reproducing the audio data to be reproduced, the reproduction section of the pronunciation part that includes the pronunciation elements (phonemes) that the user is not good at is identified in the audio data. , an embodiment has been described in which the playback speed of a specified playback section is switched (changed) to a slower speed for playback.

以下、第２実施形態の音声再生処理（２）では、２つの類似の発音要素（音素）をそれぞれ含む２つの単語（熟語、成句等でもよい）の音声データをそれぞれ再生し、ユーザが類似音素を聞き分ける練習を行なう際に、当該２つの単語それぞれの音声データにおいて、２つの類似の発音要素（音素）を含む発音部分の再生区間を特定し、特定した再生区間の再生速度を遅く変化させ話速変換して再生する実施例について説明する。 Hereinafter, in the audio reproduction process (2) of the second embodiment, the audio data of two words (which may be idioms, idiomatic phrases, etc.) each containing two similar pronunciation elements (phonemes) are reproduced, and the user When practicing distinguishing between the two words, identify the playback section of the pronunciation part that includes two similar pronunciation elements (phonemes) in the audio data of each of the two words, and change the playback speed of the specified playback section to a slower speed. An example of speed conversion and playback will be described.

（第２実施形態）
図１０は、学習支援装置１０の第２実施形態の音声再生処理（２）を示すフローチャートである。 (Second embodiment)
FIG. 10 is a flowchart showing the audio reproduction process (2) of the second embodiment of the learning support device 10.

図１１は、音声再生処理（２）に含まれる話速変換発音区間特定処理（Ａ４）を示すフローチャートである。 FIG. 11 is a flowchart showing the speech speed conversion pronunciation section identification process (A4) included in the audio reproduction process (2).

図１２は、音声再生処理（２）に従った２つの類似音素をそれぞれ含む２つの単語の音声データの通常の再生タイミングと、類似音素の発音部分に対応して再生速度を変化させ話速変換して再生する再生タイミングとを対比して示す図である。 Figure 12 shows the normal playback timing of audio data of two words each containing two similar phonemes according to the audio playback process (2), and speech speed conversion by changing the playback speed corresponding to the pronunciation part of the similar phoneme. FIG.

ユーザ操作に応じて音声再生処理（２）が開始されると、ＣＰＵ２１は、音声再生モードデータ記憶部２２ｉに記憶されている再生モードのデータに基づき、再生モードが、類似音素の聞き分けモードであるかを判定する（ステップＡ１）。 When the audio reproduction process (2) is started in response to a user operation, the CPU 21 determines that the reproduction mode is a similar phoneme discrimination mode based on the reproduction mode data stored in the audio reproduction mode data storage unit 22i. (Step A1).

類似音素の聞き分けモードであると判定されると（ステップＡ１（Ｙｅｓ））、ＣＰＵ２１は、例えば苦手発音テーブル（２２ｇ：図３参照）に記述されている、ユーザが苦手で且つ聞き分けるのが難しい２つの類似する音素の発音記号の組み（１６組み）を、表示部１７に一覧にして表示させ、聞き分け対象となる２つの類似する音素の発音記号の組みをユーザに選択させる（ステップＡ２）。 When it is determined that the mode is the similar phoneme discrimination mode (step A1 (Yes)), the CPU 21 selects, for example, the 2 words which the user is not good at and which are difficult to distinguish, which are described in the weak pronunciation table (22g: see FIG. 3). A list of 16 sets of pronunciation symbols for 5 similar phonemes is displayed on the display unit 17, and the user is asked to select a set of pronunciation symbols for 2 similar phonemes to be distinguished (step A2).

ここでは、聞き分け対象となる２つの類似する音素の発音記号の組みとして、苦手発音テーブル（２２ｇ）の語学レベル１（初級）に区分けされている発音記号の組み（[∫]：［θ］）が選択されたと仮定する。 Here, as a set of pronunciation symbols for two similar phonemes to be distinguished, we use a set of pronunciation symbols ([∫]:[θ]) classified into language level 1 (beginner) in the weak pronunciation table (22g). Suppose that is selected.

ＣＰＵ２１は、選択された聞き分け対象となる２つの類似する音素をそれぞれ含む音声データを選択する（ステップＡ３）。 The CPU 21 selects audio data each including two similar phonemes to be selected for aural discrimination (step A3).

ここでは、ステップＡ２にて選択された２つの類似する音素の発音記号（[∫]：［θ］）に基づいて、辞書データ（２２ｄ）あるいは学習コンテンツデータ（２２ｃ）から、それぞれの音素が含まれる２つの単語（“sink”と“think”）に対応する音声データが選択されたと仮定する。 Here, each phoneme is included from the dictionary data (22d) or the learning content data (22c) based on the pronunciation symbols ([∫]:[θ]) of the two similar phonemes selected in step A2. Assume that audio data corresponding to two words (“sink” and “think”) are selected.

すると、ＣＰＵ２１は、話速変換発音区間特定処理（Ａ４：図１１参照）に移行し、選択された２つの単語（“sink”と“think”）に対応する音声データを対象に、それぞれの音声データ内の類似する発音要素（[∫]：［θ］）に対応する発音タイミングを特定し、特定された発音タイミングに対応する部分の音声データを、再生速度を遅くして話速変換処理する。 Then, the CPU 21 moves to speech rate conversion pronunciation interval identification processing (A4: see Figure 11), and targets the audio data corresponding to the two selected words ("sink" and "think"), The pronunciation timing corresponding to similar pronunciation elements ([∫]:[θ]) in the data is identified, and the speech data of the part corresponding to the identified pronunciation timing is processed by slowing down the playback speed and converting the speaking speed. .

すなわち、話速変換発音区間特定処理（Ａ４）に移行されると、ＣＰＵ２１は、先ず、聞き分け対象として選択された２つの音声データが類似の音声であるか否かを、例えば各音声データに対応するテキストに付加された発音記号の一致度に基づき判定する（ステップＡ４１）。 That is, when the process moves to the speech speed conversion pronunciation section identification process (A4), the CPU 21 first determines whether or not the two audio data selected as the target for listening are similar sounds, for example, by checking the corresponding sound data for each audio data. The judgment is made based on the degree of matching of the phonetic symbols added to the text (step A41).

聞き分け対象として選択された２つの音声データが、２つの単語（“sink”と“think”）に対応する音声データである場合に、当該各音声データに対応する発音記号の一致度に基づき類似の音声であると判定されると（ステップＡ４１（Ｙｅｓ））、ＣＰＵ２１は、２つの音声データをそれぞれ音声認識して、例えば図１２の（Ａ１）（Ｂ１）に示すように、開始時間０msecから終了時間４００msecまでを構成する各音素区間の発音タイミング（０－１００－１７０－２７０－４００msec）に分解する（ステップＡ４２）。 When the two audio data selected for listening are audio data corresponding to two words (“sink” and “think”), similar audio data are identified based on the degree of matching of the phonetic symbols corresponding to each audio data. If it is determined that it is a voice (step A41 (Yes)), the CPU 21 performs voice recognition on each of the two voice data, and performs voice recognition on each of the two voice data, starting from a start time of 0 msec and ending as shown in (A1) and (B1) in FIG. 12, for example. It is decomposed into pronunciation timings (0-100-170-270-400 msec) of each phoneme interval constituting a time up to 400 msec (step A42).

そして、２つの音声データを比較して差異のある部分（“sink”の“s”[∫]の部分と“think”の“th”［θ］の部分）の音素区間に対応する発音タイミング（０－１００msec）を特定し、例えば図１２の（Ａ２）（Ｂ２）に示すように、特定された発音タイミングに対応する部分の音声データを、再生速度を遅く（ここでは２．７倍に遅く）して話速変換処理する（ステップＡ４３）。 Then, by comparing the two audio data, the pronunciation timing ( 0-100msec), and for example, as shown in (A2) and (B2) in FIG. ) to perform speech speed conversion processing (step A43).

一方、聞き分け対象として選択された２つの音声データが、類似の音声ではないと判定された場合（ステップＡ４１（Ｎｏ））、ＣＰＵ２１は、２つの音声データそれぞれにおいて、選択された類似音素の発音記号（[∫]：［θ］）に対応する発音タイミングを特定し、特定された発音タイミングに対応する部分の音声データを、再生速度を遅くして話速変換処理する（ステップＡ４４）。 On the other hand, if it is determined that the two audio data selected as the target for listening are not similar sounds (step A41 (No)), the CPU 21 determines the pronunciation symbol of the selected similar phoneme in each of the two audio data. The pronunciation timing corresponding to ([∫]:[θ]) is specified, and the voice data of the portion corresponding to the specified pronunciation timing is subjected to speech speed conversion processing by slowing down the reproduction speed (step A44).

ＣＰＵ２１は、話速変換発音区間特定処理（Ａ４）に従い、図１２の（Ａ２）（Ｂ２）に示すように、話速変換処理された２つの音声データ（“sink”と“think”）のうち、一方の音声データ“sink”と他方の音声データ“think”とを順番に再生する（ステップＡ５，Ａ６）。 According to the speech speed conversion pronunciation section identification process (A4), the CPU 21 selects one of the two voice data ("sink" and "think") that have been subjected to the speech speed conversion process, as shown in (A2) and (B2) of FIG. , one audio data "sink" and the other audio data "think" are played back in order (steps A5, A6).

ここで、ＣＰＵ２１は、一方の音声データ“sink”と他方の音声データ“think”が、ユーザにより順番に指定される毎に再生するよう処理してもよいし、自動で順次再生するよう処理してもよい。 Here, the CPU 21 may perform processing such that one audio data "sink" and the other audio data "think" are played back each time they are specified in order by the user, or may be processed so that they are automatically played back sequentially. It's okay.

以上のように構成した学習支援装置１０の第２実施形態の音声再生処理（２）によれば、２つの類似の発音要素（音素）をそれぞれ含む２つの単語（熟語、成句等でもよい）の音声データが、聞き分け対象の音声データとして選択されると、当該２つの単語それぞれの音声データにおいて、２つの類似の発音要素（音素）を含む発音部分の再生区間が特定され、特定された再生区間の再生速度が遅く変化され話速変換されて再生される。 According to the audio reproduction process (2) of the second embodiment of the learning support device 10 configured as described above, two words (which may be idioms, idiomatic phrases, etc.) each containing two similar pronunciation elements (phonemes) are reproduced. When the audio data is selected as the audio data to be distinguished, the reproduction section of the pronunciation part containing two similar pronunciation elements (phonemes) is identified in the audio data of each of the two words, and the identified reproduction section is The playback speed is changed to slow and the speech speed is converted and played back.

これにより、ユーザによる聞き取りが苦手な２つの類似の発音要素（音素）をそれぞれ含む２つの英単語等の音声データの再生において、ユーザは、当該類似の発音要素（音素）の部分を容易に聞き取って、効果的に聞き分ける練習を行なうことが可能になる。 As a result, when playing audio data such as two English words each containing two similar pronunciation elements (phonemes) that are difficult for the user to hear, the user can easily hear the part of the similar pronunciation elements (phonemes). This allows you to practice listening effectively.

なお、以上の学習支援装置１０による第１および第２実施形態の音声再生処理において、再生対象の音声データを再生する際に、ユーザが苦手なあるいは聞き分け対象の発音要素（音素）を含む発音部分の再生区間を特定し、当該特定した再生区間の再生速度を遅く切り換えるタイミングと、元の通常の再生速度に切り換えるタイミングでは、当該再生速度を段階的に切り換えることで、ユーザに聞き取りの違和感を与えないよう処理してもよい。 In addition, in the audio reproduction processing of the first and second embodiments by the learning support device 10, when reproducing the audio data to be reproduced, the pronunciation portion containing the pronunciation element (phoneme) that the user is not good at or that the user needs to distinguish. The playback section is identified, and the playback speed of the identified playback section is changed in stages at the timing of switching to a slower playback speed and at the timing of switching back to the original normal playback speed, giving the user a sense of discomfort when listening. You may take measures to ensure that there is no such thing.

また、第１および第２実施形態の音声再生処理では、再生対象の音声データの、ユーザが苦手なあるいは聞き分け対象の発音要素（音素）を含む特定の再生区間において、再生速度を遅く切り換える（変化させる）ことで、当該ユーザが苦手なあるいは聞き分け対象の発音要素（音素）を含む発音部分をユーザに聞き取り易く再生し、ユーザが効果的に練習を行えるよう構成した。 In addition, in the audio playback processing of the first and second embodiments, the playback speed is switched to a slower speed (change By doing so, the pronunciation part containing the pronunciation element (phoneme) that the user is not good at or that the user needs to distinguish is played back in an easy-to-understand manner for the user, so that the user can practice effectively.

これとは逆に、再生対象の音声データの、特定の再生区間の再生速度を早く切り換える（変化させる）ことで、ユーザが苦手なあるいは聞き分け対象の発音要素（音素）を含む発音部分を、ユーザにより聞き取り難く再生し、例えば語学レベルの高いユーザにとって効果的な練習が行えるよう構成してもよい。 On the contrary, by switching (changing) the playback speed of a specific playback section of the audio data to be played faster, the user can easily hear the pronunciation part that includes pronunciation elements (phonemes) that the user is not good at or needs to distinguish. The content may be reproduced in such a way that it is difficult to hear, so that users with a high level of language proficiency can practice effectively.

さらに、再生対象の音声データの、特定の再生区間の再生速度を変化させるのではなく、当該特定の再生区間の再生音量を大きくまたは小さく変化させ強調して再生することで、ユーザが苦手なあるいは聞き分け対象の発音要素（音素）を含む発音部分の聞き取り練習を行なう構成としてもよい。 Furthermore, instead of changing the playback speed of a specific playback section of the audio data to be played back, the playback volume of the specific playback section can be changed to be louder or quieter to emphasize the playback. It may also be configured to practice listening to pronunciation portions that include pronunciation elements (phonemes) to be distinguished.

前記各実施形態において記載した電子機器（学習支援装置１０）による各処理の手法、すなわち、図４～図８のフローチャートに示す第１実施形態の音声再生処理（１）、図１０，図１１のフローチャートに示す第２実施形態の音声再生処理（２）などの各手法は、何れもコンピュータに実行させることができるプログラムとして、メモリカード（ＲＯＭカード、ＲＡＭカードなど）、磁気ディスク（フロッピ（登録商標）ディスク、ハードディスクなど）、光ディスク（ＣＤ－ＲＯＭ、ＤＶＤなど）、半導体メモリなどの外部記録装置の媒体に格納して配布することができる。そして、電子機器のコンピュータ（ＣＰＵ）は、この外部記録装置の媒体に記録されたプログラムを記憶装置に読み込み、この読み込んだプログラムによって動作が制御されることにより、前記各実施形態において説明した音声再生機能を実現し、前述した手法による同様の処理を実行することができる。 The method of each process by the electronic device (learning support device 10) described in each of the above embodiments, that is, the audio reproduction process (1) of the first embodiment shown in the flowcharts of FIGS. 4 to 8, and the sound reproduction process (1) of FIGS. Each method such as the audio playback process (2) of the second embodiment shown in the flowchart is performed using a memory card (ROM card, RAM card, etc.), magnetic disk (floppy disk (registered trademark), etc.) as a program that can be executed by a computer. ) discs, hard disks, etc.), optical discs (CD-ROMs, DVDs, etc.), semiconductor memories, and other external recording device media for distribution. Then, the computer (CPU) of the electronic device reads the program recorded on the medium of this external recording device into the storage device, and the operation is controlled by the read program, thereby playing back the audio described in each of the above embodiments. It is possible to implement the functions and perform similar processing using the techniques described above.

また、前記各手法を実現するためのプログラムのデータは、プログラムコードの形態として通信ネットワーク（Ｎ）上を伝送させることができ、この通信ネットワーク（Ｎ）に接続されたコンピュータ装置（プログラムサーバ）から、前記プログラムのデータを電子機器に取り込んで記憶装置に記憶させ、前述した音声再生機能を実現することもできる。 Furthermore, the program data for realizing each of the above methods can be transmitted in the form of a program code over a communication network (N), and can be transmitted from a computer device (program server) connected to this communication network (N). It is also possible to import the data of the program into an electronic device and store it in a storage device to realize the audio playback function described above.

本願発明は、前記各実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。さらに、前記各実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適宜な組み合わせにより種々の発明が抽出され得る。例えば、各実施形態に示される全構成要件から幾つかの構成要件が削除されたり、幾つかの構成要件が異なる形態にして組み合わされても、発明が解決しようとする課題の欄で述べた課題が解決でき、発明の効果の欄で述べられている効果が得られる場合には、この構成要件が削除されたり組み合わされた構成が発明として抽出され得るものである。 The present invention is not limited to the embodiments described above, and can be modified in various ways without departing from the spirit of the invention at the implementation stage. Further, each of the embodiments includes inventions at various stages, and various inventions can be extracted by appropriately combining the plurality of disclosed constituent features. For example, even if some constituent features are deleted from all the constituent features shown in each embodiment or some constituent features are combined in a different form, the problem stated in the column of problems to be solved by the invention If the problem can be solved and the effect described in the column of effects of the invention can be obtained, a structure in which this constituent feature is deleted or combined can be extracted as an invention.

以下に、本願出願の当初の特許請求の範囲に記載された発明を付記する。 Below, the invention described in the original claims of the present application will be added.

［付記１］
プロセッサを備え、
前記プロセッサは、
学習対象となる発音要素を特定し、
再生対象の音声データ内で、前記特定された発音要素を含む一部の再生区間を対象区間として特定し、
前記音声データの再生中に、前記特定された前記対象区間での再生状態を他の再生区間の再生状態に対して変化させる、
ように構成されている電子機器。 [Additional note 1]
Equipped with a processor,
The processor includes:
Identify the pronunciation elements to be studied,
Identifying a part of the playback section including the identified pronunciation element in the audio data to be played back as a target section,
changing the playback state of the identified target section during playback of the audio data with respect to the playback state of other playback sections;
Electronic equipment configured as follows.

［付記２］
前記プロセッサは、
ユーザが再生対象として任意に指定したテキストに対応する音声データを再生し、
学習モードが設定されている場合には、前記特定された対象区間での再生状態を他の再生区間の再生状態に対して変化させ、学習モードが設定されていない場合には、前記音声データの全体を同じ再生状態で再生する、
ように構成されている付記１に記載の電子機器。 [Additional note 2]
The processor includes:
Plays the audio data corresponding to the text arbitrarily specified by the user as the playback target,
If the learning mode is set, the playback state of the specified target section is changed relative to the playback state of other playback sections, and if the learning mode is not set, the playback state of the audio data is changed. Play the whole thing in the same playback state,
The electronic device according to supplementary note 1, which is configured as follows.

［付記３］
前記プロセッサは、
前記音声データの再生中に、前記特定された前記対象区間で再生される音声を他の再生区間で再生される音声よりも強調するか、前記特定された対象区間での再生速度を他の再生区間の再生速度よりも遅くする、
ように構成されている付記１または付記２に記載の電子機器。 [Additional note 3]
The processor includes:
During the playback of the audio data, the sound played in the specified target section is emphasized more than the sound played in other playback sections, or the playback speed in the specified target section is set to be higher than that played in other playback sections. Make the playback speed slower than the section,
The electronic device according to appendix 1 or 2, which is configured as follows.

［付記４］
前記プロセッサは、
前記特定された対象区間で、音程を変えずに再生速度を変化させる話速変換により再生速度を変化させる、
ように構成されている付記１ないし付記３の何れかに記載の電子機器。 [Additional note 4]
The processor includes:
changing the playback speed in the identified target section by speech speed conversion that changes the playback speed without changing the pitch;
The electronic device according to any one of Supplementary Notes 1 to 3, which is configured as follows.

［付記５］
前記プロセッサは、
再生対象のテキストが単語を含む場合に、前記単語に含まれる一部の発音要素の発音部分を前記対象区間として特定する第１処理と、
再生対象のテキストが文を含む場合に、前記文に含まれる一部の単語の発音部分を前記対象区間として特定する第２処理と、
再生対象のテキストが文章である場合に、前記文章に含まれる一部の文の発音部分を前記対象区間として特定する第３処理、
のうちの少なくとも１つの処理を実行する、
ように構成されている付記１乃至付記４のいずれかに記載の電子機器。 [Additional note 5]
The processor includes:
When the text to be played includes a word, a first process of identifying a pronunciation portion of some pronunciation elements included in the word as the target section;
when the text to be played includes a sentence, a second process of identifying pronunciation portions of some words included in the sentence as the target section;
when the text to be played back is a sentence, a third process of identifying a pronunciation part of some sentences included in the sentence as the target section;
performing at least one process of;
The electronic device according to any one of Supplementary Notes 1 to 4, which is configured as follows.

［付記６］
前記プロセッサは、
前記第１処理と、前記第２処理と、前記第３処理、のいずれかをユーザに選択させる、
付記５に記載の電子機器。 [Additional note 6]
The processor includes:
Allowing the user to select one of the first process, the second process, and the third process;
Electronic equipment described in Appendix 5.

［付記７］
ディスプレイと、
ストレージと、を備え、
前記プロセッサは、
学習対象となる発音要素を、前記ディスプレイに表示させた複数の発音記号の中からユーザに選択させて特定するか、または前記ストレージに予め記憶された前記ユーザにとって苦手な発音要素のデータに基づき特定する、
ように構成されている付記１ないし付記６の何れかに記載の電子機器。 [Additional note 7]
display and
Equipped with storage and
The processor includes:
A pronunciation element to be learned is specified by having the user select it from among a plurality of phonetic symbols displayed on the display, or is specified based on data of pronunciation elements that are difficult for the user and stored in advance in the storage. do,
The electronic device according to any one of Supplementary Notes 1 to 6, which is configured as follows.

［付記８］
前記ストレージは、音声データを対応付けたテキストデータを記憶し、
前記プロセッサは、
前記ストレージに記憶されたテキストデータのテキストを前記ディスプレイに表示させる、ように構成され、
前記再生対象の音声データは、前記ディスプレイに表示されたテキストの中からユーザにより任意に選択されたテキストに対応する音声データである、
付記７に記載の電子機器。 [Additional note 8]
The storage stores text data associated with audio data,
The processor includes:
configured to display the text of the text data stored in the storage on the display,
The audio data to be played is audio data corresponding to text arbitrarily selected by the user from among the texts displayed on the display.
Electronic equipment described in Appendix 7.

［付記９］
前記プロセッサは、
前記音声データの前記特定された苦手な発音要素を含む再生区間を、当該苦手な発音要素としての音素を含む音素単位または単語単位または文単位の再生区間として特定する、
ように構成されている付記７または付記８に記載の電子機器。 [Additional note 9]
The processor includes:
identifying a playback section of the audio data that includes the identified weak pronunciation element as a playback section of a phoneme unit, a word unit, or a sentence unit that includes the phoneme as the weak pronunciation element;
The electronic device according to appendix 7 or 8, which is configured as follows.

［付記１０］
前記プロセッサは、
前記苦手な発音要素としての音素を含む音素単位または単語単位または文単位の再生区間を、ディスプレイに表示させた当該音素単位または単語単位または文単位の選択項目をユーザに選択させて特定するか、またはストレージに記憶されたユーザの語学レベルのデータに応じて特定する、
付記９に記載の電子機器。 [Additional note 10]
The processor includes:
specifying a playback section of a phoneme unit, word unit, or sentence unit that includes the phoneme as the weak pronunciation element by having the user select a selection item of the phoneme unit, word unit, or sentence unit displayed on a display; or specified according to the user's language level data stored in the storage,
Electronic equipment described in Appendix 9.

［付記１１］
前記ストレージに予め記憶された苦手な発音要素のデータは、複数の単語を連結して構成される語句のうち、当該単語を単一で発音した場合と比較して発音が変化する発音変化語句の当該発音が変化する部分の発音要素のデータである、
付記７または付記８に記載の電子機器。 [Additional note 11]
The data of difficult pronunciation elements stored in advance in the storage includes words and phrases with pronunciation changes that change the pronunciation compared to when the word is pronounced singly, among words that are formed by connecting multiple words. Data of the pronunciation element of the part where the pronunciation changes,
Electronic equipment described in Appendix 7 or Appendix 8.

［付記１２］
前記プロセッサは、
前記音声データの前記特定された再生区間での再生速度を、当該特定された再生区間以外での再生速度よりも遅く変化させる、
ように構成されている付記１ないし付記１１の何れかに記載の電子機器。 [Additional note 12]
The processor includes:
changing the playback speed of the audio data in the specified playback section to be slower than the playback speed outside the specified playback section;
The electronic device according to any one of Supplementary notes 1 to 11, which is configured as follows.

［付記１３］
前記プロセッサは、
聞き分け練習の対象となる２つの発音要素を特定し、
前記特定された２つの発音要素をそれぞれ含む２つの単語の音声データを再生する際に、前記２つの音声データそれぞれの前記２つの発音要素を含む再生区間を特定し、
前記２つの音声データの前記特定されたそれぞれの再生区間での再生速度を変化させる、
ように構成されている付記１ないし付記１２の何れかに記載の電子機器。 [Additional note 13]
The processor includes:
Identify the two pronunciation elements that are the target of listening practice,
When reproducing audio data of two words each including the identified two pronunciation elements, specifying a playback section including the two pronunciation elements of each of the two audio data,
changing the playback speed in each of the specified playback sections of the two audio data;
The electronic device according to any one of Supplementary Notes 1 to 12, which is configured as follows.

［付記１４］
電子機器のプロセッサにより、
学習対象となる発音要素を特定し、
再生対象の音声データ内で、前記特定された発音要素を含む一部の再生区間を対象区間として特定し、
前記音声データの再生中に、前記特定された前記対象区間での再生状態を他の再生区間の再生状態に対して変化させる、
ようにした音声再生方法。 [Additional note 14]
Due to the processor of electronic equipment,
Identify the pronunciation elements to be studied,
Identifying a part of the playback section including the identified pronunciation element in the audio data to be played back as a target section,
changing the playback state of the identified target section during playback of the audio data with respect to the playback state of other playback sections;
Audio playback method.

［付記１５］
電子機器のプロセッサを、
学習対象となる発音要素を特定し、
再生対象の音声データ内で、前記特定された発音要素を含む一部の再生区間を対象区間として特定し、
前記音声データの再生中に、前記特定された前記対象区間での再生状態を他の再生区間の再生状態に対して変化させる、
ように機能させるためのプログラム。 [Additional note 15]
electronic device processors,
Identify the pronunciation elements to be studied,
Identifying a part of the playback section including the identified pronunciation element in the audio data to be played back as a target section,
changing the playback state of the identified target section during playback of the audio data with respect to the playback state of other playback sections;
A program to make it work like this.

１０ …学習支援装置（電子機器）
１４ …キー入力部（キーボード）
１４Ｓ…［音声］キー
ＢＳ …［音声］タッチキー
１５ …音声出力部
１５Ｓ…本体スピーカ
１７ …タッチパネル式表示部（ディスプレイ）
２１ …ＣＰＵ（プロセッサ）
２２ …記憶部（ストレージ）
２２ａ…学習支援処理プログラム
２２ｂ…音声再生処理プログラム
２２ｃ…学習コンテンツ記憶部
２２ｄ…辞書データ記憶部
２２ｅ…他のコンテンツ記憶部
２２ｆ…語学レベルデータ記憶部
２２ｇ…苦手発音テーブル記憶部
２２ｈ…発音変化イディオムテーブル記憶部
２２ｉ…音声再生モードデータ記憶部
２２ｊ…話速変換区間設定データ記憶部
２２ｋ…話速変換再生区間データ記憶部
２３ …外部記録媒体
２４ …記録媒体読取部
２５ …通信部
２７ …イヤホンマイク
３０ …Ｗｅｂサーバ（プログラムサーバ）
Ｎ …通信ネットワーク（インターネット） 10...Learning support device (electronic device)
14...Key input section (keyboard)
14S...[Audio] key BS...[Audio] touch key 15...Audio output unit 15S...Main speaker 17...Touch panel display unit (display)
21...CPU (processor)
22...Storage unit
22a...Learning support processing program 22b...Audio reproduction processing program 22c...Learning content storage section 22d...Dictionary data storage section 22e...Other content storage section 22f...Language level data storage section 22g...Weak pronunciation table storage section 22h...Pronunciation change idiom Table storage section 22i...Audio playback mode data storage section 22j...Speech speed conversion section setting data storage section 22k...Speech speed conversion playback section data storage section 23...External recording medium 24...Recording medium reading section 25...Communication section 27...Earphone microphone 30...Web server (program server)
N...Communication network (Internet)

Claims

determining means for determining whether or not a word to be learned includes a phoneme that is pronounced as a predetermined pronunciation;
When the determining means determines that the word includes a phoneme that produces the predetermined pronunciation, the playback speed at which the word is audibly reproduced is set to a reproduction speed of the phoneme that produces the predetermined pronunciation. a setting means for setting the playback speed to be slower than the playback speed of other phonemes;
A learning support device comprising:

selection means for selecting, as learning targets, a first word that includes a phoneme that is one of a pair of pre-correlated pronunciations and a second word that includes a phoneme that is the other pronunciation;
The playback speed at which the first word and the second word selected by the selection means are reproduced aloud is determined by determining the playback speed for the phoneme that is the one pronunciation and the playback speed for the phoneme that is the other pronunciation. a setting means for setting the playback speed of the phoneme so that the playback speed of the phoneme is slower than the playback speed of other phonemes;
Equipped with
The selection means selects the first word and the first word as the learning targets so that the pronunciations of phonemes in corresponding positions match each other, except for the phoneme that is pronounced in the one direction or the phoneme in the other direction. Select 2 words,
A learning support device characterized by:

comprising a reproduction means for audibly reproducing the word at a reproduction speed set by the setting means,
The playback means changes the playback speed by speech speed conversion that changes the playback speed without changing the pitch.
The learning support device according to claim 1 or 2, characterized in that:

A learning support method executed by a learning support device, comprising:
a determination process for determining whether or not the word to be learned includes a phoneme that produces a predetermined pronunciation;
If it is determined by the determination process that the word includes a phoneme that produces the predetermined pronunciation, the playback speed at which the word is reproduced aloud is set to a reproduction speed of the phoneme that produces the predetermined pronunciation. Setting process to set the playback speed to be slower than other phonemes,
A learning support method characterized by comprising:

A learning support method executed by a learning support device, comprising:
a selection process of selecting a first word that includes a phoneme that is one of a pair of pre-correlated pronunciations and a second word that includes a phoneme that is the other pronunciation as learning targets;
The playback speed at which the first word and the second word selected by the selection process are reproduced as audio is determined by determining the playback speed for the phoneme that is pronounced as one of the words and the playback speed for the phoneme that is pronounced as the other word. A setting process to set the playback speed of the phoneme to be slower than the playback speed of other phonemes,
including;
The selection process includes selecting the first word and the first word as the learning targets so that the pronunciations of phonemes in corresponding positions match each other, except for the phoneme that is pronounced in one direction or the phoneme in the other direction. Select 2 words,
A learning support method characterized by:

computer
determining means for determining whether or not a word to be learned includes a phoneme that produces a predetermined pronunciation;
When the determining means determines that the word includes a phoneme that produces the predetermined pronunciation, the playback speed at which the word is audibly reproduced is set to a reproduction speed of the phoneme that produces the predetermined pronunciation. a setting means for setting the playback speed to be slower than the playback speed for other phonemes;
A program characterized by functioning as

computer,
Selection means for selecting, as learning targets, a first word that includes a phoneme that is one of a pair of pre-correlated pronunciations and a second word that includes a phoneme that is the other pronunciation;
The playback speed at which the first word and the second word selected by the selection means are reproduced aloud is determined by determining the playback speed for the phoneme that is the one pronunciation and the playback speed for the phoneme that is the other pronunciation. a setting means for setting the playback speed of the phoneme to be slower than the playback speed of other phonemes;
function as
The selection means selects the first word and the first word as the learning targets so that the pronunciations of phonemes in corresponding positions match each other, except for the phoneme that is pronounced in the one direction or the phoneme in the other direction. Select 2 words,
A program characterized by: