JP5706379B2

JP5706379B2 - Feature detection apparatus, feature detection method and program thereof

Info

Publication number: JP5706379B2
Application number: JP2012192937A
Authority: JP
Inventors: 南　泰浩; 泰浩南; 哲生小林; 弘晃杉山
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-09-03
Filing date: 2012-09-03
Publication date: 2015-04-22
Anticipated expiration: 2032-09-03
Also published as: JP2014049006A

Description

本発明は、語彙学習曲線の特徴を検出する技術に関する。 The present invention relates to a technique for detecting features of a vocabulary learning curve.

ヒトの言語発達は「人間とは何か」を考える上で重要な科学的知見や示唆を提供し得るものでありながら、現状としては言語発達について未解決の問題が多い。このため、言語発達に関する測定技術の進展や商業上でのサービス展開はほとんど見られないのが現状である。特に、音声認知や語彙獲得、文法操作などの基本能力の中でも、語彙獲得に関する科学技術はほとんど進展が見られていない。しかし、健やかな発達を緩やかに後押しする教育や、言語発達遅滞を含む発達障害に関する早期発見や支援等の必要性を考えると、語彙獲得に係る技術開発は重要な意味を持つと考えられる。 Although human language development can provide important scientific knowledge and suggestions for thinking about "what is human beings", there are currently many unresolved issues regarding language development. For this reason, there is almost no progress in measurement technology related to language development and commercial service development. In particular, there has been little progress in science and technology related to vocabulary acquisition, among basic abilities such as speech recognition, vocabulary acquisition, and grammar manipulation. However, considering the need for education that moderately supports healthy development and the early detection and support of developmental disabilities, including language development delays, technical development related to vocabulary acquisition is considered to be important.

幼児の言語発達の中でも特に特徴的で且つ個人性を捉える上で重要なものの一つは、語彙学習曲線（幼児が新しい単語を発話するようになった日齢と、日齢までに幼児が発話するようになった単語の累積数との関係を示すグラフ）である。この語彙学習曲線の特徴を、語彙爆発（またはボキャブラリー・スパート）と言われる事象を抜きに、求めることはできない。語彙爆発とは、発達心理学者が二十世紀中頃から注目してきた現象であり、１歳後半に起こるとされる語彙学習速度の急激な変化のことを指す。基本的には、幼児は１歳の誕生日前後に初語を発するようになるが、その後しばらくは非常に緩やかな速度で単語を覚える。しかし１歳半以降になると、急激に単語を発するようになるため、その劇的な変化を「爆発」や「スパート」と呼んでいる。語彙爆発は多くの親が意識的に気づくほど劇的な変化を伴うため、心理学の分野だけでなく育児産業の関係者にもよく知られている。このため、子どもの語彙発達（学習）の様子を数値化するためには、語彙爆発をモデル化しなければならないと考えられている。 One of the most characteristic and important aspects of preschool language development is the vocabulary learning curve (the age at which an infant begins to speak a new word, Graph showing the relationship with the cumulative number of words that have come to be. The characteristics of this vocabulary learning curve cannot be obtained without the phenomenon called vocabulary explosion (or vocabulary spurt). Vocabulary explosion is a phenomenon that development psychologists have been focusing on since the middle of the twentieth century and refers to a rapid change in vocabulary learning speed that occurs in the second half of the year. Basically, infants begin to utter their first words around their first birthday, but for a while after that they learn words at a very moderate rate. However, after the age of one and a half years, the word suddenly starts to utter, so the dramatic change is called "explosion" or "spurt". The vocabulary explosion is so dramatic that many parents consciously notice it, so it is well known not only in the field of psychology but also in the childcare industry. For this reason, it is considered that vocabulary explosions must be modeled in order to quantify the state of child vocabulary development (learning).

従来、発達心理学の分野では、語彙チェックリスト（親の回答に基づくアンケート調査）を用いた大規模集団データで語彙爆発の現象を複数の言語で確認してきた。月齢毎に集団データの平均値をプロットすると、緩やかな上昇を示す二次曲線になり、その変曲点が１８−２０ヶ月ころに現れることを見出してきた。こうした集団データから、語彙爆発が多くの子どもで見られる一般的な現象であるとみなしている。 Conventionally, in the field of developmental psychology, the phenomenon of vocabulary explosion has been confirmed in multiple languages using large-scale group data using a vocabulary checklist (questionnaire survey based on parents' answers). It has been found that when the average value of the group data is plotted for each age, it becomes a quadratic curve showing a gradual increase, and its inflection point appears around 18-20 months. From these collective data, we consider that the vocabulary explosion is a common phenomenon seen in many children.

語彙学習曲線の特徴を調べる手法として、語彙爆発が個人毎にいつ起こるのか、また、語彙爆発時期（語彙爆発が開始される時期）をどのように検出及び推定するのかという観点から、語彙爆発時期を見積もる以下の六つの手法が提案されている。 As a method of examining the characteristics of the vocabulary learning curve, the vocabulary explosion time is determined from the viewpoint of when the vocabulary explosion occurs for each individual and how to detect and estimate the vocabulary explosion time (the time when the vocabulary explosion starts). The following six methods have been proposed.

（１）特に計算などせずグラフを描き、目視で語彙爆発時期を判定する目視法。 (1) A visual method for determining a vocabulary explosion time by drawing a graph without any particular calculations.

（２）緩やかな上昇を示す二次曲線で語彙学習曲線を近似し、１８−２０ヶ月ころに現れる変曲点を語彙爆発時期とする手法。 (2) A method of approximating a vocabulary learning curve with a quadratic curve showing a gradual rise and setting an inflection point appearing around 18-20 months as a vocabulary explosion time.

（３）５０語覚えた時点を語彙爆発時期と定義する５０語達成基準法。 (3) The 50-word achievement standard method that defines the time when 50 words are learned as the vocabulary explosion time.

（４）ある特定の期間（例えば三週間）で達成基準（例えば三十語以上）を満たした時期を語彙爆発時期にするという特定期間達成基準法。 (4) A specific period achievement standard method in which a vocabulary explosion period is defined as a period that satisfies an achievement standard (for example, 30 words or more) in a specific period (for example, three weeks)

（５）時間軸に沿った語彙獲得データの速度成分をロジスティック回帰式に近似させ、その変曲点を語彙爆発時期とするロジスティック回帰近似法（非特許文献１参照）。 (5) A logistic regression approximation method in which the velocity component of the vocabulary acquisition data along the time axis is approximated to a logistic regression equation, and the inflection point is the vocabulary explosion time (see Non-Patent Document 1).

（６）語彙爆発の前後で、語彙獲得直線を、二つの直線で近似し、その誤差の和が最小になるようにし、その交点を語彙爆発時期とする手法（非特許文献２参照）。 (6) A method of approximating a vocabulary acquisition line by two straight lines before and after the vocabulary explosion, minimizing the sum of the errors, and setting the intersection as the vocabulary explosion time (see Non-Patent Document 2).

今までは、これらの手法により語彙爆発時期を決定し、その前後の時期の語彙学習速度を求めたり、語彙爆発時期を決定するときに利用した近似曲線を利用したりすることにより、語彙学習曲線の特徴を把握していた。 Until now, the vocabulary learning curve was determined by determining the vocabulary explosion time using these methods, obtaining the vocabulary learning speed before and after that time, or using the approximate curve used to determine the vocabulary explosion time. I knew the characteristics.

Ganger, J., Brent, M. R., "Reexamining the vocabulary spurt", Developmental Psychology, 2004, Vol.40, No.4, p.621-632.Ganger, J., Brent, M. R., "Reexamining the vocabulary spurt", Developmental Psychology, 2004, Vol.40, No.4, p.621-632. 南泰浩、小林哲生、杉山弘晃、“折れ線近似による語彙爆発開始時期の推定”、電子情報通信学会論文集、2012年3月Yasuhiro Minami, Tetsuo Kobayashi, Hiroaki Sugiyama, “Estimation of the Start of Vocabulary Explosion by Line Approximation”, IEICE Transactions, March 2012

（１）〜（４）の手法は、多数の幼児のデータを平均して、一つの語彙爆発の時期を求め、それを語彙学習曲線の特徴としている。このような処理では、データが平均化されるため、幼児毎に固有の語彙学習曲線の特徴は得られない。 In the methods (1) to (4), the data of a large number of infants are averaged to determine the timing of one vocabulary explosion, which is used as a feature of the vocabulary learning curve. In such a process, since the data is averaged, the characteristic of the vocabulary learning curve unique to each infant cannot be obtained.

（５）及び（６）の手法は、個人毎のデータを用いることで、幼児毎に固有の特徴をより細かく求めることができる。しかし、これらの手法も語彙爆発の時期を一点、あるいは語彙爆発が存在しないと仮定しているため、より細かな語彙学習曲線の特徴を求めることはできない。（５）及び（６）の手法で、細かい特徴が扱えない理由を、幼児の一人一人の実際の語彙学習曲線を用いて説明する。図１は幼児１の、図２は幼児２の語彙学習曲線を表す。これらの図では横軸に幼児の日齢を示し、縦軸に学習された単語の累積数を示している。図１の破線で囲まれた部分のようなＳ字カーブ、図２の破線で囲まれた部分のような大きな不連続性が現れるような複雑な語彙学習曲線を、（５）及び（６）の手法により一つの語彙爆発でモデル化することは困難である。 In the methods (5) and (6), by using the data for each individual, it is possible to obtain more specific features unique to each infant. However, since these methods also assume that the timing of vocabulary explosion is one point or that there is no vocabulary explosion, it is not possible to obtain more detailed characteristics of the vocabulary learning curve. The reason why the detailed features cannot be handled by the methods (5) and (6) will be described using the actual vocabulary learning curve of each infant. FIG. 1 shows the vocabulary learning curve of the infant 1 and FIG. In these figures, the horizontal axis indicates the infant's age, and the vertical axis indicates the cumulative number of learned words. A complex vocabulary learning curve in which a large discontinuity appears such as an S-shaped curve as shown by a broken line in FIG. 1 and a large discontinuity as shown by a broken line in FIG. It is difficult to model with one vocabulary explosion by this method.

そこで、本発明では、語彙学習曲線の特徴を今までの手法とは別の視点から求める。本発明は、語彙爆発を仮定することなく、語彙爆発とは異なる語彙学習曲線の特徴を検出する特徴検出装置、特徴検出方法、及びそのプログラムを提供することを目的とする。 Therefore, in the present invention, the characteristics of the vocabulary learning curve are obtained from a different viewpoint from the conventional methods. An object of the present invention is to provide a feature detection apparatus, a feature detection method, and a program for detecting a feature of a vocabulary learning curve different from a vocabulary explosion without assuming a vocabulary explosion.

上記の課題を解決するために、本発明の第一の態様によれば、特徴検出装置は、幼児が新しい単語を発話するようになった日齢と、日齢までに幼児が発話するようになった単語の累積数との関係を示す語彙学習曲線に基づき、新しい単語を発話するようになった日齢の間隔が閾値より大きくなる部分を検出する第一プラトー検出部と、語彙学習曲線を平滑化して、平滑化した語彙学習曲線の語彙学習速度が極小となる部分を検出する第二プラトー検出部と、を含む。 In order to solve the above-described problem, according to the first aspect of the present invention, the feature detection apparatus is configured so that the infant speaks a new word and the infant speaks before the age. Based on the vocabulary learning curve that shows the relationship with the cumulative number of words that have become, a first plateau detection unit that detects a portion where the age interval when a new word is spoken is greater than a threshold, and a vocabulary learning curve And a second plateau detection unit that detects a portion where the vocabulary learning speed of the smoothed vocabulary learning curve is minimized.

上記の課題を解決するために、本発明の第二の態様によれば、第一プラトー検出部及び第二プラトー検出部を含む特徴検出装置における特徴検出方法は、第一プラトー検出部によって、幼児が新しい単語を発話するようになった日齢と、日齢までに幼児が発話するようになった単語の累積数との関係を示す語彙学習曲線に基づき、新しい単語を発話するようになった日齢の間隔が閾値より大きくなる部分を検出する第一プラトー検出ステップと、第二プラトー検出部によって、語彙学習曲線を平滑化して、平滑化した語彙学習曲線の語彙学習速度が極小となる部分を検出する第二プラトー検出ステップと、を含む。 In order to solve the above-described problem, according to a second aspect of the present invention, a feature detection method in a feature detection apparatus including a first plateau detection unit and a second plateau detection unit is performed by an infant using a first plateau detection unit. Utters new words based on a vocabulary learning curve that shows the relationship between the age at which a child began speaking a new word and the cumulative number of words spoken by an infant by age A first plateau detection step for detecting a portion where the age interval is larger than a threshold value, and a portion where the vocabulary learning speed of the smoothed vocabulary learning curve is minimized by smoothing the vocabulary learning curve by the second plateau detection unit Detecting a second plateau.

本発明によれば、語彙爆発とは異なる語彙学習曲線の特徴であるプラトーを検出することができるという効果を奏する。 According to the present invention, it is possible to detect a plateau that is a feature of a vocabulary learning curve different from a vocabulary explosion.

幼児１の語彙学習曲線を表す図。The figure showing the vocabulary learning curve of the infant 1. 幼児２の語彙学習曲線を表す図。The figure showing the vocabulary learning curve of the infant 2. 幼児１の語彙学習曲線のプラトーの位置を示す図。The figure which shows the position of the plateau of the vocabulary learning curve of the infant 1. 幼児２の語彙学習曲線のプラトーの位置を示す図。The figure which shows the position of the plateau of the vocabulary learning curve of the infant 2. 第一実施形態に係る特徴検出装置の機能ブロック図。The functional block diagram of the feature detection apparatus which concerns on 1st embodiment. 第一実施形態に係る特徴検出装置の処理フローを示す図。The figure which shows the processing flow of the feature detection apparatus which concerns on 1st embodiment. 第一プラトー検出部１１０の処理フローを示す図。The figure which shows the processing flow of the 1st plateau detection part 110. FIG. 幼児１の、横軸を単語の累積数ｉとした語彙学習曲線を示す図。The figure which shows the vocabulary learning curve of the infant 1 which made the cumulative number i of the word a horizontal axis. 幼児１の語彙学習曲線を平滑化する際に得られた語彙学習曲線の１単語を学習するのに必要な日数の系列を示す図。The figure which shows the series of the days required in order to learn 1 word of the vocabulary learning curve obtained when smoothing the vocabulary learning curve of the infant 1. 幼児１の１単語を学習するのに必要な日数が極大となる部分を示す図。The figure which shows the part where the number of days required in order to learn 1 word of the infant 1 becomes the maximum. 幼児１の語彙学習曲線と１単語を学習するのに必要な日数が極大となる部分とを同時に示す図。The figure which shows simultaneously the vocabulary learning curve of the infant 1, and the part where the number of days required to learn one word becomes the maximum. 幼児２の、横軸を単語の累積数ｉとした語彙学習曲線を示す図。The figure which shows the vocabulary learning curve of the infant 2 with the horizontal axis | shaft as the cumulative number i of a word. 幼児２の語彙学習曲線を平滑化する際に得られた語彙学習曲線の１単語を学習するのに必要な日数の系列を示す図。The figure which shows the series of the days required in order to learn 1 word of the vocabulary learning curve obtained when smoothing the vocabulary learning curve of the infant 2. 幼児２の１単語を学習するのに必要な日数が極大となる部分を示す図。The figure which shows the part in which the number of days required to learn one word of the infant 2 becomes the maximum. 幼児２の語彙学習曲線と１単語を学習するのに必要な日数が極大となる部分とを同時に示す図。The figure which shows simultaneously the vocabulary learning curve of the infant 2, and the part where the number of days required to learn one word becomes the maximum. 併合部の処理フローを示す図。The figure which shows the processing flow of a merge part. 図３と図１１とを併合処理したプラトーの位置を示す図。The figure which shows the position of the plateau which merged FIG. 3 and FIG. 図４と図１５とを併合処理したプラトーの位置を示す図。The figure which shows the position of the plateau which merged FIG. 4 and FIG.

以下、本発明の実施形態について説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, constituent parts having the same function and steps for performing the same process are denoted by the same reference numerals, and redundant description is omitted.

＜発明の概要＞
まず、発明者らは、語彙爆発がある一時期に語彙が急激に増加するような簡単な現象ではなく、いくつかの急峻な語彙の増加が重なりあっている現象であることを発見した（参考文献１参照）。
［参考文献１］Y. Minami, H. Sugiyama, T. Kobayashi, "Multiple Vocabulary Spurts in Japanese Children", 12th International Congress for the Study of Child Language (IASCL2011), 2011 <Summary of invention>
First, the inventors discovered that this is not a simple phenomenon in which the vocabulary rapidly increases at a time when there is a vocabulary explosion, but a phenomenon in which several vocabulary increases overlap (references). 1).
[Reference 1] Y. Minami, H. Sugiyama, T. Kobayashi, "Multiple Vocabulary Spurts in Japanese Children", 12th International Congress for the Study of Child Language (IASCL2011), 2011

さらに、語彙学習曲線をよく観察すると、これらの増加は、単純な曲線（直線や二次曲線）に、新しい単語を数日間以上発話しない区間を複数個挿入するモデルで表現できることを発見した。この新しい単語を数日間以上発話しない区間（部分）をプラトー（ｐｌａｔｅａｕ：「平原」を意味する学習心理学の専門用語）と名付けた。実際、図１及び図２を見ると、プラトー（その幾つかを実線で囲まれた部分に示す）が存在することが分かる。このモデル化では、元々の語彙学習曲線が極めて単純な曲線（直線や二次曲線）と仮定しているため、これらのプラトーは、語彙爆発という現象を理解するのに重要な鍵となる特徴量であり、語彙学習曲線の特徴と言える。 Furthermore, if we look closely at the vocabulary learning curve, we found that these increases can be expressed in a simple curve (straight line or quadratic curve) by a model that inserts multiple intervals where new words are not spoken for more than a few days. The section (part) in which this new word is not spoken for more than a few days is named plateau (a technical term in learning psychology that means “plain”). In fact, it can be seen from FIG. 1 and FIG. 2 that there are plateaus (some of which are shown in the part surrounded by a solid line). Since this modeling assumes that the original vocabulary learning curve is a very simple curve (straight line or quadratic curve), these plateaus are key features for understanding the phenomenon of vocabulary explosion. This is a characteristic of the vocabulary learning curve.

プラトーの位置の簡単な見つけ方としては、語彙学習曲線中に、閾値ｐ（例えばｐ＝６）より長く、１単語も学習していない期間を見つける方法が考えられる。ここでは、ｙ_ｉをｉ番目の単語を学習した日齢とすると、ｙ_ｉ−ｙ_ｉ−１＞ｐのとき、ｙ_ｉ−１からｙ_ｉ−１までをプラトーとし、ｙ_ｉ−１をプラトーの開始位置とし、ｙ_ｉ−１をプラトーの終了位置とする。言い換えると、累積数ｉ−１から累積数ｉになる前日までの期間をプラトーという。このような方法により、図３及び図４に示すようにプラトーを簡単に検出することができる。なお、図３及び図４は、それぞれ図１及び図２のｙ軸とｘ軸とを置き換えた表示になっている。すなわちｙ軸が日齢、ｘ軸が単語の累積数である。図中、□で示した位置がプラトーの終了位置に対応する。より詳しく言うと、□で示した位置の前日がプラトーの終了位置である。 As a simple method of finding the plateau position, a method of finding a period in the vocabulary learning curve that is longer than a threshold value p (for example, p = 6) and in which no word is learned can be considered. Here, when y _i is the age at which the i-th word is learned, when y _i −y _i−1 > p, y _i−1 to y _i −1 are plateaus, and y _i−1 is a plateau. And y _i −1 is the end position of the plateau. In other words, the period from the cumulative number i-1 to the previous day from the cumulative number i is called a plateau. By such a method, the plateau can be easily detected as shown in FIGS. 3 and 4 are displayed by replacing the y-axis and the x-axis in FIGS. 1 and 2, respectively. That is, the y-axis is age, and the x-axis is the cumulative number of words. In the figure, the position indicated by □ corresponds to the end position of the plateau. More specifically, the day before the position indicated by □ is the end position of the plateau.

図４の実線で囲まれた部分のように、語彙学習曲線が階段状になっていることが人間の目には確認できるのにもかかわらず、その間隔が閾値ｐより小さいため、プラトーとして検出されない区間（部分）が存在する。このような区間（部分）をプラトー候補と呼ぶと、かなりの数のプラトー候補が語彙学習曲線中に観測されることが分かる。 Although the human eye can confirm that the vocabulary learning curve is stepped like the part surrounded by the solid line in FIG. 4, the interval is smaller than the threshold value p, so it is detected as a plateau. There is a section (part) that is not performed. If such a section (part) is called a plateau candidate, it can be seen that a considerable number of plateau candidates are observed in the vocabulary learning curve.

このプラトー候補は、前述のプラトーの定義からは外れている。しかし、語彙学習曲線が急峻に変わり、語彙学習速度が急激に小さくなるという意味では、語彙学習曲線の特徴を担う点であると考えられる。そこで、プラトーの定義を語彙学習速度が極小になる区間（部分）というふうに定義を拡張する。 This plateau candidate is out of the definition of the plateau described above. However, in the sense that the vocabulary learning curve changes sharply and the vocabulary learning speed decreases rapidly, this is considered to be a point that bears the characteristics of the vocabulary learning curve. Therefore, the definition of the plateau is extended to a section (part) where the vocabulary learning speed is minimized.

つまり、プラトーは、（ｉ）所定の期間（閾値）より長く、新しい単語を発話しない区間（部分）、または、（ｉｉ）語彙学習速度が極小になる区間（部分）と定義される。なお、語彙学習速度が極小になるとき、（ａ）縦軸を日齢、横軸を単語の累積数とすると、語彙学習曲線の傾斜が急になり、（ｂ）縦軸を単語の累積数、横軸を日齢とすると、語彙学習曲線の傾斜がなだらかになる。 That is, the plateau is defined as (i) a section (part) longer than a predetermined period (threshold) and not speaking a new word, or (ii) a section (part) where the vocabulary learning speed is minimized. When the vocabulary learning speed is minimized, (a) the vertical axis represents the age, and the horizontal axis represents the cumulative number of words. The lexical learning curve has a steep slope, and (b) the vertical axis represents the cumulative number of words. When the horizontal axis is the age, the slope of the vocabulary learning curve becomes gentle.

実施形態では、このプラトーを検出する装置について説明する。 In the embodiment, an apparatus for detecting this plateau will be described.

＜第一実施形態＞
図５は第一実施形態に係る特徴検出装置１００の機能ブロック図を、図６はその処理フローを示す。 <First embodiment>
FIG. 5 is a functional block diagram of the feature detection apparatus 100 according to the first embodiment, and FIG. 6 shows a processing flow thereof.

特徴検出装置１００は、第一プラトー検出部１１０、第二プラトー検出部１２０及び併合部１３０を含む。第二プラトー検出部１２０は、平滑化部１２１と検出部１２３とを含む。 The feature detection apparatus 100 includes a first plateau detection unit 110, a second plateau detection unit 120, and a merge unit 130. The second plateau detection unit 120 includes a smoothing unit 121 and a detection unit 123.

特徴検出装置１００は、幼児が新しい単語を発話するようになった日齢ｙ_ｉと、その日齢ｙ_ｉまでに幼児が発話するようになった単語の累積数ｉとからなるデータセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝を受け取り、プラトーの位置の集合｛ｒ_１，ｒ_２，…，ｒ_ｋ，…，ｒ_Ｋ｝を求め、出力する。ただし、Ｋは語彙習得曲線に含まれるプラトーの個数である。なお、データセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝は、図１〜図４等に示されるような幼児毎の語彙学習曲線を表すので、単に語彙学習曲線とも言う。また、ｉは幼児が新たに学習した単語の順番、ｙ_ｉはｉ番目の単語を発話した日齢と言ってもよい。以下に本実施形態の概要を説明する。 Feature detector 100 includes infants and Nichiyowai y _i which is adapted to utter the new word, a data set consisting of a cumulative number i of words infants up to that Nichiyowai y _i began to speech {(1 , Y ₁ ), (2, y ₂ ),..., (I, y _i ),..., (I, y _I )} and receive a set of plateau positions {r ₁ , r ₂ _,. ..., r _K } is obtained and output. K is the number of plateaus included in the vocabulary acquisition curve. The data sets {(1, y ₁ ), (2, y ₂ ),..., (I, y _i ),..., (I, y _I )} are as shown in FIGS. Since it represents a vocabulary learning curve for each infant, it is also simply called a vocabulary learning curve. Also, i may be said to be the order of words newly learned by the infant, and y _i may be said to be the age at which the i-th word was spoken. The outline of this embodiment will be described below.

第一プラトー検出部１１０において、（ｉ）所定の期間（閾値）より長く、新しい単語を発話しない区間（部分）を検出する。 In the first plateau detection unit 110, (i) a section (part) that is longer than a predetermined period (threshold) and does not speak a new word is detected.

第二プラトー検出部１２０において、（ｉｉ）語彙学習速度（語彙学習曲線の微分係数）が極小になる区間（部分）を検出する。ここで、観測値にはランダム性が含まれるので、語彙学習曲線から直接、微分係数の極小値を求めると多くの極小点を生成してしまう。そこで、本実施形態ではトレンドモデルというカルマンフィルタを使って、語彙学習曲線を平滑化して、微分係数を計算する。このとき、図１及び図２の縦軸及び横軸を入れ替える。つまり、図３及び図４のように横軸を累積数、縦軸を日齢とする。この処理を、カルマンフィルタの時間間隔が均等に並んでいる必要があるという要請に応えるために導入する。ただし、プラトーを見つけるためには、累積数ｉのデータ系列を日齢ｙ_ｉで微分して極小値を検出する代わりに日齢ｙ_ｉのデータ系列をiで微分して極大値を検出する必要がある。 The second plateau detection unit 120 detects (ii) a section (part) where the vocabulary learning speed (the differential coefficient of the vocabulary learning curve) is minimized. Here, since the observed value includes randomness, if the minimum value of the differential coefficient is obtained directly from the vocabulary learning curve, many minimum points are generated. Therefore, in this embodiment, the Kalman filter called a trend model is used to smooth the vocabulary learning curve and calculate the differential coefficient. At this time, the vertical axis and the horizontal axis in FIGS. 1 and 2 are interchanged. That is, as shown in FIGS. 3 and 4, the horizontal axis represents the cumulative number, and the vertical axis represents the age. This processing is introduced in order to meet the demand that the time intervals of the Kalman filter need to be arranged evenly. However, in order to find the plateau, instead of differentiating the data series of the cumulative number i by the age y _i and detecting the local minimum, it is necessary to differentiate the data series of the age y _i by i and detect the local maximum There is.

併合部１３０において、第一プラトー検出部１１０と第二プラトー検出部１２０とで検出されたプラトーを併合し、出力する。 In the merging unit 130, the plateaus detected by the first plateau detection unit 110 and the second plateau detection unit 120 are merged and output.

なお、発明者らのこれまでの語彙学習曲線の観察から、日齢が高くなるにつれてプラトーの間隔が短くなる傾向があることが分かっている。上述の構成により、第一プラトー検出部１１０において語彙学習初期のプラトーを検出し、第二プラトー検出部１２０において語彙学習中期以降のプラトーを検出することができる。以下に詳細を説明する。 In addition, it is known from the observation of the vocabulary learning curve by the inventors so far that the interval between the plateaus tends to become shorter as the age increases. With the above-described configuration, the first plateau detection unit 110 can detect a plateau at the initial stage of vocabulary learning, and the second plateau detection unit 120 can detect a plateau after the middle period of vocabulary learning. Details will be described below.

（データセットの収集について）
幼児の語彙学習曲線におけるプラトーを推定するために、どういったデータを参照するかがまずは問題となる。幼児の発話を全てデジタルビデオレコーダーなどの電子メディアで記録可能であれば、それを分析するのが最も高精度な方法といえるが、データ取得にかかるコストは膨大で、かつ幼児の曖昧な発話データを自動で認識し単語レベルで分析する工学的技術もまだ存在しないので、実現は大変難しい。一方、所定期間毎に（例えば、三ヶ月に１度）アンケートに回答してもらい、幼児が新たに発話した単語数の変化を把握する方法もある。この場合、所定期間が長ければ、語彙爆発の正確な時期を把握するのは困難である。また、所定期間が短ければ、アンケートの回答者（幼児の親）への負担が増大する。従って、現実的には、データを記録する親への負担を軽減しつつ、かつ細かい時間ポイントでデータ取得が可能な方法が望ましい。 (About collecting data sets)
In order to estimate the plateau in the infant's vocabulary learning curve, the first question is what data to refer to. If it is possible to record all of the infant's utterances with electronic media such as a digital video recorder, it can be said that the most accurate method is to analyze it, but the cost of data acquisition is enormous and the infant's ambiguous speech data Since there is no engineering technology that automatically recognizes and analyzes at the word level, it is very difficult to realize. On the other hand, there is also a method in which a questionnaire is answered every predetermined period (for example, once every three months) to grasp the change in the number of words newly spoken by the infant. In this case, if the predetermined period is long, it is difficult to grasp the exact timing of the vocabulary explosion. In addition, if the predetermined period is short, the burden on the respondent of the questionnaire (the parent of the infant) increases. Therefore, in practice, it is desirable to have a method capable of acquiring data at fine time points while reducing the burden on the parent who records the data.

そこで、本実施形態では、ウェブ日誌法を利用したデータ取得を適用する。この方法は、幼児が単語を新たに学習（発話）した場合に、ウェブ上の特定のサイトに携帯電話やパーソナルコンピュータからネットワークを介してアクセスし、その日の日誌と共に、幼児が覚えた単語を記録するものである（参考文献２及び３参照）。
［参考文献２］小林哲生、永田昌明、「ウェブを用いた幼児言語発達研究：大規模縦断データ収集の試み」、言語処理学会第１５回年次大会論文集、２００９年、ｐ．５３４−５３７
［参考文献３］小林哲生、永田昌明、「ウェブ上で収集した幼児語彙発達データの信頼性検証」、言語処理学会第１６回年次大会論文集、２０１０年、ｐ．４０３−４０６
この方法の有効性は科学的に検証されている点で非常によい。 Therefore, in this embodiment, data acquisition using the web diary method is applied. In this method, when an infant learns a new word (speaks), a specific site on the web is accessed via a network from a mobile phone or a personal computer, and the word that the infant remembers is recorded along with the diary of the day. (See References 2 and 3).
[Reference 2] Tetsuo Kobayashi, Masaaki Nagata, “Infant Language Development Research Using the Web: A Trial of Large Scale Longitudinal Data Collection”, Proc. 15th Annual Conference of the Language Processing Society, 2009, p. 534-537
[Reference 3] Tetsuo Kobayashi, Masaaki Nagata, “Reliability Verification of Infant Vocabulary Development Data Collected on the Web”, Proc. Of the 16th Annual Conference of the Language Processing Society, 2010, p. 403-406
The effectiveness of this method is very good in that it has been scientifically verified.

また、この方法によるデータ取得の利点は、親にとっても比較的容易に記録できる方式でありながら、記録年月日（幼児が新たな単語を覚えた年月日）と幼児の生年月日との差から、幼児が新たな単語を覚えた日齢を算出可能な点である。 In addition, the advantage of data acquisition by this method is that it is relatively easy for parents to record, but the date of recording (the date when the infant learned a new word) and the date of birth of the infant From the difference, it is possible to calculate the age at which the infant learned a new word.

算出された各単語の獲得日齢を昇順に並べ、小さい方から１，２，…と整数系列を割り当て、単語の累積数ｉを算出する。ここで、ｉは単語の累積数を表すとともに、対応する単語を覚えた順番を表す。これにより、日齢と単語の累積数との組からなるデータセットが生成される。 The calculated acquisition ages of each word are arranged in ascending order, and an integer series of 1, 2,... Is assigned from the smallest to calculate the cumulative number i of words. Here, i represents the cumulative number of words and the order in which the corresponding words are learned. Thereby, the data set which consists of a set of age and the cumulative number of words is produced | generated.

本実施形態では、直接、単語の累積数ｉと日齢ｙ_ｉからなるデータセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝が入力として与えられていると仮定する。 In this embodiment, direct, word data set consisting cumulative number i and day-old _{_{_{y i {(1, y 1}}} ), (2, y 2), ..., (i, y i), ..., (I, Suppose y _I )} is given as input.

＜第一プラトー検出部１１０＞
第一プラトー検出部１１０は、データセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝を受け取り、幼児が新しい単語を発話するようになった日齢ｙ_ｉに基づき、新しい単語を発話するようになった日齢の間隔（ｙ_ｉ−ｙ_ｉ−１）が閾値ｐより大きくなる部分を検出し（ｓ１１０）、その部分の終了位置に対応する単語の累積数ｉの集合をプラトーの位置の集合｛ｒ_１，１，ｒ_１，２，…，ｒ_１，ｖ，…，ｒ_１，Ｖ｝として併合部１３０に出力する。なお、日齢の間隔（ｙ_ｉ−ｙ_ｉ−１）が閾値ｐより大きくなる場合、プラトーの開始位置の累積数はｉ−１となり、終了位置に対応する累積数はｉとなる。なお、累積数ｉは、プラトーの終了位置の累積数ではなく、プラトーの終了位置に対応する累積数であり、プラトーの終了位置は単語の累積数がｉになる前日である（日齢で表すとｙ_ｉ−１）。 <First Plateau Detection Unit 110>
The first plateau detector 110, the data set _{_{{(1, y 1),}} (2, y 2), ..., (i, y i), ..., (I, y I)} receives the word infant new A portion where the interval (y _i -y _i-1 ) at which the new word starts to be spoken is greater than the threshold p based on the age y _{i at} which the new word starts to be spoken (s110). sets of sets plateau position of cumulative number i of words corresponding to the end position of the part _{_{_{{r 1,1, r 1,2, ...}}} , r 1, v, ..., r 1, V} combining unit 130 as Output to. When the age interval (y _i -y _i-1 ) is larger than the threshold value p, the cumulative number of plateau start positions is i-1, and the cumulative number corresponding to the end position is i. The cumulative number i is not the cumulative number of the plateau end position, but the cumulative number corresponding to the plateau end position. And y _i -1).

図７は、第一プラトー検出部１１０の処理フローの一例を示す。 FIG. 7 shows an example of the processing flow of the first plateau detection unit 110.

まず、初期化を行う（ｓ１１１）。次に、新しい単語を発話するようになった日齢の間隔（ｙ_ｉ−ｙ_ｉ−１）が閾値ｐより大きいか否かを判定する（ｓ１１２）。大きい場合には、プラトーの位置ｒ_１，ｖに単語の累積数ｉを代入し（ｓ１１３）、ｖをインクリメントする（ｓ１１４）。 First, initialization is performed (s111). Next, it is determined whether or not the age interval (y _i −y _i−1 ) at which a new word is spoken is larger than a threshold value p (s112). If it is larger, the cumulative number i of words is substituted into the plateau positions r _{1 and v} (s113), and v is incremented (s114).

全ての累積数ｉについて、ｓ１１２〜ｓ１１４の処理を行い（ｓ１１５、ｓ１１６）、その後、第一プラトー検出部１１０で検出されたプラトーの個数ｖ−１をＶに代入する（ｓ１１７）。 The processing of s112 to s114 is performed for all the cumulative numbers i (s115, s116), and then the number of plateaus v-1 detected by the first plateau detection unit 110 is substituted for V (s117).

例えば、第一プラトー検出部１１０によって検出されたプラトーの位置に対応する単語の累積数ｉは、図３及び図４の□で示した位置に対応する単語の累積数である。 For example, the cumulative number i of words corresponding to the position of the plateau detected by the first plateau detection unit 110 is the cumulative number of words corresponding to the positions indicated by the squares in FIGS.

＜第二プラトー検出部１２０＞
第二プラトー検出部１２０は、平滑化部１２１と検出部１２３とを含む。第二プラトー検出部１２０は、語彙学習曲線を平滑化して、平滑化した語彙学習曲線の語彙学習速度が極小となる部分を検出し、その部分に対応する単語の累積数ｉをプラトーの位置ｒ_２，ｖとして併合部１３０に出力する。以下、平滑化部１２１と検出部１２３の処理内容を説明する。 <Second Plateau Detection Unit 120>
The second plateau detection unit 120 includes a smoothing unit 121 and a detection unit 123. The second plateau detection unit 120 smoothes the vocabulary learning curve, detects a portion where the vocabulary learning speed of the smoothed vocabulary learning curve is minimized, and determines the cumulative number i of words corresponding to that portion as the plateau position r. _{2 and v} are output to the merging unit 130. Hereinafter, processing contents of the smoothing unit 121 and the detection unit 123 will be described.

（平滑化部１２１）
平滑化部１２１は、データセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝を受け取り、ノイズの影響を取り除くフィルタを用いて、データセット｛（１，ｙ_１），（２，ｙ_２），…，（ｉ，ｙ_ｉ），…，（Ｉ，ｙ_Ｉ）｝によって表される語彙学習曲線を滑らかに（平滑化）する（ｓ１２１）。 (Smoothing part 121)
The smoothing unit 121 receives the data set {(1, y ₁ ), (2, y ₂ ), ..., (i, y _i ), ..., (I, y _I )}, and removes the influence of noise. Is used to smooth the vocabulary learning curve represented by the data set {(1, y ₁ ), (2, y ₂ ), ..., (i, y _i ), ..., (I, y _I )} ( Smoothing) (s121).

例えば、フィルタとして下記の状態方程式で示されるカルマンスムーザを用いる。 For example, a Kalman smoother represented by the following equation of state is used as a filter.

ここで、ｖ_１，ｉ、ｖ_２，ｉ及びｗ_ｉは、何れもガウス分布から生成される変数であり、平均が何れも０、分散がそれぞれ０．０５、０．０５及び０．１である。この状態方程式を用いてカルマンスムーザを実施することにより、日齢のデータ系列（語彙学習曲線）｛ｙ_１，ｙ_２，…，ｙ_ｉ，…，ｙ_Ｉ｝を平滑化して、滑かな日齢のデータ系列（語彙学習曲線）｛ｘ_１，ｘ_２，…，ｘ_ｉ，…，ｘ_Ｉ｝を得る。 _{_{Here, v 1, i, v 2}} , i and _{w i} are both a variable generated from the Gaussian distribution, both mean 0, variance, respectively 0.05, 0.05 and 0.1 is there. By performing the Kalman smoother using the state equation, the day data series (vocabulary learning curve) age _{_{{y 1, y 2, ...}} , y i, ..., y I} by smoothing, smooth Kana Date age data series (vocabulary learning _{_{curve) {x 1, x 2,}} ..., x i, ..., x I} obtained.

このカルマンスムーザを用いた場合、１単語を学習するのに必要な日数が、δｘ_ｉに代入される。この値δｘ_ｉは語彙学習曲線の微分係数に相当する。よって、得られた語彙学習曲線｛ｘ_１，ｘ_２，…，ｘ_ｉ，…，ｘ_Ｉ｝を微分することなしに、直接、１単語を学習するのに必要な日数の系列｛δｘ_１，δｘ_２，…，δｘ_ｉ，…，δｘ_Ｉ｝を求めることができる。 If using the Kalman smoother, the number of days required to learn one word, is assigned to δx _i. This value δx _i corresponds to the differential coefficient of the vocabulary learning curve. Therefore, the resultant vocabulary learning curve _{_{{x 1, x 2, ...}} , x i, ..., x I} without differentiating the directly one series of days required to learn the words {.delta.x _1, δx ₂ ,..., δx _i ,..., δx _I } can be obtained.

ただし、この値δｘ_ｉを用いずに、滑らかな語彙学習曲線｛ｘ_１，ｘ_２，…，ｘ_ｉ，…，ｘ_Ｉ｝を微分し、微分係数を求める構成としてもよい。前述の通り、微分係数は、１単語を学習するのに必要な日数δｘ_ｉに相当する。また、滑らかな語彙学習曲線｛ｘ_１，ｘ_２，…，ｘ_ｉ，…，ｘ_Ｉ｝から日数の差（δｘ_ｉ＝ｘ_ｉ−ｘ_ｉ−１）を、つまり、１単語発話するのに必要な日数を求めてもよい。 However, without using the value .delta.x _i, smooth vocabulary learning curve _{_{{x 1, x 2, ...}} , x i, ..., x I} differentiates, it may be configured to determine the differential coefficient. As described above, the differential coefficient corresponds to the number of days δx _i needed to learn one word. Further, a smooth vocabulary learning curve {x ₁ , x ₂ ,..., X _i ,..., X _I } is used to calculate the difference in days (δx _i = x _i −x _i−1 ), that is, to speak one word. You may ask for the required number of days.

平滑化部１２１は、何れかの方法により求めた１単語を学習するのに必要な日数δｘ_ｉと単語の累積数ｉとからなるデータセット｛（１，δｘ_１），（２，δｘ_２），…，（ｉ，δｘ_ｉ），…，（Ｉ，δｘ_Ｉ）｝を検出部１２３に出力する。 The smoothing unit 121 includes a data set {(1, δx ₁ ), (2, δx ₂ ) including the number of days δx _i necessary to learn one word obtained by any method and the cumulative number i of words. ,..., (I, δx _i ),..., (I, δx _I )} are output to the detection unit 123.

（検出部１２３）
検出部１２３は、データセット｛（１，δｘ_１），（２，δｘ_２），…，（ｉ，δｘ_ｉ），…，（Ｉ，δｘ_Ｉ）｝を受け取り、１単語を学習するのに必要な日数の系列｛δｘ_１，δｘ_２，…，δｘ_ｉ，…，δｘ_Ｉ｝から、１単語を学習するのに必要な日数δｘ_ｉが極大となる部分を検出し（ｓ１２３）、その部分に対応する単語の累積数ｉの集合をプラトーの位置の集合｛ｒ_２，１，ｒ_２，２，…，ｒ_２，ｗ，…，ｒ_２，Ｗ｝として併合部１３０に出力する。なお、１単語を学習するのに必要な日数δｘ_ｉが極大となる部分は、語彙学習曲線の傾斜が急になる部分（プラトー）であり、語彙学習速度が極小となる部分である。 (Detector 123)
The detection unit 123 receives the data set {(1, δx ₁ ), (2, δx ₂ ),..., (I, δx _i ),..., (I, δx _I )} to learn one word. From the series of necessary days {δx ₁ , δx ₂ ,..., Δx _i ,..., Δx _I }, a portion where the number of days δx _i necessary to learn one word is maximized is detected (s123). set of plateau position the set of cumulative number i of words corresponding to _{_{_{{r 2,1, r 2,2, ...}}} , r 2, w, ..., r 2, W} and outputs the combining unit 130 as. The portion of the number of days .delta.x _i needed to learn a word becomes the maximum is a portion inclined vocabulary learning curve becomes steeper (plateau) is a portion vocabulary learning rate is minimized.

図８及び図１２はそれぞれ幼児１及び幼児２の、横軸を単語の累積数ｉとした語彙学習曲線を、図９及び図１３はそれぞれ幼児１及び幼児２の語彙学習曲線を平滑化する際に得られた語彙学習曲線の１単語を学習するのに必要な日数の系列を、図１０及び図１４はそれぞれ幼児１及び幼児２の１単語を学習するのに必要な日数が極大となる部分を、図１１及び図１５はそれぞれ幼児１及び幼児２の語彙学習曲線と１単語を学習するのに必要な日数が極大となる部分とを同時に示す。 8 and 12 are the vocabulary learning curves for infant 1 and infant 2, respectively, with the horizontal axis representing the cumulative number of words i, and FIGS. 9 and 13 are for smoothing the vocabulary learning curves for infant 1 and infant 2, respectively. FIG. 10 and FIG. 14 show the maximum number of days required to learn one word for infant 1 and infant 2, respectively. 11 and 15 simultaneously show the vocabulary learning curve of infant 1 and infant 2 and the portion where the number of days required to learn one word is maximized.

図１５を見ると、図４の階段状の部分を、抽出できていることが分かる。 It can be seen from FIG. 15 that the stepped portion of FIG. 4 has been extracted.

なお、ｉが離散値なので、δｘ_ｉの極大値は容易な演算で求めることができる。例えば、隣接する１単語を学習するのに必要な日数δｘ_ｉ−１及びδｘ_ｉ＋１の何れに対しても１単語を学習するのに必要な日数δｘ_ｉが大きい場合に、その１単語を学習するのに必要な日数δｘ_ｉを極大値とする方法や、１単語を学習するのに必要な日数の系列｛δｘ_１，δｘ_２，…，δｘ_ｉ，…，δｘ_Ｉ｝に対する近似式を求め、その近似式に対しニュートン法等を用いて極大値を求める方法等が考えられる。 Since i is a discrete value, the maximum value of δx _i can be obtained by an easy calculation. For example, when the number of days δx _i necessary to learn one word is larger than both the number of days δx _i−1 and δx _{i + 1} necessary to learn one adjacent word, that one word is learned. a method of the maximum value dates .delta.x _i necessary, the number of days of sequences needed to learn a word _{_{{δx 1, δx 2, ...}} , δx i, ..., δx I} determined an approximate expression for, A method for obtaining a local maximum value using the Newton method or the like for the approximate expression is conceivable.

＜併合部１３０＞
併合部１３０は、プラトーの位置の集合｛ｒ_１，１，ｒ_１，２，…，ｒ_１，ｖ，…，ｒ_１，Ｖ｝及び｛ｒ_２，１，ｒ_２，２，…，ｒ_２，ｗ，…，ｒ_２，Ｗ｝とを受け取り、重複部分を削除して併合し（ｓ１３０）、併合したプラトーの位置の集合｛ｒ_１，ｒ_２，…，ｒ_ｋ，…，ｒ_Ｋ｝を特徴検出装置１００の出力値として、出力する。 <Consolidation unit 130>
Merging unit 130, a set of plateaus positions _{_{_{{r 1,1, r 1,2, ...}}} , r 1, v, ..., r 1, V} and _{_{{r 2,1, r 2,2, ...}} , r _{_2,} w, ..., receive and _{r 2, W},} were combined to remove the overlapping portion (s130), the set _{r ₁ position merged _{plateau, r 2, ..., r k} , ..., r K } As an output value of the feature detection apparatus 100.

図１６は、併合部１３０の処理フローの一例を示す。 FIG. 16 shows an example of the processing flow of the merging unit 130.

まず、初期化を行う（ｓ１３０ａ）。次に、ｒ_１，ｖとｒ_２，ｗとの大きさを比較する（ｓ１３０ｂ、ｓ１３０ｈ）。 First, initialization is performed (s130a). Next, the sizes of r1 _{, v} and r2 _{, w} are compared (s130b, s130h).

ｒ_１，ｖのほうが小さい場合には（ｓ１３０ｂ）、ｒ_１，ｖをｒ_ｋに代入し（ｓ１３０ｃ）、ｋ及びｖをインクリメントし（ｓ１３０ｄ）、ｖが第一プラトー検出部１１０で検出されたプラトーの個数をＶより大きいか否かを判定する（ｓ１３０ｅ）。Ｖ以下の場合には、ｓ１３０ｂの処理に戻る。Ｖより大きい場合には、以下に示す処理（１）を行い（ｓ１３０ｆ）、ｋ−１をＫに代入し（ｓ１３０ｇ）、併合処理を終了する。処理（１）では、ｒ_２，ｗの残りの全ての要素をプラトーの位置の集合に追加する。具体的には、ｒ_２，ｗをｒ_ｋに代入し（ｓ１３０ｆ−１）、ｋ及びｗをインクリメントし（ｓ１３０ｆ−２）、ｗが第二プラトー検出部１２０で検出されたプラトーの個数をＷより大きくなるまで、ｓ１３０ｆ−１及びｓ１３０ｆ−２を繰り返す（ｓ１３０ｆ−３）。 When r _{1 and v} are smaller (s 130 b), r _{1 and v} are substituted into r _k (s 130 c), k and v are incremented (s 130 d), and v is detected by the first plateau detection unit 110. It is determined whether or not the number of plateaus is greater than V (s130e). If it is V or less, the process returns to s130b. If larger than V, the following process (1) is performed (s130f), k-1 is substituted for K (s130g), and the merge process is terminated. In process (1), all the remaining elements of r _{2 and w} are added to the set of plateau positions. _{Specifically,} by substituting _{r 2, w} to _{r k (s130f-1),} k is incremented and w (s130f-2), w is the number of plateaus that are detected by the second plateau detector 120 W Until it becomes larger, s130f-1 and s130f-2 are repeated (s130f-3).

ｒ_２，ｗのほうが小さい場合には（ｓ１３０ｂ、ｓ１３０ｈ）、ｒ_２，ｗをｒ_ｋに代入し（ｓ１３０ｉ）、ｋ及びｗをインクリメントし（ｓ１３０ｊ）、ｗが第二プラトー検出部１２０で検出されたプラトーの個数をＷより大きいか否かを判定する（ｓ１３０ｋ）。Ｗ以下の場合には、ｓ１３０ｂの処理に戻る。Ｗより大きい場合には、以下に示す処理（２）を行い（ｓ１３０ｌ）、ｋ−１をＫに代入し（ｓ１３０ｇ）、併合処理を終了する。処理（２）では、ｒ_１，ｖの残りの全ての要素をプラトーの位置の集合に追加する。具体的には、ｒ_１，ｖをｒ_ｋに代入し（ｓ１３０ｌ−１）、ｋ及びｖをインクリメントし（ｓ１３０ｌ−２）、ｖが第一プラトー検出部１１０で検出されたプラトーの個数をＶより大きくなるまで、ｓ１３０ｌ−１及びｓ１３０ｌ−２を繰り返す（ｓ１３０ｌ−３）。 When r _{2 and w} are smaller (s 130 b and s 130 h), r _{2 and w} are substituted into r _k (s 130 i), k and w are incremented (s 130 j), and w is detected by the second plateau detection unit 120. It is determined whether or not the number of plateaus made is greater than W (s130k). If it is less than or equal to W, the process returns to s130b. If larger than W, the following process (2) is performed (s130l), k-1 is substituted for K (s130g), and the merge process is terminated. In process (2), all the remaining elements of r _{1 and v} are added to the set of plateau positions. _{Specifically,} by substituting _{r 1, v} to _{r k (s130l-1),} k is incremented and v (s130l-2), v a is the number of plateaus that are detected by the first plateau detector 110 V Until it becomes larger, s130l-1 and s130l-2 are repeated (s130l-3).

ｒ_１，ｗとｒ_２，ｗとが同じ場合には（ｓ１３０ｂ、ｓ１３０ｈ）、何れか一方の値（例えばｒ_２，ｗ）をｒ_ｋに代入し（ｓ１３０ｍ）、ｋ，ｖ及びｗをインクリメントし（ｓ１３０ｎ）、ｖが第一プラトー検出部１１０で検出されたプラトーの個数をＶより大きいか否か、及びｗが第二プラトー検出部１２０で検出されたプラトーの個数をＷより大きいか否かを判定する（ｓ１３０ｏ、ｓ１３０ｐ、ｓ１３０ｒ）。なお、代入処理を行っていないプラトーのインデックス（この例ではｖ）をインクリメントすることで重複する値を削除することと同等の処理を行っている。ｖ＞Ｖかつｗ＞Ｗの場合には、ｋ−１をＫに代入し（ｓ１３０ｇ）、併合処理を終了する。ｖ≦Ｖかつｗ≦Ｗの場合には、ｓ１３０ｂの処理に戻る。ｖ＞Ｖかつｗ≦Ｗの場合には、処理（１）を行い（ｓ１３０ｑ）、ｋ−１をＫに代入し（ｓ１３０ｇ）、併合処理を終了する。ｖ≦Ｖかつｗ＞Ｗの場合には、処理（２）を行い（ｓ１３０ｓ）、ｋ−１をＫに代入し（ｓ１３０ｇ）、併合処理を終了する。 If r _{1, w} and _{r 2, w} and is the same (s130b, s130h), one of the values (e.g. _{r 2, w)} is substituted into _{r k (s130m),} increments k, v and w (S130n), whether v is the number of plateaus detected by the first plateau detection unit 110 larger than V, and whether w is the number of plateaus detected by the second plateau detection unit 120 larger than W. (S130o, s130p, s130r). Note that processing equivalent to deleting duplicate values by incrementing a plateau index (v in this example) that has not been subjected to substitution processing is performed. If v> V and w> W, k-1 is substituted for K (s130g), and the merging process is terminated. If v ≦ V and w ≦ W, the process returns to s130b. When v> V and w ≦ W, processing (1) is performed (s130q), k-1 is substituted for K (s130g), and the merging processing is terminated. When v ≦ V and w> W, processing (2) is performed (s130s), k-1 is substituted for K (s130g), and the merging processing is terminated.

図３と図１１とを、及び、図４と図１５とをそれぞれ併合処理したプラトーの位置を図１７及び図１８に示す。 FIGS. 17 and 18 show the positions of the plateaus obtained by merging FIGS. 3 and 11 and FIGS. 4 and 15 respectively.

第二プラトー検出部１２０の処理だけではプラトーが頻繁に挿入される単語学習初期では、プラトーを検出することができない。そこで、併合部１３０で第一プラトー検出部１１０の処理で求めたプラトーと併合したものを最終的なプラトーとしている。 The plateau cannot be detected at the initial stage of word learning where the plateau is frequently inserted only by the processing of the second plateau detection unit 120. Therefore, the final plateau is obtained by merging the plateau obtained by the processing of the first plateau detection unit 110 in the merging unit 130.

＜効果＞
このような構成により、語彙爆発とは異なる語彙学習曲線の特徴であるプラトーを検出することができる。なお、プラトーは、図示しない出力部（ディスプレイやプリンタ等）を介して、自動的に提示することができる。語彙学習曲線に含まれる幼児固有のプラトーから、各個人の語彙獲得の特徴を抽出することができる。これは、最終的には、発達に合わせたオーダーメード型教育をより効果的なものにすることが可能になり、商業上、価値のある指標となりうる。 <Effect>
With such a configuration, it is possible to detect a plateau that is a feature of a vocabulary learning curve different from the vocabulary explosion. The plateau can be automatically presented via an output unit (display, printer, etc.) not shown. It is possible to extract the vocabulary acquisition characteristics of each individual from the infant-specific plateau included in the vocabulary learning curve. This can ultimately make tailored education tailored to development more effective and can be a commercially valuable indicator.

また、プラトーが起きる時期は、語彙学習過程で重要な現象が起こっている可能性が高く、この時期を求めることは、語彙学習過程を科学的に理解するために重要となる。 In addition, when the plateau occurs, there is a high possibility that an important phenomenon has occurred in the vocabulary learning process, and finding this time is important for understanding the vocabulary learning process scientifically.

＜変形例＞
本実施形態では、データを各部間で直接入出力しているが、図示しない記憶部を介してデータを入出力してもよい。 <Modification>
In this embodiment, data is directly input / output between each unit, but data may be input / output via a storage unit (not shown).

カルマンスムーザの形として、上述の状態方程式を用いたが、状態方程式としては、日齢のデータ系列のトレンドを計算するものであればどんなものでもよい。ただし、状態方程式によっては、１単語発話するのにかかる日数に対する変数が状態方程式に含まれない場合がある。この場合には、フィルタにより滑らかになった語彙学習曲線の微分係数を求めることにより１単語発話するのにかかる日数を計算することができる。また、ここでは状態方程式によるカマンスムーザの結果をフィルタの結果としたが、同じ状態方程式を利用するカルマンフィルタの結果を利用してもよい。 As the form of the Kalman smoother, the above-described equation of state is used. However, any equation can be used as the equation of state as long as it can calculate the trend of the data series of days of age. However, depending on the state equation, the state equation may not include a variable for the number of days taken to utter one word. In this case, the number of days required to speak one word can be calculated by obtaining the differential coefficient of the vocabulary learning curve smoothed by the filter. Here, the result of the Kaman smoother by the state equation is used as the filter result, but the result of the Kalman filter using the same state equation may be used.

また、フィルタの形状として状態方程式を用いたが、ローパスフィルタなどの他の平滑化方法を用いてもよい。この場合も、フィルタにより滑らかになった語彙学習曲線の微分係数を求めることにより、１単語発話するのにかかる日数を計算することができる。 Further, although the state equation is used as the shape of the filter, other smoothing methods such as a low-pass filter may be used. Also in this case, the number of days required to speak one word can be calculated by obtaining the differential coefficient of the vocabulary learning curve smoothed by the filter.

本実施形態では、累積数の順に並んだ日齢のデータ系列を平滑化しているが、日齢の順に並んだ累積数のデータ系列を平滑化してもよい。つまり、縦軸を単語の累積数とし、横軸を日齢として、平滑化を行ってもよい。しかし、その場合、カルマンスムーザ等を用いてフィルタリングするためには、時間間隔が均等に並んでいる必要があるため、各日齢における累積数を補完する必要がある。補完により、データ数及び処理が増加するため、第一実施形態のほうが有利である。また、第一実施形態では、補完した値を用いていないため、補完した値を用いた場合と比べて、その検出精度は同等以上となると考えられる。 In this embodiment, the data series of ages arranged in the order of the cumulative number is smoothed, but the data series of cumulative numbers arranged in the order of ages may be smoothed. That is, smoothing may be performed with the vertical axis as the cumulative number of words and the horizontal axis as the age. However, in that case, in order to perform the filtering using the Kalman smoother or the like, the time intervals need to be evenly arranged, so it is necessary to supplement the cumulative number at each age. Since the number of data and processing increase due to complementation, the first embodiment is more advantageous. In the first embodiment, since the supplemented value is not used, the detection accuracy is considered to be equal to or higher than that in the case where the supplemented value is used.

本実施形態では、第一プラトー検出部１１０において、プラトーの位置として終了位置に対応する累積数ｉを用いているが、開始位置の累積数ｉ−１を用いる構成としてもよい。 In the present embodiment, the first plateau detection unit 110 uses the cumulative number i corresponding to the end position as the plateau position, but may be configured to use the cumulative number i−1 of the start position.

また、併合部１３０では他のアルゴリズムを用いて併合処理を行ってもよい。例えば、Ｌｉｓｐ（プログラム言語の一種）やｍａｔｌａｂ（登録商標）、Ｒ言語等の関数Ｕｎｉｏｎ等を用いて併合処理を行ってもよい。さらに、併合部１３０を設けなくともよい。その場合、第一プラトー検出部１１０によって検出されたプラトーと第二プラトー検出部１２０によって検出されたプラトーとをそのまま出力してもよいし、何れの検出部で検出されたプラトーかを示すラベルを付加して出力してもよい。 Further, the merging unit 130 may perform the merging process using another algorithm. For example, the merge processing may be performed using a function Union such as Lisp (a kind of programming language), matlab (registered trademark), R language, or the like. Furthermore, the merging unit 130 may not be provided. In that case, the plateau detected by the first plateau detector 110 and the plateau detected by the second plateau detector 120 may be output as they are, or a label indicating which plateau is detected by which detector is used. You may add and output.

本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

＜プログラム及び記録媒体＞
上述した特徴検出装置は、コンピュータにより機能させることもできる。この場合はコンピュータに、目的とする装置（各種実施形態で図に示した機能構成をもつ装置）として機能させるためのプログラム、またはその処理手順（各実施形態で示したもの）の各過程をコンピュータに実行させるためのプログラムを、ＣＤ−ＲＯＭ、磁気ディスク、半導体記憶装置などの記録媒体から、あるいは通信回線を介してそのコンピュータ内にダウンロードし、そのプログラムを実行させればよい。 <Program and recording medium>
The feature detection apparatus described above can also be functioned by a computer. In this case, each process of a program for causing a computer to function as a target device (a device having the functional configuration shown in the drawings in various embodiments) or a process procedure (shown in each embodiment) is processed by the computer. A program to be executed by the computer may be downloaded from a recording medium such as a CD-ROM, a magnetic disk, or a semiconductor storage device or via a communication line into the computer, and the program may be executed.

１００特徴検出装置
１１０第一プラトー検出部
１２０第二プラトー検出部
１２１平滑化部
１２３検出部
１３０併合部 100 feature detection device 110 first plateau detection unit 120 second plateau detection unit 121 smoothing unit 123 detection unit 130 merging unit

Claims

Speak a new word based on a vocabulary learning curve that shows the relationship between the age at which an infant began speaking a new word and the cumulative number of words spoken by the infant by that age A first plateau detection unit that detects a portion where the interval of the age of becoming larger than a threshold value,
A second plateau detection unit that smoothes the vocabulary learning curve and detects a portion where the vocabulary learning speed of the smoothed vocabulary learning curve is minimized.
Feature detection device.

The feature detection device according to claim 1,
The second plateau detector is
A smoothing unit that smoothes the vocabulary learning curve using a Kalman smoother;
A detection unit that detects a portion where the vocabulary learning speed is minimized from a sequence of days required to learn one word obtained when performing smoothing,
Feature detection device.

The feature detection device according to claim 2,
The vocabulary learning curve consists of the data series of the ages arranged in the order of the cumulative number,
The smoothing unit smoothes the date data series using a Kalman smoother,
The detecting unit minimizes the vocabulary learning speed from a series of days required to learn one word obtained when smoothing, where the number of days required to learn one word is maximized. Detect as part
Feature detection device.

A feature detection method in a feature detection apparatus including a first plateau detection unit and a second plateau detection unit,
Based on the vocabulary learning curve indicating the relationship between the age at which the infant started speaking a new word and the cumulative number of words that the infant started speaking before the age by the first plateau detection unit A first plateau detection step for detecting a portion where the interval between days of age when a new word is spoken is greater than a threshold;
Smoothing the vocabulary learning curve by a second plateau detection unit, and detecting a portion where the vocabulary learning speed of the smoothed vocabulary learning curve is minimized.
Feature detection method.

The program for functioning a computer as each part which comprises the characteristic detection apparatus in any one of Claims 1-3.