WO2014073206A1 - 情報処理装置、及び、情報処理方法 - Google Patents
情報処理装置、及び、情報処理方法 Download PDFInfo
- Publication number
- WO2014073206A1 WO2014073206A1 PCT/JP2013/006555 JP2013006555W WO2014073206A1 WO 2014073206 A1 WO2014073206 A1 WO 2014073206A1 JP 2013006555 W JP2013006555 W JP 2013006555W WO 2014073206 A1 WO2014073206 A1 WO 2014073206A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- language model
- context
- information processing
- feature function
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- the generation probability P (w 1 m ) of the word string w 1 m including m words “w 1 , w 2 ,..., w m ” is expressed as follows using the conditional probability of each word: I can ask for it.
- w i ⁇ N + 1 i ⁇ 1 ) can be estimated using, for example, training data composed of word strings stored for estimation.
- C (w i ⁇ N + 1 i ) is the number of times the word string wi ⁇ N + 1 i appears in the learning data
- C (w is the number of times the word string wi ⁇ N + 1 i ⁇ 1 appears in the learning data.
- w i ⁇ N + 1 i ⁇ 1 ) can be estimated using maximum likelihood estimation as follows.
- the information processing apparatus 9 includes a global context extraction unit 910, a trigger feature calculation unit 920, a language model generation unit 930, a language model learning data storage unit 940, and a language model storage unit 950.
- the language model storage unit 950 stores a language model.
- Non-Patent Document 1 has a problem that the generation probability of subsequent words cannot be calculated with high accuracy.
- An object of the present invention is to provide an information processing apparatus and an information processing method capable of solving the above problems and generating a highly accurate language model.
- the present invention does not particularly limit the language processing unit (the vocabulary unit of the language model).
- the processing unit of the present invention may be a word, a word string such as an idiom or clause including a plurality of words, or an individual character.
- words are collectively described as “words”.
- the global context extraction unit 10 identifies each word included in the received language model learning data as a processing target, and sets a set of words appearing around each identified word (hereinafter also referred to as “specific word”). As a global context, each specific word is extracted.
- the set of words (global context) extracted by the global context extraction unit 10 of the present embodiment is not particularly limited.
- the global context extraction unit 10 may extract a sentence including a specific word as a global context as a global context.
- the global context extraction unit 10 may extract a set of words within a predetermined range (distance) from a word immediately before or after a specific word as a global context.
- the global context extraction unit 10 extracts a set of words in a predetermined range before a specific word as a global context, the specific word becomes a subsequent word with respect to the global context.
- the global context extraction unit 10 may extract a set of words in a predetermined range (distance) before and after the specific word as the global context.
- the front range and the back range may be the same distance or different distances.
- the “distance” described here is a distance as a word of language data.
- the distance is the number of words from a specific word or the number of sentences from a sentence including the specific word.
- the global context extraction unit 10 extracts nouns and verbs as global contexts.
- the global context extraction unit 10 of the present embodiment is not limited to this.
- the global context extraction unit 10 may select using other criteria (for example, parts of speech such as adjectives or vocabulary sets), or may extract all words.
- the global context extraction unit 10 sends the extracted global context data to the global context classification unit 20.
- emotion 1 “joy”, emotion 2 “sadness”, emotion 3 “anger”, etc. can be considered as classes to be classified.
- assigning a global context to one class represents being related to one class. For example, if the probability that the global context belongs to the topic “Landing on the Moon” is 1.0, this corresponds to assigning this global context to one topic class “Landing on the Moon”.
- classifying the global context into one class but including creating information (for example, posterior probabilities of each class) indicating relation states to a plurality of classes, To tell. Therefore, “classify the global context based on a predetermined viewpoint” can be said to “classify the global context based on the predetermined viewpoint or calculate information indicating a state related to the predetermined viewpoint”.
- the global context classification unit 20 will be described as calculating the posterior probability of each class when the global context is a condition. That is, the global context classification unit 20 calculates a posterior probability of each class when a global context is given using a global context classification model as a result of classification.
- the global context classification model can be created, for example, by learning a maximum entropy model, a support vector machine, a neural network, etc. using a large amount of text data to which class information is added.
- FIG. 3 is a diagram illustrating an example of a result obtained by classifying the global context extracted in FIG. 2 from the viewpoint of classification of “topics”.
- t is a class and d is a global context.
- the posterior probability P (t moon landing
- d) of the class of the topic 1 “moon landing” is “0.7”.
- the posterior probability P (t space station construction
- d) of the class of topic 2 “space station construction” is “0.1”. Further, the posterior probability of the topic k is “0.0”.
- the global context classification unit 20 obtains the result of classification of the global context corresponding to the specific word for the word (specific word) specified in the language model learning data by the global context extraction unit 10 (this embodiment). Then calculate the posterior probability of each class.
- the global context extraction unit 10 sets a plurality of different words in the language model learning data as specific words, repeats the extraction of the global context for each specific word, and sends the obtained global context to the global context classification unit 20.
- the global context classification unit 20 executes the classification processing described so far for all received global contexts.
- the global context extraction unit 10 may set all words in the language model learning data as specific words, or only words belonging to a specific part of speech as specific words, and are included in a predetermined vocabulary set.
- the word may be a specific word.
- the global context classification unit 20 sends the classification result to the language model generation unit 30.
- the language model generation unit 30 generates a language model for calculating the generation probability of each specific word using the classification result of the global context classification unit 20. More specifically, it is as follows. It can be said that the generation of the language model using the classification result generates the language model based on learning using the classification result. Therefore, the language model generation unit 30 can also be called a language model learning unit.
- the language model generation unit 30 can use various methods for learning such a model.
- the language model generation unit 30 may use the maximum entropy model already described.
- the language model generation unit 30 of the present embodiment generates a language model using the posterior probabilities of classes calculated based on the global context. Therefore, the language model generation unit 30 can generate a language model based on the global context.
- the language model generation unit 30 selects “Landing on the Moon”.
- a language model having a large generation probability of the specific word w “moon” can be generated.
- FIG. 4 is a flowchart showing an example of the operation of the information processing apparatus 1.
- the global context extraction unit 10 of the information processing apparatus 1 extracts a set of words around a word (specific word) in the language model learning data as global context as global context data (step S210).
- the global context classification unit 20 of the information processing apparatus 1 classifies the global context using the context classification model (step S220).
- the information processing apparatus 1 determines whether or not the processing has been completed for all words in the language model learning data (step S230). Note that the processing target words of the information processing device 1 need not be all the words included in the language model learning data.
- the information processing apparatus 1 may use a predetermined partial word of the language model learning data as a specific word. In this case, the information processing apparatus 1 determines whether all the words included in the predetermined vocabulary set have been processed as specific words.
- step S230 If the process has not been completed (NO in step S230), the information processing apparatus 1 returns to step S210 and processes the next specific word.
- the language model generation unit 30 of the information processing device 1 uses the result of classification of the global context (for example, the posterior probability of the class) to determine the generation probability of each specific word.
- a language model to be calculated is generated (step S240).
- the information processing apparatus 1 configured as described above can obtain an effect of generating a language model with high accuracy.
- the reason is that the information processing apparatus 1 extracts the global context of the language model learning data. Then, the information processing apparatus 1 classifies the extracted global context using the context classification model. Then, the information processing apparatus 1 generates a language model based on the classification result. This is because the information processing apparatus 1 can generate a language model based on the global context.
- the global context classifying unit 20 performs “moon landing”. A large value is calculated as the posterior probability of the class.
- the language model generation unit 30 generates a model for calculating the word generation probability using the posterior probability of the class as a feature. Therefore, the language model generated according to the present embodiment can be calculated with a high probability that “moon” appears in the subsequent word of the global context in FIG.
- the information processing apparatus 1 can obtain the effect of reducing the deterioration of the estimation accuracy of subsequent words even when an error is included in the global context.
- the information processing apparatus 1 extracts a global context having a predetermined size. Therefore, even if a small number of words included in the global context include an error, the ratio of the error to the global context is small, and the classification result of the global context does not change significantly.
- the configuration of the information processing apparatus 1 is not limited to the above description.
- the information processing apparatus 1 may divide each configuration into a plurality of configurations.
- the information processing apparatus 1 may divide the global context extraction unit 10 into a language model learning data reception unit (not shown), a processing unit that extracts a global context, and a transmission unit that transmits a global context.
- the information processing apparatus 1 may have one or more configurations as one configuration.
- the information processing apparatus 1 may include the global context extraction unit 10 and the global context classification unit 20 as one configuration.
- the information processing apparatus 1 may be configured by another device connected to a network (not shown).
- FIG. 5 is a block diagram showing an example of the configuration of the information processing apparatus 2 which is another configuration of the present embodiment.
- the information processing apparatus 2 includes a CPU 610, a ROM 620, a RAM 630, an IO (Input / Output) 640, a storage device 650, an input device 660, and a display device 670, and constitutes a computer.
- the CPU 610 reads a program from the storage device 650 via the ROM 620 or the IO 640. Then, the CPU 610 realizes each function as the global context extraction unit 10, the global context classification unit 20, and the language model generation unit 30 of the information processing apparatus 1 of FIG. 1 based on the read program.
- the CPU 610 uses the RAM 630 and the storage device 650 as temporary storage when realizing each function.
- the CPU 610 receives input data from the input device 660 via the IO 640 and displays the data on the display device 670.
- the CPU 610 may read a program included in the storage medium 700 that stores the program so as to be readable by a computer using a storage medium reading device (not shown). Alternatively, the CPU 610 may receive a program from an external device via a network (not shown).
- ROM 620 stores programs executed by CPU 610 and fixed data.
- the ROM 620 is, for example, a P-ROM (Programmable-ROM) or a flash ROM.
- the RAM 630 temporarily stores programs executed by the CPU 610 and data.
- the RAM 630 is, for example, a D-RAM (Dynamic-RAM).
- the IO 640 mediates data between the CPU 610, the storage device 650, the input device 660, and the display device 670.
- the IO 640 is, for example, an IO interface card.
- the storage device 650 stores data and programs stored in the information processing device 2 for a long time. Further, the storage device 650 may operate as a temporary storage device for the CPU 610. Further, the storage device 650 may store part or all of the information of the present embodiment illustrated in FIG. 1 such as language model learning data.
- the storage device 650 is, for example, a hard disk device, a magneto-optical disk device, an SSD (Solid State Drive), or a disk array device.
- the input device 660 is an input unit that receives an input instruction from an operator of the information processing apparatus 2.
- the input device 660 is, for example, a keyboard, a mouse, or a touch panel.
- the display device 670 is a display unit of the information processing apparatus 2.
- the display device 670 is a liquid crystal display, for example.
- the information processing apparatus 2 configured in this way can obtain the same effects as the information processing apparatus 1.
- the global context extraction unit 10, the global context classification unit 20, and the language model generation unit 30 are the same as those in the first embodiment. Therefore, the same description of the first embodiment is omitted as appropriate.
- the global context extraction unit 10 receives language model learning data from the language model learning data storage unit 110. Since other operations of the global context extraction unit 10 are the same as those in the first embodiment, a detailed description thereof will be omitted.
- the context classification model generation unit 40 sets the global context to “emotion”.
- a context classification model that classifies from a viewpoint can be generated. Note that the viewpoint of the class assigned to the learning data as the context classification model learning data is not limited to “topic”, “emotion”, and “time” described so far.
- the context classification model generation unit 40 may operate as follows.
- the context classification model generation unit 40 clusters words or documents included in the context classification model learning data and collects them into a plurality of clusters (unsupervised clustering).
- the clustering technique used by the context classification model generation unit 40 is not particularly limited.
- the context classification model generation unit 40 may use agglomerative clustering or k-means method as a clustering method. By regarding each cluster classified in this way as a class, the context classification model generation unit 40 can learn the context classification model.
- FIG. 8 is a schematic diagram illustrating the clustering operation of the context classification model generation unit 40.
- the context classification model generation unit 40 divides the context classification model learning data having no class information into a plurality of classes (cluster 1, cluster 2,..., Cluster 1) using aggregate clustering.
- the context classification model generation unit 40 sends the generated context classification model to the context classification model storage unit 130 and stores it.
- the context classification model storage unit 130 stores the context classification model generated by the context classification model generation unit 40.
- the global context classification unit 20 classifies the global context based on the context classification model stored in the context classification model storage unit 130 as in the first embodiment.
- the information processing device 3 does not need to generate a context classification model each time the language model learning data is processed.
- the global context classification unit 20 of the information processing device 3 may apply the same context classification model to different language model learning data.
- the information processing apparatus 3 may cause the context classification model generation unit 40 to generate a context classification model as necessary. For example, when the information processing apparatus 3 receives context classification model learning data via a network (not shown), the information processing apparatus 3 may cause the context classification model generation unit 40 to generate a context classification model.
- the global context classification unit 20 sends the classification result to the language model generation unit 30.
- the language model generation unit 30 generates a language model based on the classification result. Since the language model generation unit 30 is the same as that of the first embodiment except that the generated language model is stored in the language model storage unit 140, detailed description thereof is omitted.
- the language model storage unit 140 stores the language model generated by the language model generation unit 30.
- the information processing apparatus 3 of the present embodiment configured as described above can obtain an effect of generating a more accurate language model in addition to the effect of the first embodiment.
- the reason is that the context classification model generation unit 40 of the information processing apparatus 3 of this embodiment generates a context classification model based on the context classification model learning data.
- the global context classification unit 20 uses the generated context classification model. This is because the information processing apparatus 3 can perform processing using an appropriate context classification model.
- the information processing apparatus 3 of the present embodiment may be realized by a computer including the CPU 610, the ROM 620, and the RAM 630, similarly to the information processing apparatus 2 shown in FIG.
- the storage device 650 may operate as each storage unit of the present embodiment.
- FIG. 9 illustrates a case where the storage device 650 operates as the language model learning data storage unit 110, the context classification model learning data storage unit 120, the context classification model storage unit 130, and the language model storage unit 140 of the present embodiment. Indicates information to be stored.
- the information processing device 4 is different in that it includes a trigger feature calculation unit 50 and a language model generation unit 34 instead of the language model generation unit 30 in addition to the configuration of the information processing device 3 of the second embodiment.
- the information processing apparatus 4 of the present embodiment may be realized by a computer including the CPU 610, the ROM 620, and the RAM 630, similarly to the information processing apparatus 2 illustrated in FIG.
- the trigger feature calculation unit 50 calculates the feature function of the extracted trigger pair.
- the feature function of the trigger pair from the word a to the word b can be obtained by the following equation.
- the information processing device 4 according to the third embodiment configured as described above can obtain an effect of further improving the accuracy of the word generation probability in addition to the effect of the information processing device 3 of the second embodiment.
- the feature function of the trigger pair indicates the relationship between the two words of the trigger pair (for example, the strength of co-occurrence).
- the language model generation unit 34 of the information processing apparatus 4 generates a language model that predicts the word generation probability in consideration of the relationship between specific two words that are likely to co-occur in addition to the classification result of the global context. Because.
- FIG. 11 is a block diagram illustrating an example of the configuration of the information processing apparatus 5 according to the fourth embodiment.
- the information processing device 5 is different in that it includes an N-gram feature calculation unit 60 and a language model generation unit 35 instead of the language model generation unit 30 in addition to the configuration of the information processing device 3 of the second embodiment.
- the information processing apparatus 5 of the present embodiment may be realized by a computer including the CPU 610, the ROM 620, and the RAM 630, similarly to the information processing apparatus 2 illustrated in FIG.
- the N-gram feature calculation unit 60 calculates a feature function for the extracted word string.
- the word and w i if the the N-1 word string immediately before was w i-N + 1 i- 1, feature functions of the N-gram can be obtained by the following equation.
- the N-gram feature calculation unit 60 sends the calculated N-gram feature function to the language model generation unit 35.
- the language model generation unit 35 generates a language model using the feature function from the N-gram feature calculation unit 60 in addition to the classification result from the global context classification unit 20.
- the information processing apparatus 5 according to the fourth embodiment configured as described above can obtain the effect of further improving the accuracy of the word generation probability in addition to the effect of the information processing apparatus 3 of the second embodiment.
- the N-gram feature function is a function that takes into account local word chain restrictions.
- the language model generation unit 35 of the information processing device 5 generates a language model that predicts the word generation probability in consideration of local word restrictions in addition to the global context classification result.
- FIG. 12 is a block diagram illustrating an example of the configuration of the information processing apparatus 6 according to the fifth embodiment.
- the information processing device 6 includes a trigger feature calculation unit 50 similar to the third embodiment and an N-gram feature calculation unit 60 similar to the fourth embodiment.
- a language model generation unit 36 is included instead of the language model generation unit 34.
- the configuration other than the language model generation unit 36 of the information processing device 6 is the same as that of the information processing device 4 or the information processing device 5, the configuration and operation unique to the present embodiment will be described, and the third embodiment and Descriptions similar to those in the fourth embodiment are omitted.
- the information processing apparatus 6 of the present embodiment may be realized by a computer including the CPU 610, the ROM 620, and the RAM 630, similarly to the information processing apparatus 2 illustrated in FIG.
- the language model generation unit 36 generates a language model using a global context classification, a feature function of a trigger pair, and an N-gram feature function.
- the information processing apparatus 6 of the fifth embodiment configured as described above can realize the effects of the information processing apparatus 4 of the third embodiment and the information processing apparatus 5 of the fourth embodiment.
- a global context extraction unit that identifies a word, character, or word string included in the data as a specific word, and extracts a set of words included in at least a predetermined range from the specific word as a global context;
- a context classification means for classifying the global context based on a predetermined viewpoint and outputting a classification result;
- An information processing apparatus comprising: a language model generating unit that generates a language model for calculating the generation probability of the specific word using the classification result.
- Context classification model generation means for generating a context classification model indicating a relationship between the set of words and a class based on the predetermined viewpoint based on predetermined language data;
- the information processing apparatus according to claim 1, wherein the context classification unit classifies the global context using the context classification model.
- the context classification model generation means includes: The information processing apparatus according to claim 2, wherein a model for calculating a posterior probability of a class when a set of words is given is generated using a set of a plurality of words given class information as learning data.
- the language model generation means includes The information processing apparatus according to appendix 2 or 3, wherein a maximum entropy model using the posterior probability of the class as a feature function is used.
- Trigger feature calculating means for calculating a feature function of a trigger pair between a word included in the global context and the specific word;
- the information processing apparatus according to any one of claims 1 to 4, wherein the language model generation unit generates a language model using the classification result and the feature function of the trigger pair.
- Trigger feature calculating means for calculating a feature function of a trigger pair between a word included in the global context and the specific word;
- a feature function calculating means for calculating a feature function of N-gram immediately before the specific word,
- Appendix 8 Specifying a word, character or word string included in the data as a specific word, extracting a set of words included in at least a predetermined range from the specific word as a global context, Classifying the global context based on a predetermined viewpoint, and outputting a classification result; An information processing method for generating a language model for calculating a generation probability of the specific word using the classification result.
- Appendix 11 The information processing method according to appendix 9 or 10, wherein a maximum entropy model using the posterior probability of the class as a feature function is used.
- Appendix 17 A computer-readable recording of a program according to appendix 16, which causes a computer to execute a process of calculating a posterior probability of a class when a set of words is given using a set of a plurality of words assigned class information as learning data Possible recording media.
- Appendix 19 Processing to calculate a feature function of a trigger pair between a word included in the global context and the specific word;
- a computer-readable recording medium recording the program according to any one of appendices 15 to 18, which causes a computer to execute a result of the classification and a process of generating a language model using the feature function of the trigger pair .
- Appendix 21 Processing to calculate a feature function of a trigger pair between a word included in the global context and the specific word; A process of calculating a feature function of N-gram immediately before the specific word; 21.
- the program according to any one of appendices 15 to 20, which causes a computer to execute a process of generating a language model using the classification result, the feature function of the trigger pair, and the feature function of the N-gram.
- the present invention can improve the accuracy of generating a statistical language model used in the fields of speech recognition, character recognition, and spell check.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
Description
図1は、本発明における第1の実施形態に係る情報処理装置1の構成の一例を示すブロック図である。
なお、本実施形態に係る情報処理装置1の構成は、これまでの説明に限られるわけでない。情報処理装置1は、各構成を、複数の構成に分けても良い。例えば、情報処理装置1は、大域文脈抽出部10を、図示しない言語モデル学習データの受信部と、大域文脈を抽出する処理部と、大域文脈を送信する送信部とに分けても良い。
図6は、本発明における第2の実施形態に係る情報処理装置3の構成の一例を示すブロック図である。
図10は、第3の実施形態に係る情報処理装置4の構成の一例を示すブロック図である。
図11は、第4の実施形態に係る情報処理装置5の構成の一例を示すブロック図である。
図12は、第5の実施形態に係る情報処理装置6の構成の一例を示すブロック図である。
データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出する大域文脈抽出手段と、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力する文脈分類手段と、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する言語モデル生成手段と
を含む情報処理装置。
所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成する文脈分類モデル生成手段を含み、
前記文脈分類手段は、前記文脈分類モデルを用いて前記大域文脈を分類する
付記1に記載の情報処理装置。
前記文脈分類モデル生成手段は、
クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算するモデルを生成する
付記2に記載の情報処理装置。
前記言語モデル生成手段は、
前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる
付記2または3に記載の情報処理装置。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算するトリガー素性計算手段を含み、
前記言語モデル生成手段は、前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する
付記1乃至4のいずれか1項に記載の情報処理装置。
前記特定単語の直前のNグラムの素性関数を計算する素性関数計算手段を含み、
前記言語モデル生成手段は、前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する
付記1乃至5のいずれか1項に記載の情報処理装置。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算するトリガー素性計算手段と、
前記特定単語の直前のNグラムの素性関数を計算する素性関数計算手段とを含み、
前記言語モデル生成手段は、前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する
付記1乃至6のいずれか1項に記載の情報処理装置。
データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出し、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力し、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する
情報処理方法。
所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成し、
前記文脈分類モデルを用いて前記大域文脈を分類する
付記8に記載の情報処理方法。
クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算するモデルを生成する
付記9に記載の情報処理方法。
前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる
付記9または10に記載の情報処理方法。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算し、
前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する
付記8乃至11のいずれか1項に記載の情報処理方法。
前記特定単語の直前のNグラムの素性関数を計算し、
前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する
付記8乃至12のいずれか1項に記載の情報処理方法。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算し、
前記特定単語の直前のNグラムの素性関数を計算し、
前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する
付記8乃至13のいずれか1項に記載の情報処理方法。
データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出する処理と、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力する処理と、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する処理と
をコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体。
所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成する処理と、
前記文脈分類モデルを用いて前記大域文脈を分類する処理と
をコンピュータに実行させる付記15に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算する処理
をコンピュータに実行させる付記16に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる
付記15または16に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算する処理と、
前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる付記15乃至18のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
前記特定単語の直前のNグラムの素性関数を計算する処理と、
前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる付記15乃至19のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算する処理と、
前記特定単語の直前のNグラムの素性関数を計算する処理と、
前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる付記15乃至20のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
2 情報処理装置
3 情報処理装置
4 情報処理装置
5 情報処理装置
6 情報処理装置
9 情報処理装置
10 大域文脈抽出部
20 大域文脈分類部
30 言語モデル生成部
34 言語モデル生成部
35 言語モデル生成部
36 言語モデル生成部
40 文脈分類モデル生成部
50 トリガー素性計算部
60 Nグラム素性計算部
110 言語モデル学習データ記憶部
120 文脈分類モデル学習データ記憶部
130 文脈分類モデル記憶部
140 言語モデル記憶部
610 CPU
620 ROM
630 RAM
640 IO
650 記憶装置
660 入力機器
670 表示機器
700 記憶媒体
910 大域文脈抽出部
920 トリガー素性計算部
930 言語モデル生成部
940 言語モデル学習データ記憶部
950 言語モデル記憶部
Claims (21)
- データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出する大域文脈抽出手段と、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力する文脈分類手段と、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する言語モデル生成手段とを含む情報処理装置。 - 所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成する文脈分類モデル生成手段を含み、
前記文脈分類手段は、前記文脈分類モデルを用いて前記大域文脈を分類する請求項1に記載の情報処理装置。 - 前記文脈分類モデル生成手段は、
クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算するモデルを生成する請求項2に記載の情報処理装置。 - 前記言語モデル生成手段は、
前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる請求項2または3に記載の情報処理装置。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算するトリガー素性計算手段を含み、
前記言語モデル生成手段は、前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する請求項1乃至4のいずれか1項に記載の情報処理装置。 - 前記特定単語の直前のNグラムの素性関数を計算する素性関数計算手段を含み、
前記言語モデル生成手段は、前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する請求項1乃至5のいずれか1項に記載の情報処理装置。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算するトリガー素性計算手段と、
前記特定単語の直前のNグラムの素性関数を計算する素性関数計算手段とを含み、
前記言語モデル生成手段は、前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する
請求項1乃至6のいずれか1項に記載の情報処理装置。 - データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出し、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力し、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する情報処理方法。 - 所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成し、
前記文脈分類モデルを用いて前記大域文脈を分類する
請求項8に記載の情報処理方法。 - クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算するモデルを生成する
請求項9に記載の情報処理方法。 - 前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる
請求項9または10に記載の情報処理方法。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算し、
前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する
請求項8乃至11のいずれか1項に記載の情報処理方法。 - 前記特定単語の直前のNグラムの素性関数を計算し、
前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する
請求項8乃至12のいずれか1項に記載の情報処理方法。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算し、
前記特定単語の直前のNグラムの素性関数を計算し、
前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する
請求項8乃至13のいずれか1項に記載の情報処理方法。 - データに含まれる単語、文字又は単語列を特定の単語として特定し、前記特定単語から少なくとも所定の範囲に含まれる単語の集合を大域文脈として抽出する処理と、
前記大域文脈を所定の観点を基に分類し、分類の結果を出力する処理と、
前記分類の結果を用いて、前記特定の単語の生成確率を算出する言語モデルを生成する処理とをコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体。 - 所定の言語データを基に前記単語の集合と前記所定の観点に基づくクラスとの関係を示す文脈分類モデルを生成する処理と、
前記文脈分類モデルを用いて前記大域文脈を分類する処理と
をコンピュータに実行させる請求項15に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 - クラスの情報が付与された複数の単語の集合を学習データとして、単語の集合が与えられたときのクラスの事後確率を計算する処理
をコンピュータに実行させる請求項16に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 - 前記クラスの事後確率を素性関数とした最大エントロピーモデルを用いる
請求項15または16に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算する処理と、
前記分類の結果と、前記トリガー対の素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる請求項15乃至18のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 - 前記特定単語の直前のNグラムの素性関数を計算する処理と、
前記分類の結果と、前記Nグラムの素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる請求項15乃至19のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 - 前記大域文脈に含まれる単語と前記特定単語との間のトリガー対の素性関数を計算する処理と、
前記特定単語の直前のNグラムの素性関数を計算する処理と、
前記分類の結果と、前記トリガー対の素性関数と、前記Nグラムの素性関数とを用いて言語モデルを生成する処理と
をコンピュータに実行させる請求項15乃至20のいずれか1項に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/440,931 US20150278194A1 (en) | 2012-11-07 | 2013-11-07 | Information processing device, information processing method and medium |
JP2014545575A JPWO2014073206A1 (ja) | 2012-11-07 | 2013-11-07 | 情報処理装置、及び、情報処理方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012245003 | 2012-11-07 | ||
JP2012-245003 | 2012-11-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014073206A1 true WO2014073206A1 (ja) | 2014-05-15 |
Family
ID=50684331
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/006555 WO2014073206A1 (ja) | 2012-11-07 | 2013-11-07 | 情報処理装置、及び、情報処理方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20150278194A1 (ja) |
JP (1) | JPWO2014073206A1 (ja) |
WO (1) | WO2014073206A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694443A (zh) * | 2017-04-05 | 2018-10-23 | 富士通株式会社 | 基于神经网络的语言模型训练方法和装置 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9812130B1 (en) * | 2014-03-11 | 2017-11-07 | Nvoq Incorporated | Apparatus and methods for dynamically changing a language model based on recognized text |
US10643616B1 (en) * | 2014-03-11 | 2020-05-05 | Nvoq Incorporated | Apparatus and methods for dynamically changing a speech resource based on recognized text |
US10268684B1 (en) | 2015-09-28 | 2019-04-23 | Amazon Technologies, Inc. | Optimized statistical machine translation system with rapid adaptation capability |
US10185713B1 (en) * | 2015-09-28 | 2019-01-22 | Amazon Technologies, Inc. | Optimized statistical machine translation system with rapid adaptation capability |
CN106506327B (zh) * | 2016-10-11 | 2021-02-19 | 东软集团股份有限公司 | 一种垃圾邮件识别方法及装置 |
CN112673421B (zh) * | 2018-11-28 | 2024-07-16 | 谷歌有限责任公司 | 训练和/或使用语言选择模型以自动确定用于口头话语的话音辨识的语言 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010051654A1 (en) * | 2008-11-05 | 2010-05-14 | Google Inc. | Custom language models |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5839106A (en) * | 1996-12-17 | 1998-11-17 | Apple Computer, Inc. | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model |
ATE383640T1 (de) * | 1998-10-02 | 2008-01-15 | Ibm | Vorrichtung und verfahren zur bereitstellung von netzwerk-koordinierten konversationsdiensten |
US6374217B1 (en) * | 1999-03-12 | 2002-04-16 | Apple Computer, Inc. | Fast update implementation for efficient latent semantic language modeling |
US6484136B1 (en) * | 1999-10-21 | 2002-11-19 | International Business Machines Corporation | Language model adaptation via network of similar users |
US6697793B2 (en) * | 2001-03-02 | 2004-02-24 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for generating phrases from a database |
GB0905457D0 (en) * | 2009-03-30 | 2009-05-13 | Touchtype Ltd | System and method for inputting text into electronic devices |
US8566097B2 (en) * | 2009-06-02 | 2013-10-22 | Honda Motor Co., Ltd. | Lexical acquisition apparatus, multi dialogue behavior system, and lexical acquisition program |
US8874432B2 (en) * | 2010-04-28 | 2014-10-28 | Nec Laboratories America, Inc. | Systems and methods for semi-supervised relationship extraction |
US8346563B1 (en) * | 2012-04-10 | 2013-01-01 | Artificial Solutions Ltd. | System and methods for delivering advanced natural language interaction applications |
-
2013
- 2013-11-07 JP JP2014545575A patent/JPWO2014073206A1/ja active Pending
- 2013-11-07 US US14/440,931 patent/US20150278194A1/en not_active Abandoned
- 2013-11-07 WO PCT/JP2013/006555 patent/WO2014073206A1/ja active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010051654A1 (en) * | 2008-11-05 | 2010-05-14 | Google Inc. | Custom language models |
Non-Patent Citations (2)
Title |
---|
MASATAKA IZUMI ET AL.: "Blog Chosha Nendai Suitei no Tameno Entropy ni yoru Tokuchogo Chushutsu", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS DAI 19 KAI DATA ENGINEERING WORKSHOP RONBUNSHU, 25 June 2009 (2009-06-25) * |
RONALD ROSENFELD: "A Maximum Entropy Approach to Adaptive Statistical Language Modeling", A MAXIMUM ENTROPY APPROACH TO ADAPTIVE STATISTICAL LANGUAGE MODELING, 21 May 1996 (1996-05-21), pages 1 - 37, Retrieved from the Internet <URL:http://www.cs.cmu.edu/afs/cs/Web/People/roni/papers/me-csl-revised.pdf> [retrieved on 20130109] * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694443A (zh) * | 2017-04-05 | 2018-10-23 | 富士通株式会社 | 基于神经网络的语言模型训练方法和装置 |
CN108694443B (zh) * | 2017-04-05 | 2021-09-17 | 富士通株式会社 | 基于神经网络的语言模型训练方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2014073206A1 (ja) | 2016-09-08 |
US20150278194A1 (en) | 2015-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11604956B2 (en) | Sequence-to-sequence prediction using a neural network model | |
US11157693B2 (en) | Stylistic text rewriting for a target author | |
CN108363790B (zh) | 用于对评论进行评估的方法、装置、设备和存储介质 | |
CN107808011B (zh) | 信息的分类抽取方法、装置、计算机设备和存储介质 | |
US20190354810A1 (en) | Active learning to reduce noise in labels | |
WO2014073206A1 (ja) | 情報処理装置、及び、情報処理方法 | |
US20190347571A1 (en) | Classifier training | |
US20150095017A1 (en) | System and method for learning word embeddings using neural language models | |
JP2020520492A (ja) | 文書要約自動抽出方法、装置、コンピュータ機器及び記憶媒体 | |
KR101715118B1 (ko) | 문서 감정 분류용 딥러닝 인코딩 장치 및 방법. | |
WO2020244065A1 (zh) | 基于人工智能的字向量定义方法、装置、设备及存储介质 | |
US20210035556A1 (en) | Fine-tuning language models for supervised learning tasks via dataset preprocessing | |
US20200175229A1 (en) | Summary generation method and summary generation apparatus | |
CN111985228B (zh) | 文本关键词提取方法、装置、计算机设备和存储介质 | |
US12073181B2 (en) | Systems and methods for natural language processing (NLP) model robustness determination | |
US9348901B2 (en) | System and method for rule based classification of a text fragment | |
US20210133279A1 (en) | Utilizing a neural network to generate label distributions for text emphasis selection | |
Ranjan et al. | A comparative study on code-mixed data of Indian social media vs formal text | |
JP6312467B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
US20230177251A1 (en) | Method, device, and system for analyzing unstructured document | |
CN112101042A (zh) | 文本情绪识别方法、装置、终端设备和存储介质 | |
CN115329075A (zh) | 基于分布式机器学习的文本分类方法 | |
US20220122586A1 (en) | Fast Emit Low-latency Streaming ASR with Sequence-level Emission Regularization | |
JP2017538226A (ja) | スケーラブルなウェブデータの抽出 | |
CN114896404A (zh) | 文档分类方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13852500 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014545575 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14440931 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13852500 Country of ref document: EP Kind code of ref document: A1 |