WO2011083528A1 - データ処理装置、そのコンピュータプログラムおよびデータ処理方法 - Google Patents
データ処理装置、そのコンピュータプログラムおよびデータ処理方法 Download PDFInfo
- Publication number
- WO2011083528A1 WO2011083528A1 PCT/JP2010/007021 JP2010007021W WO2011083528A1 WO 2011083528 A1 WO2011083528 A1 WO 2011083528A1 JP 2010007021 W JP2010007021 W JP 2010007021W WO 2011083528 A1 WO2011083528 A1 WO 2011083528A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pruning
- threshold
- hypothesis
- data
- hypotheses
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 64
- 238000004590 computer program Methods 0.000 title claims description 17
- 238000003672 processing method Methods 0.000 title claims description 13
- 238000013138 pruning Methods 0.000 claims abstract description 136
- 238000012360 testing method Methods 0.000 claims abstract description 55
- 238000000034 method Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 10
- 238000007476 Maximum Likelihood Methods 0.000 claims description 8
- 230000003247 decreasing effect Effects 0.000 claims 1
- 230000007423 decrease Effects 0.000 abstract description 9
- 230000006870 function Effects 0.000 description 5
- 244000141353 Prunus domestica Species 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000013179 statistical model Methods 0.000 description 4
- 230000001186 cumulative effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Definitions
- the present invention relates to a data processing apparatus that searches for hypotheses from input target data, and more particularly to a data processing apparatus that prunes hypotheses that exceed a pruning threshold during the search, a computer program thereof, and a data processing method.
- a beam search that reduces the amount of calculation by pruning a hypothesis having a pruning scale exceeding the pruning threshold during the search is often performed to improve the search efficiency.
- a pruning scale for beam search two scales of a score difference from the maximum likelihood hypothesis and the number of hypotheses are widely used.
- the score difference threshold is used for pruning a maximum likelihood hypothesis and a hypothesis whose score difference is larger than the threshold
- the hypothesis number threshold is used for pruning a hypothesis whose hypothesis rank is larger than the threshold.
- These threshold values may be statically fixed values, or may be dynamically changed for each audio frame using some standard. For example, a technique has been proposed in which the acoustic reliability in each voice frame is calculated and the score difference threshold value is dynamically adjusted according to the calculation.
- this conventional data processing apparatus includes a data input means 101, a feature quantity extraction means 102, a hypothesis score calculation means 103, a statistical model 104, a dynamic threshold setting means 105, a hypothesis branch.
- a cutting means 106 and a result output means 107 are provided.
- the conventional data processing apparatus having such a configuration operates as follows. That is, the data input means 101 inputs the data to be searched, the feature quantity extraction means 102 extracts the feature quantity from the target data, and the hypothesis score calculation means 103 calculates the feature quantity score using the statistical model 104. Then, the dynamic threshold setting means 105 sets the threshold value for each scale used for pruning, the hypothetical pruning means 106 performs hypothesis pruning based on the pruning threshold, and the result output means 107 finally The hypothesis with the highest score is output as a result (Non-patent Document 1).
- the former is equivalent to pruning using the score difference threshold of a and the latter is b, and when pruning is performed with b, which is a small score difference threshold, the correct hypothesis is pruned incorrectly. The possibility of causing a search error becomes high.
- the threshold is set to a value that does not reach the threshold at all for other pruning measures or is close to the threshold. There is a problem in that pruning is performed only with a scale that exceeds, and search errors are likely to occur.
- the present invention has been made in view of the above-described problems, and provides a data processing device, a computer program, and a data processing method thereof, which have at least one of a recognition speed and a recognition accuracy higher than those of the prior art. is there.
- the data processing apparatus of the present invention the data input means for inputting the test data for which the correct hypothesis is confirmed in the learning mode for each predetermined input unit, and for inputting the target data for the hypothesis search for each input unit in the search mode;
- a feature quantity extraction unit that analyzes input test data and target data to extract each feature quantity, and uses the extracted feature quantities to calculate multiple pruning measures for each hypothesis of test data and target data
- a data plotting means for plotting a plurality of hypotheses of the input test data in a threshold space defined by the plurality of pruning scales corresponding to the calculated pruning scales,
- a uniform density surface setting means for setting a plurality of equal density surfaces in a threshold space corresponding to the density of a plurality of hypotheses, and a part of one selected from the plurality of equal density surfaces.
- Threshold surface generation means for generating a threshold surface consisting of multiple pruning thresholds, at least one of which increases when at least one of the multiple pruning measures decreases, and a hypothetical surface consisting of multiple hypotheses of the target data, respectively
- Hypothesis curved surface generation means for generating in the threshold space corresponding to the pruning scale, and hypothesis pruning for pruning multiple hypotheses of the target data with the position where the generated hypothetical surface intersects the threshold curved surface as a pruning threshold Means.
- the computer program of the present invention is a computer program of the data processing apparatus of the present invention, in which test data in which a correct hypothesis is confirmed is input for each predetermined input unit in the learning mode, and the target data for the hypothesis search in the search mode.
- Hypothesis scale calculation processing that calculates multiple pruning scales for each of multiple hypotheses, and multiple hypotheses of the input test data are defined by multiple pruning scales corresponding to each calculated pruning scale
- Data plot processing to plot in the threshold space and equal density to set multiple equal density surfaces in the threshold space corresponding to the density of multiple hypotheses plotted Threshold space is defined as a threshold surface consisting of a plurality of pruning thresholds that rise when at least one of a plurality of pruning scales is reduced, with a part of one selected from a plurality of equal density surfaces as a part.
- Threshold surface generation processing to be generated, hypothetical surface generation processing to generate a hypothetical surface consisting of a plurality of hypotheses of the target data in the threshold space corresponding to each calculated pruning scale, and the generated hypothetical surface is a threshold surface And a hypothesis pruning process for pruning a plurality of hypotheses of the target data with the position where the crossing is taken as a pruning threshold.
- the data processing method of the present invention is a data processing method of the data processing apparatus of the present invention, in which test data for which a correct hypothesis is confirmed is input for each predetermined input unit in the learning mode, and hypothesis search is performed in the search mode.
- Hypothesis scale calculation operation that calculates multiple pruning measures for multiple hypotheses of data, and multiple hypotheses of input test data are defined by multiple pruning scales corresponding to the calculated pruning scales, respectively.
- a data plotting operation for plotting in a threshold space a uniform density surface setting operation for setting a plurality of isodensity surfaces in the threshold space corresponding to the density of the plotted hypotheses, Threshold surface that generates a threshold surface in the threshold space that includes a plurality of pruning thresholds that increase when at least one of a plurality of pruning scales decreases, with a part selected from a number of equi-density surfaces as a part Generation operation, hypothetical surface generation operation that generates a hypothetical surface consisting of multiple hypotheses of the target data in the threshold space corresponding to each calculated pruning scale, and the position where the generated hypothetical surface intersects the threshold surface And a hypothesis pruning operation for pruning a plurality of hypotheses of the target data as a pruning threshold.
- the various components of the present invention need only be formed so as to realize their functions.
- dedicated hardware that exhibits a predetermined function
- data processing in which a predetermined function is provided by a computer program It can be realized as an apparatus, a predetermined function realized in the data processing apparatus by a computer program, an arbitrary combination thereof, or the like.
- a plurality of components are formed as a single member, and a single component is formed of a plurality of members. It may be that a certain component is a part of another component, a part of a certain component overlaps with a part of another component, or the like.
- the order of the plurality of processes and the plurality of operations can be changed within a range that does not hinder the contents.
- the computer program and the data processing method of the present invention are not limited to being executed at a timing when a plurality of processes and a plurality of operations are individually different. For this reason, other processes and operations occur during execution of certain processes and operations, and the execution timing of certain processes and operations overlaps with the execution timing of other processes and operations. Etc.
- the data processing apparatus reads a computer program and executes a corresponding processing operation, so that a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), an I / F It can be implemented as hardware constructed by general-purpose devices such as (Interface) units, dedicated logic circuits constructed to execute predetermined processing operations, combinations thereof, and the like.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- I / F I/ F
- causing the data processing apparatus to execute various operations corresponding to the computer program also means causing the data processing apparatus to control operations of the various devices.
- storing various data in the data processing device means that the CPU stores various data in an information storage medium such as an HDD (Hard Disc Drive) fixed to the data processing device, and can be exchanged for the data processing device.
- the CPU allows various data to be stored by the CD drive in an information storage medium such as a CD-R (Compact Disc-Recordable) loaded.
- the data input means inputs test data for which a correct hypothesis is confirmed in the learning mode for each predetermined input unit, and inputs target data for the hypothesis search for each input unit in the search mode.
- the feature quantity extraction unit extracts the feature quantity by analyzing the input test data and target data.
- the hypothesis scale calculation means calculates a plurality of pruning scales for each of a plurality of hypotheses of the test data and the target data using the extracted feature amount.
- the data plotting means plots a plurality of hypotheses of the input test data in a threshold space defined by a plurality of pruning scales corresponding to each calculated pruning scale. A plurality of equal density surfaces are set in the threshold space corresponding to the plotted densities of the plurality of hypotheses.
- Threshold surface generation means for generating a threshold curved surface consisting of a plurality of pruning thresholds, at least one of which increases when at least one of a plurality of pruning scales decreases, with a part selected from a plurality of equal density surfaces as a threshold.
- Hypothesis surface generation means generates a hypothesis surface consisting of a plurality of hypotheses of the target data in the threshold space corresponding to the calculated pruning scale.
- the hypothesis pruning means prunes a plurality of hypotheses of the target data using the position where the generated hypothesis curved surface intersects the threshold curved surface as a pruning threshold. For this reason, when searching for one hypothesis from the target data in the search mode, a plurality of pruning scales of the pruning threshold changes appropriately. Therefore, it is possible to provide a data processing device having at least one of recognition speed and recognition accuracy higher than the conventional one.
- the data processing apparatus 200 inputs test data TD in which a correct hypothesis is confirmed in the learning mode for each predetermined input unit, and in the search mode, the target data CD for hypothesis search.
- Data input unit 201 for each input unit feature amount extraction unit 202 that analyzes input test data TD and target data CD and extracts each feature amount CV, and the extracted feature amount CV.
- a hypothesis scale calculation unit 203 that calculates a plurality of pruning measures PM for each of a plurality of hypotheses of the test data TD and the target data CD, and a plurality of hypotheses of the input test data TD correspond to the calculated pruning measures PM, respectively.
- An equal density surface setting unit 205 that sets the density surface EC in the threshold space SS, and at least one of the plurality of pruning scales PM with a part of one selected from the plurality of equal density surfaces EC as a part decreases.
- a threshold curved surface generating unit 206 that generates a threshold curved surface SC composed of a plurality of pruning thresholds PS that rises in the threshold space SS, and a hypothetical curved surface HC composed of a plurality of hypotheses of the target data CD as calculated pruning measures PM.
- a hypothesis curved surface generation unit 207 that generates in the threshold space SS, and a hypothesis pruning that prunes a plurality of hypotheses of the target data CD with the position where the generated hypothetical curved surface HC intersects the threshold curved surface SC as a pruning threshold PS.
- a hypothesis pruning that prunes a plurality of hypotheses of the target data CD with the position where the generated hypothetical curved surface HC intersects the threshold curved surface SC as a pruning threshold PS.
- the data processing device 200 searches the statistical model 210 for calculating the score of the target data, and the search result having the maximum score accumulated from a plurality of hypotheses pruned in the search mode. And a result output unit 209 that outputs as SR.
- the threshold space SS is composed of a two-dimensional threshold plane SS defined by a score difference SD from a maximum likelihood hypothesis of a hypothesis having two pruning measures PM and a hypothesis rank HR.
- the hypothesis scale calculation unit 203 calculates a score for each of a plurality of hypotheses of the test data TD and the target data CD using the extracted feature quantity CV, and calculates the score difference SD and the hypothesis rank HR as a pruning scale PM. .
- the equal density surface setting unit 205 sets, on the threshold plane SS, equal density lines EC that are a plurality of equal density surfaces EC corresponding to the plotted densities of a plurality of hypotheses.
- the threshold curved surface generation unit 206 includes a plurality of pruning thresholds PS, one of which is selected from a plurality of isodensity lines EC, and the other is increased when one of the score difference SD and the hypothesis ranking HR decreases.
- the threshold curve SC is generated on the threshold plane SS as the threshold curved surface SC.
- the hypothesis curved surface generation unit 207 generates a hypothesis curve composed of a plurality of hypotheses of the target data CD on the threshold plane SS corresponding to the score difference SD and hypothesis rank HR calculated as the hypothesis curved surface HC.
- the data processing apparatus 200 is realized as a computer apparatus in which a computer program is installed, for example.
- the computer program includes, for example, a data input process in which test data TD in which a correct hypothesis is confirmed is input for each predetermined input unit in the learning mode and the target data CD for hypothesis search is input for each input unit in the search mode.
- It is composed of a plurality of pruning threshold values PS that increase when at least one of a plurality of pruning scales PM falls, with a part of one selected from a plurality of isodensity lines EC as a part, and equal density surface setting processing
- Threshold surface generation processing for generating a threshold curve SC on the threshold plane SS
- hypothesis curved surface generation for generating a hypothetical surface HC composed of a plurality of hypotheses of the target data CD on the threshold plane SS corresponding to each calculated pruning scale PM
- hypothesis pruning processing for pruning a plurality of hypotheses of the target data CD using the position at which the generated hypothesis curved surface HC intersects the threshold curve SC as a pruning threshold PS, and a plurality of hypotheses pruned in the search mode
- a result output process for outputting the one with the largest accumulated score as the search result SR.
- the data processing apparatus 200 is set as an operation mode in which a learning mode and a search mode can be switched, for example.
- a threshold curve SC that is a threshold curve SC is generated from the input test data TD and set in the data processing device 200.
- the set threshold curve SC is displayed. Utilizing this, one hypothesis is output as a search result from the input target data CD.
- test data TD and target data CD as speech data and hypothesis search as speech recognition
- step S1-Y test data TD for which a correct hypothesis is confirmed is input for each voice frame as a predetermined input unit. At this time, a sufficient amount of test data TD is input under a sufficiently wide beam width.
- step S3 Analyze the input test data TD to extract the feature value CV (step S3) This extraction is performed, for example, by detecting MFCC (Mel Frequency Cepstrum Coefficient) from the spectrum of the input voice of the test data TD inputted for each voice frame.
- MFCC Mel Frequency Cepstrum Coefficient
- a plurality of pruning scales PM is calculated for each of a plurality of hypotheses of the test data TD using the extracted feature quantity CV (step S4). More specifically, a score that is a likelihood is obtained from the feature quantity CV of the extracted test data TD and the statistical model 210, and the score of each hypothesis is calculated by adding it to the cumulative score.
- Such calculation of the score is executed by, for example, adding an acoustic score and a language score in speech recognition.
- the score difference SD from the maximum likelihood hypothesis of the hypothesis and the hypothesis rank HR are calculated as the pruning measure PM for speech recognition as described above.
- a plurality of hypotheses of the input test data TD are associated with the score difference SD and the hypothesis rank HR from the maximum likelihood hypothesis calculated as described above, as shown in FIG.
- a plot is made on the threshold plane SS, which is a two-dimensional threshold space SS defined by the pruning scale PM (step S5).
- a plurality of equal density lines EC which are special solutions of the equal density surface EC, are set in the threshold plane SS as shown (step S6). ).
- one is selected from a plurality of isodensity lines EC as shown in FIG. 5 in accordance with the performance and specifications of the data processing apparatus 200 and the required recognition accuracy.
- a certain threshold curve SC is generated on the threshold plane SS (step S7).
- the threshold curve SC is generated by using a specific curve such as a parabola so that when one of the two pruning scales PM decreases with a part of the isodensity line EC as a part, the other increases. This is executed by connecting to a part of the equal density line EC.
- the threshold curve SC generated in this way is set in the hypothesis pruning unit 208 (step S8), and the learning mode of the data processing device 200 is completed.
- the data processing apparatus 200 that has completed learning can perform speech recognition using the prepared threshold curve SC.
- the data processing device 200 is set to the search mode (step T1-Y), and the target speech that is the target data CD of the hypothesis search is input for each speech frame that is an input unit (step T2). ).
- the input test data TD is analyzed to extract each feature quantity CV (step T3).
- the score difference SD and the hypothesis rank HR from the maximum likelihood hypothesis are calculated as a plurality of pruning measures PM for each of a plurality of hypotheses of the target data CD using the extracted feature quantity CV (step T4).
- a hypothesis curve HC which is a special solution of the hypothesis curved surface HC composed of a plurality of hypotheses of the target data CD, is generated on the threshold plane SS corresponding to each calculated pruning scale PM. (Step T6).
- the hypothesis curve HC of the target data CD for each audio frame intersects the threshold curve SC. Therefore, a plurality of hypotheses of the target data CD are pruned using the position at which the hypothetical curved surface HC intersects the threshold curve SC as the pruning threshold PS (step T7).
- step T8 it is determined for each audio frame whether or not it is the final audio frame of the target data CD (step T8). If it is not the final audio frame (step T8-N), the next audio frame is received from the target data (step T2). .
- step T8-Y If it is the final speech frame (step T8-Y), the cumulative scores of the plurality of hypotheses pruned as described above are compared (step T9), and the hypothesis having the maximum cumulative score is output as the search result SR (step S9). T10). The hypothesis of the maximum score is output as a result.
- the data input unit 201 inputs test data TD for which a correct hypothesis is confirmed in the learning mode for each predetermined input unit, and in the search mode, the target of the hypothesis search Data CD is input for each input unit.
- the input test data TD and the target data CD are analyzed, and the feature quantity extraction unit 202 extracts the feature quantity CV.
- the hypothesis scale calculation unit 203 calculates a plurality of pruning measures PM for each of a plurality of hypotheses of the test data TD and the target data CD using the extracted feature amount CV.
- the data plotting unit 204 plots a plurality of hypotheses of the input test data TD on the threshold plane SS defined by the plurality of pruning scales PM corresponding to the calculated pruning scales PM.
- the equal density surface setting unit 205 sets a plurality of equal density lines EC on the threshold plane SS corresponding to the plotted densities of the plurality of hypotheses.
- a threshold curve SC composed of a plurality of pruning thresholds PS that rises when at least one of the plurality of pruning scales PM decreases with a part of one selected from the plurality of isodensity lines EC as a threshold plane SS. Is generated by the threshold curved surface generation unit 206.
- the hypothesis curved surface generation unit 207 generates a hypothesis curve HC composed of a plurality of hypotheses of the target data CD on the threshold plane SS corresponding to each calculated pruning scale PM.
- the hypothesis pruning unit 208 prunes a plurality of hypotheses of the target data CD using the position where the generated hypothesis curve HC intersects the threshold curve SC as a pruning threshold PS.
- the plurality of pruning scales PM of the pruning threshold PS change appropriately. Therefore, it is possible to provide the data processing device 200 having at least one of the recognition speed and the recognition accuracy higher than the conventional one.
- the hypothetical pruning is executed by generating the threshold curve SC and the hypothesis curve HC on the two-dimensional threshold plane SS defined by the two pruning scales PM.
- hypothetical pruning may be executed by generating a threshold curved surface SC and a hypothetical curved surface HC in a three-dimensional or higher threshold space SS defined by three or more pruning scales PM.
- the threshold space SS is four-dimensional or more
- the threshold curved surface SC and the hypothetical curved surface HC are expressed as mathematical hypersurfaces (not shown).
- test data TD and the target data CD are input voices, and the data processing apparatus 200 performs voice recognition.
- the data processing apparatus 200 of the present embodiment can be used for image recognition and the like as well.
- each unit of the data processing apparatus is logically realized as various functions by a computer program.
- each of these units can be formed as unique hardware, or can be realized as a combination of software and hardware.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (5)
- 学習モードでは正解仮説が確定しているテストデータを所定の入力単位ごとに入力して探索モードでは仮説探索の対象データを前記入力単位ごとに入力するデータ入力手段と、
入力された前記テストデータおよび前記対象データを分析して特徴量を各々抽出する特徴量抽出手段と、
抽出された前記特徴量を用いて前記テストデータおよび前記対象データの複数の仮説ごとに複数の枝刈尺度を計算する仮説尺度計算手段と、
入力された前記テストデータの複数の前記仮説を各々計算された前記枝刈尺度に対応して複数の前記枝刈尺度で規定されている閾値空間にプロットするデータプロット手段と、
プロットされた複数の前記仮説の密度に対応して複数の等密度面を前記閾値空間に設定する等密度面設定手段と、
複数の前記等密度面から選定された一つの一部を一部として複数の前記枝刈尺度の少なくとも一つが低下すると少なくとも一つが上昇する複数の前記枝刈閾値からなる閾値曲面を前記閾値空間に生成する閾値曲面生成手段と、
前記対象データの複数の前記仮説からなる仮説曲面を各々計算された前記枝刈尺度に対応して前記閾値空間に生成する仮説曲面生成手段と、
生成された前記仮説曲面が前記閾値曲面と交差する位置を前記枝刈閾値として前記対象データの複数の前記仮説を枝刈する仮説枝刈手段と、
を有するデータ処理装置。 - 前記閾値空間が二つの前記枝刈尺度である前記仮説の最尤仮説からのスコア差と仮説順位とで規定されている二次元の閾値平面からなり、
前記仮説尺度計算手段は、抽出された前記特徴量を用いて前記テストデータおよび前記対象データの複数の前記仮説ごとにスコアを算出して前記スコア差と前記仮説順位とを前記枝刈尺度として計算し、
前記等密度面設定手段は、プロットされた複数の前記仮説の密度に対応して複数の前記等密度面である等密度線を前記閾値平面に設定し、
前記閾値曲面生成手段は、複数の前記等密度線から選定された一つの一部を一部として前記スコア差と前記仮説順位との一方が低下すると他方が上昇する複数の前記枝刈閾値からなる閾値曲線を前記閾値曲面として前記閾値平面に生成し、
前記仮説曲面生成手段は、前記対象データの複数の前記仮説からなる仮説曲線を前記仮説曲面として各々計算された前記スコア差と前記仮説順位とに対応して前記閾値平面に生成する請求項1に記載のデータ処理装置。 - 前記探索モードで枝刈された複数の前記仮説から累積された前記スコアが最大の一つを探索結果として出力する結果出力手段を、さらに有する請求項2に記載のデータ処理装置。
- 請求項1ないし3の何れか一項に記載のデータ処理装置のコンピュータプログラムであって、
学習モードでは正解仮説が確定しているテストデータを所定の入力単位ごとに入力して探索モードでは仮説探索の対象データを前記入力単位ごとに入力するデータ入力処理と、
入力された前記テストデータおよび前記対象データを分析して特徴量を各々抽出する特徴量抽出処理と、
抽出された前記特徴量を用いて前記テストデータおよび前記対象データの複数の仮説ごとに複数の枝刈尺度を計算する仮説尺度計算処理と、
入力された前記テストデータの複数の前記仮説を各々計算された前記枝刈尺度に対応して複数の前記枝刈尺度で規定されている閾値空間にプロットするデータプロット処理と、
プロットされた複数の前記仮説の密度に対応して複数の等密度面を前記閾値空間に設定する等密度面設定処理と、
複数の前記等密度面から選定された一つの一部を一部として複数の前記枝刈尺度の少なくとも一つが低下すると少なくとも一つが上昇する複数の前記枝刈閾値からなる閾値曲面を前記閾値空間に生成する閾値曲面生成処理と、
前記対象データの複数の前記仮説からなる仮説曲面を各々計算された前記枝刈尺度に対応して前記閾値空間に生成する仮説曲面生成処理と、
生成された前記仮説曲面が前記閾値曲面と交差する位置を前記枝刈閾値として前記対象データの複数の前記仮説を枝刈する仮説枝刈処理と、
をデータ処理装置に実行させるコンピュータプログラム。 - 請求項1ないし3の何れか一項に記載のデータ処理装置のデータ処理方法であって、
学習モードでは正解仮説が確定しているテストデータを所定の入力単位ごとに入力して探索モードでは仮説探索の対象データを前記入力単位ごとに入力するデータ入力動作と、
入力された前記テストデータおよび前記対象データを分析して特徴量を各々抽出する特徴量抽出動作と、
抽出された前記特徴量を用いて前記テストデータおよび前記対象データの複数の仮説ごとに複数の枝刈尺度を計算する仮説尺度計算動作と、
入力された前記テストデータの複数の前記仮説を各々計算された前記枝刈尺度に対応して複数の前記枝刈尺度で規定されている閾値空間にプロットするデータプロット動作と、
プロットされた複数の前記仮説の密度に対応して複数の等密度面を前記閾値空間に設定する等密度面設定動作と、
複数の前記等密度面から選定された一つの一部を一部として複数の前記枝刈尺度の少なくとも一つが低下すると少なくとも一つが上昇する複数の前記枝刈閾値からなる閾値曲面を前記閾値空間に生成する閾値曲面生成動作と、
前記対象データの複数の前記仮説からなる仮説曲面を各々計算された前記枝刈尺度に対応して前記閾値空間に生成する仮説曲面生成動作と、
生成された前記仮説曲面が前記閾値曲面と交差する位置を前記枝刈閾値として前記対象データの複数の前記仮説を枝刈する仮説枝刈動作と、
を有するデータ処理方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011548868A JP5786717B2 (ja) | 2010-01-06 | 2010-12-02 | データ処理装置、そのコンピュータプログラムおよびデータ処理方法 |
US13/520,728 US9047562B2 (en) | 2010-01-06 | 2010-12-02 | Data processing device, information storage medium storing computer program therefor and data processing method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-000940 | 2010-01-06 | ||
JP2010000940 | 2010-01-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011083528A1 true WO2011083528A1 (ja) | 2011-07-14 |
Family
ID=44305275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/007021 WO2011083528A1 (ja) | 2010-01-06 | 2010-12-02 | データ処理装置、そのコンピュータプログラムおよびデータ処理方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9047562B2 (ja) |
JP (1) | JP5786717B2 (ja) |
WO (1) | WO2011083528A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013125203A1 (ja) * | 2012-02-21 | 2013-08-29 | 日本電気株式会社 | 音声認識装置、音声認識方法およびコンピュータプログラム |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5786717B2 (ja) * | 2010-01-06 | 2015-09-30 | 日本電気株式会社 | データ処理装置、そのコンピュータプログラムおよびデータ処理方法 |
JP7005463B2 (ja) * | 2018-09-27 | 2022-01-21 | 株式会社東芝 | 学習装置、学習方法及びプログラム |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02300798A (ja) * | 1989-05-15 | 1990-12-12 | A T R Jido Honyaku Denwa Kenkyusho:Kk | 音声認識装置におけるビーム制御方式 |
JPH04298796A (ja) * | 1991-03-28 | 1992-10-22 | Nec Corp | 音声認識装置 |
JPH0535292A (ja) * | 1991-07-26 | 1993-02-12 | Fujitsu Ltd | 動的計画法照合装置 |
JPH06282295A (ja) * | 1993-03-29 | 1994-10-07 | A T R Jido Honyaku Denwa Kenkyusho:Kk | 適応的探索方式 |
JPH10153999A (ja) * | 1996-11-25 | 1998-06-09 | Nec Corp | 音声認識装置 |
JPH10254496A (ja) * | 1997-03-11 | 1998-09-25 | Mitsubishi Electric Corp | 音声認識方式 |
JP2001075596A (ja) * | 1999-09-03 | 2001-03-23 | Mitsubishi Electric Corp | 音声認識装置、音声認識方法及び音声認識プログラムを記録した記録媒体 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6285786B1 (en) * | 1998-04-30 | 2001-09-04 | Motorola, Inc. | Text recognizer and method using non-cumulative character scoring in a forward search |
JP3004254B2 (ja) * | 1998-06-12 | 2000-01-31 | 株式会社エイ・ティ・アール音声翻訳通信研究所 | 統計的シーケンスモデル生成装置、統計的言語モデル生成装置及び音声認識装置 |
JP3660137B2 (ja) * | 1998-09-25 | 2005-06-15 | 株式会社東芝 | シミュレーション方法、シミュレータ、シミュレーションプログラムを記録した記録媒体および半導体装置の製造方法 |
WO2003005344A1 (en) * | 2001-07-03 | 2003-01-16 | Intel Zao | Method and apparatus for dynamic beam control in viterbi search |
US6788243B2 (en) * | 2001-09-06 | 2004-09-07 | Minister Of National Defence Of Her Majestry's Canadian Government The Secretary Of State For Defence | Hidden Markov modeling for radar electronic warfare |
US7603267B2 (en) * | 2003-05-01 | 2009-10-13 | Microsoft Corporation | Rules-based grammar for slots and statistical model for preterminals in natural language understanding system |
JP2005107743A (ja) * | 2003-09-29 | 2005-04-21 | Nec Corp | 学習システム |
US7946493B2 (en) * | 2007-09-27 | 2011-05-24 | Hand Held Products, Inc. | Wireless bar code transaction device |
JP5381988B2 (ja) * | 2008-07-28 | 2014-01-08 | 日本電気株式会社 | 対話音声認識システム、対話音声認識方法および対話音声認識用プログラム |
US8386401B2 (en) * | 2008-09-10 | 2013-02-26 | Digital Infuzion, Inc. | Machine learning methods and systems for identifying patterns in data using a plurality of learning machines wherein the learning machine that optimizes a performance function is selected |
JP5786717B2 (ja) * | 2010-01-06 | 2015-09-30 | 日本電気株式会社 | データ処理装置、そのコンピュータプログラムおよびデータ処理方法 |
US8762009B2 (en) * | 2010-11-18 | 2014-06-24 | I.D. Systems, Inc. | Impact sensor calibration tool |
US20130268271A1 (en) * | 2011-01-07 | 2013-10-10 | Nec Corporation | Speech recognition system, speech recognition method, and speech recognition program |
JPWO2012093661A1 (ja) * | 2011-01-07 | 2014-06-09 | 日本電気株式会社 | 音声認識装置、音声認識方法および音声認識プログラム |
-
2010
- 2010-12-02 JP JP2011548868A patent/JP5786717B2/ja active Active
- 2010-12-02 WO PCT/JP2010/007021 patent/WO2011083528A1/ja active Application Filing
- 2010-12-02 US US13/520,728 patent/US9047562B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02300798A (ja) * | 1989-05-15 | 1990-12-12 | A T R Jido Honyaku Denwa Kenkyusho:Kk | 音声認識装置におけるビーム制御方式 |
JPH04298796A (ja) * | 1991-03-28 | 1992-10-22 | Nec Corp | 音声認識装置 |
JPH0535292A (ja) * | 1991-07-26 | 1993-02-12 | Fujitsu Ltd | 動的計画法照合装置 |
JPH06282295A (ja) * | 1993-03-29 | 1994-10-07 | A T R Jido Honyaku Denwa Kenkyusho:Kk | 適応的探索方式 |
JPH10153999A (ja) * | 1996-11-25 | 1998-06-09 | Nec Corp | 音声認識装置 |
JPH10254496A (ja) * | 1997-03-11 | 1998-09-25 | Mitsubishi Electric Corp | 音声認識方式 |
JP2001075596A (ja) * | 1999-09-03 | 2001-03-23 | Mitsubishi Electric Corp | 音声認識装置、音声認識方法及び音声認識プログラムを記録した記録媒体 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013125203A1 (ja) * | 2012-02-21 | 2013-08-29 | 日本電気株式会社 | 音声認識装置、音声認識方法およびコンピュータプログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2011083528A1 (ja) | 2013-05-13 |
US20120310866A1 (en) | 2012-12-06 |
JP5786717B2 (ja) | 2015-09-30 |
US9047562B2 (en) | 2015-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101805976B1 (ko) | 음성 인식 장치 및 방법 | |
US20180254039A1 (en) | Speech recognition method and device | |
US10832685B2 (en) | Speech processing device, speech processing method, and computer program product | |
EP3121810A1 (en) | Apparatus and method of acoustic score calculation and speech recognition | |
JP2017016131A (ja) | 音声認識装置及び方法と電子装置 | |
US20150310335A1 (en) | Determining a performance prediction model for a target data analytics application | |
CN104538024A (zh) | 语音合成方法、装置及设备 | |
US9905224B2 (en) | System and method for automatic language model generation | |
KR20140028174A (ko) | 음성 인식 방법 및 이를 적용한 전자 장치 | |
US11227580B2 (en) | Speech recognition accuracy deterioration factor estimation device, speech recognition accuracy deterioration factor estimation method, and program | |
US20170169009A1 (en) | Apparatus and method for amending language analysis error | |
EP2988298B1 (en) | Response generation method, response generation apparatus, and response generation program | |
WO2018232591A1 (en) | SEQUENCE RECOGNITION PROCESSING | |
US20150255090A1 (en) | Method and apparatus for detecting speech segment | |
Kim et al. | Sequential labeling for tracking dynamic dialog states | |
JP5786717B2 (ja) | データ処理装置、そのコンピュータプログラムおよびデータ処理方法 | |
JP6276513B2 (ja) | 音声認識装置および音声認識プログラム | |
US20220270637A1 (en) | Utterance section detection device, utterance section detection method, and program | |
CN109727603B (zh) | 语音处理方法、装置、用户设备及存储介质 | |
CN112259084A (zh) | 语音识别方法、装置和存储介质 | |
KR20200102309A (ko) | 단어 유사도를 이용한 음성 인식 시스템 및 그 방법 | |
KR102144044B1 (ko) | 기계학습 기반 소프트웨어 정적 시험 거짓경보 분류 장치 및 방법 | |
McDonough et al. | An algorithm for fast composition of weighted finite-state transducers | |
JP4735958B2 (ja) | テキストマイニング装置、テキストマイニング方法およびテキストマイニングプログラム | |
WO2021101500A1 (en) | Rescoring automatic speech recognition hypotheses using audio-visual matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10842048 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011548868 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13520728 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10842048 Country of ref document: EP Kind code of ref document: A1 |