CN105912521A - 一种解析语音内容的方法及装置 - Google Patents
一种解析语音内容的方法及装置 Download PDFInfo
- Publication number
- CN105912521A CN105912521A CN201510995231.5A CN201510995231A CN105912521A CN 105912521 A CN105912521 A CN 105912521A CN 201510995231 A CN201510995231 A CN 201510995231A CN 105912521 A CN105912521 A CN 105912521A
- Authority
- CN
- China
- Prior art keywords
- phrases
- word
- word segmentation
- dictionary
- corpus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000011218 segmentation Effects 0.000 claims abstract description 144
- 238000005520 cutting process Methods 0.000 claims description 50
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 20
- 230000008569 process Effects 0.000 description 11
- 238000012549 training Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000010009 beating Methods 0.000 description 7
- 239000012634 fragment Substances 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000012261 overproduction Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510995231.5A CN105912521A (zh) | 2015-12-25 | 2015-12-25 | 一种解析语音内容的方法及装置 |
PCT/CN2016/096186 WO2017107518A1 (fr) | 2015-12-25 | 2016-08-22 | Procédé et appareil d'analyse d'un contenu vocal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510995231.5A CN105912521A (zh) | 2015-12-25 | 2015-12-25 | 一种解析语音内容的方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105912521A true CN105912521A (zh) | 2016-08-31 |
Family
ID=56744050
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510995231.5A Pending CN105912521A (zh) | 2015-12-25 | 2015-12-25 | 一种解析语音内容的方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105912521A (fr) |
WO (1) | WO2017107518A1 (fr) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399919A (zh) * | 2017-02-06 | 2018-08-14 | 中兴通讯股份有限公司 | 一种语义识别方法和装置 |
US20180342241A1 (en) * | 2017-05-25 | 2018-11-29 | Baidu Online Network Technology (Beijing) Co., Ltd . | Method and Apparatus of Recognizing Field of Semantic Parsing Information, Device and Readable Medium |
CN109447863A (zh) * | 2018-10-23 | 2019-03-08 | 广州努比互联网科技有限公司 | 一种4mat实时分析方法及系统 |
CN109446376A (zh) * | 2018-10-31 | 2019-03-08 | 广东小天才科技有限公司 | 一种通过分词对语音进行分类的方法及系统 |
CN109635270A (zh) * | 2017-10-06 | 2019-04-16 | 声音猎手公司 | 双向概率性的自然语言重写和选择 |
CN110998590A (zh) * | 2017-08-17 | 2020-04-10 | 国际商业机器公司 | 域特定的词汇驱动的预解析器 |
CN111831832A (zh) * | 2020-07-27 | 2020-10-27 | 北京世纪好未来教育科技有限公司 | 词表构建方法、电子设备及计算机可读介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10769375B2 (en) | 2017-08-17 | 2020-09-08 | International Business Machines Corporation | Domain-specific lexical analysis |
CN110390002A (zh) * | 2019-06-18 | 2019-10-29 | 深圳壹账通智能科技有限公司 | 通话资源配置方法、装置、计算机可读存储介质及服务器 |
CN112016297B (zh) * | 2020-08-27 | 2023-03-28 | 深圳壹账通智能科技有限公司 | 意图识别模型测试方法、装置、计算机设备和存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101404035A (zh) * | 2008-11-21 | 2009-04-08 | 北京得意音通技术有限责任公司 | 一种基于文本或语音的信息搜索方法 |
CN103294666A (zh) * | 2013-05-28 | 2013-09-11 | 百度在线网络技术(北京)有限公司 | 语法编译方法、语义解析方法以及对应装置 |
US20140249802A1 (en) * | 2013-03-01 | 2014-09-04 | The Software Shop, Inc. | Systems and methods for improving the efficiency of syntactic and semantic analysis in automated processes for natural language understanding using argument ordering |
CN104077275A (zh) * | 2014-06-27 | 2014-10-01 | 北京奇虎科技有限公司 | 一种基于语境进行分词的方法和装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8612207B2 (en) * | 2004-03-18 | 2013-12-17 | Nec Corporation | Text mining device, method thereof, and program |
US8392453B2 (en) * | 2004-06-25 | 2013-03-05 | Google Inc. | Nonstandard text entry |
CN100405362C (zh) * | 2005-10-13 | 2008-07-23 | 中国科学院自动化研究所 | 一种汉语口语解析方法及装置 |
CN101788989A (zh) * | 2009-01-22 | 2010-07-28 | 蔡亮华 | 词汇信息处理方法及系统 |
CN105096933B (zh) * | 2015-05-29 | 2017-06-20 | 百度在线网络技术(北京)有限公司 | 分词词典的生成方法和装置及语音合成方法和装置 |
-
2015
- 2015-12-25 CN CN201510995231.5A patent/CN105912521A/zh active Pending
-
2016
- 2016-08-22 WO PCT/CN2016/096186 patent/WO2017107518A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101404035A (zh) * | 2008-11-21 | 2009-04-08 | 北京得意音通技术有限责任公司 | 一种基于文本或语音的信息搜索方法 |
US20140249802A1 (en) * | 2013-03-01 | 2014-09-04 | The Software Shop, Inc. | Systems and methods for improving the efficiency of syntactic and semantic analysis in automated processes for natural language understanding using argument ordering |
CN103294666A (zh) * | 2013-05-28 | 2013-09-11 | 百度在线网络技术(北京)有限公司 | 语法编译方法、语义解析方法以及对应装置 |
CN104077275A (zh) * | 2014-06-27 | 2014-10-01 | 北京奇虎科技有限公司 | 一种基于语境进行分词的方法和装置 |
Non-Patent Citations (1)
Title |
---|
《计算机与数字工程》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399919A (zh) * | 2017-02-06 | 2018-08-14 | 中兴通讯股份有限公司 | 一种语义识别方法和装置 |
US20180342241A1 (en) * | 2017-05-25 | 2018-11-29 | Baidu Online Network Technology (Beijing) Co., Ltd . | Method and Apparatus of Recognizing Field of Semantic Parsing Information, Device and Readable Medium |
US10777192B2 (en) * | 2017-05-25 | 2020-09-15 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus of recognizing field of semantic parsing information, device and readable medium |
CN110998590A (zh) * | 2017-08-17 | 2020-04-10 | 国际商业机器公司 | 域特定的词汇驱动的预解析器 |
CN110998590B (zh) * | 2017-08-17 | 2024-01-26 | 国际商业机器公司 | 域特定的词汇驱动的预解析器 |
CN109635270A (zh) * | 2017-10-06 | 2019-04-16 | 声音猎手公司 | 双向概率性的自然语言重写和选择 |
CN109635270B (zh) * | 2017-10-06 | 2023-03-07 | 声音猎手公司 | 双向概率性的自然语言重写和选择 |
CN109447863A (zh) * | 2018-10-23 | 2019-03-08 | 广州努比互联网科技有限公司 | 一种4mat实时分析方法及系统 |
CN109446376A (zh) * | 2018-10-31 | 2019-03-08 | 广东小天才科技有限公司 | 一种通过分词对语音进行分类的方法及系统 |
CN109446376B (zh) * | 2018-10-31 | 2021-06-25 | 广东小天才科技有限公司 | 一种通过分词对语音进行分类的方法及系统 |
CN111831832A (zh) * | 2020-07-27 | 2020-10-27 | 北京世纪好未来教育科技有限公司 | 词表构建方法、电子设备及计算机可读介质 |
Also Published As
Publication number | Publication date |
---|---|
WO2017107518A1 (fr) | 2017-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105912521A (zh) | 一种解析语音内容的方法及装置 | |
JP6675463B2 (ja) | 自然言語の双方向確率的な書換えおよび選択 | |
US11113234B2 (en) | Semantic extraction method and apparatus for natural language, and computer storage medium | |
CN105917327B (zh) | 用于将文本输入到电子设备中的系统和方法 | |
KR101744861B1 (ko) | 합성어 분할 | |
US8719021B2 (en) | Speech recognition dictionary compilation assisting system, speech recognition dictionary compilation assisting method and speech recognition dictionary compilation assisting program | |
US8612206B2 (en) | Transliterating semitic languages including diacritics | |
CN104166462B (zh) | 一种文字的输入方法和系统 | |
CN1135485C (zh) | 利用计算机系统的日文文本字的识别 | |
CN108124477A (zh) | 基于伪数据改进分词器以处理自然语言 | |
JP6817556B2 (ja) | 類似文生成方法、類似文生成プログラム、類似文生成装置及び類似文生成システム | |
JP2011118689A (ja) | 検索方法及びシステム | |
Álvarez et al. | Towards customized automatic segmentation of subtitles | |
CN118246412A (zh) | 文本润色训练数据筛选方法、装置、相关设备及计算机程序产品 | |
JP2012037790A (ja) | 音声対話装置 | |
KR100509917B1 (ko) | 어절 엔-그램을 이용한 띄어쓰기와 철자 교정장치 및 방법 | |
CN107861937B (zh) | 对译语料库的更新方法、更新装置以及记录介质 | |
CN112949286A (zh) | 一种基于句式结构的汉语自动句法分析器 | |
JP2001229180A (ja) | コンテンツ検索装置 | |
Kuo et al. | Morphological and syntactic features for Arabic speech recognition | |
JP6260208B2 (ja) | テキスト要約装置 | |
US20110106849A1 (en) | New case generation device, new case generation method, and new case generation program | |
Gupta et al. | Quality Estimation of Machine Translation Outputs Through Stemming | |
Ray et al. | Iterative delexicalization for improved spoken language understanding | |
KR101982490B1 (ko) | 문자 데이터 변환에 기초한 키워드 검색 방법 및 그 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160831 |