CN111354339B - Vocabulary phoneme list construction method, device, equipment and storage medium - Google Patents

Vocabulary phoneme list construction method, device, equipment and storage medium Download PDF

Info

Publication number
CN111354339B
CN111354339B CN202010150627.0A CN202010150627A CN111354339B CN 111354339 B CN111354339 B CN 111354339B CN 202010150627 A CN202010150627 A CN 202010150627A CN 111354339 B CN111354339 B CN 111354339B
Authority
CN
China
Prior art keywords
vocabulary
phonetic
phonetic symbols
symbols
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010150627.0A
Other languages
Chinese (zh)
Other versions
CN111354339A (en
Inventor
赵伟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202010150627.0A priority Critical patent/CN111354339B/en
Publication of CN111354339A publication Critical patent/CN111354339A/en
Application granted granted Critical
Publication of CN111354339B publication Critical patent/CN111354339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种词汇音素表构建方法、装置、设备及存储介质,该方法包括:选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,提升了词汇音素表的质量,提高了词汇音素表的构建效率。

The invention discloses a vocabulary phoneme table construction method, device, equipment and storage medium. The method includes: selecting several vocabulary conversion phonetic symbols tools, and using the several vocabulary conversion phonetic symbols tools to mark phonetic symbols for the words to be marked, and obtain the phonetic symbols. Describe several phonetic symbols of the vocabulary to be marked; based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol; convert the target phonetic symbol into phonemes to generate a vocabulary phoneme table. As a result, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and the target phonetic symbols are determined based on the voting strategy, which improves the quality of the vocabulary phoneme table and improves the efficiency of constructing the vocabulary phoneme table.

Description

词汇音素表构建方法、装置、设备及存储介质Vocabulary phoneme table construction method, device, equipment and storage medium

技术领域Technical field

本发明涉及语音识别技术领域,尤其涉及一种词汇音素表构建方法、装置、设备及存储介质。The present invention relates to the technical field of speech recognition, and in particular to a method, device, equipment and storage medium for constructing a vocabulary phoneme table.

背景技术Background technique

随着计算机技术的发展,越来越多的技术(大数据、分布式、区块链Blockchain、人工智能等)应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,但由于金融行业的安全性、实时性要求,也对技术提出了更高的要求。With the development of computer technology, more and more technologies (big data, distributed, blockchain, artificial intelligence, etc.) are applied in the financial field. The traditional financial industry is gradually transforming into financial technology (Fintech). However, due to the financial The security and real-time requirements of the industry also put forward higher requirements for technology.

词汇音素表(lexicon表)是构建混合语音识别系统的关键部分。通常,为了提高语音识别效果,语音识别系统需要将字词转换为更加细粒度的音素。当前构建词汇音素表的方式是人工标注或者利用开源lexicon表,但是人工标注费时耗力,开源lexicon表构建的词汇音素表无法确保质量,并且大部分情况下会缺乏专业领域的专业词汇。The lexicon table (lexicon table) is a key part of building a hybrid speech recognition system. Generally, in order to improve speech recognition performance, speech recognition systems need to convert words into more fine-grained phonemes. The current way to build a vocabulary phoneme table is to manually label or use the open source lexicon table, but manual labeling is time-consuming and labor-intensive. The quality of the vocabulary phoneme table built by the open source lexicon table cannot be guaranteed, and in most cases it lacks professional vocabulary in professional fields.

发明内容Contents of the invention

本发明提供一种词汇音素表构建方法、装置、设备及存储介质,旨在提升词汇音素表的质量,并提高词汇音素表的构建效率。The present invention provides a vocabulary phoneme table construction method, device, equipment and storage medium, aiming to improve the quality of the vocabulary phoneme table and improve the construction efficiency of the vocabulary phoneme table.

为实现上述目的,本发明提供一种词汇音素表构建方法,所述方法包括:In order to achieve the above object, the present invention provides a method for constructing a vocabulary phoneme table, which method includes:

选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;Select several vocabulary-to-phonetic notation tools, and use the several vocabulary-to-phonetic notation tools to respectively annotate phonetic symbols for the words to be tagged, and obtain several phonetic symbols of the words to be tagged;

基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;Based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol;

将所述目标音标转换成音素,生成词汇音素表。Convert the target phonetic symbols into phonemes to generate a vocabulary phoneme table.

优选地,所述投票策略用于根据音标的票数确定胜出音标;Preferably, the voting strategy is used to determine the winning phonetic symbol based on the number of votes for the phonetic symbol;

所述基于投票策略,从所述若干个音标中选出胜出音标作为目标音标的步骤包括:The step of selecting the winning phonetic symbol as the target phonetic symbol from the several phonetic symbols based on the voting strategy includes:

将所述若干个音标中的各个音标分别赋予一张原始票;Each of the plurality of phonetic symbols is assigned to an original ticket;

若所述若干个音标中存在相同音标,则将相同音标的原始票进行合票,并将合票后的音标标记为候选音标,统计所述候选音标的票数;If there are identical phonetic symbols among the several phonetic symbols, the original votes for the same phonetic symbols will be combined, and the combined phonetic symbols will be marked as candidate phonetic symbols, and the number of votes for the candidate phonetic symbols will be counted;

根据所述票数将所述候选音标排序,根据排序结果确定胜出音标,并将所述胜出音标作为目标音标。The candidate phonetic symbols are sorted according to the number of votes, the winning phonetic symbol is determined according to the sorting result, and the winning phonetic symbol is used as the target phonetic symbol.

优选地,所述将所述若干个音标中的各个音标分别赋予一张原始票的步骤之后,还包括:Preferably, after the step of assigning each of the plurality of phonetic symbols to an original ticket, the step further includes:

若所述若干个音标中不存在相同音标,则判定所述若干个音标中不存在胜出音标;If there are no identical phonetic symbols among the several phonetic symbols, it is determined that there is no winning phonetic symbol among the several phonetic symbols;

将对应的待标注词汇标记为歧义词汇,并将所述歧义词汇转入歧义词汇池;Mark the corresponding words to be labeled as ambiguous words, and transfer the ambiguous words into the ambiguous word pool;

通过若干个备用词汇转音标工具对所述歧义词汇池中的歧义词汇标注音标,直到获得所述歧义词汇的胜出音标。The ambiguous words in the ambiguous word pool are phonetically marked using several backup word-to-phonetic notation tools until the winning phonetic notation of the ambiguous word is obtained.

优选地,所述基于投票策略,从所述若干个音标中选出胜出音标作为目标音标的步骤之前,还包括:Preferably, before the step of selecting a winning phonetic symbol from the several phonetic symbols as the target phonetic symbol based on the voting strategy, the step further includes:

根据每个所述词汇转音标工具标注的音标的个数判断所述待标注词汇是否为多音词:Determine whether the word to be tagged is a polysyllabic word based on the number of phonetic symbols annotated by the word-to-phonetic symbol conversion tool:

若一个或多个所述词汇转音标工具标注出的音标个数大于1,则判定所述待标注词汇是多音词;If the number of phonetic symbols marked by one or more of the vocabulary conversion tools is greater than 1, it is determined that the vocabulary to be marked is a polysyllabic word;

若所述待标注词汇是多音词,则将所述多音词拆分成若干个待标注子词汇;If the vocabulary to be tagged is a polysyllabic word, the polysyllabic word is split into several sub-vocabularies to be tagged;

分别将所述若干个待标注子词汇与对应的音标关联保存,获得所述待标注子词汇的若干个音标后执行步骤:基于投票策略,从所述若干个音标中选出胜出音标作为目标音标。Respectively store the several sub-vocabularies to be marked in association with the corresponding phonetic symbols, and after obtaining the several phonetic symbols of the sub-vocabulary to be marked, perform the following steps: based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol .

优选地,所述选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标的步骤之后,还包括:Preferably, the step of selecting several vocabulary-to-phonetic notation tools, and using the several vocabulary-to-phonetic notation tools to respectively annotate phonetic symbols for the words to be annotated, and obtaining the several phonetic symbols of the words to be annotated, further includes:

将所述若干个音标归一化,获得格式一致的若干个音标,以供从所述格式一致的若干个音标中选出胜出音标。The several phonetic symbols are normalized to obtain several phonetic symbols with the same format, so as to select the winning phonetic symbol from the several phonetic symbols with the same format.

优选地,所述将所述目标音标转换成音素,生成词汇音素表的步骤包括:Preferably, the step of converting the target phonetic symbols into phonemes and generating a vocabulary phoneme table includes:

基于音素格式,将所述目标音标转换成音素,并根据所述待标注词汇的音素生成词汇音素表。Based on the phoneme format, the target phonetic symbols are converted into phonemes, and a vocabulary phoneme table is generated according to the phonemes of the vocabulary to be annotated.

优选地,将所述目标音标转换成音素,生成词汇音素表的步骤之后还包括:Preferably, the step of converting the target phonetic symbols into phonemes and generating a vocabulary phoneme table further includes:

接收词汇音素表更新请求,从所述词汇音素表更新请求中获取目标更新词汇和目标更新操作;Receive a vocabulary phoneme table update request, and obtain the target update vocabulary and target update operation from the vocabulary phoneme table update request;

基于所述目标更新操作,对所述目标词汇的音素执行对应的更新操作。Based on the target update operation, a corresponding update operation is performed on the phonemes of the target vocabulary.

此外,为实现上述目的,本发明还提供一种词汇音素表构建装置,所述词汇音素表构建装置包括:In addition, to achieve the above object, the present invention also provides a device for constructing a vocabulary phoneme table. The device for constructing a vocabulary phoneme table includes:

选择模块,用于选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;A selection module, used to select several vocabulary-to-phonetic symbols conversion tools, and use the several vocabulary-to-phonetic symbols conversion tools to respectively mark phonetic symbols for the words to be marked, and obtain several phonetic symbols of the words to be marked;

投票模块,用于基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;A voting module, used to select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol based on the voting strategy;

转换模块,用于将所述目标音标转换成音素,生成词汇音素表。A conversion module, used to convert the target phonetic symbols into phonemes and generate a vocabulary phoneme table.

此外,为实现上述目的,本发明还提供一种词汇音素表构建设备,所述词汇音素表构建设备包括处理器,存储器以及存储在所述存储器中的词汇音素表构建程序,所述词汇音素表构建程序被所述处理器运行时,实现如上所述的词汇音素表构建方法的步骤。In addition, to achieve the above object, the present invention also provides a vocabulary phoneme table construction device. The vocabulary phoneme table construction device includes a processor, a memory and a vocabulary phoneme table construction program stored in the memory. The vocabulary phoneme table construction program When the construction program is run by the processor, the steps of the vocabulary phoneme table construction method as described above are implemented.

此外,为实现上述目的,本发明还提供一种计算机存储介质,所述计算机存储介质上存储有词汇音素表构建程序,所述词汇音素表构建程序被处理器运行时实现如上所述词汇音素表构建方法的步骤。In addition, in order to achieve the above object, the present invention also provides a computer storage medium. A vocabulary phoneme table construction program is stored on the computer storage medium. When the vocabulary phoneme table construction program is run by the processor, the vocabulary phoneme table is implemented as described above. Steps to build a method.

相比现有技术,本发明提供一种词汇音素表构建方法、装置、设备及存储介质,该方法包括:选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,克服了人工标注所消耗的精力,且避免了人工标注或者开源lexicon表因缺乏某些词汇,如生僻词汇,专业导致词汇音表不准确的问题,提升了词汇音素表的质量,提高了词汇音素表的构建效率。Compared with the existing technology, the present invention provides a method, device, equipment and storage medium for constructing a vocabulary phoneme table. The method includes: selecting a number of vocabulary conversion tools for phonetic notation, and using the several vocabulary conversion tools for phonetic notation to convert words to be annotated. Mark phonetic symbols to obtain several phonetic symbols of the vocabulary to be marked; based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol; convert the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be annotated, and the target phonetic symbols are determined based on the voting strategy, which overcomes the energy consumed by manual annotation and avoids the lack of certain vocabulary due to manual annotation or open source lexicon tables, such as Uncommon vocabulary and professional problems lead to inaccurate vocabulary phoneme tables, which improves the quality of vocabulary phoneme tables and improves the efficiency of building vocabulary phoneme tables.

附图说明Description of drawings

图1是本发明各实施例涉及的词汇音素表构建设备的硬件结构示意图;Figure 1 is a schematic diagram of the hardware structure of a vocabulary phoneme table construction device involved in various embodiments of the present invention;

图2是本发明词汇音素表构建方法第一实施例的流程示意图;Figure 2 is a schematic flow chart of the first embodiment of the vocabulary phoneme table construction method of the present invention;

图3是本发明词汇音素表构建方法第二实施例的流程示意图;Figure 3 is a schematic flowchart of the second embodiment of the vocabulary phoneme table construction method of the present invention;

图4是本发明词汇音素表构建方法第三实施例的流程示意图;Figure 4 is a schematic flow chart of the third embodiment of the vocabulary phoneme table construction method of the present invention;

图5是本发明词汇音素表构建装置第一实施例的功能模块示意图。Figure 5 is a schematic diagram of the functional modules of the first embodiment of the vocabulary phoneme table construction device of the present invention.

本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization of the purpose, functional features and advantages of the present invention will be further described with reference to the embodiments and the accompanying drawings.

具体实施方式Detailed ways

应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention.

本发明实施例主要涉及的词汇音素表构建设备是指能够实现网络连接的网络连接设备,所述词汇音素表构建设备可以是服务器、云平台等。The vocabulary phoneme table construction device mainly involved in the embodiment of the present invention refers to a network connection device capable of realizing network connection. The vocabulary phoneme table construction device may be a server, a cloud platform, etc.

参照图1,图1是本发明各实施例涉及的词汇音素表构建设备的硬件结构示意图。本发明实施例中,词汇音素表构建设备可以包括处理器1001(例如中央处理器CentralProcessing Unit、CPU),通信总线1002,输入端口1003,输出端口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信;输入端口1003用于数据输入;输出端口1004用于数据输出,存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器,存储器1005可选的还可以是独立于前述处理器1001的存储装置。本领域技术人员可以理解,图1中示出的硬件结构并不构成对本发明的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Referring to FIG. 1 , FIG. 1 is a schematic diagram of the hardware structure of a device for constructing a vocabulary phoneme table according to various embodiments of the present invention. In the embodiment of the present invention, the vocabulary phoneme table construction device may include a processor 1001 (such as a Central Processing Unit, CPU), a communication bus 1002, an input port 1003, an output port 1004, and a memory 1005. Among them, the communication bus 1002 is used to realize connection communication between these components; the input port 1003 is used for data input; the output port 1004 is used for data output. The memory 1005 can be a high-speed RAM memory or a stable memory (non-volatile). memory), such as disk memory. The memory 1005 may optionally be a storage device independent of the aforementioned processor 1001. Those skilled in the art can understand that the hardware structure shown in Figure 1 does not limit the present invention, and may include more or fewer components than shown, or combine certain components, or arrange different components.

继续参照图1,图1中作为一种可读存储介质的存储器1005可以包括操作系统、网络通信模块、应用程序模块以及词汇音素表构建程序。在图1中,网络通信模块主要用于连接服务器,与服务器进行数据通信;而处理器1001可以调用存储器1005中存储的词汇音素表构建程序,并执行本发明实施例提供的词汇音素表构建方法。Continuing to refer to FIG. 1 , the memory 1005 as a readable storage medium in FIG. 1 may include an operating system, a network communication module, an application module, and a vocabulary phoneme table construction program. In Figure 1, the network communication module is mainly used to connect to the server and perform data communication with the server; and the processor 1001 can call the vocabulary phoneme table construction program stored in the memory 1005, and execute the vocabulary phoneme table construction method provided by the embodiment of the present invention. .

本发明实施例提供了一种词汇音素表构建方法。The embodiment of the present invention provides a method for constructing a vocabulary phoneme table.

参照图2,图2是本发明词汇音素表构建方法第一实施例的流程示意图。Referring to Figure 2, Figure 2 is a schematic flow chart of a first embodiment of a method for constructing a vocabulary phoneme table of the present invention.

本实施例中,所述词汇音素表构建方法应用于词汇音素表构建设备,所述方法包括:In this embodiment, the vocabulary phoneme table construction method is applied to a vocabulary phoneme table construction device, and the method includes:

步骤S101,选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;Step S101, select several vocabulary-to-phonetic notation tools, use the several vocabulary-to-phonetic notation tools to annotate phonetic symbols for the words to be tagged, and obtain several phonetic symbols of the words to be tagged;

词汇音素表是构建混合语音识别系统的关键部分。通常,为了保证识别效果,语音识别系统需要将字词转换为更加细粒度的音素。词汇数量有百万,但设计良好的音素往往只有几百个。语音识别系统通过对更粗粒度的音素建模,能显著降低搜索空间,提升识别效果。本实施例中,预先准备待标注词汇,所述待标注词汇包括中文词汇、英文词汇、日文词汇、法语词汇等。所述待标注词汇包括常用词汇、生僻词汇、专业词汇等。The vocabulary phoneme table is a key part of building a hybrid speech recognition system. Usually, in order to ensure the recognition effect, the speech recognition system needs to convert words into more fine-grained phonemes. There are millions of words, but only a few hundred well-designed phonemes. By modeling coarser-grained phonemes, the speech recognition system can significantly reduce the search space and improve the recognition effect. In this embodiment, words to be tagged are prepared in advance, and the words to be tagged include Chinese words, English words, Japanese words, French words, etc. The words to be labeled include commonly used words, rare words, professional words, etc.

进一步地,选择若干个所述词汇转音标工具,所述词汇转音标工具的数量大于或等于3。所述词汇转音标工具包括但不限于汉语词汇转拼音工具、英文词汇转美式发音音标工具。Further, select several of the vocabulary-to-phonetic notation tools, and the number of the vocabulary-to-phonetic notation tools is greater than or equal to 3. The tools for converting words to phonetic symbols include, but are not limited to, tools for converting Chinese words to pinyin and tools for converting English words to American pronunciation phonetic symbols.

所述词汇转音标工具可以将词汇转换成对应的音标。例如,对于词汇“专利”,则可以转换成音标[zhuan1 li4];对于patent,则可以转换成美式音标['peitnt]。The word-to-phonetic symbol conversion tool can convert words into corresponding phonetic symbols. For example, for the word "patent", it can be converted into the phonetic notation [zhuan1 li4]; for patent, it can be converted into the American phonetic notation ['peitnt].

不同的词汇转音标工具标注的音标的形式可能会不相同。因此,获得所述待标注词汇的若干个音标之后,将所述若干个音标归一化,获得格式一致的若干个音标。具体地,首先设置归一化后音标的格式,然后将所述若干个音标都归一化成该格式。例如,对于中文词汇,可以将格式设置为“拼音+音调”,并且音调用数字1(一声)、2(二声)、3(三声)、4(四声)、5(轻声)表示。例如,若某个词汇转音标工具标注的音标是用“ˉ、ˊ、ˇ、ˋ”表示音调,则将其转换为用数字表示的音调。例如对于词汇“麻”,若某个词汇转音标工具标注的音标是[má],则将其归一化为[ma2]。The forms of phonetic symbols annotated by different word-to-phonetic notation tools may be different. Therefore, after obtaining several phonetic symbols of the vocabulary to be labeled, the several phonetic symbols are normalized to obtain several phonetic symbols with the same format. Specifically, the format of the normalized phonetic symbols is first set, and then the several phonetic symbols are normalized into the format. For example, for Chinese vocabulary, the format can be set to "Pinyin + Tone", and the tones are represented by the numbers 1 (one tone), 2 (second tone), 3 (third tone), 4 (fourth tone), and 5 (soft tone). For example, if the phonetic symbols marked by a word-to-phonetic notation tool use "ˉ, ˊ, ˇ, ˋ" to represent tones, it will be converted into tones represented by numbers. For example, for the word "ma", if the phonetic symbol marked by a word conversion tool is [má], it will be normalized to [ma2].

可以理解地,通过若干个所述词汇转音标工具对所述待标注词汇标注音标后,则每一个词汇都有对应的若干个音标,词汇音标的个数与所述词汇转音标工具的个数对应。例如,若是通过5个词汇转音标工具对所述待标注词汇进行标注,若所述待标注词汇是单音词汇,则每一个待标注词汇都有对应的5个音标;若所述待标注词汇是多音词,则每一个待标注词汇对应的音标数量大于5个。It can be understood that after the words to be marked are marked with phonetic symbols through several of the vocabulary-to-phonetic notation tools, each word has a corresponding number of phonetic symbols, and the number of vocabulary phonetic symbols is equal to the number of the vocabulary-to-phonetic notation tools. correspond. For example, if the words to be tagged are tagged using 5 vocabulary-to-phonetic symbol conversion tools, and if the words to be tagged are monosyllabic words, then each word to be tagged has a corresponding 5 phonetic symbols; if the words to be tagged are If it is a polysyllabic word, the number of phonetic symbols corresponding to each word to be tagged is greater than 5.

步骤S102,基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;Step S102: Based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol;

可以理解地,各个词汇转音标工具的准确率都不是百分之百,因此各个词汇转音标工具对待标注词汇的进行音标标注后获得的音标可能会不相同。本实施例采用投票策略,从所述待标注词汇的若干个音标中挑选出正确的音标。It is understandable that the accuracy of each vocabulary-to-phonetic notation tool is not 100%, so the phonetic symbols obtained by each vocabulary-to-phonetic notation tool may be different after phonetic notation of the words to be annotated. This embodiment uses a voting strategy to select the correct phonetic symbol from several phonetic symbols of the vocabulary to be marked.

具体地,所述步骤S102:基于投票策略,从所述若干个音标中选出胜出音标作为目标音的步骤包括:Specifically, the step S102: based on the voting strategy, the step of selecting the winning phonetic symbol as the target phonetic symbol from the several phonetic symbols includes:

步骤S102a:将所述若干个音标中的各个音标分别赋予一张原始票;Step S102a: Assign each of the several phonetic symbols to an original ticket;

所述各个音标皆为投票对象,均可获得一张原始票。本实施例中所述原始票为虚拟票,分别赋予所述各个音标相同数量的原始票。为便于投票计数,本实施例将所述原始票的数量设置为一张。Each phonetic symbol mentioned above is subject to voting, and each person can receive an original vote. In this embodiment, the original tickets are virtual tickets, and the same number of original tickets are assigned to each phonetic symbol. To facilitate vote counting, this embodiment sets the number of original votes to one.

步骤S102b:若所述若干个音标中存在相同音标,则将相同音标的原始票进行合票,并将合票后的音标标记为候选音标,统计所述候选音标的票数;Step S102b: If there are identical phonetic symbols among the several phonetic symbols, combine the original votes for the same phonetic symbols, mark the combined phonetic symbols as candidate phonetic symbols, and count the number of votes for the candidate phonetic symbols;

一般地,考虑词汇转音标工具的准确性不是百分之百,由不同的词汇转音标工具对同一个词汇进行标注,也会存在相同音标。比较各个词汇转音标工具标注的音标,筛选出相同音标,并将所述相同音标的所述原始票进行合票。本实施例中,可以利用脚本、统计工具等对所述原始票进行合票。Generally speaking, considering that the accuracy of word-to-phonetic notation tools is not 100%, the same phonetic symbol may exist if the same word is annotated by different word-to-phonetic notation tools. Compare the phonetic symbols marked by the phonetic symbol conversion tool for each word, filter out the same phonetic symbols, and combine the original tickets for the same phonetic symbols. In this embodiment, scripts, statistical tools, etc. can be used to aggregate the original tickets.

进一步地,将合票后的音标标记为候选音标,统计并保存所述候选音标的票数,以供后续选择胜出音标。Further, the combined phonetic symbols are marked as candidate phonetic symbols, and the votes of the candidate phonetic symbols are counted and saved for subsequent selection of the winning phonetic symbol.

步骤S102c:根据所述票数将所述候选音标排序,根据排序结果确定胜出音标,并将所述胜出音标作为目标音标。Step S102c: Sort the candidate phonetic symbols according to the number of votes, determine the winning phonetic symbol according to the sorting result, and use the winning phonetic symbol as the target phonetic symbol.

合票后,计算各个所述候选音标的票数,并基于所述票数按排序规则将对应的候选音标排序。若所述排序规则为正向排序,也即按票数从高到低排序,则将排序第一的候选音标确定为胜出音标;若所述排序规则为逆向排序,也即按票数从低到高排序,则将排序倒数第一的候选音标确定为胜出音标,并将所述胜出音标作为目标音标。After the votes are combined, the number of votes for each of the candidate phonetic symbols is calculated, and the corresponding candidate phonetic symbols are sorted according to the sorting rules based on the number of votes. If the sorting rule is forward sorting, that is, sorted by the number of votes from high to low, the candidate phonetic symbol ranked first will be determined as the winning phonetic symbol; if the sorting rule is reverse sorting, that is, sorted by the number of votes from low to high Sorting, then the candidate phonetic symbol ranked first from the bottom is determined as the winning phonetic symbol, and the winning phonetic symbol is used as the target phonetic symbol.

例如,对于词汇“中”,若所述若干个音标的个数为5个,这5个音标分别为[zhong1]、[zhong1]、[zong1]、[zhong1]、[zong1],则分别为这5个音标赋予一张原始票,并将3个[zhong1]合票,2个[zong1]合票,所述[zhong1]和[zong1]即为候选音标。若以正向排序为排序规则,则根据票数将[zhong1]以及[zong1]正向排序,由于[zhong1]是3票,[zong1]是2票,因此排序结果为:[zhong1]>[zong1];[zhong1]排序第一,故确定[zhong1]为胜出音标。For example, for the word "中", if the number of the phonetic symbols is 5, and these 5 phonetic symbols are [zhong1], [zhong1], [zong1], [zhong1], [zong1], then they are respectively These 5 phonetic symbols are assigned to one original ticket, and 3 [zhong1] and 2 [zong1] are combined. The [zhong1] and [zong1] are the candidate phonetic symbols. If forward sorting is used as the sorting rule, [zhong1] and [zong1] will be sorted forward according to the number of votes. Since [zhong1] has 3 votes and [zong1] has 2 votes, the sorting result is: [zhong1]>[zong1 ]; [zhong1] ranks first, so [zhong1] is determined to be the winning phonetic symbol.

若所述待标注词汇的若干个音标中,包括一个或多个可以合票的音标,还包括一个或多个不能合票的音标,则由于合票后的音标票数肯定大于不能合票的音标的票数,因此可以忽略所述不能合票的一个或多个音标。If the several phonetic symbols of the vocabulary to be marked include one or more phonetic symbols that can be combined, and one or more phonetic symbols that cannot be combined, then the number of phonetic symbols after the combination is definitely greater than the phonetic symbols that cannot be combined. number of votes, so one or more phonetic symbols that cannot be combined can be ignored.

此外,还可以根据所述词汇转音标工具的个数,设置所述胜出音标的票数的最小值。例如,若所述词汇转音标工具的个数是7,则将所述胜出音标的票数的最小值设置为5。如此,可以进一步提高词汇音素表的准确性。In addition, the minimum number of votes for the winning phonetic symbol can also be set according to the number of vocabulary conversion tools. For example, if the number of vocabulary conversion tools for phonetic notation is 7, then the minimum number of votes for the winning phonetic notation is set to 5. In this way, the accuracy of the vocabulary phoneme table can be further improved.

进一步地,所述将所述若干个音标中的各个音标分别赋予一张原始票的步骤之后还包括:Further, the step of assigning each of the plurality of phonetic symbols to an original ticket also includes:

步骤S102a1:若所述若干个音标中不存在相同音标,则判定所述若干个音标中不存在胜出音标;Step S102a1: If there are no identical phonetic symbols among the several phonetic symbols, determine that there is no winning phonetic symbol among the several phonetic symbols;

若不存在相同音标,则说明每个词汇转音标工具标注的音标都不相同,因此难以确定哪个词汇转音标工具标注的音标是正确的,故判定不存在胜出音标。If there are no identical phonetic symbols, it means that the phonetic symbols marked by each word-to-phonetic notation tool are different. Therefore, it is difficult to determine which word-to-phonetic symbol tool has the correct phonetic symbol, so it is determined that there is no winning phonetic symbol.

步骤S102a2:将对应的待标注词汇标记为歧义词汇,并将所述歧义词汇转入歧义词汇池;Step S102a2: Mark the corresponding words to be labeled as ambiguous words, and transfer the ambiguous words into the ambiguous word pool;

若不存在胜出音标,则难以保证该词汇标注音标的准确性,因此,将不存在胜出音标的待标注词汇标记为歧义词汇,并将所述歧义词汇转入歧义词汇池。If there is no winning phonetic symbol, it is difficult to ensure the accuracy of the phonetic symbol labeling of the word. Therefore, the words to be labeled that do not have the winning phonetic symbol are marked as ambiguous words, and the ambiguous words are transferred to the ambiguous word pool.

步骤S102a3:通过若干个备用词汇转音标工具对所述歧义词汇池中的歧义词汇标注音标,直到获得所述歧义词汇的胜出音标。Step S102a3: Use several backup vocabulary conversion tools to annotate phonetic symbols for the ambiguous words in the ambiguous word pool until the winning phonetic symbol of the ambiguous word is obtained.

对于所述歧义词汇,则选择若干个备用词汇转音标工具重新对所述歧义词汇池中的词汇标注音标。所述若干个备用词汇转音标工具中至少有一个与之前使用的词汇转音标工具不相同。所述歧义词汇池中的词汇可以由人工进行标注。实际测试证明,本实施例技术方案产生的歧义词汇占待标注词汇的量小于1%,因此即使由人工对所述歧义词汇进行标注也不需耗费太大的人力即可完成。For the ambiguous vocabulary, select several backup vocabulary conversion tools to re-annotate phonetic symbols for the vocabulary in the ambiguous vocabulary pool. At least one of the several backup word-to-phonetic notation tools is different from the previously used word-to-phonetic notation tool. The words in the ambiguous vocabulary pool can be manually labeled. Actual tests have proven that the ambiguous words generated by the technical solution of this embodiment account for less than 1% of the words to be labeled. Therefore, even if the ambiguous words are manually labeled, it can be completed without spending too much manpower.

可以理解地,在其它实施例中,可以设置不同的投票策略,根据该投票策略获得胜出音标。例如,先将所述若干个音标进行分组,将相同的音标分在同一组,获得多组音标;然后统计各组音标中的音标个数,根据所述票数将所述候选音标排序,根据排序结果确定胜出音标。It can be understood that in other embodiments, different voting strategies can be set, and the winning phonetic symbol can be obtained according to the voting strategy. For example, first group the several phonetic symbols, put the same phonetic symbols into the same group, and obtain multiple groups of phonetic symbols; then count the number of phonetic symbols in each group of phonetic symbols, and sort the candidate phonetic symbols according to the number of votes. The result was determined to be the winner of phonetic symbols.

步骤S103,将所述目标音标转换成音素,生成词汇音素表。Step S103: Convert the target phonetic symbols into phonemes and generate a vocabulary phoneme table.

具体地,基于音素格式,将所述目标音标转换成音素,并将所述待标注词汇的音素保存成词汇音素表。预先确定所述音素格式,基于音素格式,将所述目标音标转换成音素。所述音素格式包括音素连接方式、音素排列顺序等;其中所述音素连接方式包括点号连接、空格连接等。例如,对于待标注词汇“普通话”,对应的目标音标为[pu1tong1hua4],将对应的音素表示为[p u1t ong1 h ua4]。Specifically, based on the phoneme format, the target phonetic symbols are converted into phonemes, and the phonemes of the vocabulary to be annotated are saved into a vocabulary phoneme table. The phoneme format is determined in advance, and the target phonetic symbols are converted into phonemes based on the phoneme format. The phoneme format includes phoneme connection methods, phoneme arrangement orders, etc.; wherein the phoneme connection methods include dot-symbol connection, space connection, etc. For example, for the word "Mandarin" to be annotated, the corresponding target phonetic symbol is [pu1tong1hua4], and the corresponding phoneme is represented as [p u1t ong1 h ua4].

获得所述待标注词汇中所有词汇的音素后,保存所述音素,生成所述词汇音素表。After obtaining the phonemes of all words in the vocabulary to be tagged, the phonemes are saved and the phoneme table of the words is generated.

本实施例通过上述方案,选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,提升了词汇音素表的质量,提高了词汇音素表的构建效率。In this embodiment, through the above solution, several vocabulary-to-phonetic notation tools are selected, and the several vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and several phonetic symbols of the words to be tagged are obtained; based on the voting strategy, from the Select the winning phonetic symbol from several phonetic symbols as the target phonetic symbol; convert the target phonetic symbol into phonemes to generate a vocabulary phoneme table. As a result, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and the target phonetic symbols are determined based on the voting strategy, which improves the quality of the vocabulary phoneme table and improves the efficiency of constructing the vocabulary phoneme table.

如图3所示,本发明第二实施例提出一种词汇音素表构建方法,基于上述图2所示的第一实施例,所述基于投票策略,从所述若干个音标中选出胜出音标作为目标音标的步骤之前,还包括:As shown in Figure 3, the second embodiment of the present invention proposes a method for constructing a vocabulary phoneme table. Based on the first embodiment shown in Figure 2, the winning phonetic symbol is selected from the several phonetic symbols based on the voting strategy. As a step before the target phonetic symbols, it also includes:

步骤S1011:根据每个所述词汇转音标工具标注的音标的个判断所述待标注词汇是否为多音词;Step S1011: Determine whether the word to be marked is a polysyllabic word based on the number of phonetic symbols marked by the word-to-phonetic symbol conversion tool;

所述多音词是指有两个或两个以上音标的词汇,多音词是异音同字多音词的简称。具体地,根据每个所述词汇转音标工具标注的音标的个数判断所述待标注词汇是否为多音词。The polysyllabic words refer to words with two or more phonetic symbols, and polysyllabic words are the abbreviation of polysyllabic words with different sounds and the same character. Specifically, whether the word to be marked is a polysyllabic word is determined based on the number of phonetic symbols marked by the phonetic symbol conversion tool for each word.

步骤S1012:若一个或多个所述词汇转音标工具标注出的音标个数大于1,则判定所述待标注词汇是多音词;Step S1012: If the number of phonetic symbols marked by one or more of the vocabulary conversion tools is greater than 1, determine that the vocabulary to be marked is a polysyllabic word;

本实施例中,根据音标的个数判定所述待标注词汇是否为多音词。若一个或多个所述词汇转音标工具标注出的音标个数大于1,则判定所述待标注词汇是多音词。例如,对于词汇“朝阳”,有[zhao1 yang2]、[chao2 yang2]两个音标,若通过多个词汇转音标工具对该词汇进行标注,在考虑错误率和情况下,至少有一个词汇转音标工具能标注出这两个音标。In this embodiment, whether the word to be labeled is a polysyllabic word is determined based on the number of phonetic symbols. If the number of phonetic symbols marked by one or more of the word-to-phonetic symbol tools is greater than 1, it is determined that the word to be marked is a polysyllabic word. For example, for the word "Chaoyang", there are two phonetic symbols [zhao1 yang2] and [chao2 yang2]. If the word is annotated through multiple vocabulary-to-phonetic notation tools, at least one word must be converted to phonetic notation, taking into account the error rate and circumstances. The tool can mark these two phonetic symbols.

步骤S1013:若所述待标注词汇是多音词,则将所述待标注词汇拆分成若干个待标注子词汇。Step S1013: If the vocabulary to be tagged is a polysyllabic word, split the vocabulary to be tagged into several sub-vocabularies to be tagged.

本实施例中,所述待标注子词汇的个数与所述音标的个数相同。例如,“朝阳”则可以分成两个子词汇。In this embodiment, the number of sub-vocabularies to be labeled is the same as the number of phonetic symbols. For example, "Chaoyang" can be divided into two sub-vocabularies.

进一步地,在分成子词汇的过程中,将所述待标注词汇本身映射到最常用音标,并用“词汇+后缀”的形式映射所述最常用音标之外的其它音标。所述后缀可以是字母、数字等。本实施例中,所述最常用音标是指在各个句库含有该词汇的句子中,使用频率最高的音标。例如,对于多音词汇“万”,音标分别为[wan4]、[mo4],其中最常用音标为[wan4],则将其映射为[万wan4]、[万_2mo4]。Further, in the process of dividing the words into sub-vocabularies, the words to be annotated themselves are mapped to the most commonly used phonetic symbols, and other phonetic symbols other than the most commonly used phonetic symbols are mapped in the form of "vocabulary + suffix". The suffix can be letters, numbers, etc. In this embodiment, the most commonly used phonetic symbols refer to the most frequently used phonetic symbols in sentences containing the vocabulary in each sentence library. For example, for the polyphonic word "wan", the phonetic symbols are [wan4] and [mo4] respectively. Among them, the most commonly used phonetic symbol is [wan4], which is mapped to [万wan4] and [万_2mo4].

步骤S1014:分别将所述若干个待标注子词汇与对应的音标关联保存,获得所述待标注子词汇的若干个音标后执行步骤:基于投票策略,从所述若干个音标中选出胜出音标作为目标音标。Step S1014: Store the plurality of sub-vocabularies to be labeled in association with the corresponding phonetic symbols, and after obtaining the plurality of phonetic symbols of the sub-vocabulary to be labeled, perform the step of: based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols. as the target phonetic symbol.

将所述若干个待标注子词汇与对应的音标关联保存,并且分别标记各个待标注子词汇的映射结果。以此,获得所述待标注子词汇的若干个音标。The several sub-vocabularies to be tagged are stored in association with the corresponding phonetic symbols, and the mapping results of each sub-vocabulary to be tagged are respectively marked. In this way, several phonetic symbols of the sub-vocabulary to be labeled are obtained.

获得所述待标注子词汇的若干个音标后执行步骤S102:基于投票策略,从所述若干个音标中选出胜出音标作为目标音标。After obtaining several phonetic symbols of the sub-vocabulary to be labeled, step S102 is performed: based on the voting strategy, select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol.

本实施例通过上述方案,选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,提升了词汇音素表的质量,提高了词汇音素表的构建效率。In this embodiment, through the above solution, several vocabulary-to-phonetic notation tools are selected, and the several vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and several phonetic symbols of the words to be tagged are obtained; based on the voting strategy, from the Select the winning phonetic symbol from several phonetic symbols as the target phonetic symbol; convert the target phonetic symbol into phonemes to generate a vocabulary phoneme table. As a result, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and the target phonetic symbols are determined based on the voting strategy, which improves the quality of the vocabulary phoneme table and improves the efficiency of constructing the vocabulary phoneme table.

如图4所示,本发明第三实施例提出一种词汇音素表构建方法,基于上述图2、图3所示的第一实施例和第二实施例,所述将所述目标音标转换成音素,生成词汇音素表的步骤之后还包括:As shown in Figure 4, the third embodiment of the present invention proposes a method for constructing a vocabulary phoneme table. Based on the first and second embodiments shown in Figures 2 and 3, the target phonetic symbols are converted into Phonemes, the steps to generate the vocabulary phoneme table also include:

步骤S104,接收词汇音素表更新请求,从所述词汇音素表更新请求中获取目标更新词汇和目标更新操作;Step S104, receive a vocabulary phoneme table update request, and obtain the target update vocabulary and target update operation from the vocabulary phoneme table update request;

所述词汇音素表构建完成之后,为了获得更加完善和准确的词汇音素表,则需要对所述词汇音素表构建进行修改、新增、删除等操作。After the construction of the vocabulary phoneme table is completed, in order to obtain a more complete and accurate vocabulary phoneme table, operations such as modification, addition, and deletion of the vocabulary phoneme table need to be performed.

具体地,接收词汇音素表更新请求,所述词汇音素表更新请求包括更新操作,所述更新操作包括修改、删除和新增。从所述词汇音素表更新请求中获取待更新词汇以及对应的目标更新操作,所述目标更新操作包括修改、新增、删除中的一个。Specifically, a vocabulary phoneme table update request is received, the vocabulary phoneme table update request includes an update operation, and the update operation includes modification, deletion, and addition. The vocabulary to be updated and the corresponding target update operation are obtained from the vocabulary phoneme table update request, and the target update operation includes one of modification, addition, and deletion.

步骤S105,基于所述目标更新操作,对所述目标词汇的音素执行对应的更新操作。Step S105: Based on the target update operation, perform a corresponding update operation on the phonemes of the target vocabulary.

从所述词汇音素表更新请求中获取待更新的目标词汇,以及与所述目标词汇对应的目标更新操作。若所述目标更新操作是修改,则进一步获取修改后音素,则将所述词汇音素表中目标词汇的音素替换成所述修改后音素;若所述目标更新操作是新增,则获取需要新增的目标词汇,以及该目标词汇的音素,再将所述目标词汇以及音素保存至所述词汇音素表。The target vocabulary to be updated and the target update operation corresponding to the target vocabulary are obtained from the vocabulary phoneme table update request. If the target update operation is to modify, then further obtain the modified phoneme, and then replace the phoneme of the target vocabulary in the vocabulary phoneme table with the modified phoneme; if the target update operation is to add, then obtain the new phoneme that needs to be The added target vocabulary and the phonemes of the target vocabulary are then saved to the vocabulary phoneme table.

本实施例中,新增或修改的音素可以是人工标注的音素,也可以是基于本发明第一实施例方案获得的音素。In this embodiment, the newly added or modified phonemes may be manually annotated phonemes, or may be phonemes obtained based on the solution of the first embodiment of the present invention.

本实施例通过上述方案,选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表,接收词汇音素表更新请求,从所述词汇音素表更新请求中获取目标更新词汇和目标更新操作;基于所述目标更新操作,对所述目标词汇的音素执行对应的更新操作。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,提升了词汇音素表的质量,提高了词汇音素表的构建效率。而且还能实现快速更新。In this embodiment, through the above solution, several vocabulary-to-phonetic notation tools are selected, and the several vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and several phonetic symbols of the words to be tagged are obtained; based on the voting strategy, from the Select the winning phonetic symbol from several phonetic symbols as the target phonetic symbol; convert the target phonetic symbol into phonemes, generate a vocabulary phoneme table, receive a vocabulary phoneme table update request, and obtain the target update vocabulary and target update operation from the vocabulary phoneme table update request. ; Based on the target update operation, perform a corresponding update operation on the phonemes of the target vocabulary. As a result, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and the target phonetic symbols are determined based on the voting strategy, which improves the quality of the vocabulary phoneme table and improves the efficiency of constructing the vocabulary phoneme table. It also enables fast updates.

此外,本实施例还提供一种词汇音素表构建装置。参照图5,图5为本发明词汇音素表构建装置第一实施例的功能模块示意图。In addition, this embodiment also provides a device for constructing a vocabulary phoneme table. Referring to Figure 5, Figure 5 is a schematic diagram of the functional modules of the first embodiment of the vocabulary phoneme table construction device of the present invention.

本实施例中,所述词汇音素表构建装置为虚拟装置,存储于图1所示的词汇音素表构建设备的存储器1005中,以实现词汇音素表构建程序的所有功能:用于选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;用于基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;用于将所述目标音标转换成音素,生成词汇音素表。In this embodiment, the vocabulary phoneme table construction device is a virtual device, which is stored in the memory 1005 of the vocabulary phoneme table construction device shown in Figure 1 to realize all functions of the vocabulary phoneme table construction program: for selecting several vocabulary words A phonetic symbol conversion tool, which uses the several vocabulary conversion phonetic symbols to mark phonetic symbols for words to be marked, and obtains several phonetic symbols of the words to be marked; and is used to select a winning phonetic symbol from the several phonetic symbols based on a voting strategy as the phonetic symbol. Target phonetic symbols; used to convert the target phonetic symbols into phonemes and generate a vocabulary phoneme table.

具体地,所述词汇音素构建装置包括:Specifically, the vocabulary phoneme construction device includes:

选择模块10,用于选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;The selection module 10 is used to select several vocabulary conversion phonetic notation tools, and use the several vocabulary conversion phonetic notation tools to respectively mark phonetic symbols for the vocabulary to be marked, and obtain several phonetic symbols of the vocabulary to be marked;

投票模块20,用于基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;The voting module 20 is used to select the winning phonetic symbol from the several phonetic symbols as the target phonetic symbol based on the voting strategy;

转换模块30,用于将所述目标音标转换成音素,生成词汇音素表。The conversion module 30 is used to convert the target phonetic symbols into phonemes and generate a vocabulary phoneme table.

进一步地,所述投票模块包括:Further, the voting module includes:

赋予单元,用于将所述若干个音标中的各个音标分别赋予一张原始票;An assigning unit, configured to assign each of the plurality of phonetic symbols to an original ticket;

合票单元,用于若所述若干个音标中存在相同音标,则将相同音标的原始票进行合票,并将合票后的音标标记为候选音标,统计所述候选音标的票数;A ticket summarizing unit is used to, if there are identical phonetic symbols among the plurality of phonetic symbols, combine the original tickets for the same phonetic symbols, mark the combined phonetic symbols as candidate phonetic symbols, and count the number of votes for the candidate phonetic symbols;

确定单元,用于根据所述票数将所述候选音标排序,根据排序结果确定胜出音标。A determining unit, configured to sort the candidate phonetic symbols according to the number of votes, and determine the winning phonetic symbol according to the sorting result.

进一步地,所述赋予单元还包括:Further, the granting unit also includes:

判定子单元,用于若所述若干个音标中不存在相同音标,则判定所述若干个音标中不存在胜出音标;A determination subunit, used to determine that there is no winning phonetic symbol among the several phonetic symbols if there are no identical phonetic symbols among the several phonetic symbols;

标记子单元,用于将对应的待标注词汇标记为歧义词汇,并将所述歧义词汇转入歧义词汇池;The marking subunit is used to mark the corresponding words to be labeled as ambiguous words, and transfer the ambiguous words into the ambiguous word pool;

获得子单元,用于通过若干个备用词汇转音标工具对所述歧义词汇池中的歧义词汇标注音标,直到获得所述歧义词汇的胜出音标。Obtaining a subunit for using several backup vocabulary conversion tools to annotate phonetic symbols for ambiguous words in the ambiguous word pool until the winning phonetic symbol of the ambiguous words is obtained.

进一步地,所述投票模块还包括:Further, the voting module also includes:

判断单元,用于根据每个所述词汇转音标工具标注的音标的个数判断所述待标注词汇是否为多音词:A judging unit, configured to judge whether the word to be labeled is a polysyllabic word based on the number of phonetic symbols labeled by each word conversion tool:

判定单元,用于若一个或多个所述词汇转音标工具标注出的音标个数大于1,则判定所述待标注词汇是多音词;A determination unit, configured to determine that the vocabulary to be marked is a polyphonetic word if the number of phonetic symbols marked by one or more of the vocabulary conversion tools is greater than 1;

拆分单元,用于若所述待标注词汇是多音词,则将所述多音词拆分成若干个待标注子词汇;A splitting unit, used to split the polysyllabic word into several sub-vocabularies to be tagged if the word to be tagged is a polysyllabic word;

保存单元,用于分别将所述若干个待标注子词汇与对应的音标关联保存,获得所述待标注子词汇的若干个音标后执行步骤:基于投票策略,从所述若干个音标中选出胜出音标作为目标音标。The storage unit is used to store the plurality of sub-vocabularies to be labeled in association with the corresponding phonetic symbols, and after obtaining the several phonetic symbols of the sub-vocabulary to be labeled, perform the step of: selecting from the several phonetic symbols based on the voting strategy. The winning phonetic symbol is used as the target phonetic symbol.

进一步地,所述选址模块还包括:Further, the location selection module also includes:

将所述若干个音标归一化,获得格式一致的若干个音标,以供从所述格式一致的若干个音标中选出胜出音标。The several phonetic symbols are normalized to obtain several phonetic symbols with the same format, so as to select the winning phonetic symbol from the several phonetic symbols with the same format.

进一步地,所述转换模块还包括:Further, the conversion module also includes:

基于音素格式,将所述目标音标转换成音素,并根据所述待标注词汇的音素生成词汇音素表。Based on the phoneme format, the target phonetic symbols are converted into phonemes, and a vocabulary phoneme table is generated according to the phonemes of the vocabulary to be annotated.

进一步地,所述转换模块还包括:Further, the conversion module also includes:

获取单元,用于接收词汇音素表更新请求,从所述词汇音素表更新请求中获取目标更新词汇和目标更新操作;An acquisition unit configured to receive a vocabulary phoneme table update request, and obtain a target update vocabulary and a target update operation from the vocabulary phoneme table update request;

更新单元,用于基于所述目标更新操作,对所述目标词汇的音素执行对应的更新操作。An update unit, configured to perform a corresponding update operation on the phonemes of the target vocabulary based on the target update operation.

此外,本发明实施例还提供一种计算机存储介质,所述计算机存储介质上存储有词汇音素表构建程序,所述词汇音素表构建程序被处理器运行时实现如上所述词汇音素表构建方法的步骤,此处不再赘述。In addition, embodiments of the present invention further provide a computer storage medium, which stores a vocabulary phoneme table construction program. When the vocabulary phoneme table construction program is run by a processor, the vocabulary phoneme table construction method is implemented as described above. The steps will not be repeated here.

相比现有技术,本发明提出的一种词汇音素表构建方法、装置、设备及存储介质,该方法包括:选择若干个词汇转音标工具,由所述若干个词汇转音标工具分别为待标注词汇标注音标,获得所述待标注词汇的若干个音标;基于投票策略,从所述若干个音标中选出胜出音标作为目标音标;将所述目标音标转换成音素,生成词汇音素表。由此,通过多个词汇转音标工具为待标注词汇标注音标,并基于投票策略确定目标音标,提升了词汇音素表的质量,提高了词汇音素表的构建效率。Compared with the prior art, the present invention proposes a method, device, equipment and storage medium for constructing a vocabulary phoneme table. The method includes: selecting a number of vocabulary-to-phonetic symbol conversion tools, and using the several vocabulary-to-phonetic symbol conversion tools to convert words to be annotated. Vocabulary annotation phonetic symbols are obtained to obtain several phonetic symbols of the vocabulary to be annotated; based on the voting strategy, the winning phonetic symbols are selected from the several phonetic symbols as the target phonetic symbols; the target phonetic symbols are converted into phonemes to generate a vocabulary phoneme table. As a result, multiple vocabulary-to-phonetic notation tools are used to annotate phonetic symbols for the words to be tagged, and the target phonetic symbols are determined based on the voting strategy, which improves the quality of the vocabulary phoneme table and improves the efficiency of constructing the vocabulary phoneme table.

需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that, as used herein, the terms "include", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article or system that includes a list of elements not only includes those elements, but It also includes other elements not expressly listed or that are inherent to the process, method, article or system. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the embodiments of the present invention are only for description and do not represent the advantages and disadvantages of the embodiments.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干请求用以使得一台终端设备执行本发明各个实施例所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product that is essentially or contributes to the existing technology. The computer software product is stored in a storage medium (such as ROM/RAM) as mentioned above. , magnetic disk, optical disk), including several requests to cause a terminal device to execute the method described in various embodiments of the present invention.

以上所述仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and do not limit the patent scope of the present invention. Any equivalent structure or process transformation made using the description and drawings of the present invention may be directly or indirectly applied in other related technical fields. , are all similarly included in the scope of patent protection of the present invention.

Claims (9)

1. A method of vocabulary phonemic table construction, the method comprising:
selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively marking phonetic symbols for the vocabulary to be marked by the plurality of vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be marked;
selecting winning phonetic symbols from the several phonetic symbols as target phonetic symbols based on voting strategy;
converting the target phonetic symbol into a phoneme to generate a vocabulary phoneme list;
before the step of selecting the winning phonetic symbol from the plurality of phonetic symbols as the target phonetic symbol based on the voting strategy, the method further comprises the following steps:
judging whether the vocabulary to be annotated is a polyphonic word or not according to the number of phonetic symbols annotated by each vocabulary to be annotated:
if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is greater than 1, judging that the vocabulary to be marked is a multi-phonetic-symbol;
if the vocabulary to be annotated is a polyphone, splitting the polyphone into a plurality of sub-vocabularies to be annotated;
respectively storing the plurality of sub-vocabularies to be marked and corresponding phonetic symbols in an associated manner, and executing the steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be marked: and selecting a winning phonetic symbol from the several phonetic symbols as a target phonetic symbol based on a voting strategy.
2. The method of claim 1, wherein the voting strategy is used to determine winning phonetic symbols based on the number of votes for the phonetic symbols;
the step of selecting the winning phonetic symbol from the plurality of phonetic symbols as the target phonetic symbol based on the voting strategy comprises the following steps:
each phonetic symbol in the plurality of phonetic symbols is respectively endowed with an original ticket;
if the same phonetic symbols exist in the plurality of phonetic symbols, the original ticket of the same phonetic symbol is combined, the phonetic symbols after the ticket combination are marked as candidate phonetic symbols, and the ticket number of the candidate phonetic symbols is counted;
and sorting the candidate phonetic symbols according to the ticket number, determining winning phonetic symbols according to the sorting result, and taking the winning phonetic symbols as target phonetic symbols.
3. The method of claim 2, wherein after the step of assigning each of the plurality of phonetic symbols to an original ticket, further comprising:
if the same phonetic symbols do not exist in the plurality of phonetic symbols, judging that the winning phonetic symbols do not exist in the plurality of phonetic symbols;
marking the corresponding vocabulary to be annotated as ambiguous vocabulary, and converting the ambiguous vocabulary into an ambiguous vocabulary pool;
and annotating the ambiguous vocabulary in the ambiguous vocabulary pool by a plurality of spare vocabulary phonetic transcription tools until the winning phonetic transcription of the ambiguous vocabulary is obtained.
4. The method of claim 1, wherein the selecting a plurality of vocabulary phonetic transcription tools, the plurality of vocabulary phonetic transcription tools respectively annotate phonetic transcriptions of the vocabulary to be annotated, and after the step of obtaining the plurality of phonetic transcriptions of the vocabulary to be annotated, further comprises:
normalizing the plurality of phonetic symbols to obtain a plurality of phonetic symbols with consistent formats, so as to select winning phonetic symbols from the plurality of phonetic symbols with consistent formats.
5. The method of claim 1, wherein the step of converting the target phonetic symbol to phonemes and generating a vocabulary phonemic list comprises:
and converting the target phonetic symbol into a phoneme based on a phoneme format, and generating a vocabulary phoneme list according to the phonemes of the vocabulary to be annotated.
6. The method of claim 1, wherein the step of converting the target phonetic symbol to phonemes and generating a vocabulary phonemic list further comprises:
receiving a vocabulary phonemic table updating request, and acquiring a target updating vocabulary and a target updating operation from the vocabulary phonemic table updating request;
and based on the target updating operation, executing corresponding updating operation on the phonemes of the target updating vocabulary.
7. A vocabulary phoneme list construction device, characterized in that the vocabulary phoneme list construction device comprises:
the selection module is used for selecting a plurality of vocabulary phonetic symbol conversion tools, and the vocabulary phonetic symbol conversion tools are used for respectively marking phonetic symbols for the vocabulary to be marked to obtain a plurality of phonetic symbols of the vocabulary to be marked;
the voting module is used for selecting winning phonetic symbols from the plurality of phonetic symbols as target phonetic symbols based on a voting strategy;
the conversion module is used for converting the target phonetic symbol into a phoneme to generate a vocabulary phoneme list;
the voting module further comprises:
the judging unit is used for judging whether the vocabulary to be marked is a polyphone according to the number of phonetic symbols marked by each vocabulary phonetic symbol converting tool:
the judging unit is used for judging that the vocabulary to be marked is a polyphone if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is larger than 1;
the splitting unit is used for splitting the multi-sound word into a plurality of sub-words to be annotated if the words to be annotated are multi-sound words;
the storage unit is used for respectively storing the plurality of sub-vocabularies to be marked and the corresponding phonetic symbols in an associated way, and executing the steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be marked: and selecting a winning phonetic symbol from the several phonetic symbols as a target phonetic symbol based on a voting strategy.
8. A vocabulary phonemic building device comprising a processor, a memory and a vocabulary phonemic building program stored in the memory, which vocabulary phonemic building program, when executed by the processor, implements the steps of the vocabulary phonemic building method according to any one of claims 1-6.
9. A computer storage medium having stored thereon a vocabulary phoneme list construction program which when executed by a processor performs the steps of the vocabulary phoneme list construction method of any one of claims 1to 6.
CN202010150627.0A 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium Active CN111354339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010150627.0A CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010150627.0A CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111354339A CN111354339A (en) 2020-06-30
CN111354339B true CN111354339B (en) 2023-11-03

Family

ID=71194340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010150627.0A Active CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111354339B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112530402B (en) * 2020-11-30 2024-01-12 深圳市优必选科技股份有限公司 Speech synthesis method, speech synthesis device and intelligent equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004006123A2 (en) * 2002-07-03 2004-01-15 2012244 Ontario Inc. Method and system of creating and using chinese language data and user-corrected data
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, speech recognition method and electronic device thereof
JP2014164260A (en) * 2013-02-27 2014-09-08 Canon Inc Information processor and information processing method
CN109117463A (en) * 2018-07-26 2019-01-01 掌阅科技股份有限公司 Text pinyin marking method, electronic equipment, storage medium
CN109918619A (en) * 2019-01-07 2019-06-21 平安科技(深圳)有限公司 A kind of pronunciation mask method and device based on basic dictionary mark
CN109977361A (en) * 2019-03-01 2019-07-05 广州多益网络股份有限公司 A kind of Chinese phonetic alphabet mask method, device and storage medium based on similar word
CN110827803A (en) * 2019-11-11 2020-02-21 广州国音智能科技有限公司 Method, device and equipment for constructing dialect pronunciation dictionary and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004006123A2 (en) * 2002-07-03 2004-01-15 2012244 Ontario Inc. Method and system of creating and using chinese language data and user-corrected data
JP2014164260A (en) * 2013-02-27 2014-09-08 Canon Inc Information processor and information processing method
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, speech recognition method and electronic device thereof
CN109117463A (en) * 2018-07-26 2019-01-01 掌阅科技股份有限公司 Text pinyin marking method, electronic equipment, storage medium
CN109918619A (en) * 2019-01-07 2019-06-21 平安科技(深圳)有限公司 A kind of pronunciation mask method and device based on basic dictionary mark
CN109977361A (en) * 2019-03-01 2019-07-05 广州多益网络股份有限公司 A kind of Chinese phonetic alphabet mask method, device and storage medium based on similar word
CN110827803A (en) * 2019-11-11 2020-02-21 广州国音智能科技有限公司 Method, device and equipment for constructing dialect pronunciation dictionary and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Context-independent phoneme recognition using a K-Nearest Neighbour classification approach;Ladan Golipour et al.;2009 IEEE International Conference on Acoustics, Speech and Signal Processing;全文 *
普通话语音识别中的基本音素分析;黄中伟等;深圳大学学报理工版;全文 *

Also Published As

Publication number Publication date
CN111354339A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN1029170C (en) Language translation system
WO2020186778A1 (en) Error word correction method and device, computer device, and storage medium
CN111666427B (en) Entity relationship joint extraction method, device, equipment and medium
US7636657B2 (en) Method and apparatus for automatic grammar generation from data entries
KR101435265B1 (en) Method for disambiguating multiple readings in language conversion
US9508341B1 (en) Active learning for lexical annotations
CN114547329A (en) Method for establishing pre-training language model, semantic analysis method and device
CN110428823B (en) Spoken language understanding device and spoken language understanding method using the same
WO2021174871A1 (en) Data query method and system, computer device, and storage medium
CN111462748B (en) Speech recognition processing method and device, electronic equipment and storage medium
CN107798123A (en) Knowledge base and its foundation, modification, intelligent answer method, apparatus and equipment
TW201822190A (en) Speech recognition system and method thereof, vocabulary establishing method and computer program product
CN109299471A (en) A kind of method, apparatus and terminal of text matches
CN113393830B (en) Hybrid acoustic model training and lyric timestamp generation method, device and medium
CN111354339B (en) Vocabulary phoneme list construction method, device, equipment and storage medium
WO2023045186A1 (en) Intention recognition method and apparatus, and electronic device and storage medium
CN115731921A (en) Training end-to-end spoken language understanding system with out-of-order entities
CN110750967B (en) Pronunciation labeling method and device, computer equipment and storage medium
WO2024067471A1 (en) Speech recognition method, and server, speech recognition system and readable storage medium
CN111611793B (en) Data processing method, device, equipment and storage medium
CN116894092A (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN114613359A (en) Language model training method, audio recognition method and computer equipment
CN113889115A (en) Dialect commentary method based on voice model and related device
CN102918587B (en) Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses
CN118748009A (en) Method, device, equipment and medium for processing multiple pronunciation problems in speech recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant