WO2015172448A1 - 一种中文人名播报方法及装置 - Google Patents

一种中文人名播报方法及装置 Download PDF

Info

Publication number
WO2015172448A1
WO2015172448A1 PCT/CN2014/084267 CN2014084267W WO2015172448A1 WO 2015172448 A1 WO2015172448 A1 WO 2015172448A1 CN 2014084267 W CN2014084267 W CN 2014084267W WO 2015172448 A1 WO2015172448 A1 WO 2015172448A1
Authority
WO
WIPO (PCT)
Prior art keywords
pronunciation
string
name
network side
knowledge base
Prior art date
Application number
PCT/CN2014/084267
Other languages
English (en)
French (fr)
Inventor
刘伟
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015172448A1 publication Critical patent/WO2015172448A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems

Definitions

  • the present invention relates to the field of mobile communications, and in particular, to a Chinese name broadcast method and related device.
  • Lexical Analysis The process of converting a sequence of characters into a sequence of words in computer science.
  • the program or function that performs lexical analysis is called a lexical analyzer, also called a scanner. Called by the parser. Since Chinese is not like English words separated by spaces, lexical analysis is generally combined with semantic analysis techniques.
  • parsing refers to the analysis of the grammatical function of the words in the sentence, such as "I am late", where "I” is the subject, "coming” is the predicate, and "late” is the complement.
  • Chinese information processing such as machine translation. It is a direct implementation of the idea of chunking, which simplifies the description of sentences by identifying high-level structural elements. One way to find the rule of chunks from different sentences is to learn a grammar that can explain the block structure found. This is a category of grammatical induction.
  • Text To Speech Speech synthesis is a process of converting text into speech output. The main task of this process is to decompose the input text into phonemes by words or words, and to the numbers and currency units in the text. Symbols to be specially processed, such as word morphing and punctuation, are analyzed, and digital audio is generated by the phoneme and then played back by a speaker or saved as a sound file and then played back with multimedia software.
  • the embodiment of the invention provides a Chinese name broadcast method and device, which performs multi-phone pronunciation identification by treating a person name string in a text string, and solves the multi-phone character in the Chinese name according to the identified multi-phone pronunciation Broadcast inaccurate questions.
  • a Chinese name broadcast method including: sending, by a terminal side, a to-be-sent text string containing a name string to a network side;
  • the character name string is broadcast according to the pronunciation string.
  • a Chinese name broadcast method including: the network side searches for a name string in a text string to be broadcasted in a preset pronunciation database; and reads through the pronunciation database The string is used to perform the pronunciation identification on the name string; the character string to be broadcast that has been subjected to the pronunciation identification is sent to the terminal side, so that the terminal side broadcasts the name string according to the pronunciation string.
  • the method before the step of searching for the name string of the character string to be broadcasted in the preset pronunciation database, the method further includes:
  • the network side receives the text string to be broadcasted from the terminal side;
  • the pronunciation database includes a historical person name pronunciation knowledge base and/or a surname pronunciation knowledge base, and the network side searches for the name of the person in the text string to be broadcasted in the historical person name pronunciation knowledge base and/or the surname pronunciation knowledge base.
  • the step of performing the pronunciation identification on the name string by using the pronunciation string in the pronunciation database comprises:
  • the network side extracts the corresponding knowledge in the historical name pronunciation knowledge base and/or the surname pronunciation knowledge base. a pronunciation string of the name string; inserting the pronunciation string into a specified position of the " ⁇ text string to be broadcasted, and identifying the pronunciation of the person name.
  • a Chinese name broadcast apparatus including: a terminal side transceiver module, configured to: send a text string to be broadcast containing a name string to a network side, and receive from a network side a text string to be broadcast that has completed the pronunciation identification of the name string; and
  • the terminal side broadcast module is configured to: broadcast the name string according to the pronunciation string.
  • a Chinese name broadcast device including: a network side search module, configured to: search for a name string of a person in a text string to be broadcasted in a preset pronunciation database;
  • a network side identification module configured to: perform a pronunciation identification on the name string by using a pronunciation string in the pronunciation database; and a network side sending module, configured to:: the to-be-recorded text string that has been read and identified Sending to the terminal side, for the terminal side to broadcast the name string according to the pronunciation string.
  • the method further includes:
  • a network side receiving module configured to: receive a text string to be broadcasted from the terminal side before the network side search module searches for the name string of the character string to be broadcasted; and the network side analysis module, where the setting is: Performing semantic analysis on the to-be-published text string to obtain a name string in the text string to be broadcasted.
  • the pronunciation database includes a historical person name pronunciation knowledge base and/or a surname pronunciation knowledge base
  • the network side search module is configured to search in the historical person name pronunciation knowledge base and/or surname pronunciation knowledge base. Broadcast the name string of the person in the text string.
  • the network side identification module is configured to extract a pronunciation string corresponding to the name string in the historical name pronunciation knowledge base and/or a surname pronunciation knowledge base, and extract the extracted pronunciation string. Inserting into a specified position of the text string to be broadcast, identifying the pronunciation of the person name.
  • Embodiments of the present invention also provide a computer program, including program instructions, when the program instructions are When the terminal side executes, the terminal side can perform the above method.
  • the embodiment of the present invention further provides a computer program, including program instructions, when the program instruction is executed by the network side, so that the network side can execute the foregoing method.
  • Embodiments of the present invention also provide a carrier carrying any of the above computer programs.
  • the embodiment of the present invention resolves the name of the person by the semantics in the text and identifies it, thereby reaching the terminal side.
  • the TTS When the TTS broadcasts, it can use the logo to broadcast the polyphonic words in the name according to the correct pronunciation.
  • FIG. 1 is a block diagram of a Chinese name broadcast method on the terminal side according to an embodiment of the present invention
  • FIG. 2 is a block diagram of a Chinese name broadcast apparatus on the terminal side according to an embodiment of the present invention
  • FIG. 3 is a schematic block diagram of a Chinese name broadcast method on the network side according to an embodiment of the present invention
  • FIG. 4 is a block diagram of a Chinese name broadcast apparatus on the network side according to an embodiment of the present invention.
  • FIG. 5 is a novel of Zeng Yiting's "The Romance of the Three Kingdoms” provided by an embodiment of the present invention. He likes the role of Guan Yunchang very much. "Two grammar trees obtained after syntactic analysis and grammar analysis; FIG. 6 is the present invention. The flow chart of the Chinese name broadcast provided by the embodiment.
  • FIG. 1 is a schematic block diagram of a Chinese name broadcast method on a terminal side according to an embodiment of the present invention. As shown in FIG. 1, the method includes:
  • Step S101 The terminal side sends the to-be-sent text string containing the name string to the network side
  • step S102 receiving the to-be-recorded text string name string from the network side that has completed the pronunciation identification of the character name string.
  • Step S103 Broadcast the name string according to the pronunciation string.
  • the terminal side uses the network side to process the name string. If there is a polyphonic word in the name, the network side performs the pronunciation identification on the polyphonic word, so that the speech synthesis engine on the terminal side can correctly broadcast the polyphonic word in the name according to the identified pronunciation.
  • FIG. 2 is a block diagram of a Chinese name broadcast apparatus on the terminal side according to an embodiment of the present invention. As shown in FIG. 2, the terminal side transceiver module 21 and the terminal side broadcast module 22 are included. among them:
  • the terminal-side transceiver module 21 sends a to-be-sent text string containing the name string to the network side, and receives a to-be-sent text string from the network side that has completed the pronunciation identification of the name string;
  • the broadcast module 22 broadcasts the name string according to the pronunciation string.
  • FIG. 3 is a schematic block diagram of a Chinese name broadcast method on the network side according to an embodiment of the present invention. As shown in FIG. 3, the method includes:
  • Step S301 After receiving the text string to be broadcasted on the terminal side, the network side performs semantic analysis on the text string to be broadcasted to obtain a character name string in the text string to be broadcasted.
  • the name string is searched for. If the corresponding person name string is searched, it means that the person name has a polyphonic word.
  • the pronunciation database in the step S301 includes a historical person name pronunciation knowledge base and/or a surname pronunciation knowledge base, and the network side searches for the name of the person in the text string to be broadcasted in the historical person name pronunciation knowledge base and/or the surname pronunciation knowledge base. String.
  • Step S302 Perform a pronunciation identification on the name string by using a pronunciation string in the pronunciation database.
  • the network side extracts a pronunciation string corresponding to the name string of the person in the historical name pronunciation knowledge base and/or the surname pronunciation knowledge base, and inserts the pronunciation string into the to-be-recorded text character.
  • the specified position of the string identifying the pronunciation of the person's name.
  • Step S303 Send the to-be-sent text string that has been subjected to the pronunciation identification to the terminal side, so that the terminal side broadcasts the name string according to the pronunciation string.
  • FIG. 4 is a block diagram of a Chinese name broadcast apparatus according to an embodiment of the present invention. As shown in FIG. 4, the network side receiving module 41, the network side analyzing module 42, the network side searching module 43, and the network side identifier are included. Module 44 and network side transmitting module 45. among them:
  • the network side receiving module 41 receives the text string to be broadcasted from the terminal side.
  • the network side analysis module 42 performs semantic analysis on the text string to be broadcast received by the network side receiving module 41, and obtains a character name string in the text string to be broadcasted.
  • the network side search module 43 searches the preset pronunciation database to search for the name string of the person analyzed by the network side analysis module 42.
  • the pronunciation database includes a historical person name pronunciation knowledge base and/or a surname pronunciation knowledge base.
  • the network side identification module 44 performs a voice recognition on the name string by using a pronunciation string in the pronunciation database. Specifically, the identifier module 24 extracts a pronunciation string corresponding to the name string in the historical name pronunciation knowledge base and/or a surname pronunciation knowledge base, and inserts the extracted pronunciation string into the The specified position of the text string to be broadcast, identifying the pronunciation of the name of the person.
  • the network side sending module 45 sends the to-be-sent text string that has been subjected to the pronunciation identification to the terminal side, so that the terminal side broadcasts the to-be-published text string according to the pronunciation string that the network side identification module 44 has identified.
  • the name string of the person is
  • Step 1 Define a pronunciation slot in the string to identify the pronunciation of the polyphonic word in the person's name, so that the TTS engine can broadcast it according to the identification in the pronunciation slot with the correct pronunciation.
  • Step 2 Establish a historical name pronunciation knowledge base. For the special pronunciation of the words in the historical name, the pronunciation of the pronunciation slot is determined according to step one. At the same time, the knowledge base of the surname pronunciation is established, and the pronunciation of the special surname is recorded according to step 1 according to step 1.
  • Step 3 Use lexical analysis and syntactic analysis techniques to find the location of the name string in the syntax tree. (If the data source is a contact, you can search and match directly in the last name knowledge base and add a pronunciation slot.)
  • Step 4 Search the name string node in the syntax tree generated by the text string to be broadcasted in the historical name knowledge base, and if it is found, the historical character string matched in the text string is identified by the pronunciation slot.
  • Step 5 If there is no searched person name string node in the historical name knowledge base, then Search matching is performed in the surname knowledge base, and the pronunciation of the matched polyphonic character is identified.
  • Step 6 The TTS broadcast engine formulates a broadcast strategy for multi-tone characters in the name of the person according to the pronunciation slot.
  • the embodiment of the present invention further provides a computer program, including program instructions, when the program instruction is executed by the terminal side, so that the terminal side can execute the above method.
  • the embodiment of the present invention further provides a computer program, including program instructions, when the program instruction is executed by the network side, so that the network side can execute the foregoing method.
  • Embodiments of the present invention also provide a carrier carrying any of the above computer programs.
  • FIG. 6 is a flow chart of a Chinese name broadcast according to an embodiment of the present invention. As shown in FIG. 6, this embodiment only uses the TTS voice broadcast of the telephone terminal to read "The Story of the Three Kingdoms” by Zeng Yiting. He likes the role of Guan Yunchang very much. "This text is an example.
  • the pronunciation slot in the string for example, the name "Zeng Yiting”, which can be defined as "Pronounce: zeng] - court, and the string identification in the format of [Pronounce: xxx] is called a read channel.
  • Table 1 is a historical name reading knowledge base chart
  • Table 2 is a surname pronunciation knowledge base chart.
  • “Zeng Yiting looks at the novel "The Romance of the Three Kingdoms”. He likes the role of Guan Yunchang very much.”
  • the text content to be broadcast by TTS is transmitted by the mobile terminal to the server via http, and the lexical analysis and syntactic analysis are performed by the server. After getting two syntax trees, the syntax tree shows two Name nodes ("Zeng Yiting", “Guan Yunchang”), as shown in Figure 5.
  • Step 1 Match the historical name pronunciation knowledge base. If the name string is matched in the historical name knowledge base, perform step 2. If there is no match, go to step 3.
  • Step 2 Identify the pronunciation slot.
  • Step 3 Match the last name pronunciation knowledge base. If the name string is matched in the last name pronunciation knowledge base, go to step 4. If there is no match, go to step 5.
  • Step 4 Identify the pronunciation slot.
  • the components of the apparatus and/or system provided by the embodiments of the present invention described above, as well as the steps of the method, can be implemented by a general computing device, which can be concentrated in a single calculation. On the device, or distributed over a network of computing devices, optionally, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device, or They are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated into a single integrated circuit module.
  • the invention is not limited to any particular combination of hardware and software.
  • the name of the person is identified and identified by the semantics in the text, so that the purpose of the multi-phonetic word in the person name can be broadcasted according to the identifier according to the identifier when the terminal side TTS broadcasts.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种中文人名播报方法及装置。所述方法包括:在预置的读音数据库中,搜索待播报文本字符串中的人名字符串;通过所述读音数据库中的读音字符串对所述人名字符串进行读音标识;将已进行读音标识的待播报文本字符串发送至终端侧,以供终端侧按照所述读音字符串播报所述人名字符串。

Description

一种中文人名播报方法及装置 技术领域
本发明涉及移动通讯领域, 特别涉及一种中文人名播报方法及相关的装 置。
背景技术
词法分析: 是计算机科学中将字符序列转换为单词序列的过程。 进行词 法分析的程序或者函数叫做词法分析器, 也叫扫描器。 供语法分析器调用。 由于汉语不像英文单词是以空格分开的, 因此词法分析一般都是结合语义分 析技术相结合使用。
句法分析:所谓句法分析 (Parsing)就是指对句子中的词语语法功能进行分 析, 比如 "我来晚了" , 这里 "我" 是主语, "来" 是谓语, "晚了" 是补 语。 句法分析现在主要的应用在于中文信息处理, 如机器翻译等。 它是语块 分析( chunking )思想的一个直接实现, 语块分析通过识别出高层次的结构单 元来简化句子的描述。 从不同的句子中找到语块规律的一条途径是学习一种 语法, 这种语法能够解释所找到的分块结构。 这属于语法归纳的范畴。
语音合成( Text To Speech, TTS ) : 语音合成就是一个将文本转化为语 音输出的过程,这个过程的工作主要是将输入的文本按字或词分解为音素, 并 且对文本中的数字、 货币单位、 单词变形以及标点等要特殊处理的符号进行 分析, 以及将音素生成数字音频然后用扬声器播放出来或者存为声音文件以 后用多媒体软件播放。
目前汉语人名中普遍存在多音字, 有些多音字的发音又是姓氏中所独有 的, 例如, 曾(zeng)沈 (shen)翟 (zhai)单 (shan), 在一些常用词汇中的读音为曾 (ceng)沈 (chen)翟 (di) 单 (dan)。 另外一些历史人名中有些特殊的发音, 例如刘 禅 (shan),关云长 (chang),贾平凹 (wa),在一些常用词汇中的读音为禅 (chan) 长 (zhang) 凹 (ao)。 对于姓名的多音字, TTS播报引擎基本上按照常用词汇的发 音进行播报, 经常播出让人啼笑皆非的效果, 例如人名: 曾 (ceng)—庭。 发明内容
本发明实施例提供一种中文人名播报方法及装置, 通过对待播报文本字 符串中的人名字符串进行多音字读音标识, 并按照所标识的多音字读音播报 人名字符串, 解决中文人名中多音字播报不准确的问题。
根据本发明的一个实施例, 提供了一种中文人名播报方法, 包括: 终端侧将含有人名字符串的待播报文本字符串发送至网络侧;
接收来自网络侧的已对所述人名字符串完成读音标识的待播报文本字符 串;
按照所述读音字符串播报所述人名字符串。
根据本发明的另一个实施例, 提供了一种中文人名播报方法, 包括: 网络侧在预置的读音数据库中,搜索待播报文本字符串中的人名字符串; 通过所述读音数据库中的读音字符串对所述人名字符串进行读音标识; 将已进行读音标识的待播报文本字符串发送至终端侧, 以供终端侧按照 所述读音字符串播报所述人名字符串。
较佳地, 在预置的读音数据库中, 搜索待播报文本字符串中的人名字符 串的步骤之前, 还包括:
网络侧接收来自终端侧的待播报文本字符串;
对所述待播报文本字符串进行语义分析, 得到所述待播报文本字符串中 的人名字符串。
较佳地, 所述读音数据库包括历史人名发音知识库和 /或姓氏发音知识 库, 网络侧在所述历史人名发音知识库和 /或姓氏发音知识库中, 搜索待播报 文本字符串中的人名字符串。 较佳地, 所述的通过所述读音数据库中的读音字符串对所述人名字符串 进行读音标识的步骤包括:
网络侧在所述历史人名发音知识库和 /或姓氏发音知识库中提取对应于 所述人名字符串的读音字符串; 将所述读音字符串插入到所述待播 "^文本字符串的指定位置, 标识所述 人名的读音。
根据本发明的另一个实施例, 提供了一种中文人名播报装置, 包括: 终端侧收发模块, 其设置为: 将含有人名字符串的待播报文本字符串发 送至网络侧, 并接收来自网络侧的已对所述人名字符串完成读音标识的待播 报文本字符串; 以及
终端侧播报模块, 其设置为: 按照所述读音字符串播报所述人名字符串。 根据本发明的另一实施例, 提供了一种中文人名播报装置, 包括: 网络侧搜索模块, 其设置为: 在预置的读音数据库中, 搜索待播报文本 字符串中的人名字符串;
网络侧标识模块, 其设置为: 通过所述读音数据库中的读音字符串对所 述人名字符串进行读音标识; 以及 网络侧发送模块, 其设置为: 将已进行读音标识的待播报文本字符串发 送至终端侧, 以供终端侧按照所述读音字符串播报所述人名字符串。
较佳地, 还包括:
网络侧接收模块, 其设置为: 在所述网络侧搜索模块搜索待播报文本字 符串中的人名字符串之前, 接收来自终端侧的待播报文本字符串; 以及 网络侧分析模块, 其设置为: 对所述待播报文本字符串进行语义分析, 得到所述待播报文本字符串中的人名字符串。
较佳地, 所述读音数据库包括历史人名发音知识库和 /或姓氏发音知识 库, 所述网络侧搜索模块是设置为在所述历史人名发音知识库和 /或姓氏发音 知识库中, 搜索待播报文本字符串中的人名字符串。 较佳地, 所述网络侧标识模块是设置为在所述历史人名发音知识库和 /或 姓氏发音知识库中提取对应于所述人名字符串的读音字符串, 并将所提取的 读音字符串插入到所述待播报文本字符串的指定位置,标识所述人名的读音。 本发明实施例还提供一种计算机程序, 包括程序指令, 当该程序指令被 终端侧执行时 , 使得该终端侧可执行上述方法。
本发明实施例还提供一种计算机程序, 包括程序指令, 当该程序指令被 网络侧执行时, 使得该网络侧可执行上述方法。
本发明实施例还提供一种载有上述任一计算机程序的载体。
本发明实施例通过文本中的语义解析出人名并标识,从而达到在终端侧
TTS播报的时候可以根据标识将人名中的多音字按照正确发音播报的目的。
附图概述
图 1是本发明实施例提供的终端侧的中文人名播报方法原理框图; 图 2是本发明实施例提供的终端侧的中文人名播报装置框图;
图 3是本发明实施例提供的网络侧的中文人名播报方法原理框图; 图 4是本发明实施例提供的网络侧的中文人名播报装置框图;
图 5是本发明实施例提供的 "曾一庭看 《三国演义》这部小说, 他非常 喜欢关云长这个角色。 " 经过句法分析和语法分析后所得出的两颗语法树; 图 6是本发明实施例提供的中文人名播报流程图。
本发明的较佳实施方式
以下结合附图对本发明的优选实施例进行详细说明, 需要说明的是, 在 不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互组合。 图 1 是本发明实施例提供的终端侧的中文人名播报方法原理框图, 如图 1所示, 包括:
步骤 S101:终端侧将含有人名字符串的待播 "^文本字符串发送至网络侧, 步骤 S102: 接收来自网络侧的已对所述人名字符串完成读音标识的待播 报文本字符串人名字符串,
步骤 S103: 按照所述读音字符串播报所述人名字符串。 为实现对中文人名的正确播报, 终端侧利用网络侧对人名字符串进行处 理。 若所述人名中存在多音字, 则网络侧对该多音字进行读音标识, 从而使 得终端侧的语音合成引擎能够按照所标识的读音正确播报所述人名中的多音 字。
图 2是本发明实施例提供的终端侧的中文人名播报装置框图, 如图 2所 示, 包括终端侧收发模块 21和终端侧播报模块 22。 其中:
所述终端侧收发模块 21 将含有人名字符串的待播报文本字符串发送至 网络侧, 并接收来自网络侧的已对所述人名字符串完成读音标识的待播报文 本字符串; 所述终端侧播报模块 22按照所述读音字符串播报所述人名字符 串。
图 3是本发明实施例提供的网络侧的中文人名播报方法原理框图, 如图 3所示, 包括:
步骤 S301 : 网络侧收到终端侧的待播报文本字符串后, 对所述待播报文 本字符串进行语义分析, 得到所述待播报文本字符串中的人名字符串。 在预 置的读音数据库中, 搜索所述人名字符串。 若搜索到相应的人名字符串, 则 说明所述人名中具有多音字。
所述步骤 S301 中的读音数据库包括历史人名发音知识库和 /或姓氏发音 知识库, 网络侧在所述历史人名发音知识库和 /或姓氏发音知识库中, 搜索待 播报文本字符串中的人名字符串。
步骤 S302: 通过所述读音数据库中的读音字符串对所述人名字符串进行 读音标识。
该步骤中, 网络侧在所述历史人名发音知识库和 /或姓氏发音知识库中提 取对应于所述人名字符串的读音字符串, 并将所述读音字符串插入到所述待 播报文本字符串的指定位置, 标识所述人名的读音。
步骤 S303: 将已进行读音标识的待播报文本字符串发送至终端侧, 以供 终端侧按照所述读音字符串播报所述人名字符串。
图 4是本发明实施例提供的中文人名播报装置框图, 如图 4所示, 包括 网络侧接收模块 41、 网络侧分析模块 42、 网络侧搜索模块 43、 网络侧标识 模块 44和网络侧发送模块 45。 其中:
所述网络侧接收模块 41接收来自终端侧的待播报文本字符串。
所述网络侧分析模块 42对所述网络侧接收模块 41接收的待播报文本字 符串进行语义分析, 得到所述待播报文本字符串中的人名字符串。
所述网络侧搜索模块 43在预置的读音数据库中,搜索所述网络侧分析模 块 42分析得到的人名字符串。 其中, 所述读音数据库包括历史人名发音知识 库和 /或姓氏发音知识库。
所述网络侧标识模块 44在所述网络侧搜索模块 43搜索到所述人名字符 串时,通过所述读音数据库中的读音字符串对所述人名字符串进行读音标识。 具体地说, 所述标识模块 24在所述历史人名发音知识库和 /或姓氏发音知识 库中提取对应于所述人名字符串的读音字符串, 并将所提取的读音字符串插 入到所述待播报文本字符串的指定位置, 标识所述人名的读音。
所述网络侧发送模块 45 将已进行读音标识的待播报文本字符串发送至 终端侧,以供终端侧按照所述网络侧标识模块 44已标识的读音字符串播报所 述待播报文本字符串中的人名字符串。
具体实施时, 可以按照如下步骤进行:
步骤一: 定义字符串中的读音槽, 用以标识人名中多音字的读音, 以便 于 TTS引擎根据读音槽中的标识用正确的读音进行播报。
步骤二: 建立历史人名读音知识库, 对于历史人名中的字的特殊读音根 据步骤一进行读音槽的标识。 同时建立姓氏读音知识库, 对于特殊的姓氏读 音根据步骤一进行读音槽的标识。 步骤三: 利用词法分析和句法分析技术在句法树中找到名字字符串的节 点位置 (数据源如果是联系人则可以直接在姓氏读音知识库进行搜索匹配并 添加读音槽)。
步骤四: 在历史人名知识库中搜索待播报文本字符串生成的句法树中的 名字字符串节点, 如果搜索到了就将文本字符串中所匹配到的历史人名字符 串进行读音槽的标识。
步骤五: 如果在历史人名知识库中没有搜索到的人名字符串节点, 则在 姓氏读音知识库中进行搜索匹配, 对于匹配到的多音字姓氏进行读音槽的标 识。
步骤六: TTS播报引擎根据读音槽制定人名中多音字的播报策略。
本发明实施例还提供一种计算机程序, 包括程序指令, 当该程序指令被 终端侧执行时, 使得该终端侧可执行上述方法。
本发明实施例还提供一种计算机程序, 包括程序指令, 当该程序指令被 网络侧执行时, 使得该网络侧可执行上述方法。
本发明实施例还提供一种载有上述任一计算机程序的载体。
图 6是本发明实施例提供的中文人名播报流程图, 如图 6所示, 本实施 例仅以电话终端 TTS语音播报 "曾一庭看 《三国演义》这部小说, 他非常喜 欢关云长这个角色。 " 这段文本为例。
首先, 定义字符串中的读音槽, 例如名字 "曾一庭" , 可以定义为 "曾 [Pronounce:zeng]—庭", 以 [Pronounce:xxx]类似这样格式的字符串标识叫做读 音槽。
其次, 在服务器端以数据库的形式建立历史人名读音知识库以及姓氏读 音知识库, 见以下表 1和表 2所示。 表 1为历史人名读音知识库图表, 表 2 为姓氏读音知识库图表。 将 "曾一庭看 《三国演义》这部小说, 他非常喜欢 关云长这个角色。 " 这段待 TTS播报的文本内容由手机终端以 http方式传输 到服务器端, 由服务器端进行词法分析和句法分析后得到两颗语法树, 语法 树展示有两个 Name节点( "曾一庭" , "关云长" ), 见附图 5所示。
表 1
Id Hi story name Pronounce char Pronounce slot 1 关云长 长 [Pronounce: chang]
2 刘禅 禅 [Pronounce: shan]
3 贾平凹 凹 [Pronounce : wa]
表 2
Figure imgf000010_0001
然后, 按照以下步骤进行中文人名标识和播报:
步骤一: 匹配历史人名读音知识库, 若在所述历史人名读音知识库中匹 配到人名字符串, 则执行步骤二, 若没有匹配到, 则执行步骤三。
步骤二: 标识读音槽。
将( "曾一庭" , "关云长")这两个 Name节点作为检索条件在历史人名 读音知识库中用 SQL 查询语句(类似 select * from table 1 h where h.History— name = "关云长,,)搜索匹配到 "关云长 [Pronounce hang]" , 将这个 读音槽插入到文本字符串中, 例如: "曾一庭看 《三国演义》这部小说, 他 喜欢关云长 [Pronounce hang]这个角色。 ,, 。
步骤三: 匹配姓氏读音知识库, 若在所述姓氏读音知识库中匹配到人名 字符串, 则执行步骤四, 若没有匹配到, 则执行步骤五。
步骤四: 标识读音槽。
将剩余的( "曾一庭" )这个 Name节点在姓氏读音知识库用 SQL查询语 句搜索匹配到 "曾 [Pronounce:zeng] —庭" , 将上一个步骤处理好的文本进行 处理后得出 "曾 [Pronounce:zeng]—庭看 《三国演义》这部小说, 他非常喜欢 关云长 [Pronounce hang]这个角色。 ,, 如手机) , 终端再将处理好后的文本以参数的形式传给 TTS引擎, TTS引擎 依据 "曾 [Pronounce :zeng],, "长 [Pronounce :chang],, 这两个读音槽进行播才艮策 略的制定, 即 TTS引擎按照读音槽中的读音播报人名。
尽管上文对本发明进行了详细说明, 但是本发明不限于此, 本技术领域 技术人员可以根据本发明的原理进行各种修改。 因此, 凡按照本发明原理所 作的修改, 都应当理解为落入本发明的保护范围。
本领域的技术人员应该明白, 上述的本发明实施例所提供的装置和 /或系 统的各组成部分, 以及方法中的各步骤, 可以用通用的计算装置来实现, 它 们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上, 可选地, 它们可以用计算装置可执行的程序代码来实现, 从而, 可以将它们 存储在存储装置中由计算装置来执行, 或者将它们分别制作成各个集成电路 模块, 或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。 这 样, 本发明不限制于任何特定的硬件和软件结合。
工业实用性
本发明实施例通过文本中的语义解析出人名并标识,从而达到在终端侧 TTS播报的时候可以根据标识将人名中的多音字按照正确发音播报的目的。

Claims

权 利 要 求 书
1、 一种中文人名播报方法, 包括:
终端侧将含有人名字符串的待播报文本字符串发送至网络侧;
所述终端侧接收来自所述网络侧的已对所述人名字符串完成读音标识的 待播报文本字符串;
所述终端侧按照所述读音字符串播报所述人名字符串。
2、 一种中文人名播报方法, 包括:
网络侧在预置的读音数据库中,搜索待播报文本字符串中的人名字符串; 所述网络侧通过所述读音数据库中的读音字符串对所述人名字符串进行 读音标识;
所述网络侧将已进行读音标识的待播报文本字符串发送至终端侧, 以供 所述终端侧按照所述读音字符串播报所述人名字符串。
3、 根据权利要求 2所述的方法, 其中, 在预置的读音数据库中, 搜索待 播报文本字符串中的人名字符串的步骤之前, 还包括:
所述网络侧接收来自所述终端侧的待播报文本字符串;
所述网络侧对所述待播报文本字符串进行语义分析, 得到所述待播报文 本字符串中的人名字符串。
4、 根据权利要求 2或 3所述的方法, 其中, 所述读音数据库包括历史人 名发音知识库和 /或姓氏发音知识库, 网络侧在所述历史人名发音知识库和 / 或姓氏发音知识库中, 搜索待播报文本字符串中的人名字符串。
5、 根据权利要求 4所述的方法, 其中, 所述的通过所述读音数据库中的 读音字符串对所述人名字符串进行读音标识的步骤包括:
所述网络侧在所述历史人名发音知识库和 /或姓氏发音知识库中提取对 应于所述人名字符串的读音字符串;
所述网络侧将所述读音字符串插入到所述待播报文本字符串的指定位 置, 标识所述人名的读音。
6、 一种中文人名播报装置, 包括:
终端侧收发模块, 其设置为: 将含有人名字符串的待播报文本字符串发 送至网络侧, 并接收来自网络侧的已对所述人名字符串完成读音标识的待播 报文本字符串; 以及
终端侧播报模块, 其设置为: 按照所述读音字符串播报所述人名字符串。
7、 一种中文人名播报装置, 包括:
网络侧搜索模块, 其设置为: 在预置的读音数据库中, 搜索待播报文本 字符串中的人名字符串;
网络侧标识模块, 其设置为: 通过所述读音数据库中的读音字符串对所 述人名字符串进行读音标识; 以及
网络侧发送模块, 其设置为: 将已进行读音标识的待播报文本字符串发 送至终端侧, 以供所述终端侧按照所述读音字符串播报所述人名字符串。
8、 根据权利要求 7所述的装置, 还包括:
网络侧接收模块, 其设置为: 在所述网络侧搜索模块搜索待播报文本字 符串中的人名字符串之前, 接收来自终端侧的待播报文本字符串; 及
网络侧分析模块, 其设置为: 对所述待播报文本字符串进行语义分析, 得到所述待播报文本字符串中的人名字符串。
9、 根据权利要求 7或 8所述的装置, 其中, 所述读音数据库包括历史人 名发音知识库和 /或姓氏发音知识库, 所述网络侧搜索模块是设置为在所述历 史人名发音知识库和 /或姓氏发音知识库中, 搜索待播报文本字符串中的人名 字符串。
10、 根据权利要求 9所述的装置, 其中, 所述网络侧标识模块是设置为 在所述历史人名发音知识库和 /或姓氏发音知识库中提取对应于所述人名字 符串的读音字符串, 并将所提取的读音字符串插入到所述待播报文本字符串 的指定位置, 标识所述人名的读音。
11、 一种计算机程序, 包括程序指令, 当该程序指令被终端侧执行时, 使得该终端侧可执行权利要求 1所述的方法。
12、 一种载有权利要求 11所述计算机程序的载体。
13、 一种计算机程序, 包括程序指令, 当该程序指令被网络侧执行时, 使得该网络侧可执行权利要求 2-5任一项所述的方法。
14、 一种载有权利要求 13所述计算机程序的载体。
PCT/CN2014/084267 2014-05-14 2014-08-13 一种中文人名播报方法及装置 WO2015172448A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410204353.3A CN105095180A (zh) 2014-05-14 2014-05-14 一种中文人名播报方法及装置
CN201410204353.3 2014-05-14

Publications (1)

Publication Number Publication Date
WO2015172448A1 true WO2015172448A1 (zh) 2015-11-19

Family

ID=54479225

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/084267 WO2015172448A1 (zh) 2014-05-14 2014-08-13 一种中文人名播报方法及装置

Country Status (2)

Country Link
CN (1) CN105095180A (zh)
WO (1) WO2015172448A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108305611B (zh) * 2017-06-27 2022-02-11 腾讯科技(深圳)有限公司 文本转语音的方法、装置、存储介质和计算机设备
CN110032626B (zh) * 2019-04-19 2022-04-12 百度在线网络技术(北京)有限公司 语音播报方法和装置
CN110111778B (zh) * 2019-04-30 2021-11-12 北京大米科技有限公司 一种语音处理方法、装置、存储介质及电子设备
CN111401059A (zh) * 2020-03-16 2020-07-10 深圳市子瑜杰恩科技有限公司 小说朗读的方法
CN115329156B (zh) * 2022-10-14 2022-12-23 北京云成金融信息服务有限公司 基于历史数据的数据治理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030216920A1 (en) * 2002-05-16 2003-11-20 Jianghua Bao Method and apparatus for processing number in a text to speech (TTS) application
CN1889171A (zh) * 2005-06-29 2007-01-03 诺基亚公司 用于识别字符/字符串的语音识别方法和系统
CN1983387A (zh) * 2005-12-14 2007-06-20 英业达股份有限公司 用字符特征对应的参数来发出不同效果的字音系统及方法
JP2008186376A (ja) * 2007-01-31 2008-08-14 Casio Comput Co Ltd 音声出力装置及び音声出力プログラム
CN103024172A (zh) * 2012-12-10 2013-04-03 广东欧珀移动通信有限公司 一种手机来电的语音播报方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1835077B (zh) * 2005-03-14 2011-05-11 台达电子工业股份有限公司 中文人名自动语音辨识输入方法及系统
CN201063245Y (zh) * 2007-04-12 2008-05-21 南京卡巴自动设备有限公司 智能语音播报器
CN101519980B (zh) * 2009-04-02 2011-04-20 昆明理工大学 一种基于文本到语音的煤矿瓦斯实时报警系统和方法
CN101778149A (zh) * 2009-12-31 2010-07-14 中兴通讯股份有限公司 一种移动终端及其实现语音播报功能的方法
CN103165126A (zh) * 2011-12-15 2013-06-19 无锡中星微电子有限公司 一种手机文本短信的语音播放的方法
US20140115451A1 (en) * 2012-06-28 2014-04-24 Madeleine Brett Sheldon-Dante System and method for generating highly customized books, movies, and other products
CN103533519A (zh) * 2012-07-06 2014-01-22 盛乐信息技术(上海)有限公司 短信播报方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030216920A1 (en) * 2002-05-16 2003-11-20 Jianghua Bao Method and apparatus for processing number in a text to speech (TTS) application
CN1889171A (zh) * 2005-06-29 2007-01-03 诺基亚公司 用于识别字符/字符串的语音识别方法和系统
CN1983387A (zh) * 2005-12-14 2007-06-20 英业达股份有限公司 用字符特征对应的参数来发出不同效果的字音系统及方法
JP2008186376A (ja) * 2007-01-31 2008-08-14 Casio Comput Co Ltd 音声出力装置及び音声出力プログラム
CN103024172A (zh) * 2012-12-10 2013-04-03 广东欧珀移动通信有限公司 一种手机来电的语音播报方法

Also Published As

Publication number Publication date
CN105095180A (zh) 2015-11-25

Similar Documents

Publication Publication Date Title
CN108010523B (zh) 信息处理方法以及记录介质
US20200258506A1 (en) Domain and intent name feature identification and processing
WO2015172448A1 (zh) 一种中文人名播报方法及装置
US11093110B1 (en) Messaging feedback mechanism
US10366690B1 (en) Speech recognition entity resolution
US9589563B2 (en) Speech recognition of partial proper names by natural language processing
JP2020505643A (ja) 音声認識方法、電子機器、及びコンピュータ記憶媒体
CN107526809B (zh) 基于人工智能推送音乐的方法和装置
JP2020030408A (ja) オーディオにおける重要語句を認識するための方法、装置、機器及び媒体
WO2006106415A1 (en) Method, device, and computer program product for multi-lingual speech recognition
CN105632487B (zh) 一种语音识别方法和装置
CN111178076B (zh) 命名实体识别与链接方法、装置、设备及可读存储介质
KR20180046780A (ko) 이중 웨이크업을 이용한 음성 인식 서비스 제공 방법 및 이를 위한 장치
CN112634892B (zh) 一种语音处理方法、装置、可读存储介质和电子设备
WO2017166626A1 (zh) 归一化方法、装置和电子设备
JP6625772B2 (ja) 検索方法及びそれを用いた電子機器
CN111178081B (zh) 语义识别的方法、服务器、电子设备及计算机存储介质
CN109102800A (zh) 一种确定歌词显示数据的方法和装置
CN111680129A (zh) 语义理解系统的训练方法及系统
KR102342571B1 (ko) 다중 음성인식모듈을 적용한 음성 인식 방법 및 이를 위한 음성인식장치
JP4848397B2 (ja) 関連クエリ導出装置、関連クエリ導出方法及びプログラム
US20190116260A1 (en) Voice recognition-based dialing
KR102536944B1 (ko) 음성 신호 처리 방법 및 장치
JP2015200860A (ja) 辞書データベース管理装置、apiサーバ、辞書データベース管理方法、及び辞書データベース管理プログラム
Basu et al. Commodity price retrieval system in bangla: An ivr based application

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14891744

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14891744

Country of ref document: EP

Kind code of ref document: A1