CN115223588B - Child voice phrase matching method based on pinyin distance and sliding window - Google Patents
Child voice phrase matching method based on pinyin distance and sliding window Download PDFInfo
- Publication number
- CN115223588B CN115223588B CN202210292844.2A CN202210292844A CN115223588B CN 115223588 B CN115223588 B CN 115223588B CN 202210292844 A CN202210292844 A CN 202210292844A CN 115223588 B CN115223588 B CN 115223588B
- Authority
- CN
- China
- Prior art keywords
- distance
- target text
- pinyin
- phrase
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000013518 transcription Methods 0.000 claims abstract description 9
- 230000035897 transcription Effects 0.000 claims abstract description 9
- 238000002372 labelling Methods 0.000 claims abstract description 4
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000019771 cognition Effects 0.000 abstract description 2
- 230000009286 beneficial effect Effects 0.000 abstract 1
- 230000003930 cognitive ability Effects 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/169—Annotation, e.g. comment data or footnotes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及自然语言处理领域,具体涉及一种基于拼音距离和滑动窗口的儿童语音短语匹配方法。The invention relates to the field of natural language processing, and in particular to a children's speech phrase matching method based on pinyin distance and sliding window.
背景技术Background Art
如今,儿童的认知能力评估是脑科学研究的一个方向,其中一个方案是让儿童给出图片或场景的短语描述,与目标文本短语匹配进行认知正确性的判定。而由于低龄儿童识字能力欠缺,常常需要根据说话内容进行评估,这涉及音频的采集、转写和判定,大幅增加了志愿者的工作量。针对该问题,机器可以参与转写和判定环节,以节约人工成本。随着语音识别技术的发展,目前成人语音的识别准确率可以达到95%以上,相关产品应用广泛。然而儿童说话可能口齿不清,现有语音识别模型难以纠正表达模糊的部分,导致目标文本短语匹配困难,增加了误判为认知错误的音频数量。Nowadays, the assessment of children's cognitive abilities is a direction of brain science research. One of the solutions is to ask children to give short descriptions of pictures or scenes, and match them with target text phrases to determine the correctness of cognition. However, due to the lack of literacy skills of young children, it is often necessary to evaluate based on the content of the speech, which involves the collection, transcription and judgment of audio, which greatly increases the workload of volunteers. To address this problem, machines can participate in the transcription and judgment links to save labor costs. With the development of speech recognition technology, the current recognition accuracy of adult speech can reach more than 95%, and related products are widely used. However, children may speak unclearly, and existing speech recognition models find it difficult to correct the ambiguous parts of the expression, resulting in difficulty in matching the target text phrases and increasing the number of audios misjudged as cognitive errors.
从拼音角度而言,若两个完全不同的汉字发音相近,则对应的拼音也具有一定的相似性。通过拼音距离的度量,允许在一定范围内匹配发音类似的汉字,能够较好地解决上述问题。目前,拼音距离常采用两个拼音对应英文字母串的编辑距离表示,具有一定的可实施性,但忽略了拼音的声母或韵母之间发音的相似程度。From the perspective of pinyin, if two completely different Chinese characters have similar pronunciations, the corresponding pinyins also have certain similarities. By measuring the pinyin distance, it is allowed to match Chinese characters with similar pronunciations within a certain range, which can better solve the above problem. At present, the pinyin distance is often expressed by the edit distance between the English letter strings corresponding to two pinyins, which has certain feasibility, but ignores the similarity of the pronunciation between the initial consonants or finals of the pinyin.
在音频采集过程中,考虑到低龄儿童的认知能力,难以对儿童说话内容的长度进行限制,往往会出现较多冗余词,影响目标文本短语的匹配。因此,在较长的儿童语音转写文本中,需要寻找可能匹配的目标文本短语,滑动窗口策略具有可实施性。In the audio collection process, considering the cognitive ability of young children, it is difficult to limit the length of children's speech content, and there will often be many redundant words, which will affect the matching of target text phrases. Therefore, in the longer children's speech transcription text, it is necessary to find possible matching target text phrases, and the sliding window strategy is feasible.
发明内容Summary of the invention
有鉴于此,本发明的目的在于提供一种基于拼音距离和滑动窗口的儿童语音短语匹配方法,以便在儿童说话的内容中,寻找可能匹配的目标文本短语,减少儿童发音模糊带来的不利影响。In view of this, the purpose of the present invention is to provide a children's speech phrase matching method based on pinyin distance and sliding window, so as to find possible matching target text phrases in the content of children's speech and reduce the adverse effects of children's ambiguous pronunciation.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solution:
一种基于拼音距离和滑动窗口的儿童语音短语匹配方法,包括以下步骤:A children's speech phrase matching method based on pinyin distance and sliding window comprises the following steps:
步骤1:给定目标文本短语,收集儿童短语音频,通过语音识别模型得到儿童短语音频的转写文本,根据音频表达的内容是否包括目标文本短语进行标注;Step 1: Given a target text phrase, collect audio of the child's phrase, obtain the transcribed text of the child's phrase audio through the speech recognition model, and annotate it according to whether the content expressed in the audio includes the target text phrase;
步骤2:将目标文本短语与转写文本转化为对应的拼音序列,在转写文本对应的拼音序列中,使用滑动窗口算法,寻找与目标文本短语的拼音距离最小的子序列,并记录最小距离,具体包括:Step 2: Convert the target text phrase and the transcribed text into corresponding pinyin sequences. In the pinyin sequence corresponding to the transcribed text, use the sliding window algorithm to find the subsequence with the smallest pinyin distance to the target text phrase and record the minimum distance, including:
2.1)不考虑拼音声调,将目标文本短语与转写文本转化为对应的拼音序列;2.1) Ignoring the pinyin tones, the target text phrases and the transcribed text are converted into corresponding pinyin sequences;
2.2)使用滑动窗口算法,窗口大小与目标文本短语的字数相同,窗口每次向右滑动1个字,遍历转写文本对应的拼音序列,寻找与目标文本短语的拼音距离最小的子序列,子序列长度=窗口大小,并记录最小距离,若存在多个目标文本短语,则对每个目标文本短语分别进行该操作,得到最小距离的集合,集合元素个数为目标文本短语的个数,最后在该集合中寻找最小值作为转写文本与多个目标文本短语的最小距离;2.2) Use a sliding window algorithm, with the window size being the same as the number of characters in the target text phrase. The window slides rightward one character at a time, traversing the pinyin sequence corresponding to the transcribed text, and finding the subsequence with the minimum pinyin distance to the target text phrase. The subsequence length = window size, and record the minimum distance. If there are multiple target text phrases, perform this operation on each target text phrase to obtain a set of minimum distances. The number of set elements is the number of target text phrases. Finally, find the minimum value in the set as the minimum distance between the transcribed text and multiple target text phrases.
2.3)对于两个拼音序列S={s1,s2,......,sn}、Q={q1,q2,......,qn},有:2.3) For two pinyin sequences S = {s 1 , s 2 , ..., s n }, Q = {q 1 , q 2 , ..., q n }, we have:
d(S,Q)=[d(s1,q1)+d(s2,q2)+……+d(sn,qn)]÷nd(S, Q)=[d(s 1 , q 1 )+d(s 2 , q 2 )+……+d(s n , q n )]÷n
d为拼音距离,对于两个独立字的拼音si、qi,将si、qi分别拆分为声母部分和韵母部分,则有:d is the phonetic distance. For the phonetic symbols s i and q i of two independent characters, s i and q i are split into the initial part and the final part respectively, then:
d(si,qi)=声母距离(si,qi)+韵母距离(si,qi)d( si , qi ) = initial consonant distance ( si , qi ) + final vowel distance ( si , qi )
声母距离(si,qi)=声母编辑距离(si,qi)×声母权值(si,qi)Initial consonant distance (s i , q i ) = initial consonant edit distance (s i , q i ) × initial consonant weight (s i , q i )
其中声母权值(si,qi)由人工根据si、qi的声母发音相似度设计,权值范围[0.5,1.5],韵母距离(si,qi)的计算方式与声母距离一致。The initial consonant weights (s i , q i ) are manually designed based on the pronunciation similarity of the initial consonants of s i and q i , with a weight range of [0.5, 1.5]. The calculation method of the final distance (s i , q i ) is consistent with the initial consonant distance.
步骤3:对于步骤1的所有已标注数据,使用步骤2所述方法计算最小距离,得到最小距离的集合,并根据人工参与程度的设定比例,得到判定区间,对于每一个最小距离,若小于区间左端点,则目标文本短语匹配成功,若大于等于区间右端点,则目标文本短语匹配失败,若在区间内即包括区间左端点但不包括区间右端点,则由人工判定是否匹配目标文本短语;根据数据标注结果,对于每一个设定的人工参与比例,使用滑动窗口算法,寻找使准确率达到最大时对应的判定区间,具体包括:Step 3: For all the labeled data in step 1, the minimum distance is calculated using the method described in step 2 to obtain a set of minimum distances, and a judgment interval is obtained according to the set ratio of manual participation. For each minimum distance, if it is less than the left endpoint of the interval, the target text phrase matches successfully; if it is greater than or equal to the right endpoint of the interval, the target text phrase matches unsuccessfully; if it is within the interval, that is, including the left endpoint of the interval but not including the right endpoint of the interval, it is manually determined whether the target text phrase matches; according to the data labeling results, for each set manual participation ratio, a sliding window algorithm is used to find the corresponding judgment interval when the accuracy is maximized, specifically including:
3.1)令判定区间为[left,right),若最小距离<left,则目标文本短语匹配成功,若最小距离≥right,则目标文本短语匹配失败,若left≤最小距离<right,则由人工判定是否匹配目标文本短语;3.1) Let the judgment interval be [left, right). If the minimum distance is less than left, the target text phrase matches successfully. If the minimum distance is greater than or equal to right, the target text phrase matches unsuccessfully. If left is less than or equal to the minimum distance and less than right, it is manually determined whether the target text phrase matches.
3.2)人工参与程度的设定比例为序列{0,k1%,k2%,......,kt%},对步骤2计算得到的所有已标注数据的最小距离共m个进行升序排序,得到有序数组a={d1,d2,......,di,......,dj,......,dm},当人工比例为kr%时,对有序数组a使用滑动窗口算法,以m×kr%为窗口大小,令当前窗口为(di,dj),则j-i+1=m×kr%,判定区间[left,right)的确定方式为:3.2) The set ratio of the degree of manual participation is the sequence {0, k 1 %, k 2 %, ..., k t %}. The m minimum distances of all the labeled data calculated in step 2 are sorted in ascending order to obtain an ordered array a = {d 1 , d 2 , ..., d i , ..., d j , ..., d m }. When the manual ratio is k r %, the sliding window algorithm is used on the ordered array a, with m × k r % as the window size. Let the current window be (d i , d j ), then j-i+1 = m × k r %, and the determination method of the judgment interval [left, right) is:
对每一个判定区间,所有数据使用步骤3.1)所述规则进行匹配结果判定,过滤需要人工判定的数据,并与已标注数据进行比较,计算当前判定结果的准确率;使用滑动窗口算法时,初始i=0,窗口每次向右移动1个单位,寻找使判定结果的准确率最大时的判定区间作为人工比例为kr%时的最佳判定区间。For each judgment interval, all data are matched using the rules described in step 3.1) to filter the data that require manual judgment and compare them with the labeled data to calculate the accuracy of the current judgment result. When using the sliding window algorithm, initially i=0, the window moves to the right by 1 unit each time, and the judgment interval that maximizes the accuracy of the judgment result is found as the optimal judgment interval when the manual ratio is k r %.
本发明与现有技术相比,具有以下技术效果:Compared with the prior art, the present invention has the following technical effects:
本发明是基于拼音距离和滑动窗口的儿童语音短语匹配方法,相较过去仅使用拼音的编辑距离计算拼音相似度,考虑了声母和韵母的发音相似度,构建了声母和韵母之间编辑距离的权值矩阵,进一步优化了拼音距离的计算方式。同时,判定区间基于大量数据进行确定,具有统计意义。The present invention is a children's speech phrase matching method based on pinyin distance and sliding window. Compared with the previous method of calculating pinyin similarity using only the edit distance of pinyin, the pronunciation similarity of initials and finals is considered, and a weight matrix of the edit distance between initials and finals is constructed, which further optimizes the calculation method of pinyin distance. At the same time, the judgment interval is determined based on a large amount of data and has statistical significance.
本发明考虑了儿童发音的模糊性和说话内容的冗余性,提高了目标文本短语匹配的精确度,更准确地判定儿童的认知水平,具有可实施性。The present invention takes into account the ambiguity of children's pronunciation and the redundancy of speech content, improves the accuracy of target text phrase matching, more accurately determines the cognitive level of children, and is feasible.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明实施例的流程示意图。FIG1 is a schematic diagram of a flow chart of an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
下面结合具体的实施例以及附图对本发明进行进一步说明。The present invention is further described below in conjunction with specific embodiments and drawings.
实施例Example
参阅图1所示,本发明是一种基于拼音距离和滑动窗口的儿童语音短语匹配方法,包括以下步骤:Referring to FIG. 1 , the present invention is a method for matching children's speech phrases based on pinyin distance and sliding window, comprising the following steps:
步骤1:通过语音识别模型得到发音模糊的儿童语音的转写文本{这是牙此},并给定目标文本短语{牙齿};Step 1: Obtain the transcribed text {this is tooth this} of the child's speech with ambiguous pronunciation through the speech recognition model, and give the target text phrase {teeth};
步骤2:将目标文本短语与转写文本转化为对应的拼音序列,在转写文本对应的拼音序列中,使用滑动窗口算法,寻找与目标文本短语的拼音距离最小的子序列,并记录最小距离,具体包括:Step 2: Convert the target text phrase and the transcribed text into corresponding pinyin sequences. In the pinyin sequence corresponding to the transcribed text, use the sliding window algorithm to find the subsequence with the smallest pinyin distance to the target text phrase and record the minimum distance, including:
2.1)不考虑拼音声调,将目标文本短语与转写文本分别转化为对应的拼音序列{ya,chi}、{zhe,shi,ya,ci};2.1) Without considering the pinyin tones, the target text phrase and the transcribed text are converted into the corresponding pinyin sequences {ya, chi} and {zhe, shi, ya, ci} respectively;
2.2)使用滑动窗口算法,窗口大小与目标文本短语的字数相同(字数为2),窗口每次向右滑动1个字,遍历转写文本对应的拼音序列{zhe,shi,ya,ci},寻找与目标文本短语的拼音距离最小的子序列,子序列长度=窗口大小,并记录最小距离dmin=min{d({ya,chi},{zhe,shi}),d({ya,chi},{shi,ya}),d({ya,chi},{ya,ci})},d为拼音距离。若存在多个目标文本短语,则对每个目标文本短语分别进行该操作,得到最小距离的集合,集合元素个数为目标文本短语的个数,最后在该集合中寻找最小值作为转写文本与多个目标文本短语的最小距离;2.2) Use a sliding window algorithm, with the window size being the same as the number of characters in the target text phrase (the number of characters is 2). The window slides to the right one character at a time, traversing the pinyin sequence {zhe, shi, ya, ci} corresponding to the transcribed text, and finding the subsequence with the minimum pinyin distance from the target text phrase. The subsequence length = window size, and record the minimum distance d min = min{d({ya, chi}, {zhe, shi}), d({ya, chi}, {shi, ya}), d({ya, chi}, {ya, ci})}, where d is the pinyin distance. If there are multiple target text phrases, perform this operation on each target text phrase separately to obtain a set of minimum distances. The number of set elements is the number of target text phrases. Finally, find the minimum value in the set as the minimum distance between the transcribed text and multiple target text phrases.
2.3)两个拼音序列S={ya,chi}、Q={ya,ci}的距离为:2.3) The distance between two pinyin sequences S = {ya, chi} and Q = {ya, ci} is:
d(S,Q)=[d(ya,ya)+d(chi,ci)]÷2(d为拼音距离)d(S,Q)=[d(ya,ya)+d(chi,ci)]÷2 (d is the phonetic distance)
对于两个独立字的拼音chi、ci,将chi、ci分别拆分为声母部分ch、c和韵母部分i、i,则有:For the pinyin of two independent characters chi and ci, split chi and ci into the initial consonant part ch and c and the final part i and i respectively, then we have:
d(chi,ci)=d(ch,c)+d(i,i)d(chi,ci)=d(ch,c)+d(i,i)
d(ch,c)=编辑距离(ch,c)×权值(ch,c)d(ch, c) = edit distance (ch, c) × weight (ch, c)
其中权值(ch,c)由人工根据ch、c的发音相似度设计,值为0.5,则d(ch,c)=1×0.5=0.5,d(i,i)=0×1.0=0。The weight (ch, c) is designed manually based on the pronunciation similarity of ch and c, and its value is 0.5, so d(ch, c)=1×0.5=0.5, d(i, i)=0×1.0=0.
步骤3:对于步骤1的所有已标注数据,使用步骤2所述方法计算最小距离,得到最小距离的集合,并根据人工参与程度的设定比例,得到判定区间,对于每一个最小距离,若小于区间左端点,则目标文本短语匹配成功,若大于等于区间右端点,则目标文本短语匹配失败,若在区间内即包括区间左端点但不包括区间右端点,则由人工判定是否匹配目标文本短语;根据数据标注结果,对于每一个设定的人工参与比例,使用滑动窗口算法,寻找使准确率达到最大时对应的判定区间,具体包括:Step 3: For all the labeled data in step 1, the minimum distance is calculated using the method described in step 2 to obtain a set of minimum distances, and a judgment interval is obtained according to the set ratio of manual participation. For each minimum distance, if it is less than the left endpoint of the interval, the target text phrase matches successfully; if it is greater than or equal to the right endpoint of the interval, the target text phrase matches unsuccessfully; if it is within the interval, that is, including the left endpoint of the interval but not including the right endpoint of the interval, it is manually determined whether the target text phrase matches; according to the data labeling results, for each set manual participation ratio, a sliding window algorithm is used to find the corresponding judgment interval when the accuracy is maximized, specifically including:
3.1)令判定区间为[left,right),若最小距离<left,则目标文本短语匹配成功,若最小距离≥right,则目标文本短语匹配失败,若left≤最小距离<right,则由人工判定是否匹配目标文本短语;3.1) Let the judgment interval be [left, right). If the minimum distance is less than left, the target text phrase matches successfully. If the minimum distance is greater than or equal to right, the target text phrase matches unsuccessfully. If left is less than or equal to the minimum distance and less than right, it is manually determined whether the target text phrase matches.
3.2)人工参与程度的设定比例为序列{0,5%,10%,......,50%},对步骤2计算得到的所有已标注数据的共m=5000个最小距离进行升序排序,得到有序数组a={d1,d2,......,di-1,di,di+1,......,dj-1,dj,dj+1,......,dm}={0,0,......,1.4,1.5,1.5,......1.9,1.9,1.9,......,4.0},当人工比例为5%时,对有序数组a使用滑动窗口算法,以5000×5%=250为窗口大小,则i、j满足j-i+1=250,存在i、j,使得窗口为(di,dj)=(1.5,1.9),判定区间[left,right)的确定方式为:3.2) The set ratio of the degree of manual participation is a sequence of {0, 5%, 10%, ..., 50%}. The m = 5000 minimum distances of all the labeled data calculated in step 2 are sorted in ascending order to obtain an ordered array a = {d 1 , d 2 , ..., d i-1 , d i , d i+1 , ..., d j-1 , d j , d j+1 , ..., d m } = {0, 0, ..., 1.4, 1.5, 1.5, ..., 1.9, 1.9, 1.9, ..., 4.0}. When the manual ratio is 5%, the sliding window algorithm is used for the ordered array a, with 5000 × 5% = 250 as the window size, then i and j satisfy j-i+1 = 250, there are i and j such that the window is (d i , d j ) = (1.5, 1.9), and the determination method of the decision interval [left, right) is:
left=(1.4+1.5+1.5)÷3≈1.47left=(1.4+1.5+1.5)÷3≈1.47
right=(1.9+1.9+1.9)÷3=1.9right=(1.9+1.9+1.9)÷3=1.9
对每一个判定区间,所有数据使用步骤3.1)所述规则进行匹配结果判定,过滤需要人工判定的数据,并与已标注数据进行比较,计算当前判定结果的准确率;使用滑动窗口算法时,初始i=0,窗口每次向右移动1个单位,寻找使判定结果的准确率最大即89.29%时的判定区间[1.5,1.9)作为人工比例为5%时的最佳判定区间。For each judgment interval, all data are matched using the rules described in step 3.1) to determine the results, filter the data that require manual judgment, and compare with the labeled data to calculate the accuracy of the current judgment result; when using the sliding window algorithm, initially i=0, the window moves to the right by 1 unit each time, and finds the judgment interval [1.5, 1.9) that maximizes the accuracy of the judgment result, i.e. 89.29%, as the optimal judgment interval when the manual ratio is 5%.
以上所述仅为本发明的较佳实施例,在本发明权利要求所限定的范围内可对其进行一定修改,但都将落入本发明的保护范围内。The above description is only a preferred embodiment of the present invention, and certain modifications may be made thereto within the scope defined by the claims of the present invention, but all modifications will fall within the protection scope of the present invention.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210292844.2A CN115223588B (en) | 2022-03-24 | 2022-03-24 | Child voice phrase matching method based on pinyin distance and sliding window |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210292844.2A CN115223588B (en) | 2022-03-24 | 2022-03-24 | Child voice phrase matching method based on pinyin distance and sliding window |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115223588A CN115223588A (en) | 2022-10-21 |
CN115223588B true CN115223588B (en) | 2024-08-13 |
Family
ID=83606923
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210292844.2A Active CN115223588B (en) | 2022-03-24 | 2022-03-24 | Child voice phrase matching method based on pinyin distance and sliding window |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115223588B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118627465B (en) * | 2024-08-14 | 2025-02-14 | 江西风向标智能科技有限公司 | A method and system for segmenting science test paper text |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003900584A0 (en) * | 2003-02-11 | 2003-02-27 | Telstra New Wave Pty Ltd | System for predicting speech recognition accuracy and development for a dialog system |
CN107967916A (en) * | 2016-10-20 | 2018-04-27 | 谷歌有限责任公司 | Determine voice relation |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9418152B2 (en) * | 2011-02-09 | 2016-08-16 | Nice-Systems Ltd. | System and method for flexible speech to text search mechanism |
EP3837681A1 (en) * | 2018-09-04 | 2021-06-23 | Google LLC | Reading progress estimation based on phonetic fuzzy matching and confidence interval |
CN109256152A (en) * | 2018-11-08 | 2019-01-22 | 上海起作业信息科技有限公司 | Speech assessment method and device, electronic equipment, storage medium |
CN112149406B (en) * | 2020-09-25 | 2023-09-08 | 中国电子科技集团公司第十五研究所 | Chinese text error correction method and system |
CN112509609B (en) * | 2020-12-16 | 2022-06-10 | 北京乐学帮网络技术有限公司 | Audio processing method and device, electronic equipment and storage medium |
CN113486155B (en) * | 2021-07-28 | 2022-05-20 | 国际关系学院 | Chinese naming method fusing fixed phrase information |
-
2022
- 2022-03-24 CN CN202210292844.2A patent/CN115223588B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003900584A0 (en) * | 2003-02-11 | 2003-02-27 | Telstra New Wave Pty Ltd | System for predicting speech recognition accuracy and development for a dialog system |
CN107967916A (en) * | 2016-10-20 | 2018-04-27 | 谷歌有限责任公司 | Determine voice relation |
Also Published As
Publication number | Publication date |
---|---|
CN115223588A (en) | 2022-10-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1575029B1 (en) | Generating large units of graphonemes with mutual information criterion for letter to sound conversion | |
Zhang et al. | Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams | |
US6067520A (en) | System and method of recognizing continuous mandarin speech utilizing chinese hidden markou models | |
Ferrer et al. | Classification of lexical stress using spectral and prosodic features for computer-assisted language learning systems | |
JP4778008B2 (en) | Method and system for generating and detecting confusion sound | |
US9978364B2 (en) | Pronunciation accuracy in speech recognition | |
Gao et al. | A study on robust detection of pronunciation erroneous tendency based on deep neural network. | |
CN105957518A (en) | Mongolian large vocabulary continuous speech recognition method | |
CN105404621A (en) | Method and system for blind people to read Chinese character | |
CN114386399B (en) | Text error correction method and device | |
Ye et al. | An approach to mispronunciation detection and diagnosis with acoustic, phonetic and linguistic (APL) embeddings | |
Avram et al. | Towards a romanian end-to-end automatic speech recognition based on deepspeech2 | |
CN101334998A (en) | Chinese Speech Recognition System Based on Discriminative Fusion of Heterogeneous Models | |
Al-Anzi et al. | The impact of phonological rules on Arabic speech recognition | |
CN115223588B (en) | Child voice phrase matching method based on pinyin distance and sliding window | |
KR102731583B1 (en) | Transliteration for speech recognition training and scoring | |
Halabi | Arabic speech corpus | |
CN113571037B (en) | Chinese braille voice synthesis method and system | |
Karim et al. | On the training of deep neural networks for automatic arabic-text diacritization | |
US11817079B1 (en) | GAN-based speech synthesis model and training method | |
Azim et al. | Large vocabulary Arabic continuous speech recognition using tied states acoustic models | |
Li et al. | Improving mandarin tone mispronunciation detection for non-native learners with soft-target tone labels and blstm-based deep models | |
CN111508522A (en) | Statement analysis processing method and system | |
CN116453500A (en) | Method, system, electronic device and storage medium for synthesizing small language speech | |
Vythelingum et al. | Error detection of grapheme-to-phoneme conversion in text-to-speech synthesis using speech signal and lexical context |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |