WO2016138605A1 - Generating method for shielding signals used for protecting chinese speech privacy - Google Patents

Generating method for shielding signals used for protecting chinese speech privacy Download PDF

Info

Publication number
WO2016138605A1
WO2016138605A1 PCT/CN2015/000255 CN2015000255W WO2016138605A1 WO 2016138605 A1 WO2016138605 A1 WO 2016138605A1 CN 2015000255 W CN2015000255 W CN 2015000255W WO 2016138605 A1 WO2016138605 A1 WO 2016138605A1
Authority
WO
WIPO (PCT)
Prior art keywords
probability table
chinese
segment
syllable
interval
Prior art date
Application number
PCT/CN2015/000255
Other languages
French (fr)
Chinese (zh)
Inventor
李晔
马晓凤
郝秋赟
樊燕红
姜竞赛
张鹏
Original Assignee
山东省计算中心(国家超级计算济南中心)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 山东省计算中心(国家超级计算济南中心) filed Critical 山东省计算中心(国家超级计算济南中心)
Publication of WO2016138605A1 publication Critical patent/WO2016138605A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/1752Masking
    • G10K11/1754Speech masking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal

Definitions

  • the invention relates to a method for generating a masking signal for protecting the privacy of Chinese speech, and more particularly to a method which can form an insignificant meaning and is very similar to a normal speaking speech, and reduces the negative impact on the auditory A method for generating a masking signal for protecting the privacy of Chinese speech.
  • the channels for unintentional leakage of sound signals mainly include: doors, windows, walls, and various pipes.
  • the method proposed in this paper is mainly for the unintentional leakage of sound signals.
  • most of the unconscious leaks of sound signals are protected by acoustic masking techniques.
  • the interference source is placed at a location or path where there may be a sound leak, and an interference signal is generated, thereby masking a useful voice signal, thereby achieving the function of sound leakage protection.
  • the above interference signal is called a masking signal.
  • the choice of masking signal should consider two factors, one is the masking effect, and the other is the psychological and physiological influence of the masking signal on people.
  • the common masking signals mainly include white noise, pink noise, and HVAC noise.
  • White noise and pink noise usually have relatively stable statistical characteristics, but the masking efficiency is low.
  • the HVAC noise signal itself is discontinuous, unstable, unevenly distributed or the sound level is too high, and sometimes it becomes a noise source, which has a relatively large psychological and physiological impact on people, and the negative effects are obvious.
  • the main purpose of the present invention is to synthesize a new masking signal by utilizing the characteristics of Chinese pronunciation, including various statistical characteristics of words, words and sentences. Because of its similar statistical characteristics to normal pronunciation, it is not easy to be cracked and masking effect. Well, it will reduce the impact of masking signals on people's psychology and physiology, and it will be somewhat confusing.
  • the method for generating a masking signal for protecting the privacy of Chinese speech of the present invention is special in that it is implemented by the following steps:
  • Statistical statement probability table with a representative Chinese corpus as a statistical sample, statistics on the number of statements contained in each paragraph in the corpus, and obtain the probability table of the number of statements that make up the paragraph [J 1 , J 2 , J 3 ..., J m ], referred to as the statement probability table, where J i represents the percentage of paragraphs with the number of statements i, and 1 ⁇ i ⁇ m;
  • Statistical phrase probability table the number of phrases included in all segments in the corpus is counted, and the probability table of the number of phrases in the segment is obtained. Referred to as a phrase probability table, where C i represents the percentage of segments with the number of phrases i in all segments, 1 ⁇ i ⁇ q;
  • Statistical Chinese character probability table statistics on the number of Chinese characters contained in all phrases in the corpus, and the probability table of the number of Chinese characters in the phrase Referred to as the Chinese character probability table, where Z i represents the percentage of the phrase whose number of Chinese characters is i, and 1 ⁇ i ⁇ p;
  • Statistical syllable probability table first sort the syllables in alphabetical order, denoted as [H 1 , H 2 , H 3 ..., H k ], and then obtain the syllable probability table according to the probability of each syllable appearing in everyday language. [h 1 , h 2 , h 3 ..., h k ], referred to as a syllable probability table, where h i represents the frequency at which the syllable H i appears in everyday language, 1 ⁇ i ⁇ k;
  • the sentence segment contains 1 phrase, and if r 3 ⁇ [C 1 , C 1 + C 2 ], the segment contains 2 phrases, according to which analogy;
  • the phrase contains 1 Chinese character
  • r 4 ⁇ [Z 1 , Z 1 + Z 2 ] the phrase contains 2 Chinese characters, and so on;
  • the same number of random numbers [ ⁇ 1 , ⁇ 2 ..., ⁇ n ] can be generated by using the seed according to the number of Chinese characters contained in the phrase, if the random number ⁇ 1 ⁇ [0, h 1 ] , select the H 1 syllable; if ⁇ 2 ⁇ [h 1 , h 1 + h 2 ], select the H 2 syllable, and so on;
  • Speech synthesis is based on a speech library that synthesizes the random text generated in the previous step into a masked signal output.
  • the voice library is recorded in a professional recording studio, covering all the common syllables of Chinese speech.
  • the naming of each syllable in the speech library corresponds to the syllable name of the generated random text.
  • the tone in the voice library is one
  • the syllable with the pronunciation of "ah” is named "a1.wav”
  • the syllable with the second pronunciation is "ah", which is named "a2.wav” accordingly.
  • the random text "text.txt" generated in the previous step is read and matched with the speech library.
  • the syllable "bail” is read from the random text, it is mapped to the speech library. "bail.wav”, and so on, The syllables are in one-to-one correspondence with the pronunciations in the speech library, and finally the masked signal output is synthesized.
  • a silent segment is added between the natural segments, between the sentences and between the segments.
  • the symbols at the end of the sentence are specified as a period, a question mark, an exclamation point, and the symbols at the end of the sentence are specified as a colon, a comma, a semicolon, and the symbols at the end of the paragraph are specified as carriage returns and line feed symbols.
  • the pre-recorded silent segment is stored in the voice library.
  • the name of the silent segment must be different from all the syllables in the voice library. For example, the silent segment is named jyin.wav.
  • the method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention, the probability of statement, the probability of segment, the probability of phrase, and the probability of Chinese characters in steps a), b), c), and d) are all accurate to 0.01, step e)
  • the syllable probability in is accurate to 0.0001.
  • the method for generating a masking signal for protecting the privacy of Chinese speech is a modern Chinese general balanced corpus constructed by the National Language Committee.
  • the invention has the beneficial effects that the method for generating the masking signal of the invention fully considers the requirement of the sound concealing of the conference room and the characteristics of the Chinese speech, and discards the traditional way of masking signals using steady-state noise, based on words and words in the Chinese language.
  • FIG. 1 is a flow chart of a method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention.
  • FIG. 1 a flow chart of a method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention is given.
  • the random text generation involves the following probability tables:
  • phrase probability table
  • the syllables form a phrase, which needs to count the probability table of the number of Chinese characters that make up the phrase, referred to as the Chinese character probability table;
  • the corpus of the above several probability table statistics comes from the modern Chinese general balanced corpus.
  • the corpus was established by the National Language Committee and the whole library is about 100 million characters. Among them, the corpus before 1997 was about 70 million characters, all of which were manually entered into the printed corpus; after 1997, the corpus was about 30 million characters, manually entered and taken from the electronic text.
  • the corpus is characterized by a large time span of corpus samples, a wide distribution of fields, and a more balanced ratio, which can better represent the whole picture of modern Chinese.
  • Word count 19455541 Number of words 12842566 Number of segments 1768545
  • phrase probability calculation formula is:
  • Phrase Probability (%) (total number of segments of a certain length / total number of segments of the corpus) ⁇ 100
  • Chinese characters are distinguished by syllables, and each Chinese character corresponds to one syllable.
  • a1, a2, a3, and a4 respectively represent four syllables whose pronunciation is "ah”, and the frequency of occurrence of each syllable in daily speech is separately counted according to such rules. , as shown in Table 6.
  • the program generates a random number r 5 . If the random number r 5 ⁇ [0,h 1 ] ⁇ [0,215], then corresponds to the syllable H 1 (ie a1), if r 5 ⁇ [h 1 ,h 1 +h 2 ] ⁇ [215,240], corresponding to the syllable H 2 (ie a2), and so on, sequentially determine the specific syllables that make up the phrase.
  • the speech database is used to synthesize the random text generated in the previous step into a masked signal output.
  • the voice library must be recorded in a professional recording studio, covering all the common syllables of Chinese speech.
  • the naming of each syllable in the speech library corresponds to the syllable name of the generated random text.
  • the tone in the voice library is one
  • the syllable with the pronunciation of "ah” is named "a1.wav”
  • the syllable with the second pronunciation is "ah”, which is named "a2.wav” accordingly.
  • the random text "text.txt” generated in the previous step is read and matched with the speech library. For example, if the syllable “bail” is read from the random text, it is mapped to the speech library. "bail.wav”, and so on, all the syllables are in one-to-one correspondence with the pronunciations in the speech library, and finally the masked signal output is synthesized.
  • a silent segment is added between the natural segments, between the sentences and between the segments to simulate the pause during normal speech.
  • the symbols at the end of the sentence are specified as a period, a question mark, an exclamation point, and the symbols at the end of the sentence are specified as a colon, a comma, a semicolon, and the symbols at the end of the paragraph are specified as carriage returns and line feed symbols.
  • the pre-recorded silent segment is stored in the voice library.
  • the name of the silent segment must be different from all the syllables in the voice library. For example, the silent segment is named jyin.wav.
  • the combined masking signal has the smoothest effect when the mute segment length is set to 0.5s.

Abstract

A generating method for shielding signals used for protecting Chinese speech privacy comprises: (a) a probability table of sentences is obtained through statistics; (b) a probability table of paragraphs of statements is obtained through statistics; (c) a probability table of phrases is obtained through statistics; (d) a probability table of Chinese characters is obtained through statistics; (e) a probability table of syllables is obtained through statistics; (f) a text information is generated on the basis of the determined number of sentences in a paragraph, the number of paragraphs of statements in a sentence, the number of phrases in a paragraph of statements, the number of Chinese characters in a phrase and the syllables in Chinese characters; and (g) sound synthesis is carried out. The requirement of shielding sound in a meeting room and the features of Chinese speech are fully considered by the generating method for shielding signals, the conventional mode of shielding signals by adopting stationary noise, etc. is abandoned, and shielding signals which have no actual effect but are quite similar to normal speaking voice is generated on the basis of each statistical features of the words, the phrases and the sentences in Chinese language and by using a voice database of human utterance. Compared with conventional shielding noises, various side effects on hearing are greatly reduced and the effect of shielding sound is enhanced by the shielding signals.

Description

一种用于保护汉语语音私密度的掩蔽信号的生成方法Method for generating masking signal for protecting Chinese speech privacy 技术领域Technical field
本发明涉及一种用于保护汉语语音私密度的掩蔽信号的生成方法,更具体的说,尤其涉及一种可形成无实际意义的、与正常说话语音极其相似的、减小了听觉上负面影响的用于保护汉语语音私密度的掩蔽信号的生成方法。The invention relates to a method for generating a masking signal for protecting the privacy of Chinese speech, and more particularly to a method which can form an insignificant meaning and is very similar to a normal speaking speech, and reduces the negative impact on the auditory A method for generating a masking signal for protecting the privacy of Chinese speech.
背景技术Background technique
会议室保密涉及到国家、商业、科技等机密信息的保护,属于信息安全领域,从国家安全到商业应用都有迫切的需求,商业窃听每年给国家造成的经济损失可达数百亿元。作为保密会议室最基本的信息形式,声音是需要保护的重点。保密会议室中声音信息的泄露主要有两种方式:主动泄露和无意识泄露。主动泄露指的是通过在会议室内部安装窃听设备所造成的泄露,而无意识泄露指的是会议召开期间,声音通过空气传声、固体传声等方式泄露,而被非授权人员听到。具体而言,声音信号无意识泄露的通道主要包括:门、窗、墙体以及各种管道等。本文所提出的方法主要针对声音信号的无意识泄露。目前,针对声音信号的无意识泄露,大都采用声掩蔽技术进行防护。具体而言,就是在可能存在声音泄露的位置、途径上布设干扰源,产生干扰信号,从而掩蔽有用的语音信号,从而达到声音泄露防护的作用。上述干扰信号被称为掩蔽信号。The confidentiality of conference rooms involves the protection of confidential information such as the state, business, science and technology. It belongs to the field of information security. There is an urgent need from national security to commercial applications. The economic losses caused by commercial eavesdropping to the country can reach tens of billions of yuan each year. As the most basic form of information in a confidential meeting room, sound is the focus of protection. There are two main ways to disclose the sound information in a confidential conference room: active leakage and unintentional disclosure. Active leakage refers to leakage caused by installing eavesdropping equipment inside the conference room. Unconscious leakage refers to the leakage of sound through air transmission, solid sound transmission, etc. during the conference, and is heard by unauthorized personnel. Specifically, the channels for unintentional leakage of sound signals mainly include: doors, windows, walls, and various pipes. The method proposed in this paper is mainly for the unintentional leakage of sound signals. At present, most of the unconscious leaks of sound signals are protected by acoustic masking techniques. Specifically, the interference source is placed at a location or path where there may be a sound leak, and an interference signal is generated, thereby masking a useful voice signal, thereby achieving the function of sound leakage protection. The above interference signal is called a masking signal.
掩蔽信号的选择要考虑两个方面的因素,一是掩蔽效果,二是掩蔽信号对人的心理和生理影响。目前常见的掩蔽信号主要有白噪声、粉噪声、暖通空调噪声等。白噪声和粉噪声,通常具有比较稳定的统计特性,但掩蔽效率较低。而暖通空调噪声信号本身具有不连续、不稳定、分布不均或是声音级过高,有时候反而成为噪声源,对人的心理和生理影响比较大,负面效应明显。 The choice of masking signal should consider two factors, one is the masking effect, and the other is the psychological and physiological influence of the masking signal on people. At present, the common masking signals mainly include white noise, pink noise, and HVAC noise. White noise and pink noise usually have relatively stable statistical characteristics, but the masking efficiency is low. The HVAC noise signal itself is discontinuous, unstable, unevenly distributed or the sound level is too high, and sometimes it becomes a noise source, which has a relatively large psychological and physiological impact on people, and the negative effects are obvious.
发明内容Summary of the invention
本发明的主要目的是利用汉语发音的特点,包括字、词、句的各项统计特性,合成一种新的掩蔽信号,由于其与正常发音的统计特性类似,因而不容易被破解,掩蔽效果好,同时会降低掩蔽信号对人心理和生理的影响,兼具一定的迷惑性。The main purpose of the present invention is to synthesize a new masking signal by utilizing the characteristics of Chinese pronunciation, including various statistical characteristics of words, words and sentences. Because of its similar statistical characteristics to normal pronunciation, it is not easy to be cracked and masking effect. Well, it will reduce the impact of masking signals on people's psychology and physiology, and it will be somewhat confusing.
本发明的用于保护汉语语音私密度的掩蔽信号的生成方法,其特别之处在于,通过以下步骤来实现:The method for generating a masking signal for protecting the privacy of Chinese speech of the present invention is special in that it is implemented by the following steps:
a).统计语句概率表,以具有代表性的汉语语料库为统计样本,对语料库中每个段落所包含的语句数进行统计,获得组成段落的语句数的概率表[J1,J2,J3…,Jm],简称语句概率表,其中Ji表示语句数目为i的段落占所有段落的百分比,1≤i≤m;a). Statistical statement probability table, with a representative Chinese corpus as a statistical sample, statistics on the number of statements contained in each paragraph in the corpus, and obtain the probability table of the number of statements that make up the paragraph [J 1 , J 2 , J 3 ..., J m ], referred to as the statement probability table, where J i represents the percentage of paragraphs with the number of statements i, and 1 ≤ i ≤ m;
b).统计句段概率表,对语料库中所有语句所包含的句段数进行统计,获得语句的句段数的概率表[D1,D2,D3…,Dl],简称句段概率表,其中Di表示句段数目为i的语句占所有语句的百分比,1≤i≤l;b). Statistical segment probability table, statistics on the number of segments included in all statements in the corpus, and the probability table [D 1 , D 2 , D 3 ..., D l ] of the number of segments of the statement, referred to as the segment probability table , where D i represents the percentage of statements with the number of segments i, and 1 ≤ i ≤ l;
c).统计词组概率表,对语料库中所有句段所包含的词组数进行统计,获得句段的词组数的概率表
Figure PCTCN2015000255-appb-000001
简称词组概率表,其中Ci表示词组数目为i的句段占所有句段的百分比,1≤i≤q;
c). Statistical phrase probability table, the number of phrases included in all segments in the corpus is counted, and the probability table of the number of phrases in the segment is obtained.
Figure PCTCN2015000255-appb-000001
Referred to as a phrase probability table, where C i represents the percentage of segments with the number of phrases i in all segments, 1 ≤ i ≤ q;
d).统计汉字概率表,对语料库中所有词组所包含的汉字数进行统计,获得词组的汉字数的概率表
Figure PCTCN2015000255-appb-000002
简称汉字概率表,其中Zi表示汉字数目为i的词组占所有词组的百分比,1≤i≤p;
d). Statistical Chinese character probability table, statistics on the number of Chinese characters contained in all phrases in the corpus, and the probability table of the number of Chinese characters in the phrase
Figure PCTCN2015000255-appb-000002
Referred to as the Chinese character probability table, where Z i represents the percentage of the phrase whose number of Chinese characters is i, and 1 ≤ i ≤ p;
e).统计音节概率表,首先按照字母顺序对音节进行排序,记为[H1,H2,H3…,Hk],然后根据各音节在日常用语中出现的概率,获得音节概率表[h1,h2,h3…,hk],简称音节概率表,其中hi表示音节Hi在日常用语中出现的频率,1≤i≤k;e). Statistical syllable probability table, first sort the syllables in alphabetical order, denoted as [H 1 , H 2 , H 3 ..., H k ], and then obtain the syllable probability table according to the probability of each syllable appearing in everyday language. [h 1 , h 2 , h 3 ..., h k ], referred to as a syllable probability table, where h i represents the frequency at which the syllable H i appears in everyday language, 1 ≤ i ≤ k;
f).生成文本信息,按照如下步骤生成语音对应的文本信息:f). Generate text information, and generate text information corresponding to the voice according to the following steps:
f-1).确定自然段的语句数,在区间范围
Figure PCTCN2015000255-appb-000003
内产生随机数r1,并判 断随机数r1所属区间;如果r1在区间
Figure PCTCN2015000255-appb-000004
内,则得出自然段中所包含的语句数为n1,其中,1≤n1≤m,J0=0;通过步骤f-2)确定出自然段中的每个语句;
F-1). Determine the number of statements in the natural segment, in the range of the interval
Figure PCTCN2015000255-appb-000003
Generate a random number r 1 and judge the interval to which the random number r 1 belongs; if r 1 is in the interval
Figure PCTCN2015000255-appb-000004
Inside, it is found that the number of sentences included in the natural segment is n1, where 1≤n1≤m, J 0 =0; each statement in the natural segment is determined by step f-2);
例如,若随机数r1∈[0,J1],则该自然段包含1个语句,若r1∈[J1,J1+J2],则该自然段包含2个语句,依此类推;For example, if the random number r 1 ∈ [0, J 1 ], the natural segment contains 1 statement, if r 1 ∈ [J 1 , J 1 + J 2 ], the natural segment contains 2 statements, according to this analogy;
f-2).确定语句中的句段数,在区间范围
Figure PCTCN2015000255-appb-000005
内产生随机数r2,并判断随机数r2所属区间;如果r2在区间
Figure PCTCN2015000255-appb-000006
内,则得出语句中所包含的句段数为n2,其中,1≤n2≤l,D0=0;通过步骤f-3)确定出每个语句中的句段;
F-2). Determine the number of segments in the statement, in the range of the interval
Figure PCTCN2015000255-appb-000005
Generate a random number r 2 and determine the interval to which the random number r 2 belongs; if r 2 is in the interval
Figure PCTCN2015000255-appb-000006
Inside, the number of segments included in the statement is n2, where 1≤n2≤1, D 0 =0; the segment in each statement is determined by step f-3);
例如,若随机数r2∈[0,D1],则该语句包含1个句段,若r2∈[D1,D1+D2],则该语句包含2个句段,依此类推;For example, if the random number r 2 ∈ [0, D 1 ], the statement contains 1 segment, if r 2 ∈ [D 1 , D 1 + D 2 ], the statement contains 2 segments, according to this analogy;
f-3).确定句段中的词组数,在区间范围
Figure PCTCN2015000255-appb-000007
内产生随机数r3,并判断随机数r3所属区间;如果r3在区间
Figure PCTCN2015000255-appb-000008
内,则得出句段中所包含的词组数为n3,其中,1≤n3≤q,C0=0;通过步骤f-4)确定每个句段中的词组;
F-3). Determine the number of phrases in the segment, in the range of the interval
Figure PCTCN2015000255-appb-000007
Generate a random number r 3 and determine the interval to which the random number r 3 belongs; if r 3 is in the interval
Figure PCTCN2015000255-appb-000008
Within the sentence, the number of phrases included in the sentence is n3, where 1≤n3≤q, C 0 =0; the phrase in each segment is determined by step f-4);
例如,若随机数r3∈[0,C1],则该句段包含1个词组,若r3∈[C1,C1+C2],则该句段包含2个词组,依此类推;For example, if the random number r 3 ∈ [0, C 1 ], the sentence segment contains 1 phrase, and if r 3 ∈ [C 1 , C 1 + C 2 ], the segment contains 2 phrases, according to which analogy;
f-4).确定词组中的汉字数,在区间范围
Figure PCTCN2015000255-appb-000009
内产生随机数r4,并判断随机数r4所属区间;如果r4在区间
Figure PCTCN2015000255-appb-000010
内,则得出词组中所包含的 汉字数为n4,汉字数即音节数,每个汉字对应一个音节,其中,1≤n4≤p,Z0=0;通过步骤f-5)确定每个汉字的音节;
F-4). Determine the number of Chinese characters in the phrase, in the range of the interval
Figure PCTCN2015000255-appb-000009
Generate a random number r 4 and determine the interval to which the random number r 4 belongs; if r 4 is in the interval
Figure PCTCN2015000255-appb-000010
Inside, it is concluded that the number of Chinese characters contained in the phrase is n4, the number of Chinese characters is the number of syllables, and each Chinese character corresponds to one syllable, where 1≤n4≤p, Z 0 =0; each step is determined by step f-5) Syllables of Chinese characters;
例如,若随机数r4∈[0,Z1],则该词组包含1个汉字,若r4∈[Z1,Z1+Z2],则该词组包含2个汉字,依此类推;For example, if the random number r 4 ∈ [0, Z 1 ], the phrase contains 1 Chinese character, if r 4 ∈ [Z 1 , Z 1 + Z 2 ], the phrase contains 2 Chinese characters, and so on;
f-5).确定音节,在区间范围
Figure PCTCN2015000255-appb-000011
内产生随机数r5,并判断随机数r5所属区间;如果r5在区间
Figure PCTCN2015000255-appb-000012
内,则得出汉字所对应的音节为Hn5,其中,1≤n5≤k,h0=0;直至词组中所有汉字的音节确定完毕;
F-5). Determine the syllable, in the range
Figure PCTCN2015000255-appb-000011
Generate a random number r 5 and determine the interval to which the random number r 5 belongs; if r 5 is in the interval
Figure PCTCN2015000255-appb-000012
Inside, it is concluded that the syllable corresponding to the Chinese character is H n5 , where 1≤n5≤k, h 0 =0; until the syllables of all the Chinese characters in the phrase are determined;
在该步骤中,可以根据词组中所含的汉字数,利用种子生成与汉字数相同数目的随机数[γ1、γ2…,γn],若随机数γ1∈[0,h1],则选取H1音节;若γ2∈[h1,h1+h2],则选取H2音节,依此类推;In this step, the same number of random numbers [γ 1 , γ 2 ..., γ n ] can be generated by using the seed according to the number of Chinese characters contained in the phrase, if the random number γ 1 ∈ [0, h 1 ] , select the H 1 syllable; if γ 2 ∈ [h 1 , h 1 + h 2 ], select the H 2 syllable, and so on;
按照步骤f-1)至f-5)生成自然段的文本信息,直至所生成的自然段数目满足要求;Generating the text information of the natural segment according to steps f-1) to f-5) until the number of natural segments generated meets the requirements;
g).语音合成,利用与每个音节的发音相对应的语音库,将步骤f)中获取的自然段的文本信息中的音节,与语音库中的发音一一对应形成相应的语音数据,通过在保密会议中的声音泄漏位置播放该语音数据,即可形成与正常发音的统计特性类似、掩蔽性好、对会议人员影响小的语音掩蔽信号。g). speech synthesis, using the speech library corresponding to the pronunciation of each syllable, the syllables in the text information of the natural segment obtained in step f) are correspondingly corresponding to the pronunciations in the speech library to form corresponding speech data, By playing the voice data in the sound leakage position in the confidential conference, a voice masking signal similar to the normal utterance statistical characteristics, good masking property, and small influence on the conference personnel can be formed.
语音合成是基于语音库,将上一步骤所产生的随机文本合成为掩蔽信号输出。语音库在专业的录音室内录制,涵盖了汉语语音所有的常用音节。语音库中各音节的命名与生成随机文本的音节名字一一对应。例如语音库中音调为一声,读音为“啊”的音节命名为“a1.wav”,读音为二声的“啊”,相应地命名为“a2.wav”。语音合成时,读取上一步骤所产生的随机文本“text.txt”,并与语音库进行匹配,例如,从随机文本中读取到“bail”这个音节,则将其对应到语音库的“bail.wav”,依次类推, 将所有音节与语音库中发音一一对应,最终合成掩蔽信号输出。Speech synthesis is based on a speech library that synthesizes the random text generated in the previous step into a masked signal output. The voice library is recorded in a professional recording studio, covering all the common syllables of Chinese speech. The naming of each syllable in the speech library corresponds to the syllable name of the generated random text. For example, the tone in the voice library is one, the syllable with the pronunciation of "ah" is named "a1.wav", and the syllable with the second pronunciation is "ah", which is named "a2.wav" accordingly. During speech synthesis, the random text "text.txt" generated in the previous step is read and matched with the speech library. For example, if the syllable "bail" is read from the random text, it is mapped to the speech library. "bail.wav", and so on, The syllables are in one-to-one correspondence with the pronunciations in the speech library, and finally the masked signal output is synthesized.
为了使合成的掩蔽信号听起来更流畅自然,在各自然段之间、各语句之间及各句段之间加入静音段。句末符号规定为句号、问号、感叹号,句段末的符号规定为冒号、逗号、分号,段末的符号规定为回车、换行符号。将提前录制好的静音段存入语音库中,静音段的命名必须有别于语音库中的所有音节,例如将静音段命名为jyin.wav.。读取随机文本时,若遇到以上规定的末端符号,直接读取语音库中对应的静音段,以达到语音停顿的目的。In order to make the synthesized masking signal sound smoother and natural, a silent segment is added between the natural segments, between the sentences and between the segments. The symbols at the end of the sentence are specified as a period, a question mark, an exclamation point, and the symbols at the end of the sentence are specified as a colon, a comma, a semicolon, and the symbols at the end of the paragraph are specified as carriage returns and line feed symbols. The pre-recorded silent segment is stored in the voice library. The name of the silent segment must be different from all the syllables in the voice library. For example, the silent segment is named jyin.wav. When reading random text, if the end symbol specified above is encountered, the corresponding silent segment in the voice library is directly read to achieve the purpose of voice pause.
本发明的用于保护汉语语音私密度的掩蔽信号的生成方法,步骤f)在语音文本信息的生成过程中,语句末的符号为句号、问号或感叹号,句段末的符号为冒号、逗号或分号,段末的符号为回车或换行符;在文本信息生成发音数据的过程中,自然段之间、各语句之间以及各句段之间均加入静音段。The method for generating a masking signal for protecting the privacy of Chinese speech, step f) in the process of generating the speech text information, the symbol at the end of the sentence is a period, a question mark or an exclamation point, and the symbol at the end of the sentence is a colon, a comma or The semicolon, the symbol at the end of the paragraph is a carriage return or line feed; in the process of generating the pronunciation data of the text information, a silent segment is added between the natural segments, between the sentences, and between the segments.
本发明的用于保护汉语语音私密度的掩蔽信号的生成方法,步骤a)、b)、c)、d)中语句概率、句段概率、词组概率、汉字概率均精确至0.01,步骤e)中的音节概率精确至0.0001。The method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention, the probability of statement, the probability of segment, the probability of phrase, and the probability of Chinese characters in steps a), b), c), and d) are all accurate to 0.01, step e) The syllable probability in is accurate to 0.0001.
本发明的用于保护汉语语音私密度的掩蔽信号的生成方法,步骤a)中所述的语料库为国家语委立项建设的现代汉语通用平衡语料库。The method for generating a masking signal for protecting the privacy of Chinese speech, the corpus described in step a) is a modern Chinese general balanced corpus constructed by the National Language Committee.
本发明的有益效果是:本发明的掩蔽信号的生成方法,充分考虑了会议室声音掩蔽的需求及汉语语音的特点,摒弃了采用稳态噪声等掩蔽信号的传统方式,基于汉语语言中字、词、句的各项统计特性,利用人类发声语音库,生成一种无实际意义的、与正常说话语音极其相似的掩蔽信号。这种掩蔽信号相比传统的掩蔽噪声,大大减弱了听觉上的各种负面影响,提高了声音掩蔽效果。The invention has the beneficial effects that the method for generating the masking signal of the invention fully considers the requirement of the sound concealing of the conference room and the characteristics of the Chinese speech, and discards the traditional way of masking signals using steady-state noise, based on words and words in the Chinese language. The statistical characteristics of the sentence, using the human voiced speech library, to generate a masking signal that is of no practical significance and is very similar to normal speaking speech. Compared with the traditional masking noise, this masking signal greatly reduces the various negative effects of hearing and improves the sound masking effect.
附图说明DRAWINGS
图1为本发明的用于保护汉语语音私密度的掩蔽信号的生成方法的流程图。1 is a flow chart of a method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention.
具体实施方式detailed description
下面结合附图与实施例对本发明作进一步说明。 The invention will be further described below in conjunction with the drawings and embodiments.
如图1所示,给出了本发明的用于保护汉语语音私密度的掩蔽信号的生成方法的流程图,随机文本生成涉及以下几个概率表:As shown in FIG. 1, a flow chart of a method for generating a masking signal for protecting the privacy of Chinese speech according to the present invention is given. The random text generation involves the following probability tables:
1)由语句形成自然段,需要统计组成段落的语句数的概率表,简称为语句概率表;1) The natural segment is formed by the statement, and the probability table of the number of statements constituting the paragraph needs to be counted, which is simply referred to as the statement probability table;
2)由句段形成语句,需要统计组成语句的句段数的概率表,简称为句段概率表;2) Forming a sentence from a sentence, it is necessary to count the probability table of the number of segments constituting the statement, which is simply referred to as a segment probability table;
3)由词组形成句段,需要统计组成句段的词组数的概率表,简称为词组概率表;3) Form a sentence segment from a phrase, and need to count the probability table of the number of phrases constituting the segment, which is simply referred to as a phrase probability table;
4)由音节组成词组,需要统计组成词组的汉字数的概率表,简称为汉字概率表;4) The syllables form a phrase, which needs to count the probability table of the number of Chinese characters that make up the phrase, referred to as the Chinese character probability table;
5)各音节在日常用语中出现的概率,简称为音节概率表。5) The probability that each syllable appears in everyday language, referred to as the syllable probability table.
以上几个概率表统计的语料来源于现代汉语通用平衡语料库。该语料库由国家语委立项建设,全库约为1亿字符。其中,1997年以前的语料约7000万字符,均为手工录入印刷版语料;1997之后的语料约为3000万字符,手工录入和取自电子文本各半。该语料库的特点在于语料样本时间跨度大、领域分布广、比例更为均衡,能够较好地代表现代汉语的全貌。The corpus of the above several probability table statistics comes from the modern Chinese general balanced corpus. The corpus was established by the National Language Committee and the whole library is about 100 million characters. Among them, the corpus before 1997 was about 70 million characters, all of which were manually entered into the printed corpus; after 1997, the corpus was about 30 million characters, manually entered and taken from the electronic text. The corpus is characterized by a large time span of corpus samples, a wide distribution of fields, and a more balanced ratio, which can better represent the whole picture of modern Chinese.
以词组概率为例讲述概率表的统计方法,用于统计的语料库的基本信息如表1所示:Take the phrase probabilities as an example to describe the statistical methods of the probability table. The basic information of the corpus used for statistics is shown in Table 1:
表1Table 1
字数Word count 1945554119455541
词数Number of words 1284256612842566
句段数Number of segments 17685451768545
则词组概率计算公式为:Then the phrase probability calculation formula is:
词组概率(%)=(某一长度的句段总数/语料库总句段数)×100Phrase Probability (%) = (total number of segments of a certain length / total number of segments of the corpus) × 100
首先,根据如表2所示的语句概率表,确定随机文本的一个自然段由几个语句组成。First, based on the statement probability table as shown in Table 2, it is determined that a natural segment of random text consists of several statements.
表2Table 2
Figure PCTCN2015000255-appb-000013
Figure PCTCN2015000255-appb-000013
Figure PCTCN2015000255-appb-000014
Figure PCTCN2015000255-appb-000014
具体实施中,为计算方便将各频率同时扩大100倍,得到整数的语句数概率表J=[5,10,15,30,…]。根据随机数r1与语句概率表J之间的关系来确定。若随机数r1∈[0,J1]∈[0,5],则该自然段包含1个语句,若r1∈[J1,J1+J2]∈[5,15],则该自然段包含2个语句,依此类推。In the specific implementation, for each calculation, each frequency is simultaneously expanded by 100 times, and an integer sentence number probability table J=[5, 10, 15, 30, ...] is obtained. It is determined based on the relationship between the random number r 1 and the statement probability table J. If the random number r 1 ∈[0,J 1 ]∈[0,5], the natural segment contains 1 statement, if r 1 ∈[J 1 ,J 1 +J 2 ]∈[5,15], then The natural section contains 2 statements, and so on.
其次,根据如表3所示的句段概率表,确定每个语句包含几个句段。Second, based on the segment probability table as shown in Table 3, it is determined that each statement contains several segments.
表3table 3
Figure PCTCN2015000255-appb-000015
Figure PCTCN2015000255-appb-000015
同样,根据整数的组成语句的句段数概率表D=[14,7,32,25,…],若生成的随机数r2∈[0,D1]∈[0,14],则语句包含1个句段,若随机数r2∈[D1,D1+D2]∈[14,21],则语句包含2个句段,依此类推。Similarly, according to the number of segments of the integer composition sentence table D = [14, 7, 32, 25, ...], if the generated random number r 2 ∈ [0, D 1 ] ∈ [0, 14], the statement contains 1 segment, if the random number r 2 ∈ [D 1 , D 1 + D 2 ] ∈ [14, 21], the statement contains 2 segments, and so on.
再次,根据如表4所示的词组数量概率表,确定每个句段包含几个词组。Again, based on the phrase number probability table as shown in Table 4, it is determined that each sentence segment contains several phrases.
表4Table 4
Figure PCTCN2015000255-appb-000016
Figure PCTCN2015000255-appb-000016
Figure PCTCN2015000255-appb-000017
Figure PCTCN2015000255-appb-000017
整数的词组数量概率表为C=[1,1,2,2,2,…],若随机数r3∈[0,C1]∈[0,1],则该句段包含1个词,若随机数r3∈[C1,C1+C2]∈[1,2],则该句段包含2个词,依此类推。The probability number table of integers is C=[1,1,2,2,2,...]. If the random number r 3 ∈[0,C 1 ]∈[0,1], the sentence contains 1 word. If the random number r 3 ∈ [C 1 , C 1 + C 2 ] ∈ [1, 2], the sentence contains 2 words, and so on.
最后,根据如表5所示的汉字数概率表,确定每个词组包含几个汉字。Finally, according to the Chinese character probability table shown in Table 5, it is determined that each phrase contains several Chinese characters.
表5table 5
Figure PCTCN2015000255-appb-000018
Figure PCTCN2015000255-appb-000018
整数的汉字数概率表为Z=[45,42,2,1,…],利用程序生成随机数r4,若随机数r4∈[0,Z1]∈[0,45],则该词组包含1个汉字,若随机数r4∈[Z1,Z1+Z2]∈[45,87],则该词组包含2个汉字,依此类推。The integer Chinese character probability table is Z=[45,42,2,1,...], and the random number r 4 is generated by the program. If the random number r 4 ∈[0,Z 1 ]∈[0,45], then The phrase contains 1 Chinese character. If the random number r 4 ∈ [Z 1 , Z 1 + Z 2 ] ∈ [45, 87], the phrase contains 2 Chinese characters, and so on.
这里,汉字按照音节区分,每个汉字对应一个音节,例如,a1,a2,a3,a4分别代表读音为“啊”的四个音节,按照此种规则分别统计各音节在日常语音中出现的频率,如表6所示。Here, Chinese characters are distinguished by syllables, and each Chinese character corresponds to one syllable. For example, a1, a2, a3, and a4 respectively represent four syllables whose pronunciation is "ah", and the frequency of occurrence of each syllable in daily speech is separately counted according to such rules. , as shown in Table 6.
表6Table 6
Figure PCTCN2015000255-appb-000019
Figure PCTCN2015000255-appb-000019
将每个音节的频率乘以10000进行化整,化整的音节概率表为h=[215,25,16,97,…]。利用程序生成随机数r5,若随机数r5∈[0,h1]∈[0,215],则 对应音节H1(即a1),若r5∈[h1,h1+h2]∈[215,240],则对应音节H2(即a2),依此类推,依次确定组成词组的具体音节。The frequency of each syllable is multiplied by 10000 for rounding, and the syllable probability table of the syllable is h=[215,25,16,97,...]. The program generates a random number r 5 . If the random number r 5 ∈[0,h 1 ]∈[0,215], then corresponds to the syllable H 1 (ie a1), if r 5 ∈[h 1 ,h 1 +h 2 ]∈[215,240], corresponding to the syllable H 2 (ie a2), and so on, sequentially determine the specific syllables that make up the phrase.
语音合成:Speech synthesis:
按照上述步骤,得到一段不具任何实际意义的随机文本。接下来要利用语音数据库,将上一步骤所产生的随机文本合成为掩蔽信号输出。语音库须在专业的录音室内录制,涵盖了汉语语音所有的常用音节。语音库中各音节的命名与生成随机文本的音节名字一一对应。例如语音库中音调为一声,读音为“啊”的音节命名为“a1.wav”,读音为二声的“啊”,相应地命名为“a2.wav”。语音合成时,读取上一步骤所产生的随机文本“text.txt”,并与语音库进行匹配,例如,从随机文本中读取到“bail”这个音节,则将其对应到语音库的“bail.wav”,依次类推,将所有音节与语音库中发音一一对应,最终合成掩蔽信号输出。Follow the steps above to get a random text that doesn't make any sense. Next, the speech database is used to synthesize the random text generated in the previous step into a masked signal output. The voice library must be recorded in a professional recording studio, covering all the common syllables of Chinese speech. The naming of each syllable in the speech library corresponds to the syllable name of the generated random text. For example, the tone in the voice library is one, the syllable with the pronunciation of "ah" is named "a1.wav", and the syllable with the second pronunciation is "ah", which is named "a2.wav" accordingly. During speech synthesis, the random text "text.txt" generated in the previous step is read and matched with the speech library. For example, if the syllable "bail" is read from the random text, it is mapped to the speech library. "bail.wav", and so on, all the syllables are in one-to-one correspondence with the pronunciations in the speech library, and finally the masked signal output is synthesized.
为了使合成的掩蔽信号听起来更流畅自然,在各自然段之间、各语句之间及各句段之间加入静音段,模拟正常讲话时的停顿。句末符号规定为句号、问号、感叹号,句段末的符号规定为冒号、逗号、分号,段末的符号规定为回车、换行符号。将提前录制好的静音段存入语音库中,静音段的命名必须有别于语音库中的所有音节,例如将静音段命名为jyin.wav.。读取随机文本时,若遇到以上规定的末端符号,直接读取语音库中对应的静音段,以达到语音停顿的目的。经过对日常语音特性的研究及大量实验得出,静音段长度设置为0.5s时,合成的掩蔽信号效果最流畅。 In order to make the synthesized masking signal sound smoother and natural, a silent segment is added between the natural segments, between the sentences and between the segments to simulate the pause during normal speech. The symbols at the end of the sentence are specified as a period, a question mark, an exclamation point, and the symbols at the end of the sentence are specified as a colon, a comma, a semicolon, and the symbols at the end of the paragraph are specified as carriage returns and line feed symbols. The pre-recorded silent segment is stored in the voice library. The name of the silent segment must be different from all the syllables in the voice library. For example, the silent segment is named jyin.wav. When reading random text, if the end symbol specified above is encountered, the corresponding silent segment in the voice library is directly read to achieve the purpose of voice pause. After studying the daily speech characteristics and a large number of experiments, the combined masking signal has the smoothest effect when the mute segment length is set to 0.5s.

Claims (4)

  1. 一种用于保护汉语语音私密度的掩蔽信号的生成方法,其特征在于,通过以下步骤来实现:A method for generating a masking signal for protecting the privacy of Chinese speech, which is characterized by the following steps:
    a).统计语句概率表,以具有代表性的汉语语料库为统计样本,对语料库中每个段落所包含的语句数进行统计,获得组成段落的语句数的概率表[J1,J2,J3…,Jm],简称语句概率表,其中Ji表示语句数目为i的段落占所有段落的百分比,1≤i≤m;a). Statistical statement probability table, with a representative Chinese corpus as a statistical sample, statistics on the number of statements contained in each paragraph in the corpus, and obtain the probability table of the number of statements that make up the paragraph [J 1 , J 2 , J 3 ..., J m ], referred to as the statement probability table, where J i represents the percentage of paragraphs with the number of statements i, and 1 ≤ i ≤ m;
    b).统计句段概率表,对语料库中所有语句所包含的句段数进行统计,获得语句的句段数的概率表[D1,D2,D3…,Dl],简称句段概率表,其中Di表示句段数目为i的语句占所有语句的百分比,1≤i≤l;b). Statistical segment probability table, statistics on the number of segments included in all statements in the corpus, and the probability table [D 1 , D 2 , D 3 ..., D l ] of the number of segments of the statement, referred to as the segment probability table , where D i represents the percentage of statements with the number of segments i, and 1 ≤ i ≤ l;
    c).统计词组概率表,对语料库中所有句段所包含的词组数进行统计,获得句段的词组数的概率表[C1,C2,C3…,Cq],简称词组概率表,其中Ci表示词组数目为i的句段占所有句段的百分比,1≤i≤q;c). The statistical phrase probability table, which counts the number of phrases included in all segments of the corpus, and obtains the probability table of the number of phrases in the sentence [C 1 , C 2 , C 3 ..., C q ], referred to as the phrase probability table Where C i represents the percentage of segments in which the number of phrases is i, and 1 ≤ i ≤ q;
    d).统计汉字概率表,对语料库中所有词组所包含的汉字数进行统计,获得词组的汉字数的概率表[Z1,Z2,Z3…,Zp],简称汉字概率表,其中Zi表示汉字数目为i的词组占所有词组的百分比,1≤i≤p;d). The statistical Chinese character probability table is used to count the number of Chinese characters contained in all the phrases in the corpus, and obtain the probability table [Z 1 , Z 2 , Z 3 ..., Z p ] of the Chinese characters of the phrase, which is referred to as the Chinese character probability table, wherein Z i represents the percentage of the phrase whose number of Chinese characters is i, and 1 ≤ i ≤ p;
    e).统计音节概率表,首先按照字母顺序对音节进行排序,记为[H1,H2,H3…,Hk],然后根据各音节在日常用语中出现的概率,获得音节概率表[h1,h2,h3…,hk],简称音节概率表,其中hi表示音节Hi在日常用语中出现的频率,1≤i≤k;e). Statistical syllable probability table, first sort the syllables in alphabetical order, denoted as [H 1 , H 2 , H 3 ..., H k ], and then obtain the syllable probability table according to the probability of each syllable appearing in everyday language. [h 1 , h 2 , h 3 ..., h k ], referred to as a syllable probability table, where h i represents the frequency at which the syllable H i appears in everyday language, 1 ≤ i ≤ k;
    f).生成文本信息,按照如下步骤生成语音对应的文本信息:f). Generate text information, and generate text information corresponding to the voice according to the following steps:
    f-1).确定自然段的语句数,在区间范围
    Figure PCTCN2015000255-appb-100001
    内产生随机数r1,并判断随机数r1所属区间;如果r1在区间
    Figure PCTCN2015000255-appb-100002
    内,则得出自然段中所包含的语句数为n1,其中,1≤n1≤m,J0=0;通过步骤f-2)确定出自然段中的每个语句;
    F-1). Determine the number of statements in the natural segment, in the range of the interval
    Figure PCTCN2015000255-appb-100001
    Generate a random number r 1 and determine the interval to which the random number r 1 belongs; if r 1 is in the interval
    Figure PCTCN2015000255-appb-100002
    Inside, it is found that the number of sentences included in the natural segment is n1, where 1≤n1≤m, J 0 =0; each statement in the natural segment is determined by step f-2);
    f-2).确定语句中的句段数,在区间范围
    Figure PCTCN2015000255-appb-100003
    内产生随机数r2,并判断随机数r2所属区间;如果r2在区间
    Figure PCTCN2015000255-appb-100004
    内,则得出语句中所包含的句段数为n2,其中,1≤n2≤l,D0=0;通过步骤f-3)确定出每个语句中的句段;
    F-2). Determine the number of segments in the statement, in the range of the interval
    Figure PCTCN2015000255-appb-100003
    Generate a random number r 2 and determine the interval to which the random number r 2 belongs; if r 2 is in the interval
    Figure PCTCN2015000255-appb-100004
    Inside, the number of segments included in the statement is n2, where 1≤n2≤1, D 0 =0; the segment in each statement is determined by step f-3);
    f-3).确定句段中的词组数,在区间范围
    Figure PCTCN2015000255-appb-100005
    内产生随机数r3,并判断随机数r3所属区间;如果r3在区间
    Figure PCTCN2015000255-appb-100006
    内,则得出句段中所包含的词组数为n3,其中,1≤n3≤q,C0=0;通过步骤f-4)确定每个句段中的词组;
    F-3). Determine the number of phrases in the segment, in the range of the interval
    Figure PCTCN2015000255-appb-100005
    Generate a random number r 3 and determine the interval to which the random number r 3 belongs; if r 3 is in the interval
    Figure PCTCN2015000255-appb-100006
    Within the sentence, the number of phrases included in the sentence is n3, where 1≤n3≤q, C 0 =0; the phrase in each segment is determined by step f-4);
    f-4).确定词组中的汉字数,在区间范围
    Figure PCTCN2015000255-appb-100007
    内产生随机数r4,并判断随机数r4所属区间;如果r4在区间
    Figure PCTCN2015000255-appb-100008
    内,则得出词组中所包含的汉字数为n4,汉字数即音节数,每个汉字对应一个音节,其中,1≤n4≤p,Z0=0;通过步骤f-5)确定每个汉字的音节;
    F-4). Determine the number of Chinese characters in the phrase, in the range of the interval
    Figure PCTCN2015000255-appb-100007
    Generate a random number r 4 and determine the interval to which the random number r 4 belongs; if r 4 is in the interval
    Figure PCTCN2015000255-appb-100008
    Inside, it is concluded that the number of Chinese characters contained in the phrase is n4, the number of Chinese characters is the number of syllables, and each Chinese character corresponds to one syllable, where 1≤n4≤p, Z 0 =0; each step is determined by step f-5) Syllables of Chinese characters;
    f-5).确定音节,在区间范围
    Figure PCTCN2015000255-appb-100009
    内产生随机数r5,并判断随机数r5所属区间;如果r5在区间
    Figure PCTCN2015000255-appb-100010
    内,则得出汉字的音节为Hn5,其中,1≤n5≤k,h0=0;直至词组中所有汉字的音节确定完毕;
    F-5). Determine the syllable, in the range
    Figure PCTCN2015000255-appb-100009
    Generate a random number r 5 and determine the interval to which the random number r 5 belongs; if r 5 is in the interval
    Figure PCTCN2015000255-appb-100010
    Inside, the syllable of the Chinese character is H n5 , where 1≤n5≤k, h 0 =0; until the syllables of all the Chinese characters in the phrase are determined;
    按照步骤f-1)至f-5)生成自然段的文本信息,直至所生成的自然段数目满足要求;Generating the text information of the natural segment according to steps f-1) to f-5) until the number of natural segments generated meets the requirements;
    g).语音合成,利用与每个音节的发音相对应的语音库,将步骤f)中获 取的自然段的文本信息中的音节,与语音库中的发音一一对应形成相应的语音数据,通过在保密会议中的声音泄漏位置播放该语音数据,即可形成与正常发音的统计特性类似、掩蔽性好、对会议人员影响小的语音掩蔽信号。g). Speech synthesis, using the speech library corresponding to the pronunciation of each syllable, obtained in step f) The syllables in the text information of the natural segment are correspondingly matched with the pronunciations in the voice library to form corresponding voice data, and the voice data is played by the sound leakage position in the confidential conference, so that the statistical characteristics similar to the normal pronunciation can be formed. A masking signal with good concealment and little influence on conference personnel.
  2. 根据权利要求1所述的用于保护汉语语音私密度的掩蔽信号的生成方法,其特征在于:步骤f)在语音文本信息的生成过程中,语句末的符号为句号、问号或感叹号,句段末的符号为冒号、逗号或分号,段末的符号为回车或换行符;在文本信息生成发音数据的过程中,自然段之间、各语句之间以及各句段之间均加入静音段。The method for generating a masking signal for protecting the privacy of Chinese speech according to claim 1, wherein: step f) in the process of generating the voice text information, the symbol at the end of the sentence is a period, a question mark or an exclamation point, and the segment The last symbol is a colon, a comma or a semicolon, and the symbol at the end of the paragraph is a carriage return or a newline character; in the process of generating the pronunciation data of the text information, the natural segments, between the sentences, and between the segments are muted. segment.
  3. 根据权利要求1或2所述的用于保护汉语语音私密度的掩蔽信号的生成方法,其特征在于:步骤a)、b)、c)、d)中语句概率、句段概率、词组概率、汉字概率均精确至0.01,步骤e)中的音节概率精确至0.0001。The method for generating a masking signal for protecting the privacy of Chinese speech according to claim 1 or 2, characterized by: statement probability, segment probability, phrase probability, step a), b), c), d) The probability of Chinese characters is accurate to 0.01, and the syllable probability in step e) is accurate to 0.0001.
  4. 根据权利要求1或2所述的用于保护汉语语音私密度的掩蔽信号的生成方法,其特征在于:步骤a)中所述的语料库为国家语委立项建设的现代汉语通用平衡语料库。 The method for generating a masking signal for protecting the privacy of a Chinese speech according to claim 1 or 2, wherein the corpus described in the step a) is a general Chinese general balanced corpus constructed by the National Language Committee.
PCT/CN2015/000255 2015-03-03 2015-04-13 Generating method for shielding signals used for protecting chinese speech privacy WO2016138605A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510094030.8 2015-03-03
CN201510094030.8A CN104637485B (en) 2015-03-03 2015-03-03 A kind of generation method of masking signal for protecting Chinese speech secret degree

Publications (1)

Publication Number Publication Date
WO2016138605A1 true WO2016138605A1 (en) 2016-09-09

Family

ID=53216156

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/000255 WO2016138605A1 (en) 2015-03-03 2015-04-13 Generating method for shielding signals used for protecting chinese speech privacy

Country Status (2)

Country Link
CN (1) CN104637485B (en)
WO (1) WO2016138605A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019480A (en) * 2021-11-11 2022-09-06 艾感科技(广东)有限公司 System and method for monitoring sound and gas exposure

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106558303A (en) * 2015-09-29 2017-04-05 苏州天声学科技有限公司 Array sound mask device and sound mask method
GB2545434B (en) * 2015-12-15 2020-01-08 Sonic Data Ltd Improved method, apparatus and system for embedding data within a data stream
CN109697978B (en) * 2018-12-18 2021-04-20 百度在线网络技术(北京)有限公司 Method and apparatus for generating a model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6188771B1 (en) * 1998-03-11 2001-02-13 Acentech, Inc. Personal sound masking system
CN102522080A (en) * 2011-12-08 2012-06-27 中国科学院声学研究所 Random interference sound signal generating system and method for protecting language privacy
CN102543066A (en) * 2011-11-18 2012-07-04 中国科学院声学研究所 Target voice privacy protection method and system
JP5103974B2 (en) * 2007-03-22 2012-12-19 ヤマハ株式会社 Masking sound generation apparatus, masking sound generation method and program
CN103886858A (en) * 2014-03-11 2014-06-25 中国科学院信息工程研究所 Sound masking signal generating method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6188771B1 (en) * 1998-03-11 2001-02-13 Acentech, Inc. Personal sound masking system
JP5103974B2 (en) * 2007-03-22 2012-12-19 ヤマハ株式会社 Masking sound generation apparatus, masking sound generation method and program
CN102543066A (en) * 2011-11-18 2012-07-04 中国科学院声学研究所 Target voice privacy protection method and system
CN102522080A (en) * 2011-12-08 2012-06-27 中国科学院声学研究所 Random interference sound signal generating system and method for protecting language privacy
CN103886858A (en) * 2014-03-11 2014-06-25 中国科学院信息工程研究所 Sound masking signal generating method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019480A (en) * 2021-11-11 2022-09-06 艾感科技(广东)有限公司 System and method for monitoring sound and gas exposure
CN115019480B (en) * 2021-11-11 2023-12-26 艾感科技(广东)有限公司 System and method for monitoring sound and gas exposure

Also Published As

Publication number Publication date
CN104637485A (en) 2015-05-20
CN104637485B (en) 2018-05-01

Similar Documents

Publication Publication Date Title
Jones et al. Testifying while black: An experimental study of court reporter accuracy in transcription of African American English
Fraser Issues in transcription: factors affecting the reliability of transcripts as evidence in legal cases
Schuppler et al. How linguistic and probabilistic properties of a word affect the realization of its final/t: Studies at the phonemic and sub-phonemic level
Almeman et al. Multi dialect Arabic speech parallel corpora
WO2016138605A1 (en) Generating method for shielding signals used for protecting chinese speech privacy
Shokeir Evidence for the stable use of uptalk in South Ontario English
Gao Sociolinguistic motivations in sound change: On-going loss of low tone breathy voice in Shanghai Chinese
Wu Effect of F0 contour on perception of Mandarin Chinese speech against masking
Grice et al. The tune drives the text: Schwa in consonant-final loan words in Italian.
Bond et al. A Note on Loud and Lombard Speech.
Chao The logical structure of Chinese words
Farooq An acoustic phonetic study of six accents of Urdu in Pakistan
Peng et al. Investigation of Chinese word recognition scores of children in primary school classroom with different speech sound pressure levels
Yang An evaluation of Korean students' pronunciation of an English passage by a speech recognition application and two human raters
Ferris Techniques and challenges in speech synthesis
Buss et al. Maturation of speech-in-speech recognition for whispered and voiced speech
McCloy Prosody, intelligibility and familiarity in speech perception
Nishizaki et al. The effect of filled pauses in a lecture speech on impressive evaluation of listeners.
Vasilescu et al. Large scale data based linguistic investigations using speech technology tools: The case of Romanian
Kobayashi et al. Semantic parser for easy understandable speech broadcasting
Terhiija et al. Acoustic Analysis of Vowels in Two Southern Angami Dialects
Shamgholi et al. Armantts single-speaker persian dataset
Romito CHAPTER THREE A TRAINING PROGRAM FOR EXPERT FORENSIC TRANSCRIBERS LUCIANO ROMITO
Takawaki Orthographic Loyalty in the Spanish of Northern Mexican Speakers
Andriadi et al. A Pragmatics Analysis of Sarcastic Utterances on Homeland–Pilot TV Series Script

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15883662

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15883662

Country of ref document: EP

Kind code of ref document: A1