CN110032626A - Voice broadcast method and device - Google Patents

Voice broadcast method and device Download PDF

Info

Publication number
CN110032626A
CN110032626A CN201910318020.6A CN201910318020A CN110032626A CN 110032626 A CN110032626 A CN 110032626A CN 201910318020 A CN201910318020 A CN 201910318020A CN 110032626 A CN110032626 A CN 110032626A
Authority
CN
China
Prior art keywords
pronunciation
text
content
voice
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910318020.6A
Other languages
Chinese (zh)
Other versions
CN110032626B (en
Inventor
赵涛涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Shanghai Xiaodu Technology Co Ltd
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to CN201910318020.6A priority Critical patent/CN110032626B/en
Publication of CN110032626A publication Critical patent/CN110032626A/en
Application granted granted Critical
Publication of CN110032626B publication Critical patent/CN110032626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

The embodiment of the present invention proposes a kind of voice broadcast method and device, and method includes: according to casting acquisition of information content of text and initial casting voice;The right pronunciation of non-standard pronunciation word is obtained according to content of text;Whether pronunciation and right pronunciation of the more non-standard pronunciation word in initially casting voice are consistent;If inconsistent, pronunciation of the non-standard pronunciation word in initially casting voice is replaced with into right pronunciation, and generate final casting voice.The embodiment of the present invention is due to before playing final casting voice, obtain the right pronunciation of the non-standard pronunciation word in the corresponding content of text of casting information, and the right pronunciation is compared with the pronunciation of non-standard pronunciation word in initial casting voice, therefore improve the casting content accuracy of final casting voice.

Description

Voice broadcast method and device
Technical field
The present invention relates to voice processing technology field more particularly to a kind of voice broadcast methods and device.
Background technique
Existing intelligence playback equipment only can simply believe required broadcasting after receiving the voice play instruction of user Text in breath is played out by standard pronunciation.But due to some texts pronounce in different context and the meaning of a word it is different, Therefore only by the content for being converted to the voice broadcast mode that standard pronunciation plays out and will use family to broadcasting of text mechanization It produces ambiguity in understanding, so that user can not accurately understand the meaning of broadcasting content, reduces user experience.Especially It is when child passes through intelligent playback equipment progress language learning, if the pronunciation played out is language of the meeting to child of mistake Study generates great misleading effect.
Summary of the invention
The embodiment of the present invention provides a kind of voice broadcast method and device, to solve one or more skills in the prior art Art problem.
In a first aspect, the embodiment of the invention provides a kind of voice broadcast methods, comprising:
According to casting acquisition of information content of text and initial casting voice;
The right pronunciation of non-standard pronunciation word is obtained according to the content of text;
Whether consistent compare pronunciation and the right pronunciation of the non-standard pronunciation word in the initial casting voice;
If inconsistent, pronunciation of the non-standard pronunciation word in the initial casting voice is replaced with described correct Pronunciation, and generate final casting voice.
In one embodiment, it before the right pronunciation that non-standard pronunciation word is obtained according to the content of text, also wraps It includes:
It whether determines in the content of text comprising non-standard pronunciation word.
In one embodiment, the right pronunciation of non-standard pronunciation word is obtained according to the content of text, comprising:
Obtain the sentence where non-standard pronunciation word described in the content of text;
According to the semanteme of the sentence, right pronunciation of the non-standard pronunciation word in the content of text is determined.
In one embodiment, the right pronunciation of non-standard pronunciation word is obtained according to the content of text, comprising:
Obtain phrase composed by non-standard pronunciation word described in the content of text;
According to the meaning of the phrase, right pronunciation of the non-standard pronunciation word in the content of text is determined.
In one embodiment, the right pronunciation of non-standard pronunciation word is obtained according to the content of text, comprising:
Obtain phrase composed by non-standard pronunciation word described in the content of text or the sentence at place;
The voice whether with the phrase or the sentence is retrieved in the database;
If having, the voice of phrase described in library or the sentence, determines the non-standard pronunciation based on the data The right pronunciation of word.
In one embodiment, it after the right pronunciation that non-standard pronunciation word is obtained according to the content of text, also wraps It includes:
Determine position of the non-standard pronunciation word in the content of text;
According to position of the non-standard pronunciation word in the content of text, the non-standard pronunciation word is searched described Position in initial casting voice;
Position based on the non-standard pronunciation word in the initial casting voice, obtains the non-standard pronunciation word and exists Pronunciation in the initial casting voice.
In one embodiment, voice is initially broadcasted according to casting acquisition of information, comprising:
According to casting information, search whether have casting voice with corresponding with the casting information from server;
If having, has casting voice as the initial casting voice for described;
If not having, the corresponding content of text of the casting information is obtained from the server, and will be in the text Appearance is converted to casting voice.
In one embodiment, it when the casting information is poem title, is obtained according to the content of text nonstandard The right pronunciation of quasi- pronunciation word, comprising:
According to the poem title and the content of text, corresponding poem information is searched in the database;
According to the poem information found, the right pronunciation of each non-standard pronunciation word is obtained in poem and in the text Position in content.
Second aspect, the embodiment of the invention provides a kind of sound broadcasting devices, comprising:
First obtains module, for according to casting acquisition of information content of text and initial casting voice;
Second obtains module, for obtaining the right pronunciation of non-standard pronunciation word according to the content of text;
Comparison module, for pronunciation of the non-standard pronunciation word in the initial casting voice and described correct Whether pronunciation is consistent;
Replacement module, if for the non-standard pronunciation word in the initial pronunciation broadcasted in voice and the correct reading Sound is inconsistent, then pronunciation of the non-standard pronunciation word in the initial casting voice is replaced with the right pronunciation, and Generate final casting voice.
In one embodiment, further includes:
Determining module, for whether determining in the content of text comprising non-standard pronunciation word.
In one embodiment, the second acquisition module includes:
Sentence acquisition submodule, for obtaining the sentence where non-standard pronunciation word described in the content of text;
First pronunciation submodule determines the non-standard pronunciation word in the text for the semanteme according to the sentence Right pronunciation in content.
In one embodiment, the second acquisition module includes:
Phrase acquisition submodule, for obtaining phrase composed by non-standard pronunciation word described in the content of text;
Second pronunciation submodule determines the non-standard pronunciation word in the text for the meaning according to the phrase Right pronunciation in content.
In one embodiment, the second acquisition module includes:
Words and phrases acquisition submodule, for obtaining phrase composed by non-standard pronunciation word described in the content of text or institute Sentence;
Submodule is retrieved, for retrieving the voice whether with the phrase or the sentence in the database;
Third pronunciation submodule, if for the voice in database with the phrase or the sentence, based on described The voice of phrase described in database or the sentence determines the right pronunciation of the non-standard pronunciation word.
In one embodiment, further includes:
Position module, for determining position of the non-standard pronunciation word in the content of text;
Searching module is searched described nonstandard for the position according to the non-standard pronunciation word in the content of text Position of the quasi- pronunciation word in the initial casting voice;
Third obtains module, for the position based on the non-standard pronunciation word in the initial casting voice, obtains Pronunciation of the non-standard pronunciation word in the initial casting voice.
In one embodiment, the first acquisition module includes:
Submodule is searched, for searching whether to have from server corresponding with the casting information according to casting information Have casting voice;If having, has casting voice as the initial casting voice for described;
Content of text acquisition submodule, if for having casting voice without corresponding with the casting information, from The server obtains the corresponding content of text of the casting information, and the content of text is converted to casting voice.
The third aspect, the embodiment of the invention provides a kind of voice broadcast terminal, the function of the voice broadcast terminal can Corresponding software realization can also be executed by hardware by hardware realization.The hardware or software include one or more Module corresponding with above-mentioned function.
It is described to deposit including processor and memory in the structure of the voice broadcast terminal in a possible design Reservoir is used to store the program for supporting the voice broadcast terminal to execute above-mentioned voice broadcast method, the processor is configured to For executing the program stored in the memory.The voice broadcast terminal can also include communication interface, be used for and other Equipment or communication.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, eventually for storaged voice casting End computer software instructions used comprising for executing program involved in above-mentioned voice broadcast method.
A technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that the embodiment of the present invention due to Before playing final casting voice, the right pronunciation of the non-standard pronunciation word in the corresponding content of text of casting information is obtained, with And the right pronunciation is compared with the pronunciation of non-standard pronunciation word in initial casting voice, therefore improve final casting language The casting content accuracy of sound.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 shows the flow chart of voice broadcast method according to an embodiment of the present invention.
Fig. 2 shows the flow charts of voice broadcast method according to another embodiment of the present invention.
Fig. 3 shows the flow chart of voice broadcast method according to another embodiment of the present invention.
Fig. 4 shows the flow chart of voice broadcast method according to another embodiment of the present invention.
Fig. 5 shows the flow chart of voice broadcast method according to another embodiment of the present invention.
Fig. 6 shows the flow chart of voice broadcast method according to another embodiment of the present invention.
Fig. 7 shows the flow chart of voice broadcast method according to another embodiment of the present invention.
Fig. 8 shows the flow chart of the step S100 of voice broadcast method according to an embodiment of the present invention.
Fig. 9 shows voice broadcast method according to an embodiment of the present invention using exemplary flow chart.
Figure 10 shows the structural block diagram of sound broadcasting device according to an embodiment of the present invention.
Figure 11 shows the structural block diagram of sound broadcasting device according to another embodiment of the present invention.
Figure 12 shows the structural block diagram of the voice module of sound broadcasting device according to an embodiment of the present invention.
Figure 13 shows the structural block diagram of the voice module of sound broadcasting device according to another embodiment of the present invention.
Figure 14 shows the structural block diagram of the voice module of sound broadcasting device according to another embodiment of the present invention.
Figure 15 shows the structural block diagram of sound broadcasting device according to another embodiment of the present invention.
Figure 16 shows the structural block diagram of the acquisition module of sound broadcasting device according to an embodiment of the present invention.
Figure 17 is the security protection terminal structure schematic diagram based on seat that embodiment of the present invention provides.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Fig. 1 shows the flow chart of voice broadcast method according to an embodiment of the present invention.As shown in Figure 1, the voice broadcast side Method may include:
S100: according to casting acquisition of information content of text and initial casting voice.Broadcasting information may include needed for user The information such as topic, keyword, the correlative of content are broadcasted, as long as that broadcasts needed for capable of fast and accurately finding user is interior Appearance.Content of text may include the original text of casting content needed for user.Initial casting voice may include by text Hold directly by TTS (Text To Speech, from Text To Speech) technology conversion voice, also may include from server, The existing voice of the acquisitions such as cloud, database.
S200: the right pronunciation of non-standard pronunciation word is obtained according to content of text.Right pronunciation is the non-standard pronunciation word Pronunciation in the casting content needed for user.Since many texts may use different readings in different contexts or phrase Sound, it is therefore desirable to confirm that non-standard pronunciation word broadcasts the pronunciation in content needed for user.Wherein, non-standard pronunciation word can wrap Include polyphone, interchangeability of Chinese characters word and loan character etc..
S300: pronunciation of the more non-standard pronunciation word in initially casting voice, with non-standard pronunciation word needed for user Whether the right pronunciation in the content of text of casting is consistent.
S400: if inconsistent, replacing with right pronunciation for pronunciation of the non-standard pronunciation word in initially casting voice, from And generate the final casting voice of accurate pronunciation.
In one embodiment, as shown in Fig. 2, according to content of text obtain non-standard pronunciation word right pronunciation it Before, further includes:
S500: it whether determines in the content of text comprising non-standard pronunciation word.It, will if not including non-standard pronunciation word Initial casting voice in S100 is broadcasted as final casting voice.S200 is executed if comprising non-standard pronunciation word.
Wherein it is determined that whether can be in the following ways comprising the mode of non-standard pronunciation word in the content of text:
By the identification to text each in content of text, determine each word in existing Chinese whether be recorded in it is interior Polyphone, interchangeability of Chinese characters word or loan character etc..If so, assert that the text is non-standard pronunciation word.
Or according to every words or each phrase in identification content of text, it is determined whether include non-standard pronunciation word.
In one embodiment, as shown in figure 3, step S300 further include:
If pronunciation of the non-standard pronunciation word in initially casting voice, the text with non-standard pronunciation word casting needed for user Right pronunciation in this content is consistent, then will initially broadcast voice and broadcast as final casting voice.
In one embodiment, as shown in figure 4, obtaining the right pronunciation of non-standard pronunciation word according to content of text, packet It includes:
S210: the sentence in content of text where non-standard pronunciation word is obtained.
S220: according to the semanteme of sentence, right pronunciation of the non-standard pronunciation word in content of text is determined.Specifically, can To analyze non-standard pronunciation word in text according to the meaning of other text institute constitution contents in sentence in addition to non-standard pronunciation word Right pronunciation in this content.Can also by the corresponding part of speech of each pronunciation and the meaning of word of non-standard pronunciation word respectively with the language Other texts combine understanding in sentence, so that comprehensive analysis goes out right pronunciation of the non-standard pronunciation word in content of text.
In one embodiment, as shown in figure 5, obtaining the right pronunciation of non-standard pronunciation word according to content of text, packet It includes:
S230: phrase composed by non-standard pronunciation word in content of text is obtained.
S240: according to the meaning of phrase, right pronunciation of the non-standard pronunciation word in content of text is determined.Specifically, will The non-standard pronunciation word and its forward and backward text are respectively combined into phrase.By where the meaning of the phrase of composition and the phrase The meaning of other text institute constitution contents of sentence, comprehensive analysis go out right pronunciation of the non-standard pronunciation word in content of text.
In one embodiment, as shown in fig. 6, obtaining the right pronunciation of non-standard pronunciation word according to content of text, packet It includes:
S250: phrase or the sentence at place composed by non-standard pronunciation word in content of text are obtained.
S260: the voice whether with phrase composed by non-standard pronunciation word or sentence is retrieved in the database.
S270: if having, the voice based on phrase in database or sentence determines the correct reading of non-standard pronunciation word Sound.Database in the embodiment of the present invention may include database etc. on database and/or line under line.
In one embodiment, as shown in fig. 7, according to content of text obtain non-standard pronunciation word right pronunciation it Afterwards, further includes:
S600: position of the non-standard pronunciation word in content of text is determined.
S700: according to position of the non-standard pronunciation word in content of text, non-standard pronunciation word is searched in initially casting language Position in sound.
S800: it based on position of the non-standard pronunciation word in initially casting voice, obtains non-standard pronunciation word and is initially broadcasting Report the pronunciation in voice.
In one embodiment, as shown in figure 8, initially broadcasting voice according to casting acquisition of information, comprising:
S110: according to casting information, search whether to have that information is corresponding has casting voice with casting from server. Whether contain the corresponding existing audio resource (having casting voice) of the casting information i.e. in Network Search.
S120: if having, will have casting voice as initial casting voice.It is from existing due to having casting voice It is directly acquired in audio resource, therefore cannot be guaranteed the accurate pronunciation of casting voice.But it can reduce by content of text It is converted into audio the time it takes and resource.
S130: if not having, the corresponding content of text of casting information is obtained from server, and content of text is converted to Broadcast voice.
In one embodiment, when broadcasting information is poem title, non-standard pronunciation word is obtained according to content of text Right pronunciation, comprising:
According to poem title and content of text, corresponding poem information is searched in the database.Due to some classic poetries Title, which exists, to be repeated, therefore in order to more accurately find poem information, poem title and content of text are combined, comprehensive Inquiry.
According to the poem information found, the right pronunciation of each non-standard pronunciation word is obtained in poem and in content of text In position.
It should be noted that the database can be the poem database constructed by big data.It is stored in the database There is the position in every first poem including the sentence of non-standard pronunciation word, the complete audio of the sentence, the sentence in whole first poem The location information of the pronunciation of non-standard pronunciation word and non-standard pronunciation word in the sentence in information, the sentence.The database Building mode can be by collecting whole poems, and mark the pronunciation of non-standard pronunciation word in every first poem, and mark is non- The mode of the pronunciation of sentence where standard pronunciation word presses every first poem: the sound of poem title, non-standard pronunciation word and sentence Frequently, the data mode of the location information of non-standard pronunciation word and sentence is stored.
In one example, when user needs to play " will be into wine " by intelligent playback equipment, intelligent playback equipment Voice broadcast method are as follows:
It receives the voice broadcast that user sends and instructs " please play will be into wine ".
Being according to voice broadcast instruction identification casting information will be into wine.
Intelligent playback equipment is obtained based on casting information from cloud will be into the full text text of wine.
It will will be initial casting voice into the full text text conversion of wine by TTS technology.
According to casting information, search from database by into the poem information of wine, and will be into wine according to the determination of poem information In include non-standard pronunciation word.
The sentence audio comprising non-standard pronunciation word is further obtained from database.
Compare the sentence audio in database and the non-standard pronunciation word in corresponding sentence audio in initial casting voice Pronunciation it is whether consistent.
If inconsistent, the pronunciation of the non-standard pronunciation word in initial casting voice is replaced with obtained in database it is non- The pronunciation of standard pronunciation word.
In one embodiment, as shown in figure 9, using the scene of intelligent sound playback equipment in conjunction with user, the present invention The process of voice broadcast is as follows:
When user requires intelligent sound box to read classic poetry, such as " broadcasting will be into wine ", speaker can request cloud to take at this time Business device will find the title and full text of " will be into wine ".Then title and full text are sent jointly into voice server.Voice clothes Being engaged in device can be according to text progress TTS voice conversion, generation audio 1.It is based on database, voice or phrase meaning again, further examines In this poem of rope whether include polyphone sentence and its audio.If not including, the casting by audio 1 as final output Voice.If comprising the audio of the sentence comprising polyphone is denoted as audio 2.According to sentence position, comparing audio 2 and audio 1 The pronunciation of middle corresponding sentence, if it is different, replacing the contents of the section in audio 1 using audio 2.If identical, audio 1 is made For the casting voice of final output.
Figure 10 shows the structural block diagram of sound broadcasting device according to an embodiment of the present invention.The sound broadcasting device includes:
First obtains module 10, for according to casting acquisition of information content of text and initial casting voice.
Second obtains module 20, for obtaining the right pronunciation of non-standard pronunciation word according to content of text.
Comparison module 30, for more non-standard pronunciation word initially casting voice in pronunciation and right pronunciation whether one It causes.
Replacement module 40, if inconsistent in the initial pronunciation broadcasted in voice and right pronunciation for non-standard pronunciation word, Pronunciation of the non-standard pronunciation word in initially casting voice is then replaced with into right pronunciation, and generates final casting voice.
In one embodiment, as shown in figure 11, sound broadcasting device further include:
Determining module 50, for whether determining in the content of text comprising non-standard pronunciation word.
In one embodiment, as shown in figure 12, the second acquisition module 20 includes:
Sentence acquisition submodule 21, for obtaining the sentence in content of text where non-standard pronunciation word.
First pronunciation submodule 22 determines non-standard pronunciation word in content of text just for the semanteme according to sentence True pronunciation.
In one embodiment, as shown in figure 13, the second acquisition module 20 includes:
Phrase acquisition submodule 23, for obtaining phrase composed by non-standard pronunciation word in content of text.
Second pronunciation submodule 24 determines non-standard pronunciation word in content of text just for the meaning according to phrase True pronunciation.
In one embodiment, as shown in figure 14, the second acquisition module 20 includes:
Words and phrases acquisition submodule 25, for obtaining phrase composed by non-standard pronunciation word or the language at place in content of text Sentence.
Submodule 26 is retrieved, for retrieving the voice whether with phrase or sentence in the database.
Third pronunciation submodule 27, if for the voice in database with phrase or sentence, based on word in database The voice of group or sentence determines the right pronunciation of non-standard pronunciation word.
In one embodiment, as shown in figure 15, sound broadcasting device further include:
Position module 60, for determining position of the non-standard pronunciation word in content of text.
Searching module 70 searches non-standard pronunciation word and exists for the position according to non-standard pronunciation word in content of text Position in initial casting voice.
Third obtains module 80, for the position based on non-standard pronunciation word in initially casting voice, obtains non-standard Pronunciation of the pronunciation word in initially casting voice.
In one embodiment, as shown in figure 16, the first acquisition module 10 includes:
Submodule 11 is searched, for searching whether to have from server corresponding with casting information according to casting information Has casting voice.If having, will have casting voice as initial casting voice.
Content of text acquisition submodule 12, if information is corresponding has casting voice with casting for not having, from clothes The corresponding content of text of device acquisition casting information of being engaged in, and content of text is converted into casting voice.
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.
Figure 17 shows the structural block diagram of voice broadcast terminal according to an embodiment of the present invention.As shown in figure 17, the terminal packet Include: memory 910 and processor 920 are stored with the computer program that can be run on processor 920 in memory 910.It is described Processor 920 realizes the voice broadcast method in above-described embodiment when executing the computer program.The memory 910 and place The quantity for managing device 920 can be one or more.
The terminal further include:
Communication interface 930 carries out data interaction for being communicated with external device.
Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry Standard Architecture) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For Convenient for indicating, only indicated with a thick line in Figure 12, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any the method in above-described embodiment when executing.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the use device in conjunction with these instruction execution systems, device or equipment. The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electrical connection of one or more wirings Portion's (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM) can It wipes editable read-only memory (EPROM or flash memory), fiber device and portable read-only memory (CDROM). In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable media, because can For example by carrying out optical scanner to paper or other media, then to be edited, be interpreted or when necessary with other suitable methods It is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims (17)

1. a kind of voice broadcast method characterized by comprising
According to casting acquisition of information content of text and initial casting voice;
The right pronunciation of non-standard pronunciation word is obtained according to the content of text;
Whether consistent compare pronunciation and the right pronunciation of the non-standard pronunciation word in the initial casting voice;
If inconsistent, pronunciation of the non-standard pronunciation word in the initial casting voice is replaced with into the correct reading Sound, and generate final casting voice.
2. method according to claim 1, which is characterized in that obtain the correct of non-standard pronunciation word according to the content of text Before pronunciation, further includes:
It whether determines in the content of text comprising non-standard pronunciation word.
3. method according to claim 1, which is characterized in that obtain the correct of non-standard pronunciation word according to the content of text Pronunciation, comprising:
Obtain the sentence where non-standard pronunciation word described in the content of text;
According to the semanteme of the sentence, right pronunciation of the non-standard pronunciation word in the content of text is determined.
4. method according to claim 1, which is characterized in that obtain the correct of non-standard pronunciation word according to the content of text Pronunciation, comprising:
Obtain phrase composed by non-standard pronunciation word described in the content of text;
According to the meaning of the phrase, right pronunciation of the non-standard pronunciation word in the content of text is determined.
5. method according to claim 1, which is characterized in that obtain the correct of non-standard pronunciation word according to the content of text Pronunciation, comprising:
Obtain phrase composed by non-standard pronunciation word described in the content of text or the sentence at place;
The voice whether with the phrase or the sentence is retrieved in the database;
If having, the voice of phrase described in library or the sentence, determines the non-standard pronunciation word based on the data Right pronunciation.
6. method according to claim 1, which is characterized in that obtain the correct of non-standard pronunciation word according to the content of text After pronunciation, further includes:
Determine position of the non-standard pronunciation word in the content of text;
According to position of the non-standard pronunciation word in the content of text, the non-standard pronunciation word is searched described initial Broadcast the position in voice;
Position based on the non-standard pronunciation word in the initial casting voice, obtains the non-standard pronunciation word described Pronunciation in initial casting voice.
7. method according to claim 1, which is characterized in that initially broadcast voice according to casting acquisition of information, comprising:
According to casting information, search whether have casting voice with corresponding with the casting information from server;
If having, has casting voice as the initial casting voice for described;
If not having, the corresponding content of text of the casting information is obtained from the server, and the content of text is turned It is changed to casting voice.
8. method according to claim 1, which is characterized in that when the casting information is poem title, according to the text This content obtains the right pronunciation of non-standard pronunciation word, comprising:
According to the poem title and the content of text, corresponding poem information is searched in the database;
According to the poem information found, the right pronunciation of each non-standard pronunciation word is obtained in poem and in the content of text In position.
9. a kind of sound broadcasting device characterized by comprising
First obtains module, for according to casting acquisition of information content of text and initial casting voice;
Second obtains module, for obtaining the right pronunciation of non-standard pronunciation word according to the content of text;
Comparison module, for pronunciation and the right pronunciation of the non-standard pronunciation word in the initial casting voice It is whether consistent;
Replacement module, if for the non-standard pronunciation word it is described it is initial casting voice in pronunciation and the right pronunciation not Unanimously, then pronunciation of the non-standard pronunciation word in the initial casting voice is replaced with into the right pronunciation, and generated Final casting voice.
10. device according to claim 9, which is characterized in that further include:
Determining module, for whether determining in the content of text comprising non-standard pronunciation word.
11. device according to claim 9, which is characterized in that described second, which obtains module, includes:
Sentence acquisition submodule, for obtaining the sentence where non-standard pronunciation word described in the content of text;
First pronunciation submodule determines the non-standard pronunciation word in the content of text for the semanteme according to the sentence In right pronunciation.
12. device according to claim 9, which is characterized in that described second, which obtains module, includes:
Phrase acquisition submodule, for obtaining phrase composed by non-standard pronunciation word described in the content of text;
Second pronunciation submodule determines the non-standard pronunciation word in the content of text for the meaning according to the phrase In right pronunciation.
13. device according to claim 9, which is characterized in that described second, which obtains module, includes:
Words and phrases acquisition submodule, for obtaining phrase composed by non-standard pronunciation word described in the content of text or place Sentence;
Submodule is retrieved, for retrieving the voice whether with the phrase or the sentence in the database;
Third pronunciation submodule, if for the voice in database with the phrase or the sentence, based on the data The voice of phrase described in library or the sentence determines the right pronunciation of the non-standard pronunciation word.
14. device according to claim 9, which is characterized in that further include:
Position module, for determining position of the non-standard pronunciation word in the content of text;
Searching module searches the non-standard reading for the position according to the non-standard pronunciation word in the content of text Position of the sound word in the initial casting voice;
Third obtains module, for the position based on the non-standard pronunciation word in the initial casting voice, described in acquisition Pronunciation of the non-standard pronunciation word in the initial casting voice.
15. device according to claim 9, which is characterized in that described first, which obtains module, includes:
Submodule is searched, for searching whether to have from server corresponding with the casting information according to casting information There is casting voice;If having, has casting voice as the initial casting voice for described;
Content of text acquisition submodule, if for do not have it is corresponding with the casting information have casting voice, from described Server obtains the corresponding content of text of the casting information, and the content of text is converted to casting voice.
16. a kind of voice broadcast terminal characterized by comprising
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize such as any one of claims 1 to 8 the method.
17. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor Such as any one of claims 1 to 8 the method is realized when row.
CN201910318020.6A 2019-04-19 2019-04-19 Voice broadcasting method and device Active CN110032626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910318020.6A CN110032626B (en) 2019-04-19 2019-04-19 Voice broadcasting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910318020.6A CN110032626B (en) 2019-04-19 2019-04-19 Voice broadcasting method and device

Publications (2)

Publication Number Publication Date
CN110032626A true CN110032626A (en) 2019-07-19
CN110032626B CN110032626B (en) 2022-04-12

Family

ID=67239178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910318020.6A Active CN110032626B (en) 2019-04-19 2019-04-19 Voice broadcasting method and device

Country Status (1)

Country Link
CN (1) CN110032626B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112259092A (en) * 2020-10-15 2021-01-22 深圳市同行者科技有限公司 Voice broadcasting method and device and voice interaction equipment
CN114566060A (en) * 2022-02-23 2022-05-31 成都智元汇信息技术股份有限公司 Public transport message notification processing method, device, system, electronic device and medium

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1282072A (en) * 1999-07-27 2001-01-31 国际商业机器公司 Error correcting method for voice identification result and voice identification system
WO2007097176A1 (en) * 2006-02-23 2007-08-30 Nec Corporation Speech recognition dictionary making supporting system, speech recognition dictionary making supporting method, and speech recognition dictionary making supporting program
CN101630316A (en) * 2008-07-18 2010-01-20 株式会社日立制作所 Word message prompting system
CN101872614A (en) * 2009-04-24 2010-10-27 韩松 Hybrid voice synthesizing system
CN103093753A (en) * 2012-12-14 2013-05-08 沈阳美行科技有限公司 Navigation system user voice custom method
CN103279508A (en) * 2012-12-31 2013-09-04 威盛电子股份有限公司 Method for voice response correction and natural language conversational system
CN103366731A (en) * 2012-03-31 2013-10-23 盛乐信息技术(上海)有限公司 Text to speech (TTS) method and system
US20140122081A1 (en) * 2012-10-26 2014-05-01 Ivona Software Sp. Z.O.O. Automated text to speech voice development
CN104200803A (en) * 2014-09-16 2014-12-10 北京开元智信通软件有限公司 Voice broadcasting method, device and system
CN104197946A (en) * 2014-09-04 2014-12-10 百度在线网络技术(北京)有限公司 Voice navigation method, device and system
CN105095180A (en) * 2014-05-14 2015-11-25 中兴通讯股份有限公司 Chinese name broadcasting method and device
CN105206260A (en) * 2015-08-31 2015-12-30 努比亚技术有限公司 Terminal voice broadcasting method, device and terminal voice operation method
CN106710585A (en) * 2016-12-22 2017-05-24 上海语知义信息技术有限公司 Method and system for broadcasting polyphonic characters in voice interaction process
CN107305483A (en) * 2016-04-25 2017-10-31 北京搜狗科技发展有限公司 A kind of voice interactive method and device based on semantics recognition
CN107437413A (en) * 2017-07-05 2017-12-05 百度在线网络技术(北京)有限公司 voice broadcast method and device
CN107515850A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 Determine the methods, devices and systems of polyphone pronunciation
CN108280118A (en) * 2017-11-29 2018-07-13 广州市动景计算机科技有限公司 Text, which is broadcast, reads method, apparatus and client, server and storage medium
CN108305611A (en) * 2017-06-27 2018-07-20 腾讯科技(深圳)有限公司 Method, apparatus, storage medium and the computer equipment of text-to-speech
CN108877764A (en) * 2018-06-28 2018-11-23 掌阅科技股份有限公司 Audio synthetic method, electronic equipment and the computer storage medium of talking e-book
CN108984529A (en) * 2018-07-16 2018-12-11 北京华宇信息技术有限公司 Real-time court's trial speech recognition automatic error correction method, storage medium and computing device
CN112259092A (en) * 2020-10-15 2021-01-22 深圳市同行者科技有限公司 Voice broadcasting method and device and voice interaction equipment
CN114120961A (en) * 2021-11-19 2022-03-01 深圳市华宝电子科技有限公司 Voice broadcasting method, device, equipment and storage medium

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1282072A (en) * 1999-07-27 2001-01-31 国际商业机器公司 Error correcting method for voice identification result and voice identification system
WO2007097176A1 (en) * 2006-02-23 2007-08-30 Nec Corporation Speech recognition dictionary making supporting system, speech recognition dictionary making supporting method, and speech recognition dictionary making supporting program
CN101630316A (en) * 2008-07-18 2010-01-20 株式会社日立制作所 Word message prompting system
CN101872614A (en) * 2009-04-24 2010-10-27 韩松 Hybrid voice synthesizing system
CN103366731A (en) * 2012-03-31 2013-10-23 盛乐信息技术(上海)有限公司 Text to speech (TTS) method and system
US20140122081A1 (en) * 2012-10-26 2014-05-01 Ivona Software Sp. Z.O.O. Automated text to speech voice development
CN103093753A (en) * 2012-12-14 2013-05-08 沈阳美行科技有限公司 Navigation system user voice custom method
CN103279508A (en) * 2012-12-31 2013-09-04 威盛电子股份有限公司 Method for voice response correction and natural language conversational system
CN105095180A (en) * 2014-05-14 2015-11-25 中兴通讯股份有限公司 Chinese name broadcasting method and device
CN104197946A (en) * 2014-09-04 2014-12-10 百度在线网络技术(北京)有限公司 Voice navigation method, device and system
CN104200803A (en) * 2014-09-16 2014-12-10 北京开元智信通软件有限公司 Voice broadcasting method, device and system
CN105206260A (en) * 2015-08-31 2015-12-30 努比亚技术有限公司 Terminal voice broadcasting method, device and terminal voice operation method
CN107305483A (en) * 2016-04-25 2017-10-31 北京搜狗科技发展有限公司 A kind of voice interactive method and device based on semantics recognition
CN107515850A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 Determine the methods, devices and systems of polyphone pronunciation
CN106710585A (en) * 2016-12-22 2017-05-24 上海语知义信息技术有限公司 Method and system for broadcasting polyphonic characters in voice interaction process
CN108305611A (en) * 2017-06-27 2018-07-20 腾讯科技(深圳)有限公司 Method, apparatus, storage medium and the computer equipment of text-to-speech
CN107437413A (en) * 2017-07-05 2017-12-05 百度在线网络技术(北京)有限公司 voice broadcast method and device
CN108280118A (en) * 2017-11-29 2018-07-13 广州市动景计算机科技有限公司 Text, which is broadcast, reads method, apparatus and client, server and storage medium
CN108877764A (en) * 2018-06-28 2018-11-23 掌阅科技股份有限公司 Audio synthetic method, electronic equipment and the computer storage medium of talking e-book
CN108984529A (en) * 2018-07-16 2018-12-11 北京华宇信息技术有限公司 Real-time court's trial speech recognition automatic error correction method, storage medium and computing device
CN112259092A (en) * 2020-10-15 2021-01-22 深圳市同行者科技有限公司 Voice broadcasting method and device and voice interaction equipment
CN114120961A (en) * 2021-11-19 2022-03-01 深圳市华宝电子科技有限公司 Voice broadcasting method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112259092A (en) * 2020-10-15 2021-01-22 深圳市同行者科技有限公司 Voice broadcasting method and device and voice interaction equipment
CN112259092B (en) * 2020-10-15 2023-09-01 深圳市同行者科技有限公司 Voice broadcasting method and device and voice interaction equipment
CN114566060A (en) * 2022-02-23 2022-05-31 成都智元汇信息技术股份有限公司 Public transport message notification processing method, device, system, electronic device and medium
CN114566060B (en) * 2022-02-23 2023-03-24 成都智元汇信息技术股份有限公司 Public transport message notification processing method, device, system, electronic device and medium

Also Published As

Publication number Publication date
CN110032626B (en) 2022-04-12

Similar Documents

Publication Publication Date Title
US8321203B2 (en) Apparatus and method of generating information on relationship between characters in content
Baker Sociolinguistics and corpus linguistics
Pavel et al. Sceneskim: Searching and browsing movies using synchronized captions, scripts and plot summaries
US11011175B2 (en) Speech broadcasting method, device, apparatus and computer-readable storage medium
US20080208574A1 (en) Name synthesis
US10896444B2 (en) Digital content generation based on user feedback
CN104078044A (en) Mobile terminal and sound recording search method and device of mobile terminal
CN1965319A (en) Information search device, input supporting device, method, and program
KR20190021409A (en) Method and apparatus for playing voice
CN109979450B (en) Information processing method and device and electronic equipment
CN103150356B (en) A kind of the general demand search method and system of application
CN109616096A (en) Construction method, device, server and the medium of multilingual tone decoding figure
CN109754788A (en) A kind of sound control method, device, equipment and storage medium
CN110032626A (en) Voice broadcast method and device
CN111079423A (en) Method for generating dictation, reading and reporting audio, electronic equipment and storage medium
CN108153875B (en) Corpus processing method and device, intelligent sound box and storage medium
CN104572716A (en) System and method for playing video files
US7961851B2 (en) Method and system to select messages using voice commands and a telephone user interface
Moody The authority and authenticity of performative Englishes in popular culture
Zhao et al. (Standard) language ideology and regional Putonghua in Chinese social media: a view from Weibo
CN111324626B (en) Search method and device based on voice recognition, computer equipment and storage medium
ek Čermák Spoken corpora design: Their constitutive parameters
CN109299314B (en) Music retrieval and recommendation method, device, storage medium and terminal equipment
JP7229296B2 (en) Related information provision method and system
CN108280118A (en) Text, which is broadcast, reads method, apparatus and client, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210507

Address after: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Applicant after: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Applicant after: Shanghai Xiaodu Technology Co.,Ltd.

Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant