CN102436808A - Digital bidirectional intelligent voice explanation system and method thereof - Google Patents

Digital bidirectional intelligent voice explanation system and method thereof Download PDF

Info

Publication number
CN102436808A
CN102436808A CN2011103531723A CN201110353172A CN102436808A CN 102436808 A CN102436808 A CN 102436808A CN 2011103531723 A CN2011103531723 A CN 2011103531723A CN 201110353172 A CN201110353172 A CN 201110353172A CN 102436808 A CN102436808 A CN 102436808A
Authority
CN
China
Prior art keywords
unit
pronunciation
text
server
end device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011103531723A
Other languages
Chinese (zh)
Other versions
CN102436808B (en
Inventor
陆德宝
吕杰
吴海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ANTONG TECHNOLOGY BUSINESS DEVELOPMENT Co Ltd
Original Assignee
ANTONG TECHNOLOGY BUSINESS DEVELOPMENT Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANTONG TECHNOLOGY BUSINESS DEVELOPMENT Co Ltd filed Critical ANTONG TECHNOLOGY BUSINESS DEVELOPMENT Co Ltd
Priority to CN2011103531723A priority Critical patent/CN102436808B/en
Publication of CN102436808A publication Critical patent/CN102436808A/en
Application granted granted Critical
Publication of CN102436808B publication Critical patent/CN102436808B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention provides a digital bidirectional intelligent voice explanation system that comprises a PC server and terminal devices and is based on a TTS. The PC server is used for management and configuration of corresponded files as well as all the terminal devices can work independently; therefore, the system has good stability. According to the current explanation system, pronounced explanation only can be carried out according to a prestored pronunciation text. However, the provided system is different from the current explanation system. All the terminal devices in the system can select to carry out pronunciation according to a prestored pronunciation text or directly carry out voice amplification on on-site sounds of a narrator; and the pronunciation explanation is not influenced by a machine; therefore; demand maximization can be realized; and when there is a fault on the equipment, an emergent measure can be taken. Pronunciation files are stored and managed in a test mode; occupied resources are small; application flexibility and maintainability are strong; and a pronunciation content can be changed by modifying a text. And voice identification key words are stored and managed in a text mode as well as configuration modification can be carried out on the key words according to different demands of venues; and even when exhibits in the venues are changed, rapid configuration can also be carried out by the server and thus the flexibility is strong.

Description

Digital bidirectional intelligent sound introduction system and method thereof
Technical field
The present invention relates to speech sound eeplaining system, be mainly used in various venues, mutual to the intelligent sound explanation and the intelligent man machine language of product, showpiece.
Background technology
Along with the continuous development, particularly speech recognition algorithm of electronic technology and infotech and the development of phonetic synthesis algorithm, replace the work of repeater's power with it, and can embody the operation of its hommization.Similar introduction system is also arranged on the market, but be based on voice storages and the mode of calling mostly, this framework, sound pronunciation are to leave calling of voice data file in the storer in advance in, and it is also more inflexible pronounce, and system's hommization is poor with maintainability.System is also arranged based on TTS, but algorithm must rely on the completion with PC, in concrete the application, system cost is high, particularly goes up the system of certain scale, and stability also can't guarantee.
Summary of the invention
The technical matters that the present invention will solve is: a kind of digital bidirectional intelligent sound introduction system is provided, and its each end device can work independently.
The present invention solves the problems of the technologies described above the technical scheme of being taked to be: digital bidirectional intelligent sound introduction system; Comprise PC server and end device; It is characterized in that: described PC server comprises: NIU is used for the communication between PC server and each end device; The pronunciation text library is used for depositing in advance text and the corresponding terminal number thereof that needs are play; The speech recognition key word library is used for depositing in advance each end device speech recognition key word and corresponding terminal number thereof; The terminal monitoring administrative unit, the presence and the running status that are used to monitor and manage each end device; Terminal equipment database is used to deposit each terminal number, status information, current pronunciation text numbering and current identidication key numbering;
Described end device comprises the network communication unit, be used for and the PC server between communication; Storage unit is used to deposit the pronunciation text and the speech recognition key word that are passed over by the network communication unit; The TTS unit that pronounces is used for the pronunciation text is carried out phonetic synthesis, the output audio digital signals; Pickup unit is used to wait for the voice command that the user sends and carries out the local voice collection; Voice recognition unit, the voice messaging that is used for that pickup unit is collected carry out modeling identification, and result and identification item after the identification are compared, and trigger TTS pronunciation unit then and carry out phonetic synthesis; Trigger receiving element, be used for the local control audio switch unit that triggers, select by machine explanation and artificial explanation dual mode automatically; The D/A converting unit is used for converting the pronounce audio digital signals of unit output of TTS to analog voice signal; The audio frequency switch unit is used to switch the synthetic analog voice signal of machine and the analog voice signal of artificial microphone; Power amplifier unit is used for the analog voice signal of audio frequency switch unit output is carried out power amplification, and sends loudspeaker to.
The running status of described each end device comprises ERST, explanation state, status recognition and holding state.
Digital bidirectional intelligent sound explanation method, it is characterized in that: it may further comprise the steps:
Step 1, need pronunciation text, speech recognition key word and the corresponding terminal of pronunciation number to bind from the pronunciation text library that configures and the selection of speech recognition key word library;
Communicating by letter with the NIU of PC server in the network communication unit of step 2, end device, from the speech recognition key word and the pronunciation text of its counterpart terminal of PC downloaded number, and leaves storage unit in;
Step 3, pickup unit receive narrator's voice control command;
Step 4, voice recognition unit find corresponding pronunciation text according to recognition result after the voice control command that receives is discerned in storage unit, and the text storage address of will pronouncing passes to TTS pronunciation unit;
Step 5, TTS pronunciation unit become the pronunciation text-converted of correspondence audio digital signals and pass to the D/A converting unit to convert analog voice signal into;
Step 6, judgement trigger receiving element and whether receive trigger pip: carry out local public address if there is the voice control command that then by the audio frequency switch unit pickup unit is received directly to be sent to power amplifier unit; If do not have and then switched analog voice signal to be passed to power amplifier unit and carry out power amplification, and send loudspeaker to and play by the D/A converting unit;
The terminal monitoring unit of step 7, PC server is monitored all end devices constantly; The timed sending querying command is given each end device; Wait for the return command of counterpart terminal device then, and the presence and the running status of counterpart terminal is kept in the terminal equipment database;
Beneficial effect of the present invention is:
1, native system is based on TTS, and wherein the PC server only supplies management and configuration corresponding document, and each end device can work independently, and possesses better stability.
2, can only pronounce to explain different according to the pronunciation text that prestores with existing introduction system; Each end device of native system can be selected by the pronunciation voicing text that prestores; Perhaps direct on-the-spot sound with the narrator amplifies; Not disturbed by machine, with maximization and the emergency measure when equipment breaks down of satisfying the demands.
3, the pronunciation file is deposited and managed with the form of text, and it is little to take resource, and application flexibility is strong, and is maintainable strong; Can change the sounding content through revising text.
4, the speech recognition key word is deposited and is managed with textual form, can be configured modification according to the different demands of venue, even when showpiece changes to some extent in the pipe then and there, can dispose fast through server equally, and dirigibility is very big.
Description of drawings
Fig. 1 is the end device structured flowchart.
Fig. 2 is a PC server architecture block diagram.
Fig. 3 is a PC server system process flow diagram.
Fig. 4 is the end device system flowchart.
Embodiment
Present embodiment comprises PC server as shown in Figure 2 and end device as shown in Figure 1, and end device can be provided with a plurality of as required, and each end device is provided with a terminal number.
The PC server comprises: NIU is used for the communication between PC server and each end device; The pronunciation text library is used for depositing in advance text and the corresponding terminal number thereof that needs are play; The speech recognition key word library is used for depositing in advance each end device speech recognition key word and corresponding terminal number thereof; The terminal monitoring administrative unit, the presence and the running status that are used to monitor and manage each end device; Terminal equipment database is used to deposit each terminal number, status information, current pronunciation text numbering and current identidication key numbering.
End device comprises the network communication unit, be used for and the PC server between communication; Storage unit is used to deposit the pronunciation text and the speech recognition key word that are passed over by the network communication unit; The TTS unit that pronounces is used for the pronunciation text is carried out phonetic synthesis, the output audio digital signals; Pickup unit is used to wait for the voice command that the user sends and carries out the local voice collection; Voice recognition unit, the voice messaging that is used for that pickup unit is collected carry out modeling identification, and result and identification item after the identification are compared, and trigger TTS pronunciation unit then and carry out phonetic synthesis; Trigger receiving element, be used for the local control audio switch unit that triggers, select by machine explanation and artificial explanation dual mode automatically; The D/A converting unit is used for converting the pronounce audio digital signals of unit output of TTS to analog voice signal; The audio frequency switch unit is used to switch the synthetic analog voice signal of machine and the analog voice signal of artificial microphone; Power amplifier unit is used for the analog voice signal of audio frequency switch unit output is carried out power amplification, and sends loudspeaker to.
Fig. 3 is a PC server system process flow diagram, and Fig. 4 is the end device system flowchart, and the two flow process is combined constitutes digital bidirectional intelligent sound explanation method, may further comprise the steps:
Step 1, need pronunciation text, speech recognition key word and the corresponding terminal of pronunciation number to bind from the pronunciation text library that configures and the selection of speech recognition key word library;
Communicating by letter with the NIU of PC server in the network communication unit of step 2, end device, from the speech recognition key word and the pronunciation text of its counterpart terminal of PC downloaded number, and leaves storage unit in;
Step 3, pickup unit receive narrator's voice control command;
Step 4, voice recognition unit find corresponding pronunciation text according to recognition result after the voice control command that receives is discerned in storage unit, and the text storage address of will pronouncing passes to TTS pronunciation unit;
Step 5, TTS pronunciation unit become the pronunciation text-converted of correspondence audio digital signals and pass to the D/A converting unit to convert analog voice signal into;
Step 6, judgement trigger receiving element and whether receive trigger pip: carry out local public address if there is the voice control command that then by the audio frequency switch unit pickup unit is received directly to be sent to power amplifier unit; If do not have and then switched analog voice signal to be passed to power amplifier unit and carry out power amplification, and send loudspeaker to and play by the D/A converting unit;
The terminal monitoring unit of step 7, PC server is monitored all end devices constantly; The timed sending querying command is given each end device; Wait for the return command of counterpart terminal device then, and the presence and the running status of counterpart terminal device is kept in the terminal equipment database;
The above is merely the preferred embodiments of the present invention; Be not limited to the present invention; Although the present invention has been carried out detailed explanation with reference to previous embodiment; For a person skilled in the art, it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement.All within spirit of the present invention and principle, any modification of being done is equal to replacement, improves and waits and all should be included within protection scope of the present invention.

Claims (3)

1. digital bidirectional intelligent sound introduction system comprises PC server and end device, it is characterized in that: described PC server comprises: NIU is used for the communication between PC server and each end device; The pronunciation text library is used for depositing in advance text and the corresponding terminal number thereof that needs are play; The speech recognition key word library is used for depositing in advance each end device speech recognition key word and corresponding terminal number thereof; The terminal monitoring administrative unit, the presence and the running status that are used to monitor and manage each end device; Terminal equipment database is used to deposit each terminal number, status information, current pronunciation text numbering and current identidication key numbering;
Described end device comprises the network communication unit, be used for and the PC server between communication; Storage unit is used to deposit the pronunciation text and the speech recognition key word that are passed over by the network communication unit; The TTS unit that pronounces is used for the pronunciation text is carried out phonetic synthesis, the output audio digital signals; Pickup unit is used to wait for the voice command that the user sends and carries out the local voice collection; Voice recognition unit, the voice messaging that is used for that pickup unit is collected carry out modeling identification, and result and identification item after the identification are compared, and trigger TTS pronunciation unit then and carry out phonetic synthesis; Trigger receiving element, be used for the local control audio switch unit that triggers, select by machine explanation and artificial explanation dual mode automatically; The D/A converting unit is used for converting the pronounce audio digital signals of unit output of TTS to analog voice signal; The audio frequency switch unit is used to switch the synthetic analog voice signal of machine and the analog voice signal of artificial microphone; Power amplifier unit is used for the analog voice signal of audio frequency switch unit output is carried out power amplification, and sends loudspeaker to.
2. digital bidirectional intelligent sound introduction system according to claim 1 is characterized in that: the running status of described each end device comprises ERST, explanation state, status recognition and holding state.
3. digital bidirectional intelligent sound explanation method, it is characterized in that: it may further comprise the steps:
Step 1, need pronunciation text, speech recognition key word and the corresponding terminal of pronunciation number to bind from the pronunciation text library that configures and the selection of speech recognition key word library;
Communicating by letter with the NIU of PC server in the network communication unit of step 2, end device, from the speech recognition key word and the pronunciation text of its counterpart terminal of PC downloaded number, and leaves storage unit in;
Step 3, pickup unit receive narrator's voice control command;
Step 4, voice recognition unit find corresponding pronunciation text according to recognition result after the voice control command that receives is discerned in storage unit, and the text storage address of will pronouncing passes to TTS pronunciation unit;
Step 5, TTS pronunciation unit become the pronunciation text-converted of correspondence audio digital signals and pass to the D/A converting unit to convert analog voice signal into;
Step 6, judgement trigger receiving element and whether receive trigger pip: carry out local public address if there is the voice control command that then by the audio frequency switch unit pickup unit is received directly to be sent to power amplifier unit; If do not have and then switched analog voice signal to be passed to power amplifier unit and carry out power amplification, and send loudspeaker to and play by the D/A converting unit;
The terminal monitoring unit of step 7, PC server is monitored all end devices constantly; The timed sending querying command is given each end device; Wait for the return command of counterpart terminal device then, and the presence and the running status of counterpart terminal is kept in the terminal equipment database.
CN2011103531723A 2011-11-09 2011-11-09 Digital bidirectional intelligent voice explanation system and method thereof Expired - Fee Related CN102436808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103531723A CN102436808B (en) 2011-11-09 2011-11-09 Digital bidirectional intelligent voice explanation system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103531723A CN102436808B (en) 2011-11-09 2011-11-09 Digital bidirectional intelligent voice explanation system and method thereof

Publications (2)

Publication Number Publication Date
CN102436808A true CN102436808A (en) 2012-05-02
CN102436808B CN102436808B (en) 2013-03-27

Family

ID=45984831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103531723A Expired - Fee Related CN102436808B (en) 2011-11-09 2011-11-09 Digital bidirectional intelligent voice explanation system and method thereof

Country Status (1)

Country Link
CN (1) CN102436808B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107710322A (en) * 2015-06-24 2018-02-16 雅马哈株式会社 Information providing system, information providing method and computer readable recording medium storing program for performing
CN109218884A (en) * 2018-08-29 2019-01-15 北京云迹科技有限公司 Outer audio equipment for robot and the robot using the audio frequency apparatus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006164201A (en) * 2004-12-06 2006-06-22 Isao Kono System for providing guide information for cellular phone, phs, pda and the like
CN101656545A (en) * 2008-08-18 2010-02-24 顾声飞 Method and system for handheld wireless intelligent guide
CN101751838A (en) * 2008-12-11 2010-06-23 易游达人科技(北京)有限公司 Compound positioning self-service tour guide machine
JP2010152588A (en) * 2008-12-25 2010-07-08 Logic Design:Kk System for providing guide information, method therefor, and computer program
CN201616248U (en) * 2009-10-29 2010-10-27 张丽 RFID auxiliary positioning tour description terminating machine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006164201A (en) * 2004-12-06 2006-06-22 Isao Kono System for providing guide information for cellular phone, phs, pda and the like
CN101656545A (en) * 2008-08-18 2010-02-24 顾声飞 Method and system for handheld wireless intelligent guide
CN101751838A (en) * 2008-12-11 2010-06-23 易游达人科技(北京)有限公司 Compound positioning self-service tour guide machine
JP2010152588A (en) * 2008-12-25 2010-07-08 Logic Design:Kk System for providing guide information, method therefor, and computer program
CN201616248U (en) * 2009-10-29 2010-10-27 张丽 RFID auxiliary positioning tour description terminating machine

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107710322A (en) * 2015-06-24 2018-02-16 雅马哈株式会社 Information providing system, information providing method and computer readable recording medium storing program for performing
CN107710322B (en) * 2015-06-24 2021-04-30 雅马哈株式会社 Information providing system, information providing method, and computer-readable recording medium
CN109218884A (en) * 2018-08-29 2019-01-15 北京云迹科技有限公司 Outer audio equipment for robot and the robot using the audio frequency apparatus

Also Published As

Publication number Publication date
CN102436808B (en) 2013-03-27

Similar Documents

Publication Publication Date Title
CN101030368B (en) Method and system for communicating across channels simultaneously with emotion preservation
CN104464716A (en) Voice broadcasting system and method
CN107735804A (en) The shift learning technology of different tag sets
CN105261355A (en) Voice synthesis method and apparatus
US20120198339A1 (en) Audio-Based Application Architecture
Ince Digital Speech Processing: Speech Coding, Synthesis and Recognition
CN104392721A (en) Intelligent emergency command system based on voice recognition and voice recognition method of intelligent emergency command system based on voice recognition
CN109496332A (en) Voice dialogue device, speech dialog method and storage medium
CN105118498A (en) Training method and apparatus of speech synthesis model
CN107516511A (en) The Text To Speech learning system of intention assessment and mood
CN104380373A (en) Systems and methods for name pronunciation
CN107112014A (en) Application foci in voice-based system
US20220076674A1 (en) Cross-device voiceprint recognition
CN102549653A (en) Speech translation system, first terminal device, speech recognition server device, translation server device, and speech synthesis server device
CN102822889B (en) Pre-saved data compression for tts concatenation cost
CN112037754A (en) Method for generating speech synthesis training data and related equipment
CN105304081A (en) Smart household voice broadcasting system and voice broadcasting method
CN110473546A (en) A kind of media file recommendation method and device
CN106537497A (en) Information management system and information management method
CN109102796A (en) A kind of phoneme synthesizing method and device
CN111009245B (en) Instruction execution method, system and storage medium
CN103811000A (en) Voice recognition system and voice recognition method
CN104239442A (en) Method and device for representing search results
CN109144458A (en) For executing the electronic equipment for inputting corresponding operation with voice
CN102436808B (en) Digital bidirectional intelligent voice explanation system and method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Digital bidirectional intelligent voice explanation system and method thereof

Effective date of registration: 20130830

Granted publication date: 20130327

Pledgee: Wuhan rural commercial bank Limited by Share Ltd Optics Valley branch

Pledgor: Antong Technology Business Development Co., Ltd.

Registration number: 2013990000629

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20131031

Granted publication date: 20130327

Pledgee: Wuhan rural commercial bank Limited by Share Ltd Optics Valley branch

Pledgor: Antong Technology Business Development Co., Ltd.

Registration number: 2013990000629

PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Digital bidirectional intelligent voice explanation system and method thereof

Effective date of registration: 20131031

Granted publication date: 20130327

Pledgee: Wuhan rural commercial bank Limited by Share Ltd Optics Valley branch

Pledgor: Antong Technology Business Development Co., Ltd.

Registration number: 2013990000805

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20141127

Granted publication date: 20130327

Pledgee: Wuhan rural commercial bank Limited by Share Ltd Optics Valley branch

Pledgor: Antong Technology Business Development Co., Ltd.

Registration number: 2013990000805

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130327

Termination date: 20181109