CN104021146A - Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files - Google Patents

Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files Download PDF

Info

Publication number
CN104021146A
CN104021146A CN201410207282.2A CN201410207282A CN104021146A CN 104021146 A CN104021146 A CN 104021146A CN 201410207282 A CN201410207282 A CN 201410207282A CN 104021146 A CN104021146 A CN 104021146A
Authority
CN
China
Prior art keywords
configuration file
speaker
user
microsoft
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410207282.2A
Other languages
Chinese (zh)
Inventor
陆成刚
俞珊珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201410207282.2A priority Critical patent/CN104021146A/en
Publication of CN104021146A publication Critical patent/CN104021146A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker

Abstract

An automatic switching method for Microsoft voice recognition configuration files includes the steps that a corresponding table of the configuration files and identity information of all users using the same computer for voice recognition is set up; one user makes a sound to a microphone, and the computer can recognize the identity according to the timbre of the sound of the speaker and output the identity information of the user; the corresponding configuration file is queried from the corresponding table file according to the identity information of the user; the configuration file of the user is automatically switched to. The invention further provides an automatic switching system for the Microsoft voice recognition configuration files.

Description

Automatic switching method and the system thereof of Microsoft's speech recognition configuration file
Technical field
The present invention relates to the automatic switchover of computer speech identification configuration file, in particular to automatic switching method and the system thereof of a kind of Microsoft speech recognition configuration file.
Background technology
At present, the speech recognition engine of main flow has Microsoft, University of Science and Technology news to fly with Google etc. in the industry, wherein the identification engine of Microsoft is that the tranining database of installing based on this locality of windows platform carries out work, this learning sample collection that has just determined it unlike University of Science and Technology news fly, the database of the speech recognition engine that is deployed in high in the clouds of Google is so huge.In general, the engine of Microsoft needs user to carry out pronounciation training to form and leave the local configuration file that is applicable to this user in.When being provided with after the acquiescence support of the configuration file of training through user, the engine precision of identifying speech of Microsoft can reach gratifying degree.
But in the time having some users to use same computer to do speech recognition, just need between different configuration files, switch, current such switching must rely on manual operation completely and carry out.For example, because the action that configuration file switches is more loaded down with trivial details: in win8 system, first user wants the speaker icon in right mouse button point-> to select sound pick-up outfit-> in the window ejecting, to continue to choose with right mouse button microphone icon-> to choose " configured voice identification " menu-> in the control panel ejecting, to choose upper left " advanced speech option "-> in the voice attributes window ejecting, to choose configuration file-> corresponding to user to exit by determining, has 7 steps altogether and realize the switching of configuration file.If open microphone and arrange the switching of configuration file by control panel in win8 system, need 10 steps.The user that these operations are unfamiliar with windows system for general clerical workforce who writes document by oral account etc. is a white elephant, the present invention proposes a kind of single stepping method of the configuration file that automatically switches.
Summary of the invention
The present invention will solve prior art and rely on manually operated shortcoming, and automatic switching method and the system thereof of a kind of Microsoft speech recognition configuration file is provided.
An automatic switching method for configuration file, is characterized in that, comprising:
Step 1, system initialisation phase create use same user's identity information and the correspondence table of configuration file that computer carries out speech recognition;
Step 2, before everyone uses speech recognition, user opens microphone and facing to microphone sounding, computer is identified speaker's voice identity, and exports this user's identity information;
Step 3, then system, from corresponding list file, inquires configuration filename corresponding to this user according to this user's identity information;
Step 4, system are switched to this user's configuration file according to configuration file star default configuration file obtained in the previous step, then start to enter the work of speech recognition.
Further, the identification of computer to speaker in step 2, its concrete mode is: open microphone and carry out according to the signature analysis of input audio frequency.In step 3, there is the corresponding table of speaker ' s identity configuration file with the configuration file of speech recognition configuration listed files string representation of the same name one to one.
An automatic switchover system for Microsoft's speech recognition configuration file, comprises microphone location module, Speaker Identification module, the corresponding table of speaker ' s identity configuration file, Microsoft's speech recognition engine the profile list, the SAPI of Microsoft storehouse Helper function and automatic switching module;
Microphone location module is to open the acoustic signal of microphone collection user environment, exports to Speaker Identification module;
Speaker Identification module is analyzed speaker's sound tone color, the speaker's who exports to automatic switching module identity information according to the voice signal gathering;
Automatic switching module, for the configuration that amendment default configuration file is this user automatically, does not need through loaded down with trivial details manual operation;
The corresponding table of speaker ' s identity configuration file is for providing inquiry to automatic switching module, so that automatic switching module obtains the corresponding configuration filename of this speaker;
Micro-the profile list of Microsoft's speech recognition engine is the filename that Microsoft's speech recognition engine is deployed in each local user voice training characteristic, and this list travels through used in the time that handover module arranges default configuration file;
The SAPI of Microsoft storehouse Helper function for handover module provide about amendment default configuration file interface API.
Advantage of the present invention is: can on the basis of Microsoft's speech recognition engine, realize the different configuration file that automatically switches, without manual operation.
Brief description of the drawings
Fig. 1 is the logical schematic that realizes of embodiment of the present invention configuration file automatic switching method, and in figure, in speech recognition configuration listed files, the configuration file k of overstriking represents it is active user's default configuration file.
Fig. 2 is the systemic-function operation logic sequence chart of the embodiment of the present invention.
Fig. 3 is the system component figure of the embodiment of the present invention, in figure what represent is " depending on ".
Embodiment
With reference to accompanying drawing:
An automatic switching method for configuration file, is characterized in that, comprising:
Step 1, system initialisation phase create use same user's identity information and the correspondence table of configuration file that computer carries out speech recognition;
Step 2, before everyone uses speech recognition, user opens microphone and facing to microphone sounding, computer is identified speaker's voice identity, and exports this user's identity information;
Step 3, then system, from corresponding list file, inquires configuration filename corresponding to this user according to this user's identity information;
Step 4, system are switched to this user's configuration file according to configuration file star default configuration file obtained in the previous step, then start to enter the work of speech recognition.
The identification of computer to speaker in step 2, its concrete mode is: open microphone and carry out according to the signature analysis of input audio frequency.In step 3, there is the corresponding table of speaker ' s identity configuration file with the configuration file of speech recognition configuration listed files string representation of the same name one to one.
Please refer to Fig. 1 below, this figure is the logical schematic that realizes of configuration file automatic switching method, specifically describes as follows:
Create all users' the identity information and the corresponding list file of configuration file that use same computer; In the time that microphone has phonetic entry, computer is identified speaker's voice identity, and exports this speaker's identity information; From corresponding list file, speaker's identity information inquires its corresponding configuration file, and the configuration file that automatically switches again.
Multiple element representations in figure in speech recognition configuration listed files have been trained multiple configuration files at present in speech recognition system, and the configuration file of acquiescence only has one; While pointing to speech recognition configuration listed files when automatically switching, represent to check whether default configuration file is exactly current user's configuration, if not automatically revise the configuration that default configuration file is active user, in figure, exactly the configuration file of user k is made as to default configuration file.
Please refer to Fig. 2 below, this figure is systemic-function operation logic sequence chart, and concrete flow process is as follows:
Create all users' that use same computer to carry out speech recognition identity information and the corresponding list file of configuration file.
1) user is facing to microphone sounding;
2) speaker's sounding tone color is identified, and exported this speaker's identity information;
3) remove to mate this speaker's configuration filename according to the speaker ' s identity information of output after identification;
4) the coupling configuration file that automatically switches after configuration filename;
5) user continues to speak facing to microphone;
6) continue to carry out speech recognition.
Correspondingly, the automatic switchover system of a kind of Microsoft of the present invention speech recognition configuration file, comprises microphone location module, Speaker Identification module, the corresponding table of speaker ' s identity configuration file, Microsoft's speech recognition engine the profile list, the SAPI of Microsoft storehouse Helper function and automatic switching module;
Microphone location module is to open the acoustic signal of microphone collection user environment, exports to Speaker Identification module;
Speaker Identification module is analyzed speaker's sound tone color according to the voice signal gathering, to automatic switching module output speaker's identity information;
Automatic switching module, for the configuration that amendment default configuration file is this user automatically, does not need through loaded down with trivial details manual operation;
The corresponding table of speaker ' s identity configuration file is for providing inquiry to automatic switching module, so that automatic switching module obtains the corresponding configuration filename of this speaker;
The profile list of Microsoft's speech recognition engine is the filename that Microsoft's speech recognition engine is deployed in each local user voice training characteristic, and this list travels through used while default configuration file being set for handover module;
The SAPI of Microsoft storehouse Helper function for handover module provide about amendment default configuration file interface API.
Speaker Identification module is to set up personnel's tamber characteristic storehouse according to the spectrum signature such as speech pitch, resonance peak, realizes Speaker Identification, and accurately searches speaker's configuration file.
Speaker ' s identity identification (Speaker Recognition) although precision performance error affect the accuracy that the identification of speaker ' s identity is exported, but the reduction that wrong configuration file is set can cause precision of identifying speech causing thus, this is because mistake appears in Speaker Identification, the tamber characteristic that means this two people is more similar, so their configuration file is also seemingly closer, thereby cause them can not cause mutually the reduction of precision of identifying speech with the other side's configuration file.
Automatic switching module is for the configuration that amendment default configuration file is active user automatically, does not need through loaded down with trivial details manual operation.
What automatically switch realizes logic according to Speaker Identification module output " user " information, inquire " configuration file " name corresponding to " user " from corresponding list file, the interface function of the Helper part in the SAPI storehouse that use Microsoft provides is realized the change of default configuration file, thereby realizes the switching between different configuration files.The method of calling that the SAPI interface of default configuration file is wherein set is
// enumerate all configuration files in the profile list
Please refer to Fig. 3 below, this figure is system component figure, and particular content comprises:
System will realize automatic switchover configuration file need to depend on user, the corresponding table of configuration filename, the SAPI of Microsoft storehouse Helper interface function and third party's Speaker Identification engine.
User, the corresponding table of configuration filename are to search corresponding configuration file according to subscriber identity information;
The SAPI of Microsoft storehouse Helper interface function is change for realizing default configuration file;
Third party's Speaker Identification engine depends on microphone location, and this is with voice tone color identification speaker ' s identity because of third party's Speaker Identification engine, need to have the input of microphone location just can carry out identification.

Claims (4)

1. the automatic switching method of Microsoft's speech recognition configuration file, is characterized in that, comprising:
Step 1, system initialisation phase create use same user's identity information and the correspondence table of configuration file that computer carries out speech recognition;
Step 2, before everyone uses speech recognition, user opens microphone and facing to microphone sounding, computer is identified speaker's voice identity, and exports this user's identity information;
Step 3, then system, from corresponding list file, inquires configuration filename corresponding to this user according to this user's identity information;
Step 4, system are switched to this user's configuration file according to configuration file star default configuration file obtained in the previous step, then start to enter the work of speech recognition.
2. method according to claim 1, is characterized in that: the identification of computer to speaker in step 2, its concrete mode is: open microphone and carry out according to the signature analysis of input audio frequency.
3. method according to claim 1, is characterized in that: in step 3, have the corresponding table of speaker ' s identity configuration file with the configuration file of speech recognition configuration listed files string representation of the same name one to one.
4. right to use requires a system for the method described in 1, it is characterized in that: comprise microphone location module, Speaker Identification module, the corresponding table of speaker ' s identity configuration file, Microsoft's speech recognition engine the profile list, the SAPI of Microsoft storehouse Helper function and automatic switching module;
Microphone location module is to open the acoustic signal of microphone collection user environment, exports to Speaker Identification module;
Speaker Identification module is analyzed speaker's sound tone color, the speaker's who exports to automatic switching module identity information according to the voice signal gathering;
Automatic switching module, for the configuration that amendment default configuration file is this user automatically, does not need through loaded down with trivial details manual operation;
The corresponding table of speaker ' s identity configuration file is for providing inquiry to automatic switching module, so that automatic switching module obtains the corresponding configuration filename of this speaker;
The profile list of Microsoft's speech recognition engine is the filename that Microsoft's speech recognition engine is deployed in each local user voice training characteristic, and this list travels through used while default configuration file being set for handover module;
The SAPI of Microsoft storehouse Helper function for handover module provide about amendment default configuration file interface API.
CN201410207282.2A 2014-05-15 2014-05-15 Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files Pending CN104021146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410207282.2A CN104021146A (en) 2014-05-15 2014-05-15 Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410207282.2A CN104021146A (en) 2014-05-15 2014-05-15 Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files

Publications (1)

Publication Number Publication Date
CN104021146A true CN104021146A (en) 2014-09-03

Family

ID=51437901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410207282.2A Pending CN104021146A (en) 2014-05-15 2014-05-15 Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files

Country Status (1)

Country Link
CN (1) CN104021146A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104867494A (en) * 2015-05-07 2015-08-26 广东欧珀移动通信有限公司 Naming and classification method and system of sound recording files

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819758A (en) * 2009-12-22 2010-09-01 中兴通讯股份有限公司 System of controlling screen display by voice and implementation method
CN103607609A (en) * 2013-11-27 2014-02-26 Tcl集团股份有限公司 Voice switching method and device for TV set channels

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819758A (en) * 2009-12-22 2010-09-01 中兴通讯股份有限公司 System of controlling screen display by voice and implementation method
CN103607609A (en) * 2013-11-27 2014-02-26 Tcl集团股份有限公司 Voice switching method and device for TV set channels

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭宁宁: "《多媒体实用技术》", 30 September 2006 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104867494A (en) * 2015-05-07 2015-08-26 广东欧珀移动通信有限公司 Naming and classification method and system of sound recording files
CN104867494B (en) * 2015-05-07 2017-10-24 广东欧珀移动通信有限公司 The name sorting technique and system of a kind of recording file

Similar Documents

Publication Publication Date Title
US10079014B2 (en) Name recognition system
US8700397B2 (en) Speech recognition of character sequences
US10811005B2 (en) Adapting voice input processing based on voice input characteristics
CN101484934B (en) Method and device for the natural-language recognition of a vocal expression
US20160328205A1 (en) Method and Apparatus for Voice Operation of Mobile Applications Having Unnamed View Elements
KR101330328B1 (en) Method of recognizing voice and system for the same
US8374862B2 (en) Method, software and device for uniquely identifying a desired contact in a contacts database based on a single utterance
WO2021135604A1 (en) Voice control method and apparatus, server, terminal device, and storage medium
EP2680165B1 (en) System and method to perform textual queries on voice communications
KR102141116B1 (en) Interface device and method supporting speech dialogue survice
WO2016002251A1 (en) Information processing system, and vehicle-mounted device
WO2019082017A1 (en) Creating modular conversations using implicit routing
JP2015510176A (en) Input processing method and apparatus
CN105426357A (en) Fast voice selection method
WO2006126649A1 (en) Audio edition device, audio edition method, and audio edition program
CN103106061A (en) Voice input method and device
CN106504748A (en) A kind of sound control method and device
CN108509412A (en) A kind of data processing method, device, electronic equipment and storage medium
WO2019051805A1 (en) Data processing device and method for performing speech-based human machine interaction
CN103426429A (en) Voice control method and voice control device
US9747891B1 (en) Name pronunciation recommendation
US20210065708A1 (en) Information processing apparatus, information processing system, information processing method, and program
EP3843090B1 (en) Method and apparatus for outputting analysis abnormality information in spoken language understanding
CN109120774A (en) Terminal applies voice control method and system
CN104021146A (en) Automatic switching method for Microsoft voice recognition configuration files and system of Microsoft voice recognition configuration files

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140903