CN106653023A

CN106653023A - Method and system for triggering image acquisition by virtue of voice signal

Info

Publication number: CN106653023A
Application number: CN201611265390.0A
Authority: CN
Inventors: 翁健二
Original assignee: Shenzhen Tinno Wireless Technology Co Ltd
Current assignee: Shenzhen Tinno Wireless Technology Co Ltd
Priority date: 2016-12-30
Filing date: 2016-12-30
Publication date: 2017-05-10

Abstract

The invention discloses a method for triggering image acquisition by virtue of a voice signal. The method is characterized by comprising the following steps: receiving the voice command signal by virtue of a voice receiving/transmitting module; generating an image acquisition command corresponding to the voice command signal by virtue of a processing module in accordance with the voice command signal; and in accordance with the image acquisition command, conducting image acquisition by virtue of an image acquisition module. With the application of the method and the system provided by the invention, the image acquisition can be controlled in a mode of voice command, and an operation of clicking a functional key can be prevented, so that a situation that an image is unfocused due to jitter of a handheld device when an image function is operated by a user, and subsequently, user's experience degree is enhanced.

Description

With method and its system that speech sound signal triggers image capture

Technical field

The invention relates to a kind of voice-operated electronic installation, more particularly described below is one kind to assist a user Through a voice transmitting-receiving module of itself wearing, or using the voice transmitting-receiving module being built on the electronic installation, The mode instructed using a language audio number triggers the function of the electronic installation image capture.Counted user is set to use the electronics When device carries out image acquisition function, can be controlled in the way of language audio number instruction, be not necessary to again via click function Key, and or in the way of other electronic control switch devices, such as carry out image capture using the headset of Bluetooth protocol, from And avoid causing trembling for intelligent mobile phone, tablet PC or other intelligent running gears during user operation image acquisition function It is dynamic, produce image situation out of focus and occur.

Background technology

For prior art, often using the operation of click Touch Screen or pressing camera or mobile phone when user takes pictures On an entity button taken pictures, aforesaid operations contextual user all must be operated with handss, but in specific application feature Lower user inconvenience is manipulated with handss, and under forcing the situation that user is manipulated with handss, Consumer's Experience is not good, and the photo taken Also it is plain, thus not only impracticable but also reduction Consumer's Experience so that the practicality of product is not high, and then reduce product competition Power.

The content of the invention

For the shortcoming of above-mentioned prior art, it is an object of the invention to provide it is a kind of both practical and more directly perceived quickly use Family operation scenario, allows whereby user to carry out the operation of camera in the way of using speech sound signal, breaks away from profit Dot Strike touch-controls The use situation that screen operator or pressing camera or the entity button on mobile phone are taken pictures, consequently, it is possible to reduce use The complexity of family operation, allows user easily to operate and then increases Consumer's Experience simultaneously.

The purpose of the present invention can employ the following technical solutions to realize with its technical problem is solved.

According to the method for triggering image capture with speech sound signal proposed by the present invention, it is characterised in that comprise the steps： One phonetic order signal is received by a voice transmitting-receiving module, by a processing module according to the phonetic order signal, is produced Go out an image capture order of the correspondence phonetic order signal, and according to the image capture order, make an image capture Module carries out image capture.

In one embodiment of this invention, the method that image capture is triggered with speech sound signal, it is characterised in that more wrap Include:

Using a Bluetooth headset, a smart watch, an intelligent glasses or one there is the wearable type device of voice transmitting-receiving function to make For the voice transmitting-receiving module.

Using a storage module, the phonetic order signal is stored.

Connect the voice transmitting-receiving module and the processing module by way of being wirelessly transferred, or it is described by being electrically connected with Voice transmitting-receiving module and the processing module.

In one embodiment of this invention, the method that image capture is triggered with speech sound signal, it is characterised in that more wrap Include:The matching of the phonetic order signal is carried out by an application program, and produces the image capture instruction.

The purpose of the present invention can also be applied to the following technical measures to achieve further with its technical problem is solved.

The system that image capture is triggered with speech sound signal proposed by the present invention, it is characterised in that described with voice news Number triggering image capture system include:

One voice transmitting-receiving module, receives a phonetic order signal, and a processing module, according to the phonetic order signal, is produced One image capture order of the correspondence phonetic order signal, and an image acquisition module, according to the image capture order, Carry out image capture.

In one embodiment of this invention, the system that image capture is triggered with speech sound signal, it is characterised in that more wrap Include, the voice transmitting-receiving module is a Bluetooth headset, a smart watch, an intelligent glasses or any tool voice transmitting-receiving function Wearable type device.

In one embodiment of this invention, the system that image capture is triggered with speech sound signal, it is characterised in that more wrap Include, a storage module, store the phonetic order signal.

In one embodiment of this invention, the system that image capture is triggered with speech sound signal, it is characterised in that more wrap Include, the voice transmitting-receiving module is connected using the mode being wirelessly transferred with the processing module, or by being electrically connected with the place Reason module.

In one embodiment of this invention, the system that image capture is triggered with speech sound signal, it is characterised in that more wrap Include, an application program carries out the matching of the phonetic order signal, and produces image capture instruction described in.

In the present invention, the advantage by method and its system that image capture is triggered with speech sound signal is：User The electronic installation image acquisition function is triggered in phonetic order mode through the voice transmitting-receiving module of itself wearing, user uses When the electronic installation carries out image acquisition function, can be controlled in the way of the instruction of language audio number, be not necessary to again via click Function key, and or in the way of other electronic-controlled installations, such as the headset of Bluetooth protocol carries out image capture, so as to keep away Exempt from the shake that intelligent mobile phone, tablet PC or other intelligent running gears are caused during user operation image acquisition function, Produce image situation out of focus to occur, lift Consumer's Experience.

Description of the drawings

Fig. 1 is the module diagram of the system that image capture is triggered with speech sound signal proposed by the present invention.

Fig. 2 is the schematic flow sheet of the method that image capture is triggered with speech sound signal proposed by the present invention.

Fig. 3 to 4 is that the embodiment of the method and its system for triggering image capture with speech sound signal proposed by the present invention is illustrated Figure.

Specific embodiment

Further to illustrate that the present invention triggers the method and its system of image capture to reach predetermined with speech sound signal The technological means taken and its effect reached of improving eyesight, below in conjunction with accompanying drawing and preferred embodiment, to carrying according to the present invention What is gone out triggers the method for image capture and its specific embodiment of system, structure, feature and its effect with speech sound signal, does one Describe in detail.

Fig. 1 is refer to, is the module diagram of the system that image capture is triggered with speech sound signal proposed by the present invention, in figure In 1, a kind of system 1 that image capture is triggered with speech sound signal proposed by the present invention, it is characterised in that described to be touched with speech sound signal Sending out the system 1 of image capture includes：

One voice transmitting-receiving module 10, to the input for receiving a phonetic order signal.

One processing module 20, to carry out the matching of the phonetic order signal, and produces image capture instruction.

One image acquisition module 30, to carry out image capture action according to image capture instruction.

In the present embodiment, further include：The voice transmitting-receiving module is a Bluetooth headset, smart watch, a Brilliant Eyes Mirror or any one wearable type running gear with voice transmitting-receiving function.

In the present embodiment, further include：The voice transmitting-receiving module is sampled to the instruction signal that user sends, and is utilized The size of sound is converted into high-low voltage and produces continuous change in voltage and this analog signal is converted into into digital signal.

In the present embodiment, further include：The processing module, further includes an application program to match the phonetic order Signal, and produce the image capture instruction of the correspondence phonetic order signal.

In the present embodiment, further include：The processing module is reached the generation image capture and is referred to using a speech recognition Effect of order

In the present embodiment, further include：The topmost purpose of the speech recognition is that computer understands the sound that the mankind speak Sound, and then order computer performs corresponding work.When in conversion equipment input computer of the sound by analog to digital Portion, and after numerically storing, speech recognition program just starts the test sound of the sample sound and input for having stored in advance Sound sample is compared work.After the completion of comparison, it is possible to know that the sound that user has just sent represents He Yi, and then order meter Calculation machine is done things.

The speech recognition for adopting in the present embodiment can have following several ways：

First, according to the number of identification glossary：A small amount of glossary (hundreds of words), middle amount glossary (thousands of words), a large amount of glossarys are (tens thousand of Word).

2nd, according to using object：Special object (Speaker Dependent), not special object (Speaker Independent)。

3rd, according to occupation mode：Discontinuousness voice identification, continuous speech recognition.

In the present embodiment, further include：The application program further includes large vocabulary speech recognition system, the big vocabulary Amount speech recognition system adopts statistical-simulation spectrometry technology.

Fig. 2 is refer to, is the schematic flow sheet of the method that image capture is triggered with speech sound signal proposed by the present invention, including Following steps：

Step S110:One phonetic order signal is received by a voice transmitting-receiving module.

In the present embodiment, further include：The voice transmitting-receiving module can be a Bluetooth headset, a smart watch, an intelligence Can glasses or any one wearable type running gear with voice transmitting-receiving function.

In the present embodiment, further include：The phonetic order signal that the voice transmitting-receiving module sends to user is sampled, Department is converted into high-low voltage and produces continuous change in voltage and this analog signal is converted into into digital signal using the size of sound.

Step S120:By a processing module according to the phonetic order signal, the correspondence phonetic order news are produced Number an image capture order.

In the present embodiment, the processing module more can be comprising an application program to produce the correspondence phonetic order news Number an image capture order.

The processing module reaches the effect for producing the image capture instruction, the speech recognition using a speech recognition Topmost purpose is that computer understands mankind's one's voice in speech, and then order computer performs corresponding work.Work as sound Computer-internal is input into by the conversion equipment of analog to digital, and after numerically storing, speech recognition program just starts The sample sound that stored in advance and the test sample sound of input are compared work.After the completion of comparison, it is possible to know The sound that user has just sent represents He Yi, and then order computer is done things.

Adoptable speech recognition in the present embodiment has following several ways：

The application program further includes large vocabulary speech recognition system, and the large vocabulary speech recognition system is using system Meter mode identification technology.

The typical speech recognition system based on statistical pattern recognition method is made up of following basic module：

First, signal processing and characteristic extracting module

2nd, acoustic model

3rd, pronunciation dictionary

4th, language model

5th, decoder

The main task of the signal processing and characteristic extracting module is that feature is extracted from input signal, for acoustic model Process.Meanwhile, it typically also includes some signal processing technologies, to reduce the factors pair such as environment noise, speaker as far as possible The impact that feature is caused.

The acoustic model is adopted and is modeled based on single order HMM.

The pronunciation dictionary includes the system treatable word finder of institute and its pronunciation.

The actual image provided between acoustic model modeling unit and language model modeling unit of the pronunciation dictionary.

The language model language targeted to system is modeled.In theory, including regular language, context-free The syntax can serve as language model in interior various language models, and it is N-gram and its change for being based on statistics that the present invention is utilized Body.

The decoder is one of core of speech recognition system, and its task is the signal to being input into, according to acoustics, language Model and dictionary, find the word string that can export the signal with maximum of probability.

Step S130:Image capture is carried out according to the image capture order by an image acquisition module.

In the present embodiment, the image acquisition module can be proficiency machine attached camera module, tablet PC Attached camera module or a camera.

Fig. 3 to Fig. 4 is the schematic diagram of the embodiment of the system and method that the present invention triggers image capture with speech sound signal.

In figure 3, user holds mobile phone 70 and is ready for autodyning.

In the diagram, user sends the password taken pictures against bluetooth earphone 60, and mobile phone 70 receives taken pictures after password dynamic Make.

In the present embodiment, the mobile phone can equivalence replacement it is attached into the attached camera module of a mobile phone, tablet PC The camera module of category or a camera.

In the present embodiment, the bluetooth earphone can equivalence replacement into a wisdom wrist-watch, an intelligent glasses and any one tool There is the Wearable running gear of voice transmitting-receiving function.

The above, is only presently preferred embodiments of the present invention, and any pro forma restriction is not made to the present invention, though So the present invention is disclosed above with preferred embodiment, but is not limited to the present invention, any to be familiar with this professional technology people Member, in the range of without departing from technical solution of the present invention, when making a little change or modification using the technology contents of the disclosure above For the Equivalent embodiments of equivalent variations, as long as being the content without departing from technical solution of the present invention, the technical spirit of the foundation present invention Any simple modification made to above example and equivalent variations and modification, still fall within the scope of technical solution of the present invention It is interior.

Claims

1. a kind of method that image capture is triggered with speech sound signal, it is adaptable to mobile device, it is characterised in that including following step Suddenly：

One phonetic order signal is received by a voice transmitting-receiving module；

By a processing module according to the phonetic order signal, an image capture of the correspondence phonetic order signal is produced Order；And

By an image acquisition module according to the image capture order, image capture is carried out.

2. the method for according to claim 1 image capture being triggered with speech sound signal, it is characterised in that further include:

3. the method for according to claim 1 image capture being triggered with speech sound signal, it is characterised in that further include:

Using a storage module, the phonetic order signal is stored.

4. the method for according to claim 1 image capture being triggered with speech sound signal, it is characterised in that further include:By nothing The mode of line transmission connects the voice transmitting-receiving module and the processing module, or by being electrically connected with the voice transmitting-receiving mould Block and the processing module.

5. the method for according to claim 1 image capture being triggered with speech sound signal, it is characterised in that further include：

The matching of the phonetic order signal is carried out by an application program, and produces the image capture order.

6. it is a kind of with speech sound signal trigger image capture system, it is characterised in that it is described with speech sound signal trigger image capture System include：

One voice transmitting-receiving module, receives a phonetic order signal；

One processing module, according to the phonetic order signal, produces the image capture life of the correspondence phonetic order signal Order；And

One image acquisition module, according to the image capture order, carries out image capture.

7. the system for according to claim 6 image capture being triggered with speech sound signal, it is characterised in that further include：Institute's predicate Sound transceiver module is a Bluetooth headset, a smart watch, an intelligent glasses or a wearable type row with voice transmitting-receiving function Dynamic device.

8. the system for according to claim 6 image capture being triggered with speech sound signal, it is characterised in that further include：One storage Module, stores the phonetic order signal.

9. the system for according to claim 6 image capture being triggered with speech sound signal, it is characterised in that further include：Institute's predicate Sound transceiver module is connected using the mode being wirelessly transferred with the processing module, or by being electrically connected with the processing module.

10. the system for according to claim 6 image capture being triggered with speech sound signal, it is characterised in that further include：One application Program carries out the matching of the phonetic order signal, and produces the image capture order.