CN101742110A - Video camera set by speech recognition system - Google Patents

Video camera set by speech recognition system Download PDF

Info

Publication number
CN101742110A
CN101742110A CN200810152901A CN200810152901A CN101742110A CN 101742110 A CN101742110 A CN 101742110A CN 200810152901 A CN200810152901 A CN 200810152901A CN 200810152901 A CN200810152901 A CN 200810152901A CN 101742110 A CN101742110 A CN 101742110A
Authority
CN
China
Prior art keywords
video camera
speech
cpu
voice
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810152901A
Other languages
Chinese (zh)
Inventor
李妮
郑龙周
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Samsung Electronics Co Ltd
Original Assignee
Tianjin Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Samsung Electronics Co Ltd filed Critical Tianjin Samsung Electronics Co Ltd
Priority to CN200810152901A priority Critical patent/CN101742110A/en
Publication of CN101742110A publication Critical patent/CN101742110A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Details Of Cameras Including Film Mechanisms (AREA)

Abstract

The invention relates to a video camera set by a speech recognition system, comprising a CPU (central processing unit) circuit, a coding and decoding circuit and a speech module. The CPU circuit is connected with an electronic circuit of the video camera; the speech module is respectively connected with the CPU circuit and the coding and decoding circuit; the CPU circuit stores control procedures by the following steps: after the video camera is electrified and started, the system enters in a first selection, i.e. a speech control mode, to activate the action of the speech control module; at this time, the video camera enters in a speech control starting state; then, the recognition of key words is carried out, wherein key words which can be operated by speech includes video recording, stop, video play and shutdown; and the system can judge the information, send information obtained after judgment to the CPU and carry out the command after a user says 'video recording'. The video camera can facilitate the operation, improve the effect and the speed of snapshot by using the intelligent speech system and overcome the defect of the loss of the best picture due to menu adjustment.

Description

The video camera that adopts speech recognition system to be provided with
Technical scheme
The present invention relates to a kind of video camera, particularly a kind of video camera that adopts speech recognition system to be provided with.
Background technology
Present man-machine communication mainly is that manual manipulation mode is realized human-computer dialogue, limited the flexibility that Mechatronic Systems exchanges of unifying of people and department of computer science.In order to improve digitlization appliance system human-computer dialogue flexibility, make things convenient for special population needs such as old man, disabled person, aspect human-computer dialogue, need seek better information exchange means.Because language is human main and the most basic exchange waies, and along with the development of Digital Signal Processing software and hardware, up to the present voice processing technology reaches its maturity, and is own near the practicability stage.
Summary of the invention
In view of the deficiency that prior art exists, the invention provides and a kind ofly can control the video camera that the employing speech recognition system of camera operation is provided with by phonetic entry.
The technical scheme that the present invention takes for real above-mentioned purpose is: a kind of video camera that adopts speech recognition system to be provided with, comprise the cpu circuit and the coding and decoding circuit that are connected with the video camera electronic circuit, it is characterized in that, also comprise voice module, described voice module is connected with cpu circuit and coding and decoding circuit respectively, described cpu circuit storage control program, its controlled step is:
1, after the energized start, system enters first and selects promptly whether to enter the voice control model, this selection of system automatic option that is set to start shooting, if select NO then system enters manual mode, at this moment the function of video camera is the same with common video camera;
If 2 select YES, then this information can be communicated to speech control module by CPU, the action of voice activated control module, and at this moment video camera will enter voice control starting state;
3, carry out the identification of keyword then, here can have by the keyword of voice operating: record a video, stop, video reproduction, shutdown, the user says " video recording " back system can judge this information, the information that judgement is obtained is carried out after sending to CPU then, the user is every to say that a keyword program just once judges, till " shutdown " this program is performed.
The invention has the beneficial effects as follows: can use the voice system technology, carry out operation to video camera, make operation better convenient, and can improve the effect and the speed of capturing by the use of intelligent voice system, improved the defective of the best picture loss that causes because of the adjusting menu, perfect along with the voice voice activated control, new product also will be at emerge in multitude.
Description of drawings
Fig. 1 is that circuit of the present invention connects block diagram.
Fig. 2 is a control flow chart of the present invention.
Embodiment
The video camera that adopts speech recognition system to be provided with as shown in Figure 1, 2, comprise the cpu circuit and the coding and decoding circuit that are connected with the video camera electronic circuit, also comprise voice module, voice module is connected with cpu circuit and coding and decoding circuit respectively, the cpu circuit storage control program, its controlled step is:
1, after the energized start, system enters first and selects promptly whether to enter the voice control model, this option also has in menu, this selection of system automatic option that is set to start shooting, if select NO then system enters manual mode, at this moment the function of video camera is the same with common video camera;
If 2 select YES, then this information can be communicated to speech control module by CPU, the action of voice activated control module, and at this moment video camera will enter voice control starting state;
3, carry out the identification of keyword then, here can have by the keyword of voice operating: record a video, stop, video reproduction, shutdown, the user says " video recording " back system can judge this information, the information that judgement is obtained is carried out after sending to CPU then, the user is every to say that a keyword program just once judges, till " shutdown " this program is performed.
The present invention includes hardware designs and software design two parts, hardware components has increased pronounciation processing chip (RSC-364) in the video camera electronic circuit, has increased the code of control speech processes on the software.
Voice module (RSC-364) is the cmos device that a slice is a core with 8 MCU, and also integrated assemblies such as ROM, RAM, A/D, D/A, front-end amplifier and power amplifier.RSC-364 has accurately, and reaction time, low cost fast, and multi-functional as long as add seldom external module, just can be formed a speech recognition system.Its operational capability is 4MIPS (Million Instructions Per Second) in order to improve operational capability, more than on the chip multiplier of a 24bit * 24bit.
RSC-364 uses the artificial neural net of succeeding in school in advance to carry out nonspecific language person's speech recognition, does not promptly need just can discern simple statements such as " Yse ", " No ", " Ok " through training, claims on its Data Book that its discrimination is more than 97%.
RSC-364 also has the speech-sound synthesizing function of 5~15kb/s, and its phonetic synthesis is by the Sensory specialized designs, more general good of its tonequality.It also has improved ADPCM (adaptive differential pulse modulation) speech coder and decoder function.
The design of RSC-364 comprises that microphone signal enlarges data transaction, identification and comprehensive function also have in ROM holder (only the RSC-364 chip has), and the core of a single-chip CPU is arranged, therefore, RSC-364 can provide the 4MIPS of integer performance at 14.32MHz.This can make the consumer obtain maximum usefulness with the expense of minimum.The RSC-364 command list is very similar to 8051 groups of microprocessor.Its processor avoids limiting special-purpose internal memory, and seeing through has complete symmetry source and purpose, is fit to all instructions.
Voice activated control all carries the microphone part at most DVC, therefore can be directly connected to the microphone among the DVC.Provide power supply by the DVC mainboard.
Many identification engine identifier workflows are:
(1) the input voice is carried out preliminary treatment, comprise the cutting of voice signal and noise remove etc.The cutting algorithm that is based on energy window calculating that the cutting of voice signal is adopted makes that the end points of voice signal is more accurate.
(2) physical length and other physical features anticipation input voice according to the input voice are still continuous speech input of isolated word input.If voice signal is shorter, then adopt identification engine 1,2 to discern; If signal is longer, then adopt identification engine 2,3 to discern; If can not determine isolated voice or continuous speech, then adopt three identification engines to discern simultaneously.
(3), the recognition result that obtains sent into as candidate keywords (if recognition result difference then be many candidates) confirm that module confirms for different identification engines.
Because the identifier based on many identification engine has started two or three identification engines at least simultaneously, so the response time of system will be affected inevitably.So when the voice modeling, the method that adopts parameter to share, thus reduced the computing method complexity, improved system response time.Notice simultaneously,,, therefore can satisfy the requirement of real-time response fully because the recognition speed of identification engine 1,2 is very fast for isolated voice; For continuous speech, its recognition time mainly expends on the identification engine 3, and this is inevitably, and it is additional consuming time very little that system introduces, and therefore can therefore not reduce the response speed of system basically.
And discern the foundation of the identifier of engine more, make no matter continuous speech is imported still isolated voice input, can both adopt suitable identification engine to discern, thereby on the basis that allows the user freely to exchange, guarantee that the discrimination of system is greatly enhanced.Especially the user is when adopting the continuous speech input system correctly not discern, can lower the requirement, regard it as the isolated voice input, can correctly control household electrical appliances so on the one hand normally moves, pass through self adaptation on the other hand, the model of different identification engines has all obtained portrayal more accurately, has improved system recognition rate gradually, thereby has made the continuous speech recognition rate also be improved.In addition, all adopted in all cases and connected the identification engine, mainly be to consider often more subsidiary common burst noise and modal particles in the voice of disabled user,, can remove the noise of voice signal head and the tail and the influence of modal particle therefore by this being carried out independent modeling.

Claims (1)

1. video camera that adopts speech recognition system to be provided with, comprise the cpu circuit and the coding and decoding circuit that are connected with the video camera electronic circuit, it is characterized in that, also comprise voice module, described voice module is connected with cpu circuit and coding and decoding circuit respectively, and described cpu circuit storage control program step is:
(1) after the energized start, system enters first and selects promptly whether to enter the voice control model, this selection of system automatic option that is set to start shooting, if select NO then system enters manual mode, at this moment the function of video camera is the same with common video camera;
(2) if select YES, then this information can be communicated to speech control module by CPU, voice activated control module action, and at this moment video camera will enter voice control starting state;
(3) speech control module carries out the identification of keyword, can have by the keyword of voice operating here: record a video, stop, video reproduction, shutdown; The user says " video recording " back system can judge this information, and the information that judgement is obtained is carried out after sending to CPU then, and the user is every to say that a keyword program just once judges, till " shutdown " this program is performed.
CN200810152901A 2008-11-10 2008-11-10 Video camera set by speech recognition system Pending CN101742110A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810152901A CN101742110A (en) 2008-11-10 2008-11-10 Video camera set by speech recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810152901A CN101742110A (en) 2008-11-10 2008-11-10 Video camera set by speech recognition system

Publications (1)

Publication Number Publication Date
CN101742110A true CN101742110A (en) 2010-06-16

Family

ID=42464929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810152901A Pending CN101742110A (en) 2008-11-10 2008-11-10 Video camera set by speech recognition system

Country Status (1)

Country Link
CN (1) CN101742110A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014079324A1 (en) * 2012-11-26 2014-05-30 腾讯科技(深圳)有限公司 Voice interaction method and device
CN104469162A (en) * 2014-12-23 2015-03-25 天津天地伟业数码科技有限公司 Dome camera voice control method
CN105141919A (en) * 2015-09-01 2015-12-09 武汉同迅智能科技有限公司 Monitoring terminal device remotely controlled by voice
CN109951651A (en) * 2019-02-20 2019-06-28 浙江工业大学 A kind of collaboration method of audio broadcasting and video grabber
CN109963073A (en) * 2017-12-26 2019-07-02 浙江宇视科技有限公司 Video camera control method, device, system and PTZ camera
CN110636163A (en) * 2012-11-20 2019-12-31 华为终端有限公司 Voice response method and mobile device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110636163A (en) * 2012-11-20 2019-12-31 华为终端有限公司 Voice response method and mobile device
WO2014079324A1 (en) * 2012-11-26 2014-05-30 腾讯科技(深圳)有限公司 Voice interaction method and device
US9728192B2 (en) 2012-11-26 2017-08-08 Tencent Technology (Shenzhen) Company Limited Method and apparatus for voice interaction control of movement base on material movement
CN104469162A (en) * 2014-12-23 2015-03-25 天津天地伟业数码科技有限公司 Dome camera voice control method
CN105141919A (en) * 2015-09-01 2015-12-09 武汉同迅智能科技有限公司 Monitoring terminal device remotely controlled by voice
CN109963073A (en) * 2017-12-26 2019-07-02 浙江宇视科技有限公司 Video camera control method, device, system and PTZ camera
CN109951651A (en) * 2019-02-20 2019-06-28 浙江工业大学 A kind of collaboration method of audio broadcasting and video grabber

Similar Documents

Publication Publication Date Title
JP6058053B2 (en) Recording control system, system and program
US10699702B2 (en) System and method for personalization of acoustic models for automatic speech recognition
CN105632486B (en) Voice awakening method and device of intelligent hardware
CN101529500B (en) Content summarization system and content summarization method
JP6469252B2 (en) Account addition method, terminal, server, and computer storage medium
CN110570873B (en) Voiceprint wake-up method and device, computer equipment and storage medium
CN101742110A (en) Video camera set by speech recognition system
CN105869633A (en) Cross-lingual initialization of language models
CN102404278A (en) Song requesting system based on voiceprint recognition and application method thereof
CN110298463A (en) Meeting room preordering method, device, equipment and storage medium based on speech recognition
CN108538293A (en) Voice awakening method, device and smart machine
CN109285548A (en) Information processing method, system, electronic equipment and computer storage medium
CN111062221A (en) Data processing method, data processing device, electronic equipment and storage medium
CN108595406B (en) User state reminding method and device, electronic equipment and storage medium
CN109166571A (en) Wake-up word training method, device and the household appliance of household appliance
CN111862938A (en) Intelligent response method, terminal and computer readable storage medium
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
WO2021098318A1 (en) Response method, terminal, and storage medium
CN111506183A (en) Intelligent terminal and user interaction method
CN201286140Y (en) Video camera set by voice recognition system
CN109240488A (en) A kind of implementation method of AI scene engine of positioning
WO2018023518A1 (en) Smart terminal for voice interaction and recognition
CN110083392B (en) Audio awakening pre-recording method, storage medium, terminal and Bluetooth headset thereof
CN110556099B (en) Command word control method and device
CN111667829A (en) Information processing method and device, and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20100616