WO2010020138A1

WO2010020138A1 - Control method and device for monitoring equipment

Info

Publication number: WO2010020138A1
Application number: PCT/CN2009/072503
Authority: WO
Inventors: 崔志伟
Original assignee: 中兴通讯股份有限公司
Priority date: 2008-08-22
Filing date: 2009-06-29
Publication date: 2010-02-25
Also published as: CN101345668A

Abstract

A control method and device for a monitoring equipment, the method including: extracting the features of the digital speech signals corresponding to the received speech control instructions, obtaining the speech recognition parameter sequences of the speech control instructions; searching the speech recognition parameter sequences which are an optimal match with the obtained speech recognition parameter sequences in one or more pre-set speech recognition parameter sequences; determining the operating instructions corresponding to the searched speech recognition parameter sequences, and the operational instructions are utilized to control a monitoring apparatus.

Description

Monitoring device control method and device

The present invention relates to the field of communications, and in particular, to a method and an apparatus for controlling a monitoring device. BACKGROUND With the development of network technologies and the improvement of user security awareness, the use of video surveillance for security defense has been widely used. In video surveillance, the pan/tilt is usually used to control the camera's monitoring angle. The pan/tilt is a camera mounting platform consisting of two AC motors. The operator operates the pan/tilt to make the pan/tilt move horizontally or vertically. The camera realizes the change of the monitoring angle and performs video monitoring. At present, the implementation method of controlling the pan/tilt is: The monitoring device client sends a control command to the monitoring device by using a mouse or a keyboard to control the monitoring device. This monitoring device control method is complicated to operate, and only a professional can operate the monitoring device, and the method is not image and intuitive for the operator. SUMMARY OF THE INVENTION The present invention has been made in view of the problem of complicated operation of a monitoring device control method existing in the related art. Therefore, it is a primary object of the present invention to provide a control method and apparatus for a monitoring device to solve the above problems. According to an aspect of the invention, a method of controlling a monitoring device is provided. The control method of the monitoring device according to the present invention comprises: performing feature extraction on a digital voice signal corresponding to the received voice control instruction, acquiring a voice recognition parameter sequence of the voice control instruction; and pre-setting one or more voice recognition parameter sequences The search and the acquired speech recognition parameter sequence match the best speech recognition parameter sequence; determine the operation instruction corresponding to the searched speech recognition parameter sequence, and use the operation instruction to control the monitoring device. Before the voice control instruction is received, the method further includes: receiving one or more voice control instructions in advance; performing feature extraction on the corresponding digital voice signal for each voice control instruction, acquiring and saving each voice control Instruction speech recognition parameter sequence; configuration speech recognition The correspondence between the parameter sequence and the operation instruction. Further, the above method further includes: saving, for each voice control instruction received in advance, a corresponding digital voice signal. The operation for saving the corresponding digital voice signal for each voice control command received in advance is specifically: compressing the corresponding digital voice signal for each voice control signaling received in advance, and saving the compressed digital number voice signal. Preferably, for each operation instruction, the number of preset voice control instructions may be one or more. The operation of controlling the device by using the operation instruction is specifically: sending an operation instruction to the target encoder, and the target encoder controls the monitoring device according to the operation instruction. The operation of controlling the device by using the operation instruction is specifically: sending an operation instruction to the relay server, and the relay server forwards the operation instruction to the target encoder, and the target encoder controls the monitoring device according to the operation instruction. Preferably, the digital speech signal characteristic parameter corresponding to the voice control instruction may be extracted by using one of the following methods: a formant extraction method, an endpoint detection extraction method, a linear prediction to a general coefficient extraction method, a Mel cepstral coefficient extraction method, and a linear frequency extraction method. . Preferably, the manner of finding the best speech recognition parameter sequence matching the acquired speech recognition parameter sequence comprises at least one of the following: a dynamic time rounding algorithm, a hidden Markov model. According to another aspect of the present invention, a control device for a monitoring device is provided. The control device of the monitoring device according to the present invention includes: an obtaining module, configured to perform feature extraction on a digital voice signal corresponding to the received voice control instruction, to obtain a voice recognition parameter sequence of the voice control instruction; and a matching module, configured to be preset The one or more speech recognition parameter sequences are matched with the acquired speech recognition parameter sequence to match the best speech recognition parameter sequence; the control module is configured to determine an operation instruction corresponding to the searched speech recognition parameter sequence, and monitor the operation instruction The device is controlled. Further, the device further includes: a receiving module, configured to receive one or more voice control commands in advance; and a saving module, configured to perform feature extraction on the digital voice signal corresponding to each voice control instruction, acquire and save each voice control Instruction speech recognition parameter sequence; configuration module, for Configure the correspondence between the speech recognition parameter sequence and the operation instruction. The saving module is further configured to save the digital voice signal received in advance; or save the digital voice signal corresponding to each compressed voice control instruction. Through the above at least one technical solution of the present invention, the monitoring device is controlled by using a voice instruction, and the operator can control the monitoring device by directly inputting the voice control command, and the method is simple and more image-oriented for the operator. Intuitive. The drawings are intended to provide a further understanding of the invention, and are intended to be a part of the description of the invention. In the drawings: FIG. 1 is a flowchart of a control method of a monitoring device according to an embodiment of a method of the present invention; FIG. 2 is a detailed processing flowchart of a control method of a monitoring device according to an embodiment of a method of the present invention; FIG. 4 is a structural block diagram of a control device of a monitoring device according to an embodiment of the device of the present invention; FIG. 5 is a specific embodiment of a control device for a monitoring device according to an embodiment of the device of the present invention; Structure frame diagram. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS As described above, at present, when there is a control over a monitoring device, there is a problem that the operation is complicated. The present invention addresses a problem, and proposes a control scheme for the monitoring device, which uses a voice control command. Controlling the monitoring equipment, the scheme is simple to operate, and is more visual and intuitive than the prior art, and with the rapid development of technology, the speech recognition technology has gradually become a key technology of human-machine interface in information technology. The invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. Method Embodiments According to an embodiment of the present invention, a control method of a monitoring device is provided. In the embodiment of the present invention, the correspondence between the speech recognition parameter sequence and the operation instruction needs to be pre-configured. Specifically, one or more voice control commands may be received in advance, and the analog voice signal is converted for each voice control instruction. For the digital speech signal and feature extraction of the above digital speech signal, a speech recognition parameter sequence of each speech control instruction is obtained, wherein there are various ways to extract the feature of the digital speech signal, for example, the formant extraction method can be used. , endpoint detection extraction method, linear prediction to general coefficient extraction method, Mel cepstral coefficient (MFCC) extraction method, linear frequency (LSF) extraction method, PLCC extraction method, EPOCH extraction method, and the like. After obtaining the speech recognition parameter sequence, the obtained speech recognition parameter sequence may be saved in the form of a file on the hard disk or the memory of the computer. In addition, the digital voice signal corresponding to each voice control instruction may be saved in the hard disk or the memory of the computer. In addition, in order to save space on the hard disk or the memory, the digital voice signal can be compressed and then saved. Finally, the correspondence between the speech recognition parameter sequence and the operation instruction is configured, and, in order to improve the recognition rate of the voice control instruction, one or more voice control instructions may be preset for each operation instruction, that is, for each operation instruction a plurality of voice control commands may be input, respectively, and the voice recognition parameter sequences corresponding to the plurality of voice control commands are respectively obtained, and a correspondence relationship between the operation command and the voice recognition parameter sequence is respectively established, that is, one operation command may correspond to multiple voices. Identify the sequence of parameters. 1 is a flowchart of a method for controlling a monitoring device according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps: Step S102: Perform feature extraction on a digital voice signal corresponding to a received voice control command, and obtain the a voice recognition parameter sequence of the voice control command; Step S104, searching for a sequence of the voice recognition parameter that matches the obtained voice recognition parameter sequence in the preset one or more voice recognition parameter sequences, for example, using dynamic time return The whole algorithm, the implicit Markov model, is matched. Step S106, determining an operation instruction corresponding to the searched speech recognition parameter sequence, and controlling the monitoring device according to the determined operation instruction. Specifically, the operation instruction may be sent to the target encoder, and then the target encoder is configured according to the operation instruction. The monitoring device is controlled. In the client and supervisor When the control device is not in the same network, the operation command may be sent to the target encoder through the relay server. Specifically, the client sends an operation instruction to the relay server, and the relay server forwards the operation instruction to the target encoder, and the target encoder uses the The operating instructions control the monitoring device. Through the technical solution provided by the embodiment of the present invention, the monitoring device is controlled by using a voice instruction, so that the operator can control the monitoring device by directly inputting the voice control command, and the method is simple and more convenient for the operator. Image, intuitive. 2 is a detailed processing flowchart of a method for controlling a monitoring device according to an embodiment of the method of the present invention. As shown in FIG. 2, the method includes the following steps: Step S201: Different operating instructions are required for different monitoring devices. Corresponding voice control commands are preset according to different operation commands, for example, voice control commands for setting the pan/tilt to rotate leftward, rightward, upward, and downward. Step S202, the operation instruction for each voice control commands, Bian speech signal samples, particular, preclude the sampling frequency of the speech signal may be 8KHZ, the 8KHZ suitable for simple statement identifying another ¹ J, for complex statements, You can choose a higher sampling frequency, the sample interval is 10MS, the sample interval contains 80 sample points, a segment of speech contains multiple samples, and the short-term energy sum is calculated every 10MS, and the short-term energy obtained by the sample is greater than At a certain threshold, it is considered that the voice sample has begun. When the short-term energy is less than 1/20 of the average energy, the voice sample is considered to have ended, the voice sample signal is obtained, and the pulse-code modulation is utilized. , abbreviated as PCM) format to store the voice sample signal (ie, voice PCM code) for each operation command. Step S203, extracting corresponding feature parameters for each of the voice sample signals obtained in step S202, and determining a sequence of voice recognition parameters, which may use a formant extraction method, an endpoint detection extraction method, a linear prediction to a general coefficient extraction method, The feature parameters are extracted by the MFCC extraction method, the LSF extraction method, the PLCC extraction method, and the EPOCH extraction method. The linear prediction coding (LPC) algorithm is taken as an example for description. Specifically, for each voice sample signal, a 12th-order predictive cepstral coefficient (LPCC) corresponding to the voice sample signal is calculated, and the obtained series of feature parameters are constructed into a feature vector sequence (form as A = { al , a2, ..., ai } ), the feature vector sequence is a speech recognition parameter sequence, that is, a speech parameter template. Step S204, the obtained voice recognition parameter sequence and the voice PCM code obtained in step S202 are saved in the form of a file on a computer hard disk or a memory of the client, where the file name The name should correspond to the control instruction. For example, if the operation instruction is to rotate the monitoring device to the right, the file name can be Template_Right. In addition, the saved file can be read into the hard disk or the memory at one time without having to repeatedly read in each time the voice command is recognized, thereby saving the time of the recognition process. Through the above steps S201 to S204, the setting of the speech recognition parameter sequence corresponding to the operation instruction of the monitoring device is realized, and the following uses the voice operation command to control the monitoring device. Step S205, the voice control command is input, and the voice control command is sampled. The implementation method is the same as step S202, and details are not described herein again. Step S206, the speech recognition parameter sequence 歹' JS = { si , s2, ···, sk} corresponding to the voice control command in step S205 is obtained, and the implementation method is the same as step S203, and is not mentioned here. Step S207, matching the speech recognition parameter sequence S obtained in step S206 with a plurality of preset speech recognition parameter sequences, selecting the best matching speech recognition parameter, and according to the operation instruction corresponding to the best matching speech recognition parameter Control the monitoring equipment. For example, four speech recognition parameter sequences A = { al , a2, ..., ai }, B = { bl , b2, bj }, C = { cl , c2, ··· , cm }, D = are preset in the local area. { dl , d2 , ..., dn }, respectively, the corresponding operations are left (Template_Left), right (Template_Right), up (Template_Up), down (Template_Down), using DTW (Dynamic Time Rounding) algorithm The input speech recognition parameter sequence S is sequentially matched with the reference templates A, B, C, and D stored in the template library, wherein the reference template with the highest matching degree is the recognition result, and the control instruction represented by the recognition result is determined according to the recognition result. If the speech recognition parameter sequence S matches the reference template A most, it is determined that the operation instruction executed on the monitoring device is the leftward (Template_Left) corresponding to the reference template A. Step S208, the client establishes a TCP short connection with the decoder, and sends the monitoring device control request (ie, the operation instruction described above) determined in step S207 to the remote encoder, for example, the monitoring device control may be sent in an XML format. The request, the monitoring device control request message may include information such as the ID of the monitoring device of the message destination, the manner of control, the direction of the control, and the amount of control step, and obtains the response message of the encoder to complete the control operation of the monitoring device. FIG. 3 shows a specific implementation environment for implementing the method. As shown in FIG. 3, the client may send a monitoring device control message to the target encoder through the network, and the encoder provides a device for implementing streaming media data in response to the monitoring device. The control request is directly connected to the monitoring device and the camera. After receiving the control device control request from the client, the 485 port sends a corresponding control command to the monitoring device to complete the control operation of the monitoring device. It should be noted that, in the embodiment of the present invention, when the monitoring device control request is sent to the remote encoder, the TCP short-chain connection mode is used, but the method is not limited thereto, and the client may negotiate according to the encoder. TCP TCP long-chain connection or other connection methods such as UPD, and monitoring device control requests can also be implemented in formats other than XML. In addition, the embodiment of the present invention uses a voice command to control the monitoring device to rotate left, right, up, and down, but is not limited to the second, and the technical solution provided by the present invention can support other devices for controlling the monitoring device and the camera. More operations, such as controlling camera zoom, adjusting brightness, and manipulating the monitoring equipment's accessories such as lights and wipers. According to an embodiment of the present invention, there is also provided a computer readable medium having stored thereon computer executable instructions for causing a computer or processor to perform, for example, when executed by a computer or processor The processing of all the steps shown in Figures 1 and 2. Apparatus Embodiment According to an embodiment of the present invention, a control apparatus for a monitoring apparatus is provided. 4 is a structural block diagram of a control device of a monitoring device according to an embodiment of the present invention. As shown in FIG. 4, the device includes an acquisition module 10, a matching module 20, and a control module 30. The above modules are described in detail below. The obtaining module 10 is configured to perform feature extraction on the digital voice signal corresponding to the received voice control instruction, and obtain a voice recognition parameter sequence of the voice control instruction. The matching module 20 is configured to search for a sequence of the best speech recognition parameter that matches the acquired sequence of the speech recognition parameter in the preset one or more speech recognition parameter sequences, and the module may be connected to the acquisition module 10. The control module 30 is configured to determine an operation instruction corresponding to the searched speech recognition parameter sequence, and control the monitoring device by using an operation instruction, and the module may be connected to the matching module 20. The control device of the monitoring device provided by the embodiment of the present invention controls the monitoring device by using a voice instruction, so that the operator can control the monitoring device by directly inputting the voice control command, and the operation is simple for the operator. , and more visual and intuitive. 5 is a detailed structural diagram of a control device of a monitoring device according to an embodiment of the present invention, Based on the apparatus shown in FIG. 4, the apparatus shown in FIG. 5 further includes a receiving module 40, a saving module 50, and a configuration module 60. The above modules are described in detail. The receiving module 40 is configured to receive one or more voice control commands in advance. The saving module 50 is configured to perform feature extraction on the digital voice signal corresponding to each voice control instruction, acquire and save a voice recognition parameter sequence of each voice control instruction, and the module may be connected to the matching module 20 and the receiving module 40. The configuration module 60 is configured to configure a correspondence between the speech recognition parameter sequence and the operation instruction, and the module may be connected to the saving module 50. The saving module 50 is further configured to save the digital voice signal received in advance; or save the digital voice signal corresponding to each compressed voice control instruction. As described above, with the control method and/or device of the monitoring device provided by the present invention, the monitoring device is controlled by using a voice instruction, so that the operator can control the monitoring device by directly inputting the voice control command, for the operator In other words, the method is simple, and more visual and intuitive. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. For those skilled in the art, the present invention can be variously modified and modified. Any modifications, equivalent substitutions, improvements, etc. made therein are intended to be included within the scope of the present invention.

Claims

Claim

A control method for a monitoring device, comprising:

Performing feature extraction on the digital voice signal corresponding to the received voice control instruction, and acquiring a voice recognition parameter sequence of the voice control instruction;

Locating a sequence of speech recognition parameters that match the acquired sequence of speech recognition parameters in a sequence of one or more speech recognition parameters set in advance;

Determining an operation instruction corresponding to the searched speech recognition parameter sequence, and using the operation instruction to control the monitoring device.

The method according to claim 1, wherein before the receiving the voice control instruction, the method further includes:

Receiving one or more voice control commands in advance;

For each voice control instruction, feature extracting the corresponding digital voice signal, acquiring and storing a voice recognition parameter sequence of each voice control instruction; and configuring a correspondence relationship between the voice recognition parameter sequence and the operation instruction.

The method according to claim 2, wherein the method further comprises:

For each of the voice control commands received in advance, the corresponding digital voice signal is saved.

The method according to claim 3, wherein the operation of saving the corresponding digital voice signal for each of the voice control instructions received in advance is:

For each of the voice control signaling received in advance, the corresponding digital voice signal is compressed, and the compressed digital voice signal is saved.

The method according to claim 2, wherein the number of preset voice control instructions is one or more for each operation instruction.

The method according to claim 1, wherein the operation of controlling the device by using the operation instruction is specifically:

The operation instruction is sent to a target encoder, and the target encoder controls the monitoring device according to the operation instruction.

Sending the operation instruction to the relay server, the relay server forwards the operation instruction to the target encoder, and the target encoder controls the monitoring device according to the operation instruction.

The method according to any one of claims 1 to 7, characterized in that the digital speech signal characteristic parameter corresponding to the voice control instruction is extracted by one of the following methods: formant extraction method, endpoint detection extraction Method, linear prediction, inverse factor extraction method, Mel inverted coefficient, MFCC extraction method, linear frequency, LSF extraction method.

9. The method according to any one of claims 1 to 7, characterized in that the manner of finding the best speech recognition parameter sequence matching the acquired speech recognition parameter sequence comprises at least one of the following: Time rounding algorithm, implicit Markov model.

A control device for a monitoring device, comprising:

And an acquiring module, configured to perform feature extraction on the digital voice signal corresponding to the received voice control instruction, and acquire a sequence of voice recognition parameters of the voice control instruction;

a matching module, configured to search, in a preset one or more speech recognition parameter sequences, a sequence of speech recognition parameters that matches an optimal sequence of the acquired speech recognition parameter;

And a control module, configured to determine an operation instruction corresponding to the searched voice recognition parameter sequence, and use the operation instruction to control the monitoring device.

11. The device according to claim 10, wherein the device further comprises:

a receiving module, configured to receive one or more voice control commands in advance; a saving module, configured to perform feature extraction on a digital voice signal corresponding to each voice control instruction, and acquire and save a voice recognition parameter sequence of each voice control instruction;

And a configuration module, configured to configure a correspondence between the sequence of the speech recognition parameter and the operation instruction.

The device according to claim 10 or 11, wherein the saving module is further configured to save a digital voice signal received in advance; or save the compressed digital voice signal corresponding to each voice control command .