CN113096661A

CN113096661A - Information acquisition device and voice control method thereof

Info

Publication number: CN113096661A
Application number: CN202110522966.1A
Authority: CN
Inventors: 陈明泰
Original assignee: Mitac Computer Kunshan Co Ltd; Getac Technology Corp
Current assignee: Mitac Computer Kunshan Co Ltd; Getac Technology Corp
Priority date: 2018-07-13
Filing date: 2018-07-13
Publication date: 2021-07-09
Also published as: CN110718213A

Abstract

The embodiment of the invention provides an information acquisition device and a voice control method thereof, wherein the voice control method of the information acquisition device comprises the following steps: receiving a sound signal, carrying out voice recognition on the sound signal to obtain actual voice content, confirming at least one instruction voice content according to the actual voice content, obtaining an operation instruction corresponding to the instruction voice content when the actual voice content corresponds to any instruction voice content, and responding to the operation instruction to execute an action corresponding to the operation instruction; the actual voice content is obtained by voice recognition of the voice signal, and then the corresponding operation instruction is obtained, the action corresponding to the operation instruction is executed according to the operation instruction, and information is captured in time.

Description

Information acquisition device and voice control method thereof

The application is a divisional application with an original application number of 201810766545.1 and an application date of 2018, 7 and 13, and is named as an information acquisition device and a voice control method thereof.

Technical Field

The embodiment of the invention relates to the technical field of communication, in particular to an information acquisition device and a voice control method thereof.

Background

When police officers perform police work, recording is often required to collect evidence so as to preserve relevant evidence. Therefore, when the police officer is on duty, the police officer can acquire the media data such as the surrounding environment image, the sound and the like by wearing the information acquisition device to assist the police work, and the media data recorded by the information acquisition device can also record the scene situation when the incident happens so as to provide evidence and clear responsibility in the future.

Currently, in use, a user needs to turn on the portable information capturing device to capture environmental data by operating a start switch on the information capturing device. However, in an emergency situation, the user often has no time to manually start capturing, or has missed a time point to capture the image and/or sound of the critical situation when starting. In addition, if the user wants to know the device information of the information capturing device, such as the remaining power and/or capacity, the user also needs to turn on the portable information capturing device to display the real-time information for capturing by operating the function switch on the information capturing device.

Disclosure of Invention

The embodiment of the invention provides an information acquisition device and a voice control method thereof, and timely information acquisition.

In a first aspect, an embodiment of the present invention provides a voice control method for an information capturing apparatus, including: the method comprises the steps of receiving a voice signal, carrying out voice recognition on the voice signal to obtain actual voice content, confirming at least one instruction voice content according to the actual voice content, obtaining an operation instruction corresponding to the instruction voice content when the actual voice content corresponds to any instruction voice content, and responding to the operation instruction to execute the action corresponding to the operation instruction.

Optionally, the step of responding to the operation instruction to execute the action corresponding to the operation instruction includes: responding to the operation instruction to read the device information corresponding to the operation instruction; and playing the response voice of the device information.

Optionally, the step of the operation instruction being a recording start instruction and the step of responding to the operation instruction to execute the corresponding operation instruction includes: responding to the start recording command to control an audio/video recording unit to record audio/video so as to capture an environment data.

Optionally, the step of the operation instruction being a recording end instruction and the step of responding to the operation instruction to execute the action corresponding to the operation instruction includes: and responding to the recording ending instruction to control the video recording unit to end video recording so as to generate environment data.

Optionally, the method further comprises: confirming the voice signal according to a voiceprint data; when the voice signal is matched with the voiceprint data, performing voice recognition of the voice signal; when the voice signal does not match the voiceprint data, the voice recognition step of the voice signal is not performed and the voice signal is discarded.

In a second aspect, an embodiment of the present invention provides an information capturing apparatus, including a microphone, a voice recognition unit, a control unit, and an audio/video recording unit. The microphone receives a voice to generate a corresponding sound signal. The voice recognition unit is coupled to the microphone for performing voice recognition on the voice signal to obtain an actual voice content. The video recording unit carries out video recording to acquire environmental data; the control unit is coupled with the voice recognition unit and the audio-video recording unit, and when the actual voice content corresponds to the instruction voice content, the control unit obtains an operation instruction corresponding to the instruction voice content and responds to the operation instruction to execute the action corresponding to the operation instruction.

Optionally, the information capturing device further includes a speaker, wherein in the operation of responding to the operation command and executing the corresponding operation command, the control unit responds to the operation command to read the device information corresponding to the operation command, and plays the response voice of the device information through the speaker.

Optionally, the operation instruction is a start recording instruction, and in the action of responding to the operation instruction and executing the corresponding operation instruction, the control unit responds to the start recording instruction to control the video recording unit to record video so as to capture the environment data.

Optionally, the operation instruction is a recording ending instruction, and in response to the operation instruction, the control unit responds to the recording ending instruction to control the video recording unit to end video recording to generate the environment data.

Optionally, the control unit further confirms the voice signal according to voiceprint data; wherein, when the voice signal is matched with the voiceprint data, the control unit performs the voice recognition of the voice signal; and when the voice signal does not match the voiceprint data, the control unit does not perform voice recognition on the voice signal and discards the voice signal.

In summary, the information capturing apparatus and the voice control method thereof according to the embodiments of the invention can obtain the actual voice content by recognizing the voice signal through the voice, and further obtain the corresponding operation command, and execute the action corresponding to the operation command according to the operation command.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a block diagram of an information retrieval device according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a voice control method of an information retrieval device according to an embodiment of the present invention;

FIG. 3 is a block diagram of an information capturing device according to another embodiment of the present invention;

FIG. 4 is a flowchart illustrating a voice control method of an information retrieval device according to another embodiment of the present invention;

fig. 5 is a flowchart of a voice control method of an information capturing device according to another embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, software implementations, hardware implementations, and so on.

Fig. 1 is a circuit block diagram of an information capturing apparatus according to an embodiment of the present invention. Fig. 2 is a flowchart of a voice control method of an information capturing device according to an embodiment of the present invention. Referring to fig. 1 and 2, the information capturing apparatus 100 includes a microphone 110, a voice recognition unit 120, and a control unit 130. The microphone 110 is coupled to the voice recognition unit 120, and the voice recognition unit 120 is coupled to the control unit 130.

As shown in fig. 2, the voice control method of the information capturing apparatus in this embodiment includes:

in step S01, an audio signal is received.

Specifically, the microphone 110 in this embodiment receives a voice from a user, and the microphone 110 has a signal processing circuit. However, the signal processing circuit is not shown in fig. 1, and the signal processing circuit included in the microphone can generate a corresponding sound signal from the voice, so as to receive a sound signal through the microphone. The form of the sound signal obtained by the microphone through the signal processing circuit may be physical sound wave, and the form of the sound signal converted by the signal processing circuit may be digital signal.

In step S03, voice recognition of the voice signal is performed to obtain the actual voice content.

Specifically, the information retrieving apparatus 100 of the present embodiment further includes a storage module 150, as shown in fig. 3, which is a circuit block diagram of another embodiment of the information retrieving apparatus, and the storage module 150 is coupled to the control unit 130. The storage module 150 stores a speech model database, wherein the speech model database includes speech signals of a plurality of word strings including words, sentences and the like.

The voice recognition unit 120 obtains the sound signal generated by the microphone 110, and performs voice recognition on the sound signal to obtain the actual voice content. In this embodiment, the voice recognition unit 120 analyzes the voice signal to capture at least one feature of the voice signal, and identifies or compares the feature of the voice signal with the data in the voice model database to select or determine the text content of the voice signal, so as to obtain the actual voice content conforming to the feature of the voice signal. Since the speech model database includes speech signals of a plurality of word strings formed by words, phrases, sentences, etc., the speech recognition unit 120 obtains actual speech contents by analyzing and comparing the characteristics of the speech signals of the speech model database with the characteristics of the speech signals of the speech model database. In this process, the related technology of speech recognition is involved, and since the specific working principle of speech recognition is not the focus of the present application, the detailed description is not repeated in this embodiment.

Step S05, determining at least one instruction voice content according to the actual voice content.

Specifically, the control unit 130 confirms at least one instruction voice content according to the actual voice content. In this embodiment, the storage module 150 further includes a lookup table, in fig. 3, the lookup table in the control unit 130 is not drawn, and the lookup table includes a corresponding relationship between the actual voice content and the command voice content. During execution, the control unit 130 performs traversal in the lookup table according to the obtained actual voice content to confirm at least one instruction voice content matched with the actual voice content.

In step S07, when the actual voice content corresponds to any instruction voice content, an operation instruction corresponding to the instruction voice content is obtained.

When the actual voice content corresponds to any instruction voice content, that is, the actual voice content may completely correspond to the instruction voice content or the instruction voice content and other non-instruction voice content (e.g., environmental voice content), the control unit 130 obtains an operation instruction corresponding to the instruction voice content according to the instruction voice content corresponding to the actual voice content. In this embodiment, the actual voice contents corresponding to any one of the instruction voice contents may be identical to the instruction voice contents; or, the actual voice content corresponding to any instruction voice content may have more than a certain proportion of content identical to the instruction voice content; alternatively, the actual voice content corresponding to any instruction voice content may include the same content as the instruction voice content and other content (e.g., ambient sound content) different from the instruction voice content.

In step S09, an operation corresponding to the operation command is executed in response to the operation command.

It should be noted that the lookup table not only includes the corresponding relationship between the actual voice content and the command voice content, but also includes the corresponding relationship between the command voice content and the operation command, so that the control unit 130 can obtain the operation command corresponding to the found command voice content from the lookup table, and then execute the corresponding action.

Fig. 4 is a flowchart of a voice control method of an information capturing device according to another embodiment of the present invention. As shown in fig. 4, in this embodiment, compared to the embodiment corresponding to fig. 2, step S02 is added before step S03 is executed, and the control unit 130 can further confirm the audio signal according to a voiceprint data. The steps S03 to S09 are substantially the same as described above.

In step S01, an audio signal is received.

Step S02, confirm whether the voice signal matches the voiceprint data. Step S03 is executed when the voiceprint data is determined to be matched, and step S021 is executed when the voiceprint data is determined not to be matched.

The user may record each operation instruction in advance through the microphone 110 to set a preset spectrogram related to each operation instruction of the user. The storage module 150 of the information capturing apparatus 100 stores voiceprint data, and the voiceprint data refers to a preset spectrogram corresponding to each operation instruction. In addition, the voiceprint data can also be a preset spectrogram which is pre-recorded by one or more users and corresponds to each operation instruction. The voice recognition unit 120 analyzes the voice signal to generate an input spectrogram, and identifies or compares the characteristics of the input spectrogram with the characteristics of a preset spectrogram of the voiceprint data to identify or verify the identity of the user, thereby identifying whether the voice is uttered by the user himself or herself.

When the voice signal matches the voiceprint data, that is, the characteristics of the input spectrogram match the characteristics of the preset spectrogram of the voiceprint data, the control unit 130 performs the voice recognition on the voice signal. Moreover, the information capturing apparatus 100 can continue to perform steps S05 to S09.

Step S021, the step of voice recognition of the voice signal is not performed and the voice signal is discarded.

When the voice signal does not match the voiceprint data, that is, the characteristics of the input spectrogram do not match the characteristics of the preset spectrogram of the voiceprint data, the control unit 130 does not perform the voice recognition on the voice signal and discards the voice signal.

The information capturing apparatus 100 may further include an audio/video recording unit 140. The video recording unit 140 is coupled to the control unit 130 and can record video. When the operation command is a start recording command, the control unit 130 obtains the start recording command according to the command voice content corresponding to the actual voice content.

In step S09, an action corresponding to the operation instruction is executed in response to the operation instruction.

Optionally, the step of the operation instruction being a recording start instruction and the step of responding to the operation instruction to execute the action corresponding to the operation instruction includes: responding to the start recording command to control an audio/video recording unit to record audio/video so as to capture an environment data.

The control unit 130 responds to the start recording command (i.e. responds to the operation command) to control the video recording unit 140 to record video and audio to capture environment data, i.e. record images and/or sounds of the surrounding environment. The environment data refers to a media file including an image and/or sound of the environment, such as surrounding people, animals, or objects (e.g., passing vehicles and/or their speakers, passers-by and/or their shouts, etc.). In some embodiments, the operation command may be any one of a "start recording command", an "end recording command", a "resume number of times when the recording is still possible", a "store file and play prompt tone command", a "resume remaining capacity command", and a "resume resolution command", but is not limited thereto.

In one implementation, referring to fig. 1 and 2 together, when the user says "Camera start recording" to the microphone 110, the microphone 110 receives an audio signal (step S01) and provides the received audio signal to the speech recognition unit 120. The speech recognition unit 120 performs speech recognition on the voice signal to obtain the actual speech content as "Camera start recording" (step S03). The control unit 130 sequentially confirms the instruction speech contents recorded in the lookup table according to the actual speech contents of the Camera start recording obtained from the speech recognition result (step S05) to find out the instruction speech contents corresponding to the actual speech contents. When the corresponding instruction voice content is found, the control unit 130 may also obtain the operation instruction of "start recording instruction" corresponding to the instruction voice content from the lookup table (step S07). Here, the control unit 130 responds to the start recording command (i.e. responds to the operation command) to control the video recording unit 140 to perform video recording to acquire the environmental data (step S09). The control unit 130 may respond to the recording start instruction (i.e., respond to the operation instruction) and further control a light-emitting module (not shown in fig. 1 and 2) to emit light, so that the user knows that the video recording unit 140 is recording video.

In another embodiment, when the operation command is a record ending command, the control unit 130 obtains a record ending command according to the command voice content corresponding to the actual voice content (step S07), and the control unit 130 controls the av recording unit 140 to end the av recording in response to the record ending command (i.e. in response to the operation command) to generate the environment data (step S09). In other words, the control unit 130 stores the environment data into a corresponding media file in response to the recording ending command. In one embodiment, referring to fig. 1 and fig. 2, when the voice recognition unit 120 receives the voice signal (step S01) and performs voice recognition, the actual voice content obtained is "Camera recording end" (step S03). After the control unit 130 identifies the instruction voice content according to the actual voice content of the Camera recording end (step S05), the control unit 130 obtains an instruction (operation instruction) to end recording of the instruction voice content according to the instruction voice content corresponding to the actual voice content of the Camera recording end (step S07). Here, the control unit 130 responds to the recording ending instruction (i.e. responds to the operation instruction) to control the recording unit 140 to end the recording of the video to generate the environment data (step S09), and generates the corresponding media file from the environment data and stores the media file in the storage module. The control unit 130 may respond to the recording ending instruction (i.e. respond to the operation instruction) and further control a lighting module (not shown in fig. 1 and fig. 2) to turn off, so that the user knows that the recording of the video and audio by the video and audio recording unit 140 has ended and the environmental data has been generated.

Fig. 5 is a flowchart of a voice control method of an information capturing device according to another embodiment of the present invention. As shown in fig. 5, in this embodiment, step S09 is specifically described with respect to the embodiment corresponding to fig. 2, step S09 in this embodiment specifically includes step S091 and step S092, and step S091 includes: the control unit 130 reads device information corresponding to the operation instruction in response to the operation instruction, and step S092 includes: and the control unit 130 controls the speaker 160 to play the response voice of the device information. The steps S01 to S07 are substantially the same as described above.

In one embodiment, when the speech recognition unit 120 receives the audio signal (step S01) and performs speech recognition, the actual speech content is "Battery Life" (step S03). After the control unit 130 identifies the command voice content according to the actual voice content of the "Battery Life" (step S05), the control unit 130 obtains the "reply number of times that recording is still possible" command (operation command) corresponding to the command voice content according to the command voice content corresponding to the actual voice content of the "Battery Life" (step S07). The control unit 130 reads the device information of the number of hours that can be recorded in response to the command (operation command) (i.e., in response to the operation command) to return the number of hours that can be recorded (step S091). In one embodiment, the control unit 130 may count the current recording time and diagnose the current recording time according to the remaining power and/or capacity. In other words, the information capturing apparatus 100 may further include a timing module (not shown), and the timing module is coupled to the control unit 130. Next, the control unit 130 controls a speaker 160 to play a response voice of the number of hours that the recording is still possible (step S092). In one embodiment, the speaker 160 may be built in a display (not shown) coupled to the control unit 130. Here, the control unit 130 may read the number of times of recording in the device information in response to the "reply of the number of times of recording (operation instruction)" instruction (step S091), and the control unit 130 may control the display panel of the display to display the video screen information and the speaker 160 to dial the sound stage information (step S092).

In another embodiment, when the user says "event 1" to the microphone 110, the microphone 110 receives an audio signal (step S01) and provides the received audio signal to the speech recognition unit 120. The speech recognition unit 120 performs speech recognition on the voice signal to obtain the actual speech content as "event 1" (step S03). The control unit 130 sequentially confirms the instruction speech contents recorded in the lookup table according to the actual speech contents of the "event 1" obtained from the speech recognition result (step S05) to find the instruction speech contents corresponding to the actual speech contents. When the corresponding command voice content is found, the control unit 130 can also obtain the operation command of "store file and play prompt tone" corresponding to the command voice content from the lookup table (step S07). The control unit 130 responds to the command "store file and play alert tone" (i.e. respond to the operation command) to store the video file and play the response voice.

In another embodiment, when the user says "restore remaining capacity" to the microphone 110, the microphone 110 receives an audio signal (step S01) and provides the received audio signal to the speech recognition unit 120. The speech recognition unit 120 performs speech recognition on the voice signal to obtain the actual speech content as "recovery residual capacity" (step S03). The control unit 130 sequentially confirms the command voice contents recorded in the lookup table according to the actual voice contents of the "reply remaining capacity" obtained from the voice recognition result (step S05) to find the command voice contents corresponding to the actual voice contents. When the corresponding command voice content is found, the control unit 130 may also obtain the operation command "reply remaining capacity" corresponding to the command voice content from the lookup table (step S07). The control unit 130 responds to the "read remaining capacity and dial alert tone" command (i.e., responds to the operation command) to read the device information of the remaining capacity and play the response voice of the device information of the remaining capacity.

In another embodiment, when the user says "reply to resolution" for the microphone 110, the microphone 110 receives an audio signal (step S01) and provides the received audio signal to the speech recognition unit 120. The speech recognition unit 120 performs speech recognition on the voice signal to obtain the actual speech content as "reply resolution" (step S03). The control unit 130 sequentially confirms the command speech contents recorded in the lookup table according to the actual speech contents of the "reply resolution" obtained from the speech recognition result (step S05) to find the command speech contents corresponding to the actual speech contents. When the corresponding command speech content is found, the control unit 130 may also obtain an operation command of "reply resolution" corresponding to the command speech content from the lookup table (step S07). The control unit 130 responds to the command "reply to resolution and dial the alert tone" (i.e., responds to the operation command) to read the device information of resolution and play the response voice of the device information of resolution.

In some embodiments, the video recording unit 140 may be implemented by a camera lens and an image processing unit. In one embodiment, the Image processing unit may be an Image Signal Processor (ISP). In another embodiment, the image processing unit and the control module 130 are implemented by the same chip, but the invention is not limited thereto.

In some embodiments, control unit 130 may be implemented by one or more processing elements. Each processing element may be, but is not limited to, a microprocessor, microcontroller, digital signal processor, central processing unit, programmable logic controller, state machine, or any analog and/or digital device that manipulates signals based on operational instructions.

In some embodiments, the storage module 150 may be implemented by one or more storage elements. The storage device may be, for example, a memory or a register, but is not limited thereto.

In some embodiments, the information capturing device 100 can be a portable camera device, such as: a secret recorder, a wearable video camera, a portable evidence searching and recording machine, a micro video camera and the like which are arranged on the cap body or the clothes. In some embodiments, the information capturing device 100 may be a stationary camera device, such as: install the vehicle event data recorder on the vehicle.

Although the present invention has been described with reference to the preferred embodiments, it should be understood that various changes and modifications can be made without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A voice control method for an information capturing device, comprising:

receiving a sound signal;

performing voice recognition on the voice signal to obtain an actual voice content;

confirming at least one instruction voice content according to the actual voice content;

when the actual voice content corresponds to any instruction voice content, an operation instruction corresponding to the instruction voice content is obtained;

and responding to the operation instruction to execute the action corresponding to the operation instruction.

2. The voice-control method of claim 1, wherein the step of performing the action corresponding to the operation command in response to the operation command comprises:

responding to the operation instruction to read the device information corresponding to the operation instruction;

and playing the response voice of the device information.

3. The voice-control method of claim 1, wherein the operation command is a record-start command, and the step of executing the action corresponding to the operation command in response to the operation command comprises: responding to the start recording instruction to control an audio/video recording unit to record audio/video so as to capture environment data.

4. The voice-control method of claim 3, wherein the operation command is an end-of-recording command, and the step of executing the action corresponding to the operation command in response to the operation command comprises: responding to the recording ending instruction to control the video recording unit to end video recording so as to generate the environment data.

5. The voice control method of an information capturing device as claimed in claim 1, further comprising:

confirming the voice signal according to voice print data;

when the voice signal is matched with the voiceprint data, the step of voice recognition of the voice signal is carried out;

when the voice signal does not match the voiceprint data, the voice recognition step of the voice signal is not performed and the voice signal is discarded.

6. An information retrieving apparatus, comprising:

a microphone for receiving a voice to generate a corresponding sound signal;

a voice recognition unit coupled to the microphone for performing voice recognition on the voice signal to obtain an actual voice content;

an audio/video recording unit for recording audio/video to capture an environmental data;

and the control unit is coupled with the voice recognition unit and the audio-video recording unit, acquires an operation instruction corresponding to the instruction voice content when the actual voice content corresponds to the instruction voice content, and responds to the operation instruction to execute the action corresponding to the operation instruction.

7. The apparatus according to claim 6, further comprising a speaker, wherein in the operation corresponding to the operation command performed in response to the operation command, the control unit reads the apparatus information corresponding to the operation command in response to the operation command, and plays a response voice of the apparatus information through the speaker.

8. The apparatus of claim 6, wherein the operation command is a start recording command, and the control unit controls the video recording unit to record video in response to the start recording command to capture the environmental data during the operation corresponding to the operation command.

9. The apparatus of claim 8, wherein the operation command is an end recording command, and the control unit controls the video recording unit to end video recording to generate the environmental data in response to the end recording command during the operation corresponding to the operation command.

10. The apparatus according to claim 6, wherein the control unit further validates the audio signal according to voiceprint data; wherein, when the voice signal is matched with the voiceprint data, the control unit performs the voice recognition of the voice signal; and when the voice signal does not match the voiceprint data, the control unit does not perform the voice recognition of the voice signal and discards the voice signal.