CN115439874A - Voice control method and device of equipment, equipment and storage medium - Google Patents

Voice control method and device of equipment, equipment and storage medium Download PDF

Info

Publication number
CN115439874A
CN115439874A CN202210292172.5A CN202210292172A CN115439874A CN 115439874 A CN115439874 A CN 115439874A CN 202210292172 A CN202210292172 A CN 202210292172A CN 115439874 A CN115439874 A CN 115439874A
Authority
CN
China
Prior art keywords
voice
sound zone
information
target sound
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210292172.5A
Other languages
Chinese (zh)
Inventor
胡含
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing CHJ Automobile Technology Co Ltd
Original Assignee
Beijing CHJ Automobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing CHJ Automobile Technology Co Ltd filed Critical Beijing CHJ Automobile Technology Co Ltd
Priority to CN202210292172.5A priority Critical patent/CN115439874A/en
Publication of CN115439874A publication Critical patent/CN115439874A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The disclosure relates to a voice control method, a device and a readable medium of the device, wherein the method comprises the following steps: acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone, and acquiring image information of the target sound zone; recognizing a gesture direction in the image information according to the image information; determining equipment on a path pointed by the gesture direction according to the gesture direction; and controlling the equipment according to the voice information. According to the voice control method and device, the gesture direction of the user is obtained by obtaining the image information of the target sound zone, the device which the user wants to control is determined according to the gesture direction, and the device is controlled according to the voice information of the user, so that a mode of combining voice with gestures is achieved, convenience is provided for voice control of the user, and the problem that the user cannot carry out voice control on the device without knowing the name of the device is solved.

Description

Voice control method and device of equipment, equipment and storage medium
Technical Field
The present disclosure relates to the field of intelligent control, and in particular, to a method, an apparatus, a device, and a readable medium for controlling a device by voice.
Background
With the continuous development of the automobile industry, the vehicle-mounted machine system is continuously updated and updated, the vehicle control function is enhanced, and a plurality of vehicle-mounted machines are provided with voice assistants, so that a user can control the vehicle through voice.
Currently, the user needs to specify the device that he wants to control and the function that he wants to implement, for example, the user needs to specify "turn on the left reading light" and the voice assistant can turn on the left reading light. It can be seen that it is not user friendly to unfamiliar with the name of the device, and that this operation cannot be implemented when the user does not know the name of the controllable device.
Disclosure of Invention
In order to solve the technical problem, the present disclosure provides a voice control method, apparatus, device and readable storage medium for a device.
In a first aspect, the present disclosure provides a method for controlling a device by voice, including:
acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone, and acquiring image information of the target sound zone;
recognizing a gesture direction in the image information according to the image information;
determining equipment on a path pointed by the gesture direction according to the gesture direction;
and controlling the equipment according to the voice information.
Optionally, before the voice collecting device corresponding to the awakened target sound zone collects voice information and obtains image information of the target sound zone, the method further includes:
and responding to the awakening instruction, and determining the sound zone in which the awakening instruction is positioned as a target sound zone.
Optionally, the method further includes:
if a plurality of devices exist on the path pointed by the gesture direction, displaying the information of the plurality of devices and prompting to select the controlled device;
responding to a selection instruction of the controlled device, and determining a target controlled device;
and controlling the target controlled equipment according to the voice information.
Optionally, before the obtaining of the image information of the awakened target sound zone, the method further includes:
determining controlled equipment according to the voice information;
and if the controlled equipment cannot be determined, executing the step of acquiring the image information of the target sound zone and subsequent steps.
Optionally, after determining the device on the path to which the gesture direction points, the method further includes:
and continuously acquiring voice information through the voice acquisition device corresponding to the target sound zone, and controlling the equipment according to the acquired voice information.
Optionally, before the controlling the device according to the voice information, the method further includes:
judging whether the equipment is in an activated state;
if not, activating the equipment;
and if the equipment is in the activated state, controlling the equipment according to the voice information.
In a second aspect, the present disclosure provides a voice control apparatus for a device, comprising:
the acquisition module is used for acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone and acquiring image information of the target sound zone;
the recognition module is used for recognizing the gesture direction in the image information according to the image information;
the determining module is used for determining equipment on a path pointed by the gesture direction according to the gesture direction;
and the control module is used for controlling the equipment according to the voice information.
Optionally, the determining module includes:
and the first determining unit is used for responding to the awakening instruction and determining the sound zone where the awakening instruction is located as the target sound zone before the voice acquisition device corresponding to the awakened target sound zone acquires voice information and acquires image information of the target sound zone.
Optionally, the determining module further includes:
the second determining unit is used for displaying information of a plurality of devices and prompting to select the controlled device when the plurality of devices exist on the path pointed by the gesture direction;
the third determining unit is used for responding to the selection instruction of the controlled equipment and determining the target controlled equipment;
the control module is further configured to control the target controlled device according to the voice information.
Optionally, before the image information of the awakened target sound zone is acquired,
the determining module is further used for determining the controlled equipment according to the voice information;
and if the controlled equipment cannot be determined, the device executes the step of acquiring the image information of the target sound zone and subsequent steps.
Optionally, after the device on the path to which the gesture direction points is determined, the obtaining module is further configured to continue to collect voice information through the voice collecting device corresponding to the target sound zone;
the control module is also used for controlling the equipment according to the collected voice information.
Optionally, before the controlling the device according to the voice information,
the determining module is further configured to determine whether the device is in an active state;
the control module is further configured to activate the device when the device is not in an activated state;
the control module is further configured to control the device according to the voice information when the device is in an activated state.
In a third aspect, the present disclosure provides a computer device comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of the first aspect.
In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of the first aspect.
According to the voice control method, the device, the equipment and the readable storage medium of the equipment, the voice information of the target sound zone is collected, the image information of the target sound zone is obtained, the gesture direction is identified according to the image information, the equipment which a user wants to control is determined based on the gesture direction, and then the equipment is controlled according to the voice information, so that the user can control the equipment which the user wants to control through a voice instruction under the condition that the equipment name is not spoken, the mode of combining voice and gestures is realized, convenience is provided for the voice control of the user, and the problem that the user cannot control the equipment by voice without knowing the equipment name is solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the embodiments or the technical solutions in the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a flowchart of a voice control method of a device according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of an application scenario provided by the embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a voice control apparatus of a device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a computer device provided in an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments of the present disclosure may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
At present, a plurality of sound zones can be provided in a vehicle to collect user instructions of different positions, and each user can wake up a voice assistant based on the sound zone where the user is located and issue a control instruction, so that the voice assistant assists the user in controlling controllable equipment of the vehicle. In the process of using, a user needs to explicitly speak the name of a device to be controlled and a function to be realized, and the operation is not friendly to the user who is not familiar with the name of the device, so that a method for allowing the user to control the device by voice without speaking the name of the device is needed.
Embodiments of the present disclosure provide a method for controlling a device by using voice, and the method is described below with reference to specific embodiments.
Fig. 1 is a flowchart of a voice control method of a device according to an embodiment of the present disclosure, where the method may be executed by a voice control apparatus of the device, the voice control apparatus of the device may be implemented in a software and/or hardware manner, and the voice control apparatus of the device may be configured in a computer device, such as a server or a terminal, and specifically, may be configured in a vehicle or a cloud server of a vehicle.
The voice collaboration wake-up method shown in fig. 1 is described below with reference to the application scenario shown in fig. 2, for example, the car machine 201 in fig. 2 may execute the method, and a voice assistant is disposed in the car machine 201 to control the controllable device 203 by a user in a sound zone 204. The method comprises the following specific steps:
s101, voice information is collected through a voice collecting device corresponding to the awakened target sound zone, and image information of the target sound zone is obtained.
For example, taking two left-row users as an example, the sound zone corresponding to the position is the sound zone 204, after the user in the sound zone 204 wakes up a voice assistant arranged in the car machine 201, the car machine 201 collects voice information sent by the user through a voice collecting device 2041 (which may be a microphone, for example) corresponding to the sound zone 204, and obtains image information of the target sound zone 204 through the camera 202, so as to support the implementation of subsequent steps of the method.
And S102, recognizing the gesture direction in the image information according to the image information.
After the car machine 201 acquires the image information of the target sound zone 204, the gesture in the image information is recognized, specifically, the image data may be analyzed by a Computer Vision (CV) recognition technology, and a direction pointed by the user gesture is determined.
S103, determining equipment on a path pointed by the gesture direction according to the gesture direction.
Through CV recognition technology, after the car machine 201 determines the gesture direction, the pointing path of the gesture can be determined according to the gesture direction, so as to determine which devices in the car cabin are on the pointing path. For example, the device 203 is on the path pointed by the user gesture direction, the in-vehicle machine 201 may determine the device 203 as the device that the user wants to control.
And S104, controlling the equipment according to the voice information.
After the car machine 201 determines the device 203 that the user wants to control through the user gesture, the voice assistant in the car machine 201 controls the device 203 according to the voice information collected by the voice collecting device 2041 corresponding to the sound zone 204. The voice information may be information of a function that the user wants to implement, for example, the user wants to turn on the reading lamp but does not know the specific name of the reading lamp, may say "turn on the lamp", and then points to the reading lamp with a hand to be recognized by the voice assistant, and the voice assistant turns on the reading lamp.
According to the voice control method and device, the voice information of the target sound zone is collected, the image information of the target sound zone is obtained, the gesture direction is recognized according to the image information, the device which the user wants to control is determined based on the gesture direction, the device is controlled according to the voice information, the device which the user wants to control can be controlled through the voice instruction under the condition that the user does not speak the name of the device, the mode of combining voice with gestures is achieved, convenience is provided for voice control of the user, and the problem that the user cannot perform voice control on the device without knowing the name of the device is solved.
On the basis of the above embodiment, before acquiring the voice information through the voice acquisition device corresponding to the woken target sound zone and acquiring the image information of the target sound zone, the voice control method of the device further includes:
and responding to the awakening instruction, and determining the sound zone in which the awakening instruction is positioned as a target sound zone.
Before the voice assistant is not woken up by the user, the user may wake up by issuing a wake up instruction, which may be, for example, "hello, XXX" ("XXX" is the name of the voice assistant). The in-vehicle machine 201 acquires the wake-up instruction through the voice collecting device 203, and then the voice assistant in the in-vehicle machine 201 responds to the wake-up instruction, for example, "ask for what help is needed", and meanwhile, the in-vehicle machine 201 may determine the location of issuing the wake-up instruction through a sound source recognition technology, that is, determine that the sound zone 204 of the user issuing the wake-up instruction is the target sound zone.
In the embodiment of the disclosure, the user experience is improved by responding to the awakening instruction, the sound zone for collecting the awakening instruction is identified and determined as the target sound zone, so that the position of the user can be determined, and support is provided for subsequently acquiring the image information of the user.
On the basis of the above embodiment, the voice control method of the device further includes:
and if a plurality of devices exist on the path pointed by the gesture direction, displaying the information of the plurality of devices and prompting to select the controlled device.
When a device on the path pointed by the gesture direction is determined according to the gesture direction, a plurality of devices may exist on the pointed path, and at this time, the car machine 201 cannot determine the device which the user wants to control according to the gesture direction alone, so when a plurality of devices, such as a reading lamp and a sun shade, exist on the path pointed by the gesture direction, the car machine 201 may display information of the plurality of devices on the path pointed by the gesture direction and prompt the user to select the device which the user wants to control.
Optionally, the voice collecting device 2041 may be set as a microphone, and the car machine 201 may broadcast names of a plurality of devices through the voice collecting device 2041, for example, the voice assistant may broadcast "you want to control the reading light or the sunshade through the microphone corresponding to the sound zone 204.
Optionally, the car machine 201 may further display information of the multiple devices on the touch screen 2042, and prompt the user to select which device of the multiple devices is to be controlled, for example, icons of a reading lamp and a sun shade may be displayed on the touch screen for the user to click and select.
And determining a target controlled device in response to the selection instruction of the controlled device.
After the user issues the selection instruction, the car machine 201 will respond to the selection instruction of the controlled device, and determine the target controlled device according to the selection instruction.
Optionally, the user may also issue the selection instruction in a voice manner, for example, it may be said that "turn on the reading lamp", at this time, after the car machine 201 displays the device information, the user may correspondingly know the name of the device that the user wants to control, and issue the selection instruction by speaking the name of the device, so that the car machine 201 determines that the target controlled device is the reading lamp.
Optionally, when the in-vehicle machine 201 displays information of a plurality of devices on the touch screen 2042, the user may also select a device to be controlled by clicking the touch screen 2042, for example, after the user clicks an icon of a reading lamp, the in-vehicle machine 201 may determine the target controlled device as the reading lamp.
And controlling the target controlled equipment according to the voice information.
After the in-vehicle machine 201 determines the target controlled device through the selection instruction, the target controlled device may be controlled according to the collected arrival voice information.
In the embodiment of the disclosure, by displaying information of a plurality of devices on the path pointed by the gesture direction, prompting the user to select the controlled device, and controlling the controlled device according to the voice information, the situation that the device which the user wants to control cannot be determined when a plurality of devices are on the path pointed by the gesture direction can be avoided, the convenience of the user in using the voice assistant is improved, and meanwhile, the voice control method of the device is more perfect.
On the basis of the foregoing embodiment, before acquiring image information of an awakened target audio region, the voice control method of the device further includes:
determining controlled equipment according to the voice information; if the controlled device cannot be determined, the step of acquiring the image information of the target sound zone and subsequent steps S102, S103 and S104 thereof are performed.
For example, in some scenarios, the user may speak a name of a device to be controlled, or the car machine 201 may determine the device to be controlled by using a device part name included in the voice information issued by the user, and the car machine 201 may determine the controlled device according to the voice information issued by the user before acquiring the image information of the woken target sound zone 204, and then perform a subsequent step of controlling the controlled device according to the voice information. In case that the controlled device cannot be determined according to the voice information, the step of acquiring the image information of the target sound zone and the subsequent steps thereof are performed.
In the embodiment of the disclosure, whether the car machine can determine the controlled device according to the voice information is determined, so as to determine that the controlled device needs to be determined according to the gesture of the user subsequently, and unnecessary image analysis can be avoided, so that the efficiency and the flexibility of the method are improved.
On the basis of the above embodiment, after determining the device on the path to which the gesture direction points, the voice control method of the device further includes: and continuously acquiring voice information through the voice acquisition device corresponding to the target voice area, and controlling the equipment according to the acquired voice information.
After the car machine 201 determines the device that the user wants to control, the user may continue to send other control instructions, and at this time, the car machine 201 may continue to collect voice information through the voice collecting device 2041 corresponding to the target sound zone and control the device 203 according to the collected voice information, for example, the user may issue "what" later, "project its content onto another screen," lighten it again, "receive it" and so on, so as to facilitate the user's use of the devices.
In the embodiment of the disclosure, after the device on the path to which the gesture direction points is determined, the voice information of the target sound zone is continuously acquired, and the device is controlled according to the voice information, so that the subsequent control instruction of the user can be prevented from being missed, and convenience is provided for the user to use.
On the basis of the above embodiment, before controlling the device according to the voice information, the voice control method of the device further includes: judging whether the equipment is in an activated state; if not, activating the equipment; and if the equipment is in the activated state, controlling the equipment according to the voice information.
Illustratively, when the user lets the car machine 201 control the device through the voice assistant, the user confirms whether the device is in an activated state, and activates the device when the device is not in the activated state, so that the device can quickly respond to a subsequent control instruction.
In the embodiment of the disclosure, by judging whether the controlled device is activated or not, the device is activated when the controlled device is activated, so that the controlled device can quickly respond to a subsequent control instruction, thereby improving the sensitivity of the voice control method of the device.
Fig. 3 is a schematic structural diagram of a voice control apparatus of a device according to an embodiment of the present disclosure. The apparatus 300 may be a terminal as described in the above embodiments, or the apparatus 300 may be a component or assembly in the terminal. The voice control apparatus 300 of the device provided in the embodiment of the present disclosure may execute the processing procedure provided in the embodiment of the voice control method of the device, as shown in fig. 3, including:
the acquisition module 301 is configured to acquire voice information through a voice acquisition device corresponding to the awakened target sound zone, and acquire image information of the target sound zone;
the recognition module 302 is configured to recognize a gesture direction in the image information according to the image information;
a determining module 303, configured to determine, according to the gesture direction, a device on a path to which the gesture direction points;
and the control module 304 is used for controlling the equipment according to the voice information.
Optionally, the determining module 303 includes:
the first determining unit 3031 is configured to respond to the wake-up instruction and determine the sound zone where the wake-up instruction is located as the target sound zone before acquiring the voice information through the voice acquisition device corresponding to the woken-up target sound zone and acquiring the image information of the target sound zone.
Optionally, the determining module 303 further includes:
a second determining unit 3032, configured to show information of multiple devices when multiple devices exist on a path to which the gesture direction points, and prompt to select a controlled device;
a third determining unit 3033, configured to determine a target controlled device in response to a selection instruction of the controlled device;
the control module 304 is further configured to control the target controlled device according to the voice information.
Optionally, before acquiring the image information of the awakened target sound zone,
the determining module 303 is further configured to determine a controlled device according to the voice information;
if the controlled device cannot be determined, the apparatus 300 performs the step of acquiring the image information of the target sound zone and the subsequent steps.
Optionally, after determining the device on the path to which the gesture direction points, the obtaining module 301 is further configured to continue to collect voice information through the voice collecting device corresponding to the target sound zone;
the control module 303 is further configured to control the device according to the collected voice information.
Optionally, before controlling the device in accordance with the voice information,
the determining module 303 is further configured to determine whether the device is in an active state;
the control module 304 is further configured to activate the device when the device is not in the activated state;
the control module 304 is further configured to control the device according to the voice information when the device is in the activated state.
The voice control apparatus of the device in the embodiment shown in fig. 3 can be used to implement the technical solution of the above method embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
Fig. 4 is a schematic structural diagram of a computer device in an embodiment of the present disclosure. Referring now in particular to fig. 4, there is shown a schematic block diagram of a computer device 400 suitable for use in implementing embodiments of the present disclosure. The computer device shown in fig. 4 is only an example and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
As shown in fig. 4, the computer device 400 may include a processing means (e.g., a central processing unit, a graphic processor, etc.) 401 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403 to implement the voice control method of the embodiments as described in the present disclosure. In the RAM 403, various programs and data necessary for the operation of the computer apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the computer device 400 to communicate with other devices, either wirelessly or by wire, to exchange data. While fig. 4 illustrates a computer device 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart, thereby implementing the voice control method as described above. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 401.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Computationally appropriate media transmissions include, but are not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the computer device; or may exist alone without being installed in the computer device.
The computer readable medium carries one or more programs which, when executed by the computing device, cause the computing device to:
acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone, and acquiring image information of the target sound zone;
recognizing the gesture direction in the image information according to the image information;
determining equipment on a path pointed by the gesture direction according to the gesture direction;
and controlling the equipment according to the voice information.
Optionally, when the one or more programs are executed by the computer device, the computer device may also perform other steps described in the above embodiments.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems on a chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein disclosed is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the above features or their equivalents may be practiced without departing from the spirit of the disclosure. For example, the above features and (but not limited to) the technical features with similar functions disclosed in the present disclosure are mutually replaced to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (14)

1. A method for voice control of a device, comprising:
acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone, and acquiring image information of the target sound zone;
recognizing a gesture direction in the image information according to the image information;
determining equipment on a path pointed by the gesture direction according to the gesture direction;
and controlling the equipment according to the voice information.
2. The method according to claim 1, wherein before the voice collecting device corresponding to the awakened target sound zone collects voice information and obtains image information of the target sound zone, the method further comprises:
and responding to the awakening instruction, and determining the sound zone in which the awakening instruction is positioned as a target sound zone.
3. The method of claim 1, further comprising:
if a plurality of devices exist on the path pointed by the gesture direction, displaying the information of the plurality of devices and prompting to select the controlled device;
responding to a selection instruction of the controlled device, and determining a target controlled device;
and controlling the target controlled equipment according to the voice information.
4. The method of claim 1, wherein prior to obtaining the image information of the target soundzone of the wake-up, the method further comprises:
determining controlled equipment according to the voice information;
and if the controlled equipment cannot be determined, executing the step of acquiring the image information of the target sound zone and subsequent steps.
5. The method of claim 1, wherein after the determining the device on the path to which the gesture direction is directed, the method further comprises:
and continuously acquiring voice information through the voice acquisition device corresponding to the target sound zone, and controlling the equipment according to the acquired voice information.
6. The method of claim 1, wherein prior to said controlling the device according to the voice information, the method further comprises:
judging whether the equipment is in an activated state;
if not, activating the equipment;
and if the equipment is in the activated state, controlling the equipment according to the voice information.
7. A voice control apparatus for a device, comprising:
the acquisition module is used for acquiring voice information through a voice acquisition device corresponding to the awakened target sound zone and acquiring image information of the target sound zone;
the recognition module is used for recognizing the gesture direction in the image information according to the image information;
the determining module is used for determining equipment on a path pointed by the gesture direction according to the gesture direction;
and the control module is used for controlling the equipment according to the voice information.
8. The apparatus of claim 7, wherein the determining module comprises:
and the first determining unit is used for responding to the awakening instruction and determining the sound zone where the awakening instruction is located as the target sound zone before the voice acquisition device corresponding to the awakened target sound zone acquires voice information and acquires image information of the target sound zone.
9. The apparatus of claim 7, wherein the determining module further comprises:
the second determining unit is used for displaying the information of the plurality of devices and prompting to select the controlled device when the plurality of devices exist on the path pointed by the gesture direction;
the third determining unit is used for responding to the selection instruction of the controlled equipment and determining the target controlled equipment;
the control module is further configured to control the target controlled device according to the voice information.
10. The apparatus of claim 7, wherein prior to obtaining image information of the target volume of wakeups,
the determining module is further used for determining the controlled equipment according to the voice information;
and if the controlled equipment cannot be determined, the device executes the step of acquiring the image information of the target sound zone and the subsequent steps.
11. The apparatus according to claim 7, wherein after the device on the path pointed by the gesture direction is determined, the obtaining module is further configured to continue to obtain voice information through a voice collecting apparatus corresponding to the target sound zone;
the control module is also used for controlling the equipment according to the collected voice information.
12. The apparatus of claim 7, wherein prior to said controlling the device based on the voice information,
the determining module is further configured to determine whether the device is in an active state;
the control module is further configured to activate the device when the device is not in an activated state;
the control module is further configured to control the device according to the voice information when the device is in an activated state.
13. A computer device, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-6.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202210292172.5A 2022-03-23 2022-03-23 Voice control method and device of equipment, equipment and storage medium Pending CN115439874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210292172.5A CN115439874A (en) 2022-03-23 2022-03-23 Voice control method and device of equipment, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210292172.5A CN115439874A (en) 2022-03-23 2022-03-23 Voice control method and device of equipment, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115439874A true CN115439874A (en) 2022-12-06

Family

ID=84241173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210292172.5A Pending CN115439874A (en) 2022-03-23 2022-03-23 Voice control method and device of equipment, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115439874A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316158A (en) * 2023-11-28 2023-12-29 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316158A (en) * 2023-11-28 2023-12-29 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium
CN117316158B (en) * 2023-11-28 2024-04-12 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium

Similar Documents

Publication Publication Date Title
US11164570B2 (en) Voice assistant tracking and activation
US20190311712A1 (en) Selection system and method
US8989773B2 (en) Sharing location information among devices
CN110851863B (en) Application program authority control method and device and electronic equipment
US9997160B2 (en) Systems and methods for dynamic download of embedded voice components
CN109729004A (en) Conversation message top set treating method and apparatus
CN109408481B (en) Log collection rule updating method and device, electronic equipment and readable medium
EP2999200B1 (en) Device and method for providing service via application
CN110865846B (en) Application management method, device, terminal, system and storage medium
CN112634872A (en) Voice equipment awakening method and device
US20170006108A1 (en) Navigation method, smart terminal device and wearable device
CN111488185A (en) Page data processing method and device, electronic equipment and readable medium
CN115439874A (en) Voice control method and device of equipment, equipment and storage medium
CN112242143B (en) Voice interaction method and device, terminal equipment and storage medium
CN112162666A (en) Terminal control method and device, terminal and storage medium
CN113721811B (en) Popup window sending method, popup window sending device, electronic equipment and computer readable medium
CN112767565B (en) Method and device for OBU issuing and activating based on vehicle machine and electronic equipment
KR101569116B1 (en) Method and apparatus for notifying schedule
CN114827704A (en) Vehicle-mounted system interaction method with vehicle, storage medium and mobile terminal
CN113564865A (en) Remote control method and device for washing machine, electronic equipment and storage medium
CN115440209A (en) Voice control method, device, equipment and computer readable storage medium
CN115576458A (en) Application window display method, device, equipment and medium
KR20160033579A (en) System and method for providing service via application
CN111240718A (en) Theme updating method and device, electronic equipment and medium
CN113076053A (en) Cursor remote control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination