WO2023169123A1 - Device control method and apparatus, and electronic device and medium - Google Patents

Device control method and apparatus, and electronic device and medium Download PDF

Info

Publication number
WO2023169123A1
WO2023169123A1 PCT/CN2023/074997 CN2023074997W WO2023169123A1 WO 2023169123 A1 WO2023169123 A1 WO 2023169123A1 CN 2023074997 W CN2023074997 W CN 2023074997W WO 2023169123 A1 WO2023169123 A1 WO 2023169123A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
adjustment
target device
speed
dimension parameter
Prior art date
Application number
PCT/CN2023/074997
Other languages
French (fr)
Chinese (zh)
Inventor
徐亮
徐梓宁
沈丽娜
王锐
武锐
牛建伟
Original Assignee
深圳地平线机器人科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳地平线机器人科技有限公司 filed Critical 深圳地平线机器人科技有限公司
Publication of WO2023169123A1 publication Critical patent/WO2023169123A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present disclosure relates to artificial intelligence technology, especially an equipment control method and device, electronic equipment and media.
  • Human-computer interaction refers to the information exchange process between humans and machines using a certain dialogue language and a certain interactive method to complete certain tasks.
  • Traditional human-computer interaction is mainly achieved through input and output devices such as keyboards, mice, and monitors.
  • input and output devices such as keyboards, mice, and monitors.
  • technologies such as artificial intelligence, humans and machines have been able to interact in a manner similar to natural language.
  • Embodiments of the present disclosure provide a device control method and device, electronic devices, and media.
  • a device control method including:
  • an equipment control device including:
  • a voice recognition module configured to perform voice recognition on the voice control instruction in response to receiving the voice control instruction, and obtain a first voice recognition result
  • a determination module configured to determine the target device corresponding to the voice control instruction based on the first voice recognition result obtained by the voice recognition module
  • Detection module used to detect preset dynamic gestures
  • An adjustment module configured to continuously adjust the state of the target device based on the continuous action of the dynamic gesture in response to the detection module detecting the preset dynamic gesture.
  • a computer-readable storage medium stores a computer program, and the computer program is used to execute the device control method described in any of the above embodiments of the present disclosure.
  • an electronic device including:
  • memory for storing instructions executable by the processor
  • the processor is configured to read the executable instructions from the memory and execute the instructions to implement the device control method described in any of the above embodiments of the present disclosure.
  • the first voice recognition result is obtained by performing voice recognition on the voice control instruction, and then based on the first The speech recognition result determines the target device corresponding to the voice control instruction, and when a preset dynamic gesture is detected, the state of the corresponding target device is continuously adjusted based on the continuous action of the dynamic gesture.
  • embodiments of the present disclosure can determine the target device that needs to be adjusted based on voice control instructions without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device;
  • continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, precise, and precise, thereby improving the control of the target device. control effect.
  • Embodiments of the present disclosure can be used to adjust the status of any equipment such as home appliances, vehicle-mounted equipment, and terminal equipment.
  • the disclosed embodiments When the disclosed embodiments are applied to vehicles, they can improve the efficiency, convenience, and safety of selecting and operating vehicle-mounted equipment, and effectively avoid the inconvenience and unsafety of drivers manually operating and controlling vehicle-mounted equipment while driving;
  • continuous actions based on dynamic gestures realize continuous operation control of vehicle-mounted equipment, making the adjustment of the status of vehicle-mounted equipment more flexible, precise, and precise, thus improving the control effect of vehicle-mounted equipment.
  • Figure 1 is a system diagram to which the present disclosure is applicable.
  • Figure 2 is a schematic flowchart of a device control method provided by an exemplary embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of drawing circles with one finger in an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of a device control method provided by another exemplary embodiment of the present disclosure.
  • Figure 5 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
  • Figure 6 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
  • FIG. 7 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
  • Figure 8 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
  • Figure 9 is a schematic structural diagram of an equipment control device provided by an exemplary embodiment of the present disclosure.
  • Figure 10 is a schematic structural diagram of an equipment control device provided by another exemplary embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure.
  • plural may refer to two or more than two, and “at least one” may refer to one, two, or more than two.
  • Embodiments of the present disclosure may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which may operate with numerous other general or special purpose computing system environments or configurations.
  • Examples of well-known terminal devices, computing systems, environments and/or configurations suitable for use with terminal devices, computer systems, servers and other electronic devices include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients Computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems and distributed cloud computing technology environments including any of the above systems, etc.
  • AI Artificial Intelligence
  • the vehicle-mounted equipment is controlled through voice command operation.
  • the opening range of the vehicle window can only be controlled according to the default settings, and the opening range of the vehicle window cannot be precisely controlled. If the opening range does not reach the user's expected level, it will take multiple times.
  • the voice command "open window” controls multiple times to increase the opening range of the vehicle window, which is inefficient; and if the opening range of the vehicle window exceeds the user's expectation, the opening range of the vehicle window cannot be accurately reduced, thus failing to meet user needs. .
  • embodiments of the present disclosure propose an equipment control method and device, electronic equipment, and media to improve the efficiency, convenience, and safety of selecting and operating and controlling vehicle-mounted equipment, while achieving continuous operation control of target equipment.
  • the embodiments of the present disclosure determine the target device that needs to be adjusted through voice control instructions, and continuously adjust the status of the target device through the continuous action of dynamic gestures. This eliminates the need to manually select the target device, improves the efficiency and convenience of selecting the target device, and effectively avoids The inconvenience problem of manually selecting the target device is eliminated, and continuous operation control of the target device is realized, making the adjustment of the status of the target device more flexible, precise and precise, thus improving the control effect of the target device.
  • Embodiments of the present disclosure can be used to adjust the status of any equipment such as home appliances, vehicle-mounted equipment, and terminal equipment.
  • any equipment such as home appliances, vehicle-mounted equipment, and terminal equipment.
  • the efficiency, convenience and safety of selecting and operating vehicle-mounted equipment can be improved, and the driver's manual operation can be effectively avoided during driving.
  • the inconvenience and unsafety of operating and controlling vehicle-mounted equipment exist; moreover, continuous actions based on dynamic gestures realize continuous operation control of vehicle-mounted equipment, making the adjustment of the status of vehicle-mounted equipment more flexible, precise, and precise, thus Improved the control effect of vehicle-mounted equipment.
  • FIG. 1 is a system diagram to which the present disclosure is applicable.
  • the voice control command is collected through the audio collection module 102 (such as a microphone, etc.).
  • the voice control command or the voice control command is input into the equipment control device 104 of the embodiment of the present disclosure after being processed by the front-end signal.
  • the device control device 104 performs voice recognition on the received voice control instruction, and after obtaining the first voice recognition result, determines the target device 106 corresponding to the voice control instruction based on the voice recognition result, and calls the image acquisition module 108 (such as a camera, etc.) to collect video stream, and perform preset dynamic gesture detection on the video stream collected by the image acquisition module 108.
  • the preset dynamic gesture is detected, the state of the target device 106 is continuously adjusted based on the continuous action of the dynamic gesture.
  • the embodiments of the present disclosure can be used to adjust the status of any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc. That is, the target device 106 can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc.
  • the embodiment of the present disclosure targets various interaction scenarios in the cockpit, performs human-computer interaction based on a mixture of voice and dynamic gestures, and obtains control rights of the device to be controlled by performing speech recognition on the voice control instructions, and then Dynamic gestures are used to perform various possible continuous operations and controls on the equipment to be controlled.
  • the adjustment speed of the equipment to be controlled can also be controlled through the movement speed of the dynamic gestures, which can improve selection and
  • the efficiency, convenience and safety of operating and controlling vehicle-mounted equipment can effectively avoid the inconvenience and unsafety problems caused by drivers manually operating and controlling vehicle-mounted equipment while driving; and, continuous actions based on dynamic gestures realize the control of vehicle-mounted equipment.
  • the continuous operation control makes the adjustment of the status of the vehicle-mounted equipment more flexible, precise and precise, thus improving the control effect of the vehicle-mounted equipment.
  • the disclosed embodiment fully utilizes the excellent permission interface capability of voice control and the fine adjustment capability of dynamic gestures, and has the characteristics of simple operation, good robustness, fine adjustment, high interaction efficiency, and wide range of functions.
  • FIG. 2 is a schematic flowchart of a device control method provided by an exemplary embodiment of the present disclosure. This embodiment can be applied to electronic devices. As shown in Figure 2, the device control method of this embodiment includes the following steps:
  • Step 202 In response to receiving the voice control instruction, perform voice recognition on the voice control instruction to obtain a first voice recognition result.
  • the voice control instructions in the embodiments of the present disclosure are the original voice control instructions directly collected through the audio collection module (such as a microphone, etc.), or the voice control instructions obtained by performing front-end signal processing on the original voice control instructions collected by the audio collection module. , the embodiment of the present disclosure does not limit this.
  • front-end signal processing can include, for example, but is not limited to: Voice Activity Detection (VAD), noise reduction, Acoustic Echo Cancellation (AEC), dereverberation processing, device control, and beam forming (Beam Forming). , BF) etc.
  • VAD Voice Activity Detection
  • AEC Acoustic Echo Cancellation
  • Beam Forming Beam Forming
  • BF beam forming
  • Voice activity detection also known as voice endpoint detection and voice boundary detection, refers to detecting the presence or absence of voice in audio signals in a noisy environment and accurately detecting the starting position of the voice segment in the audio signal. It is usually used for voice coding and voice enhancement. In other speech processing systems, it plays a role in reducing speech Audio coding rate, communication bandwidth saving, mobile device energy consumption reduction, recognition rate improvement, etc.
  • the starting point of VAD is from silence to speech
  • the end point of VAD is from speech to silence
  • the determination of the end point of VAD requires a period of silence.
  • the speech obtained by the front-end signal processing of the original audio signal includes the speech from the starting point to the end point of the VAD. Therefore, the speech control instructions in the embodiments of the present disclosure may also include a period of silence after the speech segment.
  • Step 204 Based on the first voice recognition result, determine the target device corresponding to the voice control instruction.
  • the target device corresponding to the voice control instruction is the device whose status needs to be adjusted.
  • the target device can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc.
  • the vehicle-mounted equipment is the equipment on the vehicle.
  • it can include but is not limited to the following equipment on the vehicle: left rearview mirror, right rearview mirror, vehicle interior Rearview mirrors, windows, air conditioners, seats, stereos, lights, etc.
  • the embodiments of the present disclosure do not limit the scope of the target device and the specific scope of the vehicle-mounted device.
  • Step 206 In response to detecting the preset dynamic gesture, continuously adjust the state of the target device based on the continuous action of the dynamic gesture.
  • the user can continuously adjust the state of the target device by continuously making dynamic gestures until the state of the target device reaches the state effect expected by the user, for example, the vehicle window is lowered to the height expected by the user, and the dynamic gesture is stopped. Action to stop adjusting the status of the target device.
  • the target device that needs to be adjusted can be determined based on the voice control instruction without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device; in addition, Continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, fine, and precise, thus improving the control of the target device. Effect.
  • the preset dynamic gestures in the embodiments of the present disclosure can be designed with the following characteristics in mind: (1) consistent with natural habits and easy to make to improve the convenience of movements; (2) dynamic gestures, compared to static hand movements in a single frame image , good robustness; (3) Different from daily habitual actions, the probability of false positives for other actions is low; (4) It has different movement directions and can be reused.
  • the above-mentioned preset dynamic gestures can be, for example: drawing circles, that is, drawing circles in the air, which can include but is not limited to drawing circles with the left hand, drawing circles with the right hand, drawing circles with both hands, any one Draw circles with one or more fingers, make a fist to draw circles, bend your fingers to draw circles, etc.
  • drawing circles that is, drawing circles in the air
  • Figure 3 it is a schematic diagram of drawing circles with one finger.
  • the preset dynamic gestures in the embodiments of the present disclosure are not limited to this, and can be dynamic gestures with any of the above characteristics.
  • the preset dynamic gestures in this embodiment can meet the above characteristics at the same time. They are highly robust, few but precise, in line with natural habits, have high recognition accuracy, and are easy to reuse. This can improve recognition stability and accuracy and help For continuous operational control of equipment.
  • FIG. 4 is a schematic flowchart of a device control method provided by another exemplary embodiment of the present disclosure. As shown in Figure 4, based on the above embodiment shown in Figure 2, the device control method of this embodiment may also include the following steps:
  • Step 205 Determine the target dimension parameters to be adjusted of the target device.
  • the target dimension parameter is the dimension parameter that needs to adjust the status of the target device.
  • the target dimension parameter can be the lifting dimension of the window; when the target device is a seat on the vehicle, the target dimension parameter can be the front and rear dimensions, height dimensions, and backrest tilt angle dimensions of the seat. etc.; when the target device is a light on a vehicle, the target dimension parameter can be the brightness dimension, color dimension, etc. of the light; when the target device is the left rearview mirror or right rearview mirror on the vehicle, the target dimension parameter can be the left rearview mirror. mirror, right rearview mirror Elevation angle dimension, yaw angle dimension.
  • the target dimension parameter when the target device is a home appliance such as a television, the target dimension parameter may be the channel dimension, volume dimension, brightness dimension, etc. of the TV.
  • the target dimension parameter to be adjusted of the target device may be any adjustable dimension parameter of the target device. The embodiment of the present disclosure does not place a limit on the adjustable dimension parameter.
  • step 206 may include:
  • Step 2062 In response to detecting the preset dynamic gesture, determine the movement direction of the dynamic gesture.
  • Step 2064 Based on the movement direction of the dynamic gesture, determine the target adjustment direction of the target device on the target dimension parameter.
  • the corresponding relationship between the movement direction of the dynamic gesture and the device, the dimensional parameters of the device, and the adjustment direction can be preset. After determining the movement direction of the dynamic gesture, the corresponding relationship can be queried to obtain the target adjustment direction based on the movement direction of the dynamic gesture, the target device, and the target dimension parameters.
  • Table 1 below is a partial example of the correspondence between the circle direction and the device, the device's dimensional parameters, and the adjustment direction when the dynamic gesture is a circle in an embodiment of the present disclosure.
  • the specific content of the correspondence between the movement direction of the dynamic gesture and the device, the dimensional parameters of the device, and the adjustment direction is limited.
  • Step 2066 Based on the continuous action of the dynamic gesture in the movement direction of the dynamic gesture, continuously adjust the target dimension parameters of the target device in the target adjustment direction.
  • the target adjustment direction of the target device on the target dimension parameters can be determined through the movement direction of the dynamic gesture, thereby determining the target dimension of the target device to be adjusted. parameters and the target adjustment direction. Furthermore, based on the continuous action of the dynamic gesture in the direction of movement, the continuous adjustment of the target device in the target dimension parameters toward the target adjustment direction can be achieved, thereby achieving the target device in the target dimension parameters. Continuous operational control in the direction of target adjustment.
  • the target device in the embodiment of the present disclosure may be a device whose status is determined based on one dimension parameter. That is, the status of the device is determined based on one dimension parameter.
  • the device has only one adjustable dimension parameter, and each parameter value on the dimension parameter corresponds to A state of the device.
  • a window on a vehicle is a device whose state is determined based on one dimensional parameter of the lifting dimension. Different height values of the window in the lifting dimension correspond to a state of the window.
  • the target device in the embodiment of the present disclosure may also be a target device whose status is determined based on multiple dimensional parameters, that is, the status of the device is jointly determined based on the multiple dimensional parameters, and the device has multiple adjustable dimensional parameters.
  • a set of parameter values on multiple dimensional parameters respectively corresponds to a state of the device.
  • the state of the device changes.
  • the left rearview mirror and the right rearview mirror on the vehicle are devices that jointly determine the state based on the pitch angle dimension and the yaw angle dimension.
  • Each group corresponds to a state of the left rearview mirror and the right rearview mirror respectively.
  • the angle values of any or all parameters in the pitch angle dimension and yaw angle dimension change, the left rearview mirror , the status of the right rearview mirror has also changed.
  • the one dimension parameter of the target device when the status of the target device is determined based on one dimension parameter, in step 205, the one dimension parameter of the target device can be directly determined as the target dimension parameter.
  • the one dimension parameter of the target device can be directly determined as the target dimension parameter without the user having to specify the parameter that needs to be adjusted.
  • the target dimension parameters help to improve the efficiency of determining the target dimension parameters, thereby improving the control efficiency of the target device.
  • the target dimensional parameters may be determined based on the first speech recognition result.
  • the user can directly carry relevant information about the target dimensional parameters that need to be adjusted through voice control instructions.
  • the voice control instructions can be the voice "I want to adjust the front and rear of the main driver's seat", "I want to adjust the main driver's seat.”
  • the embodiments of the present disclosure have a content form that carries relevant information about the target dimension parameters in the voice control instructions. There are no restrictions on the format.
  • the first speech recognition result in text form obtained by performing speech recognition on the speech control instruction includes relevant information of the target dimension parameters, and the target dimension parameters can be determined based on the relevant information of the target dimension parameters.
  • the dimensional parameters of each device can be preset, and after voice recognition is performed on the voice control instruction to obtain the first voice recognition result, the correlation of the target dimensional parameters in the first voice recognition result is determined for the target device.
  • the information-related or closest dimension parameter is used as the target dimension parameter to be adjusted.
  • the target dimension is based on the first voice recognition result "I want to adjust the front and rear of the main driver's seat.”
  • the relevant information of the parameter "front and rear” is based on the first voice recognition result "I want to adjust the driver's seat, front and rear”.
  • the relevant information of the target dimension parameter "front and rear adjustment” is based on the first voice recognition result "I want to adjust it forward.”
  • the relevant information of the target dimensional parameter in "Main Driver's Seat” is "Adjust Forward”, and the relevant information of the target dimensional parameter “Front and Back”, “Front and Back Adjustment”, and “Adjust Forward” can be determined.
  • the related or closest dimensional parameters are front and rear. Dimension, as the target dimension parameter to be adjusted for the main driver's seat.
  • a preset determination method may be used to determine, for the target device, the dimensional parameter associated with or closest to the relevant information of the target dimensional parameter in the first speech recognition result. For example, you can determine the dimension parameter name of the target device, with the first speech recognition The dimension parameter with the most identical characters among the related information of the target dimension parameter in the result is the associated or closest dimension parameter.
  • an information list can be preset, and the information list includes relevant information that may correspond to each dimension parameter of each device. Then based on the relevant information of the target dimension parameter in the first speech recognition result, the information list can be queried for the target device, Get the matching dimension parameter as the associated or closest dimension parameter.
  • the embodiments of the present disclosure may also use other methods to determine the dimensional parameters associated with or the closest dimensional parameters to the relevant information of the target dimensional parameters in the first speech recognition result, and the embodiments of the present disclosure do not limit this.
  • the user when the status of the target device is determined based on multiple dimensional parameters, the user can directly specify the target dimensional parameters that need to be adjusted through voice control instructions without having to separately specify the target dimensional parameters that need to be adjusted, which helps to improve the target The efficiency of determining dimensional parameters is improved, thereby improving the control efficiency of the target device.
  • step 205 in response to receiving the dimensional parameter voice command, voice recognition can be performed on the dimensional parameter voice command to obtain the second The speech recognition result is then determined based on the second speech recognition result, and the target dimension parameters are determined.
  • the user can directly send the dimensional parameter voice command after sending the voice control command.
  • the user can directly send the dimensional parameter voice command "Adjust forward and backward” after sending the voice control command "I want to adjust the driver's seat.” ".
  • the device that implements the embodiments of the present disclosure may also output the dimension parameter inquiry voice after receiving the voice control instruction sent by the user, and receive the dimension parameter voice instruction sent by the user for the dimension parameter inquiry voice.
  • the user sends a voice The control command "I want to adjust the driver's seat”.
  • the device implementing the embodiment of the present disclosure outputs the dimensional parameter inquiry voice "Okay, what do you want?
  • the target dimensional parameter can be determined based on the second voice recognition result.
  • the dimension parameters of each device can be preset, and the second speech recognition result obtained by speech recognition of the dimension parameter voice command is determined for the target device, the dimension associated with the second speech recognition result or the closest dimension parameter as the target dimension parameter to be adjusted.
  • a preset determination method may be used to determine the dimensional parameters associated with the second speech recognition result or the closest dimensional parameters for the target device.
  • a specific determination method reference can be made to the implementation method of determining the dimensional parameter that is associated with the relevant information of the target dimensional parameter in the first speech recognition result or the closest dimensional parameter in the first speech recognition result, and will not be described again here.
  • the user can specify the dimensional parameter that needs to be adjusted through a separate dimensional parameter voice command, thereby determining the target dimensional parameter that needs to be adjusted by the target device.
  • hand morphology information corresponding to the dynamic gesture can also be obtained, and then the target dimensions are determined based on the hand morphology information. parameter.
  • the hand shape information may include, for example, but is not limited to any of the following: finger extension form, number of fingers, single-hand information, etc.
  • the finger extension form may be, for example, straightening, bending, etc.; the number of fingers may be, for example, one, two, etc.; the single-hand information may be, for example, the left hand, the right hand, or both hands, etc.
  • the correspondence between the hand shape information, the device, and the dimensional parameters of the device can be preset in step After the hand morphology information corresponding to the dynamic gesture is obtained in step 205, based on the target device and the obtained hand morphology information, the dimensional parameters corresponding to the target device and the obtained hand morphology information are obtained from the above correspondence relationship, as Target dimension parameters.
  • the hand morphology information is the number of fingers and the device is a vehicle-mounted device
  • the correspondence between the hand morphology information, the vehicle-mounted device, and the dimensional parameters of the vehicle-mounted device An example of partial content.
  • the hand morphology information is single-hand information and the device is a vehicle-mounted device
  • the correspondence between the hand morphology information, the vehicle-mounted device, and the dimensional parameters of the vehicle-mounted device An example of partial content.
  • the above Table 2 and Table 3 only exemplarily show part of the correspondence between the hand morphology information and the device, as well as the dimensional parameters of the device.
  • the hand morphology information is other hands other than Table 2 and Table 3. If the form information and device are other devices than Table 2 and Table 3 (such as other vehicle-mounted devices, home appliances, terminal devices, etc.), refer to Table 2 and Table 3 for the content structure, and the embodiments of this disclosure will not be repeated.
  • the target dimensional parameters that need to be adjusted for the target device can be determined through the user's hand shape information.
  • FIG. 5 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 5, based on the above embodiment shown in Figure 4, step 2066 may include the following steps:
  • Step 20662 During the continuous action of the dynamic gesture, obtain the movement speed of the dynamic gesture in real time or according to a preset adjustment period.
  • the value of the preset adjustment period can be set to a smaller value, such as 0.01s.
  • the embodiments of the present disclosure can be preset according to the specific controlled equipment and the adjustment effect. and can be updated as needed.
  • Step 20664 Based on the movement speed of the dynamic gesture, determine the target adjustment speed of the target device on the target dimension parameter.
  • Step 20666 Adjust the target device in the target dimension parameter at the target adjustment speed in the target adjustment direction.
  • the target device can be adjusted in the target adjustment direction at the target adjustment speed within the state limit range of the target dimension parameter.
  • the target device reaches the state limit range boundary on the target dimension parameter, For example, when the windows on the vehicle are lowered to the lowest level or raised to the highest level, the target device will no longer be adjusted in the target adjustment direction on the target dimension parameters to avoid damaging the target device.
  • the target adjustment speed of the target device on the target dimension parameter can be determined based on the movement speed of the dynamic gesture, and the target device can be adjusted in the target adjustment direction at the target adjustment speed.
  • the faster the movement speed of the dynamic gesture the faster the device adjustment speed, conversely, the slower the dynamic gesture movement speed, the slower the device adjustment speed, thus realizing dynamic control of the device adjustment speed based on the movement speed of the dynamic gesture, and realizing visual control of the device adjustment speed.
  • the adjustment speed configuration information of the target device on the target dimension parameters can be obtained.
  • the adjustment speed configuration information is used to determine the gesture movement speed and device adjustment speed on each dimension parameter of the target device. For example, for windows on a vehicle, in the lifting dimension, the relationship between the gesture movement speed and the window lifting speed can linearly correspond to the gesture movement speed and the device adjustment speed. Accordingly, in step 20664, based on the obtained adjustment speed configuration information, it may be determined that the device adjustment speed corresponding to the motion speed of the dynamic gesture obtained in step 20662 on the target dimension parameter is the target adjustment speed.
  • the target adjustment speed of the target device on the target dimension parameters can be determined objectively and accurately based on the movement speed of the dynamic gesture according to the preset adjustment speed configuration information, so as to achieve accurate control of the adjustment speed of the target device state.
  • the adjustment speed configuration information of the target device on the target dimension parameter can be obtained from the first speech recognition result.
  • the user can directly carry the speed adjustment configuration information through voice control instructions.
  • the voice control instruction can be "I want to adjust the main driver's window. Turn it three times to raise the entire window.” This includes the speed adjustment configuration. The message "Three turns can raise the entire window.”
  • the embodiment of the present disclosure does not limit the content form and format of the speed adjustment configuration information carried in the voice control command. Then, after performing voice recognition on the voice control instruction to obtain the first voice recognition result, the target device's parameters in the target dimension can be obtained from the first voice recognition result. Adjust the speed configuration information to determine the relationship between the gesture movement speed and the device adjustment speed on the target dimension parameters of the target device.
  • the user can directly set the adjustment speed configuration information of the target device on the target dimension parameters through voice control instructions during the process of controlling the device, thereby realizing real-time and dynamic configuration of the adjustment speed configuration information in specific scenarios. Achieve personalized configuration of device adjustment effects.
  • the following method can also be used to obtain the adjustment speed configuration information of the target device on the target dimension parameters: in response to receiving the adjustment speed configuration voice instruction, perform voice recognition on the adjustment speed configuration voice instruction, Obtain the third speech recognition result, and then obtain the adjustment speed configuration information of the target device on the target dimension parameter from the third speech recognition result, thereby determining the ratio between the gesture movement speed and the device adjustment speed on the target dimension parameter of the target device. relationship between.
  • the voice command for adjusting the speed configuration may be a voice command for adjusting the speed and configuration actively sent by the user. For example, the user actively sends the voice command for adjusting the speed configuration "Turn forward" after sending the voice control command "I want to move the driver's seat forward".
  • the user can configure the voice instruction according to the adjustment speed prompt voice sent by the device for implementing the embodiment of the present disclosure. For example, the user sends the voice control instruction "I want to After adjusting the driver's seat forward", according to the adjustment speed prompt voice "Okay, what speed do you want to adjust at” output by the device used to implement the embodiment of the present disclosure, the adjustment speed configuration voice command "Turn three turns" is sent. The entire window can be raised.”
  • the embodiment of the present disclosure does not limit the manner and specific content of the user's sending voice instructions for speed adjustment and configuration.
  • the user can set the adjustment speed configuration information of the target device on the target dimension parameters through a separate instruction during the process of controlling the device, thereby realizing real-time and dynamic configuration of the adjustment speed configuration information in specific scenarios. , to achieve personalized configuration of the equipment adjustment effect.
  • the adjustment speed configuration information of the target device on the target dimension parameter can also be obtained from the preconfigured adjustment speed configuration information, thereby determining the gesture movement speed and the target dimension parameter on the target device.
  • the preconfigured adjustment speed configuration information may be preconfigured by the user.
  • vehicle-mounted equipment users can use the speed adjustment configuration page provided by the vehicle's central control system, for example, through the configuration options for each vehicle-mounted equipment in the speed adjustment configuration page, or through the speed adjustment configuration page for human-machine voice interaction. method to set or update the adjustment speed configuration information of each vehicle-mounted device.
  • the user can also access the adjustment speed configuration permission provided by the central control system through human-computer voice interaction, and set the adjustment speed configuration information of each vehicle-mounted device through human-computer voice interaction.
  • the adjustment speed configuration information of each device can be set or updated through the adjustment speed configuration page provided by the control device that controls these devices in a similar manner to vehicle-mounted equipment. .
  • the factory-preset information of the central control system for vehicle-mounted equipment
  • control equipment for home appliances, terminal equipment and other equipment
  • the adjustment speed configuration information of the target device on the target dimension parameter can be obtained from the preconfigured adjustment speed configuration information to determine the adjustment speed configuration information for the current scene.
  • the target throttling speed for the target device is not limited to the preconfigured adjustment speed configuration information.
  • the adjustment speed configuration information can be pre-configured in the following way:
  • the adjustment speed configuration request includes information about the device identification (ID), dimension parameter ID, gesture movement range (for example, one circle), and device adjustment range (for example, 0.5cm).
  • ID is used to uniquely identify a device
  • dimension parameter ID is used to uniquely identify a device.
  • the configuration or update of the adjustment speed configuration information of the device on the dimensional parameters is implemented.
  • step 206 or 2066 in response to receiving the speed adjustment voice instruction, voice recognition is performed on the speed adjustment voice instruction, and a fourth voice recognition result is obtained, and the fourth voice recognition result is obtained from the speed adjustment voice instruction.
  • the adjustment speed update configuration information is obtained from the fourth speech recognition result.
  • the adjustment speed update configuration information is used to represent the relationship between the updated gesture movement speed and the device adjustment speed on the various dimensional parameters of the target device; then, in the dynamic During the subsequent continuous action of the gesture, the movement speed of the dynamic gesture is obtained in real time or according to the preset adjustment cycle, and the configuration information is updated based on the above adjustment speed to determine the updated device adjustment corresponding to the movement speed of the dynamic gesture on the target dimension parameters. speed, and then adjust the target device in the target adjustment direction at the updated adjustment speed on the target dimension parameters.
  • the user may find that the adjustment speed of the target device is too fast or too slow. Based on this embodiment, the user can send adjustments according to the adjustment effect requirements during the process of adjusting the target device.
  • the speed update voice command is used to update the adjustment speed configuration information, thereby realizing real-time update of the adjustment speed of the target device, further improving the adjustment efficiency and effect of the target device and the user's operating experience.
  • the step of presetting dynamic gesture detection may also be included.
  • FIG. 6 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 6, in some implementations, preset dynamic gestures can be detected in the following ways:
  • Step 302 Determine the position of the sound source object that sends the voice control instruction.
  • the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
  • Step 304 Based on the position of the sound source object, obtain an image sequence including the hand of the sound source object.
  • the image sequence includes multiple frames of images with a temporal relationship.
  • the image acquisition module such as a camera, etc.
  • the image acquisition module to collect images of the sound source object, and perform hand detection and tracking on the collected images to obtain the hand including the sound source object.
  • a video stream from which multiple frames of images with a temporal relationship are selected in a preset manner (such as continuous selection or every other frame selection, etc.), as an image sequence of the hand of the sound source object, or further from the selected multiple frames. Images of the same size containing the hand are intercepted from the frame images to obtain the image of the hand of the sound source object. Like sequence.
  • the image sequence of the hand is intercepted from the selected multi-frame images. Since the image contains less background information and less interference, the accuracy of the gesture detection results can be improved.
  • a first neural network such as a convolutional neural network (CNN)
  • CNN convolutional neural network
  • the first neural network can be obtained by pre-training a neural network model using sample images including hands.
  • Step 306 Perform hand key point detection on each frame image in the image sequence in sequence to obtain a hand key point sequence.
  • the hand key point sequence is formed based on the time series relationship of the hand key points in each frame image.
  • a second neural network such as CNN
  • the second neural network can be obtained by pre-training the neural network model using sample images marked with hand key point information.
  • Step 308 Perform preset dynamic gesture detection based on the hand key point sequence.
  • the hand key point sequence can be input into a third neural network, such as CNN, and the preset gesture detection result of whether the dynamic gesture is preset is output through the third neural network.
  • the third neural network can be trained in advance using sample videos of preset dynamic gestures.
  • the detection of the preset dynamic gesture is implemented based on visual technology, so that when the preset dynamic gesture is detected, the adjustment of the state of the target device is triggered. .
  • the movement direction of the dynamic gesture can be determined based on the hand key point sequence obtained in step 306 .
  • the movement direction of the dynamic gesture can be determined based on the direction corresponding to the trajectory of the hand key point sequence.
  • the dynamic gesture movement direction is determined based on visual technology.
  • the previous frame of image may be any frame of image located before the last frame of image in the image sequence. For example, it may be the previous frame of image adjacent to the last frame of image, or it may be the image of the last frame of image.
  • the embodiment of the present disclosure does not limit the images separated by several frames.
  • the distance between the hand key points in the last frame of the image and the hand key points in the previous frame of the image can be the distance between the corresponding hand key points in the last frame image and the previous frame image.
  • the average value may also be the distance between the preset hand key points (such as fingertip key points) in the last frame image and the previous frame image, etc. This embodiment of the present disclosure does not limit this.
  • the hand key point sequence can be input into the above-mentioned third neural network, and the movement direction and movement amplitude (such as the circle angle) of the dynamic gesture corresponding to the hand key point sequence are output through the third neural network, and then based on The movement amplitude and the time corresponding to the image sequence can be used to calculate the movement speed of the dynamic gesture.
  • the image sequence carrying the collection time information and labeling the hand key points can also be input into the above-mentioned third neural network, and the movement direction and movement speed of the dynamic gesture corresponding to the hand key point sequence are output through the third neural network. etc.
  • the embodiments of the present disclosure do not limit this.
  • the movement speed of the dynamic gesture can be accurately determined through the key points of the hand corresponding to the two frames of images in the image sequence and the image collection time.
  • FIG. 7 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 7, in other implementations, preset dynamic gestures can also be detected in the following ways:
  • Step 402 Determine the position of the sound source object that sends the voice control instruction.
  • the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
  • Step 404 Based on the position of the sound source object, use an optical Time of Flight (ToF) sensor to measure the distance information between each point on the hand of the sound source object and the ToF sensor to obtain a set of distance information.
  • ToF optical Time of Flight
  • the ToF sensor After determining the position of the sound source object, the ToF sensor can be used to measure the distance between the hand points of the sound source object and the ToF sensor.
  • a set of distance information obtained at each measurement moment includes the distance of the sound source object at that measurement moment. Distance information between each point of the hand and the ToF sensor.
  • Step 406 Obtain a distance information sequence based on multiple sets of distance information with time series relationships.
  • Step 408 Perform preset dynamic gesture detection based on the distance information sequence.
  • the change over time of the distance between the hand points of the sound source object and the ToF sensor can be learned, so that the distance changes can be based on whether the distance changes meet the preset
  • the distance change pattern corresponding to the dynamic gesture determines whether the hand of the sound source object makes a preset dynamic gesture.
  • three-dimensional (3D) modeling can be performed based on each set of distance information in the distance information sequence to obtain the corresponding hand posture, and the sound can be determined from the hand posture corresponding to the distance information sequence. Whether the source object's hands make preset dynamic gestures.
  • the ToF sensor is used to realize the detection of the preset dynamic gesture, so that when the preset dynamic gesture is detected, the adjustment of the state of the target device is triggered.
  • the movement direction of the dynamic gesture can be determined based on the distance information sequence obtained in step 406 .
  • the movement direction of the dynamic gesture corresponding to the distance information sequence obtained in step 406 can be determined based on the change pattern of the distance corresponding to the preset dynamic gesture in different movement directions.
  • the ToF sensor detects the change in distance from each point on the hand of the sound source object, thereby realizing the determination of the direction of dynamic gesture movement.
  • dynamic gestures can be obtained based on the last set of distance information and the previous set of distance information in the distance information sequence obtained in step 406, and the measurement time corresponding to the last set of distance information and the measurement time corresponding to the previous set of distance information. movement speed.
  • the previous set of distance information may be any set of distance information located before the last set of distance information in the distance information sequence. For example, it may be the previous set of distance information adjacent to the last set of distance information, or it may be A set of distance information that is separated from the last set of distance information by several sets of distance information, and this embodiment of the present disclosure does not limit this.
  • the distance change between the last set of distance information and the previous set of distance information in the distance information sequence can be the average of the distance changes between the last set of distance information and the previous set of distance information, or it can be The distance changes between the last set of distance information and the preset hand points (such as fingertips) in the previous set of distance information, etc., are not limited in this embodiment of the disclosure.
  • the movement speed of the dynamic gesture can be accurately determined.
  • FIG 8 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 8, in some implementations, preset dynamic gestures can also be detected in the following ways:
  • Step 502 Determine the position of the sound source object that sends the voice control instruction.
  • the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
  • Step 504 Based on the position of the sound source object, use the wearable device to obtain the positions of each point on the hand of the sound source object to obtain hand position information.
  • the hand position information includes position information of each point of the hand.
  • Wearable devices in embodiments of the present disclosure may be, for example, smart gloves, smart glasses and other smart devices.
  • the smart gloves can directly locate the positions of various points on the hand at any time, and the smart glasses can visually obtain the positions of various points on the hand.
  • the embodiments of the present disclosure do not limit the specific wearable device used and the manner in which it obtains the positions of various points on the hand of the sound source object.
  • Step 506 Determine the posture of the hand based on the hand position information.
  • Step 508 Determine hand movements based on hand postures at multiple moments.
  • Step 510 Confirm whether the hand movement is a preset dynamic gesture movement.
  • Step 512 In response to the hand movement being the preset dynamic gesture, confirm that the preset dynamic gesture is detected.
  • the wearable device can be used to directly obtain the position of each point of the hand of the sound source object, and then determine the posture of the hand, and determine the movement of the hand based on the posture of the hand at multiple moments, thereby confirming whether it is a predetermined Set dynamic gestures to trigger adjustments to the target device's state when the preset dynamic gesture is detected.
  • the movement direction of the dynamic gesture can be determined based on the postures of the hand determined at multiple moments in step 506 .
  • the movement direction of a dynamic gesture can be determined based on changes in hand postures at multiple moments.
  • the movement direction of the dynamic gesture can also be directly determined based on the hand movement determined in step 508.
  • hand position information is obtained through the wearable device, and the direction of dynamic gesture movement is determined.
  • the motion speed of the dynamic gesture can be obtained based on the last moment and the previous moment among the multiple moments obtained in step 504, as well as the hand position information of the last moment and the hand position information of the previous moment.
  • the time can be the information collection time when the wearable device obtains the position of each point on the hand of the sound source object.
  • the wearable device can obtain the position of each point on the hand of the sound source object according to the preset information collection period (for example, 0.01s), then The time interval between two information collection moments is 0.01s.
  • the previous moment may be a moment before the last moment, or may be a moment before the last moment and separated from the last moment by a preset number of moments (for example, 2).
  • the present disclosure implements There is no restriction on this.
  • the dynamics can be calculated based on the change between the hand position information at the last moment and the hand position information at the previous moment, and the time between the last moment and the measurement moment at the previous moment.
  • the speed of the gesture can be the last The change between the hand position information at a moment and the hand position information at the previous moment.
  • the last The change between the hand position information at a moment and the hand position information at the previous moment can be the average change in the distance between the corresponding points of the hand in the hand position information at the last moment and the previous moment, or It may be the change in the distance between the preset hand points (such as fingertips) in the hand position information at the last moment and the previous moment, etc., and the embodiment of the present disclosure does not limit this.
  • the movement speed of the dynamic gesture can be accurately determined using the hand position information of the sound source object obtained by the wearable device at different times.
  • the user sends a voice control instruction "I want gestures to adjust the main driver's window.”
  • the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is the main device based on the obtained first speech recognition result. driver's window, and access the control authority of the main driver's window; the user draws a circle clockwise, and the main driver's window continuously rises. During the continuous lowering of the main driver's window, the user sends the speed adjustment voice command "It's too slow. Three turns can raise the entire window.” Based on this, the device implementing the embodiment of the present disclosure determines the speed corresponding to the user's circle-drawing action. The update device adjusts the speed, and in turn controls the main driving window to rise at the update adjustment speed. The user continues to draw circles until the main driver's window is adjusted to the height expected by the user.
  • Scenario 2 Adjust the front and rear seats on the vehicle:
  • the user sends a voice control instruction "I want to adjust the driver's seat forward, turn it one centimeter forward.”
  • the device implementing the embodiment of the present disclosure performs speech recognition. Based on the obtained The first voice recognition result determines that the target device is the main driving seat, the target dimension parameter is front and rear, the adjustment speed configuration information is "turn one circle forward one centimeter", and accesses the control authority of the main driving seat; the user draws a circle clockwise , the driver's seat moves forward continuously. The user continues to move in circles until the driver's seat is adjusted to the user's desired position.
  • Scenario 3 Adjust the left rearview mirror on the vehicle based on hand shape information:
  • the user sends a voice control instruction "I want to adjust the left rearview mirror with gestures.”
  • the device After receiving the voice control instruction sent by the user, the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is based on the obtained first speech recognition result.
  • Left rearview mirror access the control authority of the left rearview mirror; when the user draws a circle with the right hand counterclockwise, the left rearview mirror lowers its head continuously; when the user draws a circle counterclockwise, the left rearview mirror continuously raises the head; when the user draws a circle counterclockwise with the left hand, the left rearview mirror The rearview mirror continuously points outward; the user draws a circle with the needle, and the left rearview mirror continuously points inward.
  • the user extends the index finger of the right hand to draw a counterclockwise circle, and the left rearview mirror lowers its head continuously; the user draws a circle counterclockwise, and the left rearview mirror continuously raises the head; the user extends the index finger and middle finger of the right hand to draw a counterclockwise circle, and the left rearview mirror continuously raises the head. outward; the user draws circles with the needle, and the left rearview mirror continuously points inward.
  • the specific adjustment speed can be determined by obtaining the preconfigured adjustment speed configuration information, or you can also refer to the above scenario one and two to configure the adjustment speed configuration information through user voice commands. The user continues to move in circles until the left rearview mirror is adjusted to the direction expected by the user.
  • the user sends a voice control instruction "I want to adjust the empty air volume with gestures.”
  • the device After receiving the voice control instruction sent by the user, the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is based on the obtained first speech recognition result.
  • the air conditioner and target dimension parameters are air volume, and are accessed to the control authority of the air conditioner; if the user draws a circle clockwise, the air volume of the air conditioner increases; if the user draws a circle counterclockwise, the air volume of the air conditioner decreases, and the specific adjustment speed can be obtained from the preconfigured adjustment Speed configuration information is determined.
  • the user sends an adjustment speed update voice command to update the adjustment speed of the air volume of the air conditioner. The user continues to move in circles until the air volume of the air conditioner reaches the user's desired effect.
  • Any device control method provided by the embodiments of the present disclosure can be executed by any appropriate device with data processing capabilities, including but not limited to: terminal devices and servers.
  • any of the device control methods provided by the embodiments of the present disclosure can be executed by the processor.
  • the processor executes any of the device control methods mentioned in the embodiments of the present disclosure by calling corresponding instructions stored in the memory. No further details will be given below.
  • the equipment control device of the embodiment of the present disclosure can be used to implement the equipment control method of the above-mentioned embodiments of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an equipment control device provided by an exemplary embodiment of the present disclosure.
  • the equipment control device of this embodiment includes: a voice recognition module 602, a first determination module 604, a detection module 606 and an adjustment module 608. in:
  • the voice recognition module 602 is configured to perform voice recognition on the voice control instruction in response to receiving the voice control instruction to obtain a first voice recognition result.
  • the first determination module 604 is configured to determine the target device corresponding to the voice control instruction based on the first voice recognition result obtained by the voice recognition module 602.
  • the target device corresponding to the voice control instruction is the device whose status needs to be adjusted.
  • the target device can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc.
  • the vehicle-mounted equipment is the equipment on the vehicle.
  • it can include but is not limited to the following equipment on the vehicle: left rearview mirror, right rearview mirror, vehicle interior Rearview mirrors, windows, air conditioners, seats, stereos, lights, etc.
  • the embodiments of the present disclosure do not limit the scope of the target device and the specific scope of the vehicle-mounted device.
  • the detection module 606 is used to detect preset dynamic gestures.
  • the detection of preset dynamic gestures in the embodiment of the present disclosure may include, but is not limited to, drawing circles, etc., for example.
  • the adjustment module 608 is configured to continuously adjust the state of the target device determined by the first determination module 604 based on the continuous action of the dynamic gesture in response to the detection module 606 detecting the preset dynamic gesture.
  • the target device that needs to be adjusted can be determined based on the voice control instruction without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device; in addition, Continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, fine, and precise, thus improving the control of the target device. Effect.
  • Figure 10 is a schematic structural diagram of an equipment control device provided by another exemplary embodiment of the present disclosure. As shown in Figure 10, based on the embodiment shown in Figure 9, the equipment control device of this embodiment may further include: a second determination module 702, used to determine the target dimension parameters to be adjusted of the target equipment.
  • a second determination module 702 used to determine the target dimension parameters to be adjusted of the target equipment.
  • the adjustment module 608 may include: a first determination unit 6082, used to determine the movement direction of the dynamic gesture; a second determination unit 6084, used to determine the target adjustment of the target device on the target dimension parameter based on the movement direction of the dynamic gesture.
  • the adjustment unit 6086 is used to continuously adjust the target dimension parameters of the target device in the target adjustment direction based on the continuous action of the dynamic gesture in the movement direction.
  • the status of the target device is determined based on a dimensional parameter.
  • the The second determination module 702 is specifically used to determine one dimension parameter of the target device as the target dimension parameter.
  • the status of the target device is determined based on multiple dimensional parameters.
  • the second determination module 702 is specifically configured to determine the target dimension parameters based on the first speech recognition result.
  • the status of the target device is determined based on multiple dimensional parameters.
  • the voice recognition module 602 is also configured to perform voice recognition on the dimension parameter voice command in response to receiving the dimension parameter voice command, and obtain a second voice recognition result.
  • the second determination module 702 is specifically configured to determine the target dimension parameters based on the second speech recognition result.
  • the device control device of this embodiment may also include: a first acquisition module 704, specifically used to acquire hand form information corresponding to dynamic gestures, where the hand form information may include, for example But it is not limited to any of the following: finger extension form, number of fingers, single-hand information, etc.
  • the finger extension form may be, for example, straightening, bending, etc.; the number of fingers may be, for example, one, two, etc.; the single-hand information may be, for example, the left hand, the right hand, or both hands, etc.
  • the second determination module 702 is specifically configured to determine the target dimension parameters based on the hand morphology information obtained by the first acquisition module 704 .
  • a second acquisition module 706 and a third determination module 708 may also be included.
  • the second acquisition module 706 is used to acquire the movement speed of the dynamic gesture in real time or according to a preset adjustment period during the continuous action of the dynamic gesture.
  • the third determination module 708 is configured to determine the target adjustment speed of the target device on the target dimension parameter based on the movement speed of the dynamic gesture.
  • the adjustment unit 6086 is specifically used to adjust the target device on the target dimension parameter at the target adjustment speed in the target adjustment direction.
  • the equipment control device may also include: a third acquisition module 710, used to obtain the adjustment speed configuration information of the target device on the target dimension parameter, and the adjustment speed configuration information is used to represent the adjustment speed configuration information in the target dimension parameter.
  • the third determination module 708 is specifically configured to determine, based on the adjustment speed configuration information, the device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter as the target adjustment speed.
  • the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the first speech recognition result.
  • the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the preconfigured adjustment speed configuration information.
  • the voice recognition module 602 can also be used to respond to receiving the voice instruction to adjust the speed configuration, perform voice recognition on the voice instruction to adjust the speed configuration, and obtain a third voice recognition result.
  • the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the third speech recognition result;
  • the device control device may also include: a configuration module 712, configured to receive an adjustment speed configuration request through a setting interface.
  • the adjustment speed configuration request includes a device identification, a dimension parameter identification, and a gesture movement amplitude.
  • the device identifier is used to uniquely identify a device
  • the dimension parameter identifier is used to uniquely identify a dimension parameter
  • the relationship between the gesture movement speed and the device adjustment speed is determined; based on The relationship between device identification, dimension parameter identification, gesture movement speed and device adjustment speed, configure the adjustment speed of the device identified by the device identification on the dimension parameter identified by the dimension parameter identification degree configuration information; or, based on the relationship between the device identification, the dimension parameter identification, the gesture movement speed and the device adjustment speed, update the adjustment speed configuration information corresponding to the device identification and the dimension parameter identification in the preconfigured adjustment speed configuration information.
  • the speech recognition module 602 is also used to continuously adjust the target device in the target dimension parameters in the target adjustment direction based on the continuous action of the dynamic gesture in the movement direction. , in response to receiving the speed adjustment voice instruction, perform voice recognition on the speed adjustment voice instruction, and obtain a fourth voice recognition result.
  • the third acquisition module 710 is also used to acquire the adjustment speed update configuration information from the fourth speech recognition result.
  • the adjustment speed update configuration information is used to represent the parameters of each dimension of the target device. After the update The relationship between gesture movement speed and device adjustment speed.
  • the second acquisition module 706 is also used to acquire the movement speed of the dynamic gesture in real time or according to a preset adjustment period during the subsequent continuous action of the dynamic gesture.
  • the third determination module 708 is also configured to determine the update device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter based on the adjustment speed update configuration information.
  • the adjustment unit 6086 is also used to adjust the target device on the target dimension parameters in the target adjustment direction at an updated adjustment speed.
  • a fourth determination module 714 may be further included, configured to determine the position of the sound source object that sends the voice control instruction.
  • the detection module 606 is specifically configured to: based on the position of the sound source object, obtain an image sequence including the hand of the sound source object, where the image sequence includes multiple frames of images with a temporal relationship; Hand key point detection is performed on each frame image in the sequence to obtain a hand key point sequence.
  • the hand key point sequence is formed based on the time series relationship between the hand key points in each frame image; based on the hand key point sequence, a preset is performed Dynamic gesture detection.
  • the first determining unit 6082 is specifically configured to determine the movement direction of the dynamic gesture based on the hand key point sequence.
  • the second acquisition module 706 is specifically used to obtain the hand key points based on the hand key points in the last frame image in the image sequence, the hand key points in the previous frame image, and the last frame image.
  • the acquisition time and the acquisition time of the previous frame image are used to obtain the movement speed of the dynamic gesture.
  • the detection module 606 is specifically used to: based on the position of the sound source object, use the ToF sensor to measure the distance information between the hand points of the sound source object and the ToF sensor to obtain a set of Distance information; based on multiple sets of distance information with time-series relationships, a distance information sequence is obtained; based on the distance information sequence, preset dynamic gesture detection is performed.
  • the first determining unit 6082 is specifically configured to determine the movement direction of the dynamic gesture based on the distance information sequence.
  • the second acquisition module 706 is specifically configured to calculate the distance information based on the last set of distance information and the previous set of distance information in the distance information sequence, and the measurement time corresponding to the last set of distance information and the previous set of distances.
  • the measurement moment corresponding to the information is used to obtain the movement speed of the dynamic gesture.
  • the detection module 606 is specifically used to: based on the position of the sound source object, use the wearable device to obtain the position of each point of the hand of the sound source object, and obtain the hand position information.
  • the hand position information Including position information of each point of the hand; determining the acquired hand position information to determine the posture of the hand; determining the movement of the hand based on the posture of the hand at multiple moments; confirming whether the movement of the hand is a preset dynamic gesture Action, in response to whether the hand movement is a preset dynamic gesture action, confirming that the preset dynamic gesture is detected.
  • the first determining unit 6082 is specifically configured to determine the dynamic gesture based on the hand postures at the multiple moments. direction of movement.
  • the second acquisition module 706 is specifically configured to obtain information based on the last moment and the previous moment among the multiple moments, as well as the hand position information of the last moment and the hand position of the previous moment. Information to obtain the movement speed of dynamic gestures.
  • FIG. 11 is a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure. Next, an electronic device according to an embodiment of the present disclosure is described with reference to FIG. 11 .
  • the electronic device may be any one or both of the first device 100 and the second device 200, or a stand-alone device independent of them.
  • the stand-alone device may communicate with the first device and the second device to receive the information from them. collected input signal.
  • the electronic device includes one or more processors 802 and memory 804.
  • Processor 802 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
  • CPU central processing unit
  • Processor 802 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
  • Memory 804 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache).
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 802 may execute the program instructions to implement the device control methods of various embodiments of the present disclosure described above and/or other Desired functionality.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the electronic device may also include an input device 806 and an output device 808, with these components interconnected by a bus system and/or other forms of connection mechanisms (not shown).
  • the input device 806 may be the above-mentioned microphone or microphone array, used to capture the input signal of the sound source.
  • the input device 806 may be a communication network connector for receiving the collected input signals from the first device 100 and the second device 200 .
  • the input device 806 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 808 can output various information to the outside, including determined distance information, direction information, etc.
  • the output devices 808 may include, for example, displays, speakers, printers, and communication networks and remote output devices to which they are connected, among others.
  • the electronic device may include any other suitable components depending on the specific application.
  • embodiments of the present disclosure may also be a computer program product, which includes computer program instructions that, when executed by a processor, cause the processor to perform the “exemplary method” described above in this specification
  • the steps in the device control method according to various embodiments of the present disclosure are described in Sec.
  • the methods and apparatus of the present disclosure may be implemented in many ways.
  • the methods and devices of the present disclosure may be implemented through software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above order for the steps of the methods is for illustration only, and the steps of the methods of the present disclosure are not limited to the order specifically described above unless otherwise specifically stated.
  • the present disclosure may also be implemented as programs recorded in recording media, and these programs include machine-readable instructions for implementing methods according to the present disclosure.
  • the present disclosure also covers recording media storing programs for executing methods according to the present disclosure.
  • each component or each step can be decomposed and/or recombined. These decompositions and/or recombinations should be considered equivalent versions of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Disclosed in the embodiments of the present disclosure are a device control method and apparatus, and an electronic device and a medium. The device control method comprises: in response to receiving a voice control instruction, performing voice recognition on the voice control instruction, so as to obtain a first voice recognition result; on the basis of the first voice recognition result, determining a target device corresponding to the voice control instruction; and in response to detecting a preset dynamic gesture, continuously adjusting the state of the target device on the basis of a continuous action of the dynamic gesture. By means of the embodiments of the present disclosure, the efficiency and convenience of selecting a target device can be improved, and continuous operation control of the target device can be realized, such that the state of the target device is adjusted more flexibly, finely and accurately.

Description

设备控制方法和装置、电子设备和介质Equipment control methods and devices, electronic equipment and media
本公开要求在2022年03月11日提交的、申请号为202210242711.4、发明名称为“设备控制方法和装置、电子设备和介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims priority to the Chinese patent application with application number 202210242711.4 and the invention title "Equipment Control Method and Device, Electronic Equipment and Medium" submitted on March 11, 2022, the entire content of which is incorporated into this disclosure by reference. middle.
技术领域Technical field
本公开涉及人工智能技术,尤其是一种设备控制方法和装置、电子设备和介质。The present disclosure relates to artificial intelligence technology, especially an equipment control method and device, electronic equipment and media.
背景技术Background technique
人机交互是指人与机器之间使用某种对话语言,以一定的交互方式,为完成确定任务的人与机器之间的信息交换过程。传统的人机交互主要通过键盘、鼠标、显示器等输入输出设备实现,而随着人工智能等技术的发展,人与机器之间已经能够通过类似于自然语言的方式进行交互。Human-computer interaction refers to the information exchange process between humans and machines using a certain dialogue language and a certain interactive method to complete certain tasks. Traditional human-computer interaction is mainly achieved through input and output devices such as keyboards, mice, and monitors. However, with the development of technologies such as artificial intelligence, humans and machines have been able to interact in a manner similar to natural language.
随着智能车辆的普及,智能车辆上的车载设备逐渐增多,可实现的辅助功能也越来越多。对于行驶过程中的驾驶员来说,手动操作控制车载设备实现相应的功能,存在着诸多不方便性和不安全性。With the popularity of smart vehicles, the number of on-board devices on smart vehicles is gradually increasing, and more and more auxiliary functions can be implemented. For drivers while driving, it is inconvenient and unsafe to manually control vehicle-mounted equipment to implement corresponding functions.
发明内容Contents of the invention
为了解决上述技术问题,提出了本公开。本公开的实施例提供了一种设备控制方法和装置、电子设备和介质。In order to solve the above technical problems, the present disclosure is proposed. Embodiments of the present disclosure provide a device control method and device, electronic devices, and media.
根据本公开实施例的一个方面,提供一种设备控制方法,包括:According to an aspect of an embodiment of the present disclosure, a device control method is provided, including:
响应于接收到语音控制指令,对所述语音控制指令进行语音识别,得到第一语音识别结果;In response to receiving the voice control instruction, perform voice recognition on the voice control instruction to obtain a first voice recognition result;
基于所述第一语音识别结果,确定所述语音控制指令对应的目标设备;Based on the first voice recognition result, determine the target device corresponding to the voice control instruction;
响应于检测到预设动态手势,基于所述动态手势的持续动作对所述目标设备的状态进行连续调节。In response to detecting the preset dynamic gesture, continuously adjusting the state of the target device based on continued actions of the dynamic gesture.
根据本公开实施例的又一个方面,提供一种设备控制装置,包括:According to yet another aspect of an embodiment of the present disclosure, an equipment control device is provided, including:
语音识别模块,用于响应于接收到语音控制指令,对所述语音控制指令进行语音识别,得到第一语音识别结果;A voice recognition module, configured to perform voice recognition on the voice control instruction in response to receiving the voice control instruction, and obtain a first voice recognition result;
确定模块,用于基于所述语音识别模块得到的所述第一语音识别结果,确定所述语音控制指令对应的目标设备;A determination module, configured to determine the target device corresponding to the voice control instruction based on the first voice recognition result obtained by the voice recognition module;
检测模块,用于检测预设动态手势;Detection module, used to detect preset dynamic gestures;
调节模块,用于响应于所述检测模块检测到所述预设动态手势,基于所述动态手势的持续动作对所述目标设备的状态进行连续调节。An adjustment module, configured to continuously adjust the state of the target device based on the continuous action of the dynamic gesture in response to the detection module detecting the preset dynamic gesture.
根据本公开实施例的又一个方面,提供一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行本公开上述任一实施例所述的设备控制方法。 According to yet another aspect of an embodiment of the present disclosure, a computer-readable storage medium is provided, the storage medium stores a computer program, and the computer program is used to execute the device control method described in any of the above embodiments of the present disclosure.
根据本公开实施例的再一个方面,提供一种电子设备,所述电子设备包括:According to yet another aspect of an embodiment of the present disclosure, an electronic device is provided, the electronic device including:
处理器;processor;
用于存储所述处理器可执行指令的存储器;memory for storing instructions executable by the processor;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以本公开上述任一实施例所述的设备控制方法。The processor is configured to read the executable instructions from the memory and execute the instructions to implement the device control method described in any of the above embodiments of the present disclosure.
基于本公开上述实施例提供的设备控制方法和装置、电子设备和介质,在接收到语音控制指令时,通过对该语音控制指令进行语音识别,得到第一语音识别结果,然后,基于该第一语音识别结果确定语音控制指令对应的目标设备,并且,在检测到预设动态手势时,基于该动态手势的持续动作对对应的目标设备的状态进行连续调节。由此,本公开实施例可以基于语音控制指令确定需要调节的目标设备,而无需手动选择目标设备,可以提高目标设备选取的效率和便利性,有效避免手动选择目标设备存在的不方便性问题;另外,基于动态手势的持续动作对该目标设备的状态进行连续调节,实现了对目标设备的连续性操作控制,使得对目标设备的状态的调节更灵活、精细、精确,从而提高了对目标设备的控制效果。Based on the device control methods and devices, electronic devices and media provided by the above embodiments of the present disclosure, when receiving a voice control instruction, the first voice recognition result is obtained by performing voice recognition on the voice control instruction, and then based on the first The speech recognition result determines the target device corresponding to the voice control instruction, and when a preset dynamic gesture is detected, the state of the corresponding target device is continuously adjusted based on the continuous action of the dynamic gesture. Therefore, embodiments of the present disclosure can determine the target device that needs to be adjusted based on voice control instructions without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device; In addition, continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, precise, and precise, thereby improving the control of the target device. control effect.
本公开实施例可用于对家电设备、车载设备、终端设备等任意设备的状态调节。本公开实施例应用于车辆时,可以提高选取和操作控制车载设备的效率、便利性和安全性,有效避免驾驶员在行驶过程中手动操作控制车载设备存在的不方便性和不安全性问题;并且,基于动态手势的持续动作实现了对车载设备的连续性操作控制,使得对车载设备的状态的调节更灵活、精细、精确,从而提高了对车载设备的控制效果。Embodiments of the present disclosure can be used to adjust the status of any equipment such as home appliances, vehicle-mounted equipment, and terminal equipment. When the disclosed embodiments are applied to vehicles, they can improve the efficiency, convenience, and safety of selecting and operating vehicle-mounted equipment, and effectively avoid the inconvenience and unsafety of drivers manually operating and controlling vehicle-mounted equipment while driving; In addition, continuous actions based on dynamic gestures realize continuous operation control of vehicle-mounted equipment, making the adjustment of the status of vehicle-mounted equipment more flexible, precise, and precise, thus improving the control effect of vehicle-mounted equipment.
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。The technical solution of the present disclosure will be described in further detail below through the accompanying drawings and examples.
附图说明Description of the drawings
通过结合附图对本公开实施例进行更详细的描述,本公开的上述以及其他目的、特征和优势将变得更加明显。附图用来提供对本公开实施例的进一步理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开,并不构成对本公开的限制。在附图中,相同的参考标号通常代表相同部件或步骤。The above and other objects, features and advantages of the present disclosure will become more apparent through a more detailed description of the embodiments of the present disclosure in conjunction with the accompanying drawings. The accompanying drawings are used to provide further understanding of the embodiments of the present disclosure, and constitute a part of the specification. They are used to explain the disclosure together with the embodiments of the present disclosure, and do not constitute a limitation of the disclosure. In the drawings, like reference numbers generally represent like components or steps.
图1是本公开所适用的系统图。Figure 1 is a system diagram to which the present disclosure is applicable.
图2是本公开一示例性实施例提供的设备控制方法的流程示意图。Figure 2 is a schematic flowchart of a device control method provided by an exemplary embodiment of the present disclosure.
图3是本公开实施例中一根根手指画圈的一个示意图。FIG. 3 is a schematic diagram of drawing circles with one finger in an embodiment of the present disclosure.
图4是本公开另一示例性实施例提供的设备控制方法的流程示意图。FIG. 4 is a schematic flowchart of a device control method provided by another exemplary embodiment of the present disclosure.
图5是本公开又一示例性实施例提供的设备控制方法的流程示意图。Figure 5 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
图6是本公开再一示例性实施例提供的设备控制方法的流程示意图。Figure 6 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
图7是本公开还一示例性实施例提供的设备控制方法的流程示意图。FIG. 7 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
图8是本公开又一示例性实施例提供的设备控制方法的流程示意图。Figure 8 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure.
图9是本公开一示例性实施例提供的设备控制装置的结构示意图。Figure 9 is a schematic structural diagram of an equipment control device provided by an exemplary embodiment of the present disclosure.
图10是本公开另一示例性实施例提供的设备控制装置的结构示意图。Figure 10 is a schematic structural diagram of an equipment control device provided by another exemplary embodiment of the present disclosure.
图11是本公开一示例性实施例提供的电子设备的结构示意图。 FIG. 11 is a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
下面,将参考附图详细地描述根据本公开的示例实施例。显然,所描述的实施例仅仅是本公开的一部分实施例,而不是本公开的全部实施例,应理解,本公开不受这里描述的示例实施例的限制。Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present disclosure, rather than all embodiments of the present disclosure, and it should be understood that the present disclosure is not limited to the example embodiments described here.
应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。It should be noted that the relative arrangement of components and steps, numerical expressions, and numerical values set forth in these examples do not limit the scope of the disclosure unless otherwise specifically stated.
本领域技术人员可以理解,本公开实施例中的“第一”、“第二”等术语仅用于区别不同步骤、设备或模块等,既不代表任何特定技术含义,也不表示它们之间的必然逻辑顺序。Those skilled in the art can understand that terms such as "first" and "second" in the embodiments of the present disclosure are only used to distinguish different steps, devices or modules, etc., and do not represent any specific technical meaning, nor do they represent the differences between them. necessary logical sequence.
还应理解,在本公开实施例中,“多个”可以指两个或两个以上,“至少一个”可以指一个、两个或两个以上。It should also be understood that in the embodiments of the present disclosure, "plurality" may refer to two or more than two, and "at least one" may refer to one, two, or more than two.
本公开实施例可以应用于终端设备、计算机系统、服务器等电子设备,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与终端设备、计算机系统、服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。Embodiments of the present disclosure may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which may operate with numerous other general or special purpose computing system environments or configurations. Examples of well-known terminal devices, computing systems, environments and/or configurations suitable for use with terminal devices, computer systems, servers and other electronic devices include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients Computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems and distributed cloud computing technology environments including any of the above systems, etc.
申请概述Application Overview
人工智能(Artificial Intelligence,AI)是使机器能够胜任一些通常需要人类智能才能完成的复杂工作,为了执行人的指令,高效、准确的人机交互是必需的。近年来,随着AI技术的不断发展,语音识别技术在车载设备中的应用越来越受到业内的关注。Artificial Intelligence (AI) enables machines to perform complex tasks that usually require human intelligence. In order to execute human instructions, efficient and accurate human-computer interaction is necessary. In recent years, with the continuous development of AI technology, the application of speech recognition technology in vehicle-mounted equipment has attracted more and more attention in the industry.
为了避免行驶过程中驾驶员手动操作控制车载设备存在的不方便性和不安全性问题,相关技术中,通过语音命令操作控制车载设备。In order to avoid the inconvenience and unsafety problems caused by the driver's manual operation to control the vehicle-mounted equipment during driving, in related technologies, the vehicle-mounted equipment is controlled through voice command operation.
然而,本发明人通过研究发现,通过语音命令操作控制车载设备的方法,无法实现对车载设备的连续性操作控制,对车载设备的控制效果较差。例如,通过语音命令“开窗”控制打开车辆窗户时,只能按照默认设置控制车辆窗户的打开幅度,无法精确控制车辆窗户的打开幅度,如果打开幅度未达到用户的预期程度,则需要多次语音命令“开窗”多次控制增大车辆窗户的打开幅度,控制效率低下;而如果车辆窗户的打开幅度超出用户的预期程度,则无法精确减小车辆窗户的打开幅度,从而无法满足用户需求。However, the inventor found through research that the method of controlling vehicle-mounted equipment through voice command operations cannot achieve continuous operation and control of vehicle-mounted equipment, and the control effect on vehicle-mounted equipment is poor. For example, when the vehicle window is opened through the voice command "open window", the opening range of the vehicle window can only be controlled according to the default settings, and the opening range of the vehicle window cannot be precisely controlled. If the opening range does not reach the user's expected level, it will take multiple times. The voice command "open window" controls multiple times to increase the opening range of the vehicle window, which is inefficient; and if the opening range of the vehicle window exceeds the user's expectation, the opening range of the vehicle window cannot be accurately reduced, thus failing to meet user needs. .
有鉴于此,本公开实施例提出一种设备控制方法和装置、电子设备和介质,以提高选取和操作控制车载设备的效率、便利性和安全性,同时实现对目标设备的连续性操作控制。In view of this, embodiments of the present disclosure propose an equipment control method and device, electronic equipment, and media to improve the efficiency, convenience, and safety of selecting and operating and controlling vehicle-mounted equipment, while achieving continuous operation control of target equipment.
本公开实施例通过语音控制指令确定需要调节的目标设备,通过动态手势的持续动作对目标设备的状态进行连续调节,既无需手动选择目标设备,可以提高目标设备选取的效率和便利性,有效避免手动选择目标设备存在的不方便性问题,又实现了对目标设备的连续性操作控制,使得对目标设备的状态的调节更灵活、精细、精确,从而提高了对目标设备的控制效果。The embodiments of the present disclosure determine the target device that needs to be adjusted through voice control instructions, and continuously adjust the status of the target device through the continuous action of dynamic gestures. This eliminates the need to manually select the target device, improves the efficiency and convenience of selecting the target device, and effectively avoids The inconvenience problem of manually selecting the target device is eliminated, and continuous operation control of the target device is realized, making the adjustment of the status of the target device more flexible, precise and precise, thus improving the control effect of the target device.
本公开实施例可用于对家电设备、车载设备、终端设备等任意设备的状态调节。本公开实施例应用于车辆时,可以提高选取和操作控制车载设备的效率、便利性和安全性,有效避免驾驶员在行驶过程中手动 操作控制车载设备存在的不方便性和不安全性问题;并且,基于动态手势的持续动作实现了对车载设备的连续性操作控制,使得对车载设备的状态的调节更灵活、精细、精确,从而提高了对车载设备的控制效果。Embodiments of the present disclosure can be used to adjust the status of any equipment such as home appliances, vehicle-mounted equipment, and terminal equipment. When the embodiments of the present disclosure are applied to vehicles, the efficiency, convenience and safety of selecting and operating vehicle-mounted equipment can be improved, and the driver's manual operation can be effectively avoided during driving. The inconvenience and unsafety of operating and controlling vehicle-mounted equipment exist; moreover, continuous actions based on dynamic gestures realize continuous operation control of vehicle-mounted equipment, making the adjustment of the status of vehicle-mounted equipment more flexible, precise, and precise, thus Improved the control effect of vehicle-mounted equipment.
示例性系统Example system
图1是本公开所适用的系统图。如图1所示,通过音频采集模块102(例如麦克风等)采集得到语音控制指令,该语音控制指令或该语音控制指令经前端信号处理后,输入本公开实施例的设备控制装置104。由设备控制装置104对接收到的语音控制指令进行语音识别,得到第一语音识别结果后,基于该语音识别结果确定语音控制指令对应的目标设备106,调用图像采集模块108(例如摄像头等)采集视频流,并针对图像采集模块108采集的视频流进行预设动态手势检测,在检测到预设动态手势时,基于该动态手势的持续动作对目标设备106的状态进行连续调节。Figure 1 is a system diagram to which the present disclosure is applicable. As shown in FIG. 1 , the voice control command is collected through the audio collection module 102 (such as a microphone, etc.). The voice control command or the voice control command is input into the equipment control device 104 of the embodiment of the present disclosure after being processed by the front-end signal. The device control device 104 performs voice recognition on the received voice control instruction, and after obtaining the first voice recognition result, determines the target device 106 corresponding to the voice control instruction based on the voice recognition result, and calls the image acquisition module 108 (such as a camera, etc.) to collect video stream, and perform preset dynamic gesture detection on the video stream collected by the image acquisition module 108. When the preset dynamic gesture is detected, the state of the target device 106 is continuously adjusted based on the continuous action of the dynamic gesture.
本公开实施例可用于对家电设备、车载设备、终端设备等任意设备的状态调节,即上述目标设备106可以是家电设备、车载设备、终端设备等任意设备。上述目标设备106为车载设备时,本公开实施例针对座舱内的各种交互场景,基于语音和动态手势混合进行人机交互,通过对语音控制指令进行语音识别获得对待操控设备的操控权,进而通过动态手势来对待操控设备进行各种可能的连续性操作控制,在对待操控设备进行连续性操作控制的过程中,还可以通过动态手势的运动速度控制待操控设备的调节速度,可以提高选取和操作控制车载设备的效率、便利性和安全性,有效避免驾驶员在行驶过程中手动操作控制车载设备存在的不方便性和不安全性问题;并且,基于动态手势的持续动作实现了对车载设备的连续性操作控制,使得对车载设备的状态的调节更灵活、精细、精确,从而提高了对车载设备的控制效果。本公开实施例充分调用了语音控制优秀的权限接口能力和动态手势的精细调节能力,具备操作简单、鲁棒性好、调节精细、交互效率高、功能广泛的特点。The embodiments of the present disclosure can be used to adjust the status of any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc. That is, the target device 106 can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc. When the above-mentioned target device 106 is a vehicle-mounted device, the embodiment of the present disclosure targets various interaction scenarios in the cockpit, performs human-computer interaction based on a mixture of voice and dynamic gestures, and obtains control rights of the device to be controlled by performing speech recognition on the voice control instructions, and then Dynamic gestures are used to perform various possible continuous operations and controls on the equipment to be controlled. In the process of continuous operation and control of the equipment to be controlled, the adjustment speed of the equipment to be controlled can also be controlled through the movement speed of the dynamic gestures, which can improve selection and The efficiency, convenience and safety of operating and controlling vehicle-mounted equipment can effectively avoid the inconvenience and unsafety problems caused by drivers manually operating and controlling vehicle-mounted equipment while driving; and, continuous actions based on dynamic gestures realize the control of vehicle-mounted equipment. The continuous operation control makes the adjustment of the status of the vehicle-mounted equipment more flexible, precise and precise, thus improving the control effect of the vehicle-mounted equipment. The disclosed embodiment fully utilizes the excellent permission interface capability of voice control and the fine adjustment capability of dynamic gestures, and has the characteristics of simple operation, good robustness, fine adjustment, high interaction efficiency, and wide range of functions.
示例性方法Example methods
图2是本公开一示例性实施例提供的设备控制方法的流程示意图。本实施例可应用在电子设备上,如图2所示,本实施例的设备控制方法包括如下步骤:Figure 2 is a schematic flowchart of a device control method provided by an exemplary embodiment of the present disclosure. This embodiment can be applied to electronic devices. As shown in Figure 2, the device control method of this embodiment includes the following steps:
步骤202,响应于接收到语音控制指令,对该语音控制指令进行语音识别,得到第一语音识别结果。Step 202: In response to receiving the voice control instruction, perform voice recognition on the voice control instruction to obtain a first voice recognition result.
本公开实施例中的语音控制指令,是通过音频采集模块(例如麦克风等)直接采集得到原始语音控制指令,或者对音频采集模块采集到的原始语音控制指令进行前端信号处理后得到的语音控制指令,本公开实施例对此不做限制。The voice control instructions in the embodiments of the present disclosure are the original voice control instructions directly collected through the audio collection module (such as a microphone, etc.), or the voice control instructions obtained by performing front-end signal processing on the original voice control instructions collected by the audio collection module. , the embodiment of the present disclosure does not limit this.
其中,前端信号处理例如可以包括但不限于:语音活动检测(Voice Activity Detection,VAD)、降噪、声学回声消除(Acoustic Echo Cancellaction,AEC)、去混响处理、设备控制、波束形成(Beam Forming,BF)等。Among them, front-end signal processing can include, for example, but is not limited to: Voice Activity Detection (VAD), noise reduction, Acoustic Echo Cancellation (AEC), dereverberation processing, device control, and beam forming (Beam Forming). , BF) etc.
语音活动检测又称语音端点检测、语音边界检测,是指在噪声环境中检测音频信号中语音的存在与否,准确的检测出音频信号中语音段起始位置,通常用于语音编码、语音增强等语音处理系统中,起到降低语 音编码速率、节省通信带宽、减少移动设备能耗、提高识别率等作用。VAD的起点是从静音到语音,VAD的结束点是从语音到静音,VAD的结束点的判断需要一段静音。原始音频信号经前端信号处理得到的语音,包括从VAD的起点到结束点的语音,因此,本公开实施例中的语音控制指令,在语音段后还可能包括一段静音。Voice activity detection, also known as voice endpoint detection and voice boundary detection, refers to detecting the presence or absence of voice in audio signals in a noisy environment and accurately detecting the starting position of the voice segment in the audio signal. It is usually used for voice coding and voice enhancement. In other speech processing systems, it plays a role in reducing speech Audio coding rate, communication bandwidth saving, mobile device energy consumption reduction, recognition rate improvement, etc. The starting point of VAD is from silence to speech, the end point of VAD is from speech to silence, and the determination of the end point of VAD requires a period of silence. The speech obtained by the front-end signal processing of the original audio signal includes the speech from the starting point to the end point of the VAD. Therefore, the speech control instructions in the embodiments of the present disclosure may also include a period of silence after the speech segment.
步骤204,基于第一语音识别结果,确定语音控制指令对应的目标设备。Step 204: Based on the first voice recognition result, determine the target device corresponding to the voice control instruction.
该语音控制指令对应的目标设备,即需要对其状态进行调节的设备。该目标设备可以是家电设备、车载设备、终端设备等任意设备,其中的车载设备即车辆上设备,例如可以包括但不限于车辆上的以下设备:左后视镜、右后视镜、车辆内部后视镜、各窗户、各空调、各座椅、音响、各灯等等。本公开实施例对目标设备的范围和车载设备的具体范围不做限制。The target device corresponding to the voice control instruction is the device whose status needs to be adjusted. The target device can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc. The vehicle-mounted equipment is the equipment on the vehicle. For example, it can include but is not limited to the following equipment on the vehicle: left rearview mirror, right rearview mirror, vehicle interior Rearview mirrors, windows, air conditioners, seats, stereos, lights, etc. The embodiments of the present disclosure do not limit the scope of the target device and the specific scope of the vehicle-mounted device.
步骤206,响应于检测到预设动态手势,基于该动态手势的持续动作对目标设备的状态进行连续调节。Step 206: In response to detecting the preset dynamic gesture, continuously adjust the state of the target device based on the continuous action of the dynamic gesture.
通过该步骤206,用户可以通过持续做出动态手势实现对目标设备的状态的连续调节,直至目标设备的状态达到用户预期的状态效果,例如车辆的窗户降到用户预期的高度,停止该动态手势动作,可以停止对目标设备的状态的调节。Through this step 206, the user can continuously adjust the state of the target device by continuously making dynamic gestures until the state of the target device reaches the state effect expected by the user, for example, the vehicle window is lowered to the height expected by the user, and the dynamic gesture is stopped. Action to stop adjusting the status of the target device.
基于本实施例,可以基于语音控制指令确定需要调节的目标设备,而无需手动选择目标设备,可以提高目标设备选取的效率和便利性,有效避免手动选择目标设备存在的不方便性问题;另外,基于动态手势的持续动作对该目标设备的状态进行连续调节,实现了对目标设备的连续性操作控制,使得对目标设备的状态的调节更灵活、精细、精确,从而提高了对目标设备的控制效果。Based on this embodiment, the target device that needs to be adjusted can be determined based on the voice control instruction without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device; in addition, Continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, fine, and precise, thus improving the control of the target device. Effect.
本公开实施例中的预设动态手势可以考虑以下特点设计:(1)符合自然习惯,易于做出,以提高动作便利性;(2)动态手势,相对于单帧图像中的静态手部动作,鲁棒性好;(3)与日常习惯性动作相区别,误报为其他动作的概率低;(4)具有不同的运动方向,可以复用。The preset dynamic gestures in the embodiments of the present disclosure can be designed with the following characteristics in mind: (1) consistent with natural habits and easy to make to improve the convenience of movements; (2) dynamic gestures, compared to static hand movements in a single frame image , good robustness; (3) Different from daily habitual actions, the probability of false positives for other actions is low; (4) It has different movement directions and can be reused.
基于上述特点,在其中一些实现方式中,上述预设动态手势例如可以为:画圈,即隔空手势画圈,例如可以包括但不限于左手画圈、右手画圈、双手画圈,任意一根或多根手指画圈,握拳画圈,屈指画圈等等。如图3所示,为一根根手指画圈的一个示意图。本公开实施例的预设动态手势不限于此,可以为任意以上特点的动态手势。Based on the above characteristics, in some of the implementations, the above-mentioned preset dynamic gestures can be, for example: drawing circles, that is, drawing circles in the air, which can include but is not limited to drawing circles with the left hand, drawing circles with the right hand, drawing circles with both hands, any one Draw circles with one or more fingers, make a fist to draw circles, bend your fingers to draw circles, etc. As shown in Figure 3, it is a schematic diagram of drawing circles with one finger. The preset dynamic gestures in the embodiments of the present disclosure are not limited to this, and can be dynamic gestures with any of the above characteristics.
本实施例中的预设动态手势可以同时满足以上特点,具备鲁棒性高、少而精、符合自然习惯、识别准确性高、易于复用,从而可以提高识别稳定性和准确性,有助于对设备的连续性操作控制。The preset dynamic gestures in this embodiment can meet the above characteristics at the same time. They are highly robust, few but precise, in line with natural habits, have high recognition accuracy, and are easy to reuse. This can improve recognition stability and accuracy and help For continuous operational control of equipment.
图4是本公开另一示例性实施例提供的设备控制方法的流程示意图。如图4所示,在上述图2所示实施例的基础上,本实施例的设备控制方法还可以包括如下步骤:FIG. 4 is a schematic flowchart of a device control method provided by another exemplary embodiment of the present disclosure. As shown in Figure 4, based on the above embodiment shown in Figure 2, the device control method of this embodiment may also include the following steps:
步骤205,确定目标设备的待调节的目标维度参数。Step 205: Determine the target dimension parameters to be adjusted of the target device.
其中的目标维度参数,即需要对目标设备的状态进行调节的维度参数。例如,目标设备为车辆上的窗户时,目标维度参数可以是窗户的升降维度;目标设备为车辆上的座椅时,目标维度参数可以是座椅的前后维度、高低维度、靠背后倾角度维度等;目标设备为车辆上的灯时,目标维度参数可以是灯的亮度维度、色彩维度等;目标设备为车辆上的左后视镜、右后视镜时,目标维度参数可以是左后视镜、右后视镜的俯 仰角度维度、偏航角度维度。再如,目标设备为家电设备例如电视时,目标维度参数可以是电视的频道维度、音量维度、亮度维度等。本公开实施例中,目标设备的待调节的目标维度参数,可以是目标设备的可调节的任意维度参数,本公开实施例对可调节的维度参数不做限制。The target dimension parameter is the dimension parameter that needs to adjust the status of the target device. For example, when the target device is a window on a vehicle, the target dimension parameter can be the lifting dimension of the window; when the target device is a seat on the vehicle, the target dimension parameter can be the front and rear dimensions, height dimensions, and backrest tilt angle dimensions of the seat. etc.; when the target device is a light on a vehicle, the target dimension parameter can be the brightness dimension, color dimension, etc. of the light; when the target device is the left rearview mirror or right rearview mirror on the vehicle, the target dimension parameter can be the left rearview mirror. mirror, right rearview mirror Elevation angle dimension, yaw angle dimension. For another example, when the target device is a home appliance such as a television, the target dimension parameter may be the channel dimension, volume dimension, brightness dimension, etc. of the TV. In the embodiment of the present disclosure, the target dimension parameter to be adjusted of the target device may be any adjustable dimension parameter of the target device. The embodiment of the present disclosure does not place a limit on the adjustable dimension parameter.
相应地,该实施例中,步骤206可以包括:Correspondingly, in this embodiment, step 206 may include:
步骤2062,响应于检测到预设动态手势,确定该动态手势的运动方向。Step 2062: In response to detecting the preset dynamic gesture, determine the movement direction of the dynamic gesture.
步骤2064,基于动态手势的运动方向,确定目标设备在目标维度参数上的目标调节方向。Step 2064: Based on the movement direction of the dynamic gesture, determine the target adjustment direction of the target device on the target dimension parameter.
可选地,在其中一些实现方式中,可以预先设定动态手势的运动方向与设备、设备的维度参数、以及调节方向四者之间的对应关系。在确定该动态手势的运动方向后,可以基于该动态手势的运动方向、目标设备和目标维度参数,查询该对应关系得到目标调节方向。如下表1所示,为本公开实施例中,动态手势为画圈时,画圈方向与设备、设备的维度参数、以及调节方向四者之间的对应关系的一个部分内容示例,不够成对本公开实施例动态手势的运动方向与设备、设备的维度参数、以及调节方向四者之间的对应关系具体内容的限制。Optionally, in some implementations, the corresponding relationship between the movement direction of the dynamic gesture and the device, the dimensional parameters of the device, and the adjustment direction can be preset. After determining the movement direction of the dynamic gesture, the corresponding relationship can be queried to obtain the target adjustment direction based on the movement direction of the dynamic gesture, the target device, and the target dimension parameters. Table 1 below is a partial example of the correspondence between the circle direction and the device, the device's dimensional parameters, and the adjustment direction when the dynamic gesture is a circle in an embodiment of the present disclosure. In the disclosed embodiment, the specific content of the correspondence between the movement direction of the dynamic gesture and the device, the dimensional parameters of the device, and the adjustment direction is limited.
表1
Table 1
步骤2066,基于动态手势在该动态手势的运动方向上的持续动作,对目标设备在目标维度参数上,向目标调节方向进行连续调节。Step 2066: Based on the continuous action of the dynamic gesture in the movement direction of the dynamic gesture, continuously adjust the target dimension parameters of the target device in the target adjustment direction.
基于本实施例,在确定目标设备的待调节的目标维度参数后,可以通过动态手势的运动方向确定目标设备在目标维度参数上的目标调节方向,由此可以确定出目标设备待调节的目标维度参数和目标调节方向,进而,基于动态手势在该运动方向上的持续动作,便可以实现对目标设备在目标维度参数上向目标调节方向的连续调节,从而实现了对目标设备在目标维度参数上向目标调节方向的连续性操作控制。 Based on this embodiment, after determining the target dimension parameters of the target device to be adjusted, the target adjustment direction of the target device on the target dimension parameters can be determined through the movement direction of the dynamic gesture, thereby determining the target dimension of the target device to be adjusted. parameters and the target adjustment direction. Furthermore, based on the continuous action of the dynamic gesture in the direction of movement, the continuous adjustment of the target device in the target dimension parameters toward the target adjustment direction can be achieved, thereby achieving the target device in the target dimension parameters. Continuous operational control in the direction of target adjustment.
本公开实施例中的目标设备,可以是基于一个维度参数确定状态的设备,即该设备的状态基于一个维度参数确定,该设备只有一个维度参数可调节,该维度参数上的各参数值分别对应于设备的一个状态。例如,车辆上的窗户即为基于升降维度这一个维度参数确定状态的设备,窗户在升降维度上的不同高度值分别对应于窗户的一个状态。The target device in the embodiment of the present disclosure may be a device whose status is determined based on one dimension parameter. That is, the status of the device is determined based on one dimension parameter. The device has only one adjustable dimension parameter, and each parameter value on the dimension parameter corresponds to A state of the device. For example, a window on a vehicle is a device whose state is determined based on one dimensional parameter of the lifting dimension. Different height values of the window in the lifting dimension correspond to a state of the window.
或者,本公开实施例中的目标设备,也可以是基于多个维度参数确定状态的目标设备,即该设备的状态基于该多个维度参数共同确定,该设备有多个维度参数可调节,该多个维度参数上的一组参数值分别对应于设备的一个状态,在该多个维度参数中任意一个维度参数上的参数值变化时,设备的状态便发生了变化。例如,车辆上的左后视镜、右后视镜即为基于俯仰角度维度、偏航角度维度这两个维度参数共同确定状态的设备,每一组(俯仰角度维度上的角度值、偏航角度维度上的角度值)分别对应于左后视镜、右后视镜的一个状态,俯仰角度维度和偏航角度维度上任一维度参数或全部参数维度的角度值发生变化时,左后视镜、右后视镜的状态也便发生了变化。Alternatively, the target device in the embodiment of the present disclosure may also be a target device whose status is determined based on multiple dimensional parameters, that is, the status of the device is jointly determined based on the multiple dimensional parameters, and the device has multiple adjustable dimensional parameters. A set of parameter values on multiple dimensional parameters respectively corresponds to a state of the device. When the parameter value on any one of the multiple dimensional parameters changes, the state of the device changes. For example, the left rearview mirror and the right rearview mirror on the vehicle are devices that jointly determine the state based on the pitch angle dimension and the yaw angle dimension. Each group (the angle value in the pitch angle dimension, the yaw angle dimension) The angle value in the angle dimension) corresponds to a state of the left rearview mirror and the right rearview mirror respectively. When the angle values of any or all parameters in the pitch angle dimension and yaw angle dimension change, the left rearview mirror , the status of the right rearview mirror has also changed.
在其中一些实现方式中,本公开实施例中,在目标设备的状态基于一个维度参数确定的情况下,在步骤205中,可以直接确定目标设备的该一个维度参数为目标维度参数。In some implementations, in the embodiments of the present disclosure, when the status of the target device is determined based on one dimension parameter, in step 205, the one dimension parameter of the target device can be directly determined as the target dimension parameter.
基于本实施例,在目标设备的状态基于一个维度参数确定时,该目标设备只有一个维度参数可调节,则可以直接确定目标设备的该一个维度参数为目标维度参数,而无需用户指定需要调节的目标维度参数,有助于提高目标维度参数的确定效率,从而提高对目标设备的控制效率。Based on this embodiment, when the status of the target device is determined based on one dimension parameter, and the target device has only one dimension parameter that can be adjusted, then the one dimension parameter of the target device can be directly determined as the target dimension parameter without the user having to specify the parameter that needs to be adjusted. The target dimension parameters help to improve the efficiency of determining the target dimension parameters, thereby improving the control efficiency of the target device.
在其中一些实现方式中,本公开实施例中,在目标设备的状态基于多个维度参数确定的情况下,在步骤205中,可以基于第一语音识别结果,确定目标维度参数。In some implementations, in the embodiments of the present disclosure, when the status of the target device is determined based on multiple dimensional parameters, in step 205, the target dimensional parameters may be determined based on the first speech recognition result.
本实施例中,用户可以通过语音控制指令直接携带需要调节的目标维度参数的相关信息,例如,语音控制指令可以是语音“我要调节主驾座椅的前后”、“我要调节主驾座椅,前后调整”、“我要向前调主驾座椅”、“我要调左后视镜的俯仰”等,本公开实施例对语音控制指令中携带目标维度参数的相关信息的内容形式和格式不做限制。则对该语音控制指令进行语音识别得到的文本形式的第一语音识别结果中,即包括目标维度参数的相关信息,基于该目标维度参数的相关信息便可以确定目标维度参数。In this embodiment, the user can directly carry relevant information about the target dimensional parameters that need to be adjusted through voice control instructions. For example, the voice control instructions can be the voice "I want to adjust the front and rear of the main driver's seat", "I want to adjust the main driver's seat." Chair, adjust forward and backward", "I want to adjust the driver's seat forward", "I want to adjust the pitch of the left rearview mirror", etc., the embodiments of the present disclosure have a content form that carries relevant information about the target dimension parameters in the voice control instructions. There are no restrictions on the format. Then, the first speech recognition result in text form obtained by performing speech recognition on the speech control instruction includes relevant information of the target dimension parameters, and the target dimension parameters can be determined based on the relevant information of the target dimension parameters.
例如,在具体实现中,可以预先设定各设备的维度参数,对该语音控制指令进行语音识别得到第一语音识别结果后,针对目标设备,确定该第一语音识别结果中目标维度参数的相关信息关联的或最接近的维度参数,作为待调节的目标维度参数。例如针对主驾座椅这一目标设备,其存在前后维度、高低维度、靠背后倾角度维度共三个维度,则基于第一语音识别结果“我要调节主驾座椅的前后”中目标维度参数的相关信息“前后”,基于第一语音识别结果“我要调节主驾座椅,前后调整”中目标维度参数的相关信息“前后调整”,基于第一语音识别结果“我要向前调主驾座椅”中目标维度参数的相关信息“向前调”,可以确定目标维度参数的相关信息“前后”、“前后调整”、“向前调”关联的或最接近的维度参数为前后维度,作为主驾座椅的待调节的目标维度参数。For example, in a specific implementation, the dimensional parameters of each device can be preset, and after voice recognition is performed on the voice control instruction to obtain the first voice recognition result, the correlation of the target dimensional parameters in the first voice recognition result is determined for the target device. The information-related or closest dimension parameter is used as the target dimension parameter to be adjusted. For example, for the target device of the main driver's seat, which has three dimensions: front and rear dimensions, high and low dimensions, and backrest tilt angle dimension, the target dimension is based on the first voice recognition result "I want to adjust the front and rear of the main driver's seat." The relevant information of the parameter "front and rear" is based on the first voice recognition result "I want to adjust the driver's seat, front and rear". The relevant information of the target dimension parameter "front and rear adjustment" is based on the first voice recognition result "I want to adjust it forward." The relevant information of the target dimensional parameter in "Main Driver's Seat" is "Adjust Forward", and the relevant information of the target dimensional parameter "Front and Back", "Front and Back Adjustment", and "Adjust Forward" can be determined. The related or closest dimensional parameters are front and rear. Dimension, as the target dimension parameter to be adjusted for the main driver's seat.
另外,在具体实现中,可以采用预设确定方式,针对目标设备,确定第一语音识别结果中目标维度参数的相关信息关联的或最接近的维度参数。例如,可以确定目标设备的维度参数名称中,与第一语音识别 结果中目标维度参数的相关信息中相同字符最多的维度参数,为关联的或最接近的维度参数。又如,可以预先设定一个信息列表,该信息列表包括各设备的各维度参数可能对应的相关信息,则基于第一语音识别结果中目标维度参数的相关信息,可以针对目标设备查询信息列表,得到匹配的维度参数,作为关联的或最接近的维度参数。另外,本公开实施例也可以采用其他方式确定第一语音识别结果中目标维度参数的相关信息关联的或最接近的维度参数,本公开实施例对此不做限制。In addition, in a specific implementation, a preset determination method may be used to determine, for the target device, the dimensional parameter associated with or closest to the relevant information of the target dimensional parameter in the first speech recognition result. For example, you can determine the dimension parameter name of the target device, with the first speech recognition The dimension parameter with the most identical characters among the related information of the target dimension parameter in the result is the associated or closest dimension parameter. For another example, an information list can be preset, and the information list includes relevant information that may correspond to each dimension parameter of each device. Then based on the relevant information of the target dimension parameter in the first speech recognition result, the information list can be queried for the target device, Get the matching dimension parameter as the associated or closest dimension parameter. In addition, the embodiments of the present disclosure may also use other methods to determine the dimensional parameters associated with or the closest dimensional parameters to the relevant information of the target dimensional parameters in the first speech recognition result, and the embodiments of the present disclosure do not limit this.
基于本实施例,在目标设备的状态基于多个维度参数确定时,用户可以直接通过语音控制指令指定需要调节的目标维度参数,而无需再单独指定需要调节的目标维度参数,有助于提高目标维度参数的确定效率,从而提高对目标设备的控制效率。Based on this embodiment, when the status of the target device is determined based on multiple dimensional parameters, the user can directly specify the target dimensional parameters that need to be adjusted through voice control instructions without having to separately specify the target dimensional parameters that need to be adjusted, which helps to improve the target The efficiency of determining dimensional parameters is improved, thereby improving the control efficiency of the target device.
在另一些实现方式中,在目标设备的状态基于多个维度参数确定的情况下,在步骤205中,响应于接收到维度参数语音指令,可以对该维度参数语音指令进行语音识别,得到第二语音识别结果,然后,基于该第二语音识别结果,确定目标维度参数。In other implementations, when the status of the target device is determined based on multiple dimensional parameters, in step 205, in response to receiving the dimensional parameter voice command, voice recognition can be performed on the dimensional parameter voice command to obtain the second The speech recognition result is then determined based on the second speech recognition result, and the target dimension parameters are determined.
本实施例中,用户可以在发送语音控制指令后,直接发送维度参数语音指令,例如,用户可以在发送语音控制指令“我要调节主驾座椅”后,直接发送维度参数语音指令“前后调整”。或者,也可以由实现本公开实施例的装置在接收到用户发送的语音控制指令后,输出维度参数询问语音,并接收用户针对该维度参数询问语音发送的维度参数语音指令,例如,用户发送语音控制指令“我要调节主驾座椅”,由实现本公开实施例的装置在接收到该语音控制指令“我要调节主驾座椅”后,输出维度参数询问语音“好的,请问您希望如何调整?”,并接收用户针对该维度参数询问语音发送的维度参数语音指令“前后调整”。则对该维度参数语音指令进行语音识别,得到第二语音识别结果后,可以基于该第二语音识别结果确定目标维度参数。In this embodiment, the user can directly send the dimensional parameter voice command after sending the voice control command. For example, the user can directly send the dimensional parameter voice command "Adjust forward and backward" after sending the voice control command "I want to adjust the driver's seat." ". Alternatively, the device that implements the embodiments of the present disclosure may also output the dimension parameter inquiry voice after receiving the voice control instruction sent by the user, and receive the dimension parameter voice instruction sent by the user for the dimension parameter inquiry voice. For example, the user sends a voice The control command "I want to adjust the driver's seat". After receiving the voice control command "I want to adjust the driver's seat", the device implementing the embodiment of the present disclosure outputs the dimensional parameter inquiry voice "Okay, what do you want? How to adjust?", and receive the dimension parameter voice command "adjust before and after" sent by the user inquiring about the dimension parameter. Then perform voice recognition on the dimensional parameter voice command, and after obtaining the second voice recognition result, the target dimensional parameter can be determined based on the second voice recognition result.
在具体实现中,可以预先设定各设备的维度参数,对维度参数语音指令进行语音识别得到的第二语音识别结果后,针对目标设备,确定该第二语音识别结果关联的或最接近的维度参数,作为待调节的目标维度参数。In a specific implementation, the dimension parameters of each device can be preset, and the second speech recognition result obtained by speech recognition of the dimension parameter voice command is determined for the target device, the dimension associated with the second speech recognition result or the closest dimension parameter as the target dimension parameter to be adjusted.
在具体实现中,可以采用预设确定方式,针对目标设备,确定第二语音识别结果关联的或最接近的维度参数。具体的确定方式,可以参考上述实施例确定第一语音识别结果中目标维度参数的相关信息关联的或最接近的维度参数的实现方式,此处不再赘述。In a specific implementation, a preset determination method may be used to determine the dimensional parameters associated with the second speech recognition result or the closest dimensional parameters for the target device. For a specific determination method, reference can be made to the implementation method of determining the dimensional parameter that is associated with the relevant information of the target dimensional parameter in the first speech recognition result or the closest dimensional parameter in the first speech recognition result, and will not be described again here.
基于本实施例,在目标设备的状态基于多个维度参数确定时,用户可以通过单独的维度参数语音指令指定需要调节的维度参数,由此即可确定目标设备需要调节的目标维度参数。Based on this embodiment, when the status of the target device is determined based on multiple dimensional parameters, the user can specify the dimensional parameter that needs to be adjusted through a separate dimensional parameter voice command, thereby determining the target dimensional parameter that needs to be adjusted by the target device.
在又一些实现方式中,在目标设备的状态基于多个维度参数确定的情况下,在步骤205中,也可以获取动态手势对应的手部形态信息,然后,基于该手部形态信息确定目标维度参数。In some implementations, when the status of the target device is determined based on multiple dimensional parameters, in step 205, hand morphology information corresponding to the dynamic gesture can also be obtained, and then the target dimensions are determined based on the hand morphology information. parameter.
其中的手部形态信息,例如可以包括但不限于以下任意一项:手指伸出形式、手指数量、单双手信息等。其中,手指伸出形式例如可以是伸直、弯曲等;手指数量例如可以是一根、两根等;单双手信息例如可是左手、右手、或者双手等。The hand shape information may include, for example, but is not limited to any of the following: finger extension form, number of fingers, single-hand information, etc. The finger extension form may be, for example, straightening, bending, etc.; the number of fingers may be, for example, one, two, etc.; the single-hand information may be, for example, the left hand, the right hand, or both hands, etc.
具体来说,可以预先设定的手部形态信息与设备、以及设备的维度参数三者之间的对应关系,在步骤 205中获取到动态手势对应的手部形态信息后,基于目标设备和获取到的手部形态信息,从上述对应关系中获取与该目标设备和获取到的手部形态信息对应的维度参数,作为目标维度参数。Specifically, the correspondence between the hand shape information, the device, and the dimensional parameters of the device can be preset in step After the hand morphology information corresponding to the dynamic gesture is obtained in step 205, based on the target device and the obtained hand morphology information, the dimensional parameters corresponding to the target device and the obtained hand morphology information are obtained from the above correspondence relationship, as Target dimension parameters.
如下表2所示,为本公开实施例中,手部形态信息为手指数量、设备为车载设备时,手部形态信息与车载设备、以及车载设备的维度参数三者之间的对应关系的一个部分内容示例。As shown in Table 2 below, in the embodiment of the present disclosure, when the hand morphology information is the number of fingers and the device is a vehicle-mounted device, the correspondence between the hand morphology information, the vehicle-mounted device, and the dimensional parameters of the vehicle-mounted device An example of partial content.
表2
Table 2
如下表3所示,为本公开实施例中,手部形态信息为单双手信息、设备为车载设备时,手部形态信息与车载设备、以及车载设备的维度参数三者之间的对应关系的一个部分内容示例。As shown in Table 3 below, in the embodiment of the present disclosure, when the hand morphology information is single-hand information and the device is a vehicle-mounted device, the correspondence between the hand morphology information, the vehicle-mounted device, and the dimensional parameters of the vehicle-mounted device An example of partial content.
表3
table 3
以上表2和表3仅示例性示出手部形态信息与设备、以及设备的维度参数三者之间的对应关系的部分内容,对于手部形态信息为表2和表3外的其他手部形态信息、设备为表2和表3外的其他设备(例如其他车载设备、家电设备、终端设备等)的情况,在内容结构上可以参考表2和表3,本公开实施例不再赘述。The above Table 2 and Table 3 only exemplarily show part of the correspondence between the hand morphology information and the device, as well as the dimensional parameters of the device. The hand morphology information is other hands other than Table 2 and Table 3. If the form information and device are other devices than Table 2 and Table 3 (such as other vehicle-mounted devices, home appliances, terminal devices, etc.), refer to Table 2 and Table 3 for the content structure, and the embodiments of this disclosure will not be repeated.
基于本实施例,在目标设备的状态基于多个维度参数确定时,可以通过用户的手部形态信息实现对目标设备需要调节的目标维度参数的确定。Based on this embodiment, when the status of the target device is determined based on multiple dimensional parameters, the target dimensional parameters that need to be adjusted for the target device can be determined through the user's hand shape information.
图5是本公开又一示例性实施例提供的设备控制方法的流程示意图。如图5所示,在上述图4所示实施例的基础上,步骤2066可以包括如下步骤:Figure 5 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 5, based on the above embodiment shown in Figure 4, step 2066 may include the following steps:
步骤20662,在动态手势的持续动作期间,实时或者按照预设调节周期,获取该动态手势的运动速度。Step 20662: During the continuous action of the dynamic gesture, obtain the movement speed of the dynamic gesture in real time or according to a preset adjustment period.
为了实现对目标设备的状态的实时、动态调节效果,其中的预设调节周期的取值可以设置的较小,例如0.01s,本公开实施例可以根据具体操控的设备、以及调节效果预先设置,并可以根据需要更新。In order to achieve real-time and dynamic adjustment effects on the status of the target device, the value of the preset adjustment period can be set to a smaller value, such as 0.01s. The embodiments of the present disclosure can be preset according to the specific controlled equipment and the adjustment effect. and can be updated as needed.
步骤20664,基于动态手势的运动速度,确定目标设备在目标维度参数上的目标调节速度。Step 20664: Based on the movement speed of the dynamic gesture, determine the target adjustment speed of the target device on the target dimension parameter.
步骤20666,对目标设备在目标维度参数上,以目标调节速度向目标调节方向进行调节。Step 20666: Adjust the target device in the target dimension parameter at the target adjustment speed in the target adjustment direction.
其中,在该步骤20666中,可以对目标设备在目标维度参数上的状态限度范围内,以目标调节速度向目标调节方向进行调节,当目标设备在该目标维度参数上达到状态限度范围边界时,例如车辆上窗户降到最低或者升到最高时,便不再对该目标设备在目标维度参数上向目标调节方向进行调节,以避免损坏目标设备。Among them, in step 20666, the target device can be adjusted in the target adjustment direction at the target adjustment speed within the state limit range of the target dimension parameter. When the target device reaches the state limit range boundary on the target dimension parameter, For example, when the windows on the vehicle are lowered to the lowest level or raised to the highest level, the target device will no longer be adjusted in the target adjustment direction on the target dimension parameters to avoid damaging the target device.
基于本实施例,可以基于动态手势的运动速度确定目标设备在目标维度参数上的目标调节速度,并对目标设备以该目标调节速度向目标调节方向进行调节,这样,动态手势的运动速度越快,设备调节速度越快,反之,动态手势的运动速度越慢,设备调节速度越慢,从而实现了基于动态手势的运动速度对设备调节速度的动态控制,实现了对设备调节速度的可视化控制,提高了目标设备的调节效率和用户的操作体验。Based on this embodiment, the target adjustment speed of the target device on the target dimension parameter can be determined based on the movement speed of the dynamic gesture, and the target device can be adjusted in the target adjustment direction at the target adjustment speed. In this way, the faster the movement speed of the dynamic gesture , the faster the device adjustment speed, conversely, the slower the dynamic gesture movement speed, the slower the device adjustment speed, thus realizing dynamic control of the device adjustment speed based on the movement speed of the dynamic gesture, and realizing visual control of the device adjustment speed. Improves the adjustment efficiency of the target device and the user's operating experience.
可选地,在其中一些实现方式中,可以获取目标设备在目标维度参数上的调节速度配置信息,该调节速度配置信息用于确定在目标设备的各维度参数上,手势运动速度和设备调节速度之间的关系,例如针对车辆上的窗户,在升降维度上,手势运动速度和窗户升降速度之间的关系,可以将手势运动速度和设备调节速度进行线性对应。相应地,在步骤20664中,可以基于获取到的调节速度配置信息,确定在目标维度参数上,通过步骤20662获取到的动态手势的运动速度对应的设备调节速度为目标调节速度。Optionally, in some implementations, the adjustment speed configuration information of the target device on the target dimension parameters can be obtained. The adjustment speed configuration information is used to determine the gesture movement speed and device adjustment speed on each dimension parameter of the target device. For example, for windows on a vehicle, in the lifting dimension, the relationship between the gesture movement speed and the window lifting speed can linearly correspond to the gesture movement speed and the device adjustment speed. Accordingly, in step 20664, based on the obtained adjustment speed configuration information, it may be determined that the device adjustment speed corresponding to the motion speed of the dynamic gesture obtained in step 20662 on the target dimension parameter is the target adjustment speed.
基于本实施例,可以根据预先设置的调节速度配置信息,基于动态手势的运动速度客观、准确的确定目标设备在目标维度参数上的目标调节速度,以实现对目标设备状态的调节速度的准确控制。Based on this embodiment, the target adjustment speed of the target device on the target dimension parameters can be determined objectively and accurately based on the movement speed of the dynamic gesture according to the preset adjustment speed configuration information, so as to achieve accurate control of the adjustment speed of the target device state. .
在一些具体实现方式中,可以从第一语音识别结果中获取目标设备在目标维度参数上的调节速度配置信息。In some specific implementations, the adjustment speed configuration information of the target device on the target dimension parameter can be obtained from the first speech recognition result.
本实施例中,用户可以通过语音控制指令直接携带调节速度配置信息,例如,语音控制指令可以是语音“我要调节主驾车窗,转三圈能升起整面窗户”,其中包括调节速度配置信息“转三圈能升起整面窗户”,本公开实施例对语音控制指令中携带调节速度配置信息的内容形式和格式不做限制。则对该语音控制指令进行语音识别得到第一语音识别结果后,便可以从第一语音识别结果中获取目标设备在目标维度参数上的 调节速度配置信息,从而确定在目标设备的目标维度参数上,手势运动速度和设备调节速度之间的关系。In this embodiment, the user can directly carry the speed adjustment configuration information through voice control instructions. For example, the voice control instruction can be "I want to adjust the main driver's window. Turn it three times to raise the entire window." This includes the speed adjustment configuration. The message "Three turns can raise the entire window." The embodiment of the present disclosure does not limit the content form and format of the speed adjustment configuration information carried in the voice control command. Then, after performing voice recognition on the voice control instruction to obtain the first voice recognition result, the target device's parameters in the target dimension can be obtained from the first voice recognition result. Adjust the speed configuration information to determine the relationship between the gesture movement speed and the device adjustment speed on the target dimension parameters of the target device.
基于本实施例,用户可以在对设备的操控过程中,直接通过语音控制指令设置目标设备在目标维度参数上的调节速度配置信息,从而实现具体场景中对调节速度配置信息的实时、动态配置,实现对设备调节效果的个性化配置。Based on this embodiment, the user can directly set the adjustment speed configuration information of the target device on the target dimension parameters through voice control instructions during the process of controlling the device, thereby realizing real-time and dynamic configuration of the adjustment speed configuration information in specific scenarios. Achieve personalized configuration of device adjustment effects.
或者,在另一些具体实现方式中,也可以采用如下方式获取目标设备在目标维度参数上的调节速度配置信息:响应于接收到调节速度配置语音指令,对该调节速度配置语音指令进行语音识别,得到第三语音识别结果,然后,从该第三语音识别结果中获取目标设备在目标维度参数上的调节速度配置信息,从而确定在目标设备的目标维度参数上,手势运动速度和设备调节速度之间的关系。其中的调节速度配置语音指令,可以是用户主动发送的调节速度配置语音指令,例如,用户在发送语音控制指令“我要向前调主驾座椅”后主动发送了调节速度配置语音指令“转三圈可以升起整个窗户”;或者,也可以是用户根据用于实现本公开实施例的装置输出的调节速度提示语音发送的调节速度配置语音指令,例如,用户在发送语音控制指令“我要向前调主驾座椅”后,根据用于实现本公开实施例的装置输出的调节速度提示语音“好的,请问您希望按照什么速度调整?”,发送调节速度配置语音指令“转三圈可以升起整个窗户”,本公开实施例对用户发送调节速度配置语音指令的方式和具体内容不做限制。Or, in other specific implementations, the following method can also be used to obtain the adjustment speed configuration information of the target device on the target dimension parameters: in response to receiving the adjustment speed configuration voice instruction, perform voice recognition on the adjustment speed configuration voice instruction, Obtain the third speech recognition result, and then obtain the adjustment speed configuration information of the target device on the target dimension parameter from the third speech recognition result, thereby determining the ratio between the gesture movement speed and the device adjustment speed on the target dimension parameter of the target device. relationship between. The voice command for adjusting the speed configuration may be a voice command for adjusting the speed and configuration actively sent by the user. For example, the user actively sends the voice command for adjusting the speed configuration "Turn forward" after sending the voice control command "I want to move the driver's seat forward". "Three turns can raise the entire window"; or, the user can configure the voice instruction according to the adjustment speed prompt voice sent by the device for implementing the embodiment of the present disclosure. For example, the user sends the voice control instruction "I want to After adjusting the driver's seat forward", according to the adjustment speed prompt voice "Okay, what speed do you want to adjust at" output by the device used to implement the embodiment of the present disclosure, the adjustment speed configuration voice command "Turn three turns" is sent. The entire window can be raised." The embodiment of the present disclosure does not limit the manner and specific content of the user's sending voice instructions for speed adjustment and configuration.
基于本实施例,用户可以在对设备的操控过程中,通过一条单独的指令来设置目标设备在目标维度参数上的调节速度配置信息,从而实现具体场景中对调节速度配置信息的实时、动态配置,实现对设备调节效果的个性化配置。Based on this embodiment, the user can set the adjustment speed configuration information of the target device on the target dimension parameters through a separate instruction during the process of controlling the device, thereby realizing real-time and dynamic configuration of the adjustment speed configuration information in specific scenarios. , to achieve personalized configuration of the equipment adjustment effect.
或者,在又一些具体实现方式中,还可以从预先配置的调节速度配置信息中获取目标设备在目标维度参数上的调节速度配置信息,从而确定在目标设备的目标维度参数上,手势运动速度和设备调节速度之间的关系。Or, in some specific implementations, the adjustment speed configuration information of the target device on the target dimension parameter can also be obtained from the preconfigured adjustment speed configuration information, thereby determining the gesture movement speed and the target dimension parameter on the target device. The relationship between device regulation speed.
其中,该预先配置的调节速度配置信息可以是用户预先配置的。以车载设备为例,用户可以通过车辆的中控系统提供的调节速度配置页面,例如,通过该调节速度配置页面中关于各车载设备的配置选项,或者通过该调节速度配置页面进行人机语音交互的方式,来设置或更新各车载设备的调节速度配置信息。或者,用户也可以通过人机语音交互的方式,接入中控系统提供的调节速度配置权限,并通过人机语音交互的方式来设置各车载设备的调节速度配置信息。针对其他设备(例如家电设备、终端设备等),可以通过对这些设备进行统一控制的控制设备提供的调节速度配置页面,采用与车载设备类似的方式,来设置或更新各设备的调节速度配置信息。The preconfigured adjustment speed configuration information may be preconfigured by the user. Taking vehicle-mounted equipment as an example, users can use the speed adjustment configuration page provided by the vehicle's central control system, for example, through the configuration options for each vehicle-mounted equipment in the speed adjustment configuration page, or through the speed adjustment configuration page for human-machine voice interaction. method to set or update the adjustment speed configuration information of each vehicle-mounted device. Alternatively, the user can also access the adjustment speed configuration permission provided by the central control system through human-computer voice interaction, and set the adjustment speed configuration information of each vehicle-mounted device through human-computer voice interaction. For other devices (such as home appliances, terminal equipment, etc.), the adjustment speed configuration information of each device can be set or updated through the adjustment speed configuration page provided by the control device that controls these devices in a similar manner to vehicle-mounted equipment. .
在用户未预先配置调节速度配置信息时,可以获取中控系统(对于车载设备)、控制设备(对于家电设备、终端设备等其他设备)等出厂时的预设信息作为预先配置的调节速度配置信息。When the user does not pre-configure the adjustment speed configuration information, the factory-preset information of the central control system (for vehicle-mounted equipment), control equipment (for home appliances, terminal equipment and other equipment) can be obtained as the pre-configured adjustment speed configuration information .
基于本实施例,可以在用户未针对当前场景设置调节速度配置信息时,从预先配置的调节速度配置信息中获取目标设备在目标维度参数上的调节速度配置信息,以用于确定当前场景中对目标设备的目标调节速度。Based on this embodiment, when the user does not set the adjustment speed configuration information for the current scene, the adjustment speed configuration information of the target device on the target dimension parameter can be obtained from the preconfigured adjustment speed configuration information to determine the adjustment speed configuration information for the current scene. The target throttling speed for the target device.
例如,在具体应用中,可以通过如下方式预先配置调节速度配置信息: For example, in a specific application, the adjustment speed configuration information can be pre-configured in the following way:
通过设置接口,例如中控系统(对于车载设备)、控制设备(对于家电设备、终端设备等其他设备)提供的调节速度配置页面上的接口,接收用户发送的调节速度配置请求,该调节速度配置请求包括设备标识(ID)、维度参数ID、手势运动幅度(例如一圈)和设备调节幅度(例如0.5cm)信息,其中的设备ID用于唯一标识一个设备,维度参数ID用于唯一标识一个维度参数;Through the setting interface, such as the interface on the adjustment speed configuration page provided by the central control system (for vehicle-mounted equipment), control equipment (for home appliances, terminal equipment and other equipment), receive the adjustment speed configuration request sent by the user. The adjustment speed configuration The request includes information about the device identification (ID), dimension parameter ID, gesture movement range (for example, one circle), and device adjustment range (for example, 0.5cm). The device ID is used to uniquely identify a device, and the dimension parameter ID is used to uniquely identify a device. Dimension parameters;
基于调节速度配置请求中的手势运动幅度和设备调节幅度信息,确定手势运动速度和设备调节速度之间的关系;Based on the gesture motion amplitude and device adjustment amplitude information in the adjustment speed configuration request, determine the relationship between the gesture motion speed and the device adjustment speed;
基于调节速度配置请求中的设备ID、维度参数ID、手势运动速度和设备调节速度之间的关系,配置该设备ID所标识的设备在该维度参数ID所标识的维度参数上的调节速度配置信息;或者,基于调节速度配置请求中的设备ID、维度参数ID、手势运动速度和设备调节速度之间的关系,更新预先配置的调节速度配置信息中该设备ID和该维度参数ID对应的调节速度配置信息。Based on the relationship between the device ID, dimension parameter ID, gesture movement speed and device adjustment speed in the adjustment speed configuration request, configure the adjustment speed configuration information of the device identified by the device ID on the dimension parameter identified by the dimension parameter ID. ; Or, based on the relationship between the device ID, dimension parameter ID, gesture movement speed and device adjustment speed in the adjustment speed configuration request, update the adjustment speed corresponding to the device ID and the dimension parameter ID in the preconfigured adjustment speed configuration information. Configuration information.
基于本实施例,实现了针对设备在维度参数上的调节速度配置信息的配置或更新。Based on this embodiment, the configuration or update of the adjustment speed configuration information of the device on the dimensional parameters is implemented.
另外,在上述实施例中,在执行步骤206或者2066的过程中,响应于接收到调节速度更新语音指令,对该调节速度更新语音指令进行语音识别,得到第四语音识别结果,并从该第四语音识别结果中获取调节速度更新配置信息,其中的调节速度更新配置信息用于表示在目标设备的各维度参数上,更新后的手势运动速度和设备调节速度之间的关系;然后,在动态手势的后续持续动作期间,实时或者按照预设调节周期,获取该动态手势的运动速度,并基于上述调节速度更新配置信息,确定在目标维度参数上,该动态手势的运动速度对应的更新设备调节速度,进而,对目标设备在目标维度参数上,以该更新调节速度向目标调节方向进行调节。In addition, in the above embodiment, during the execution of step 206 or 2066, in response to receiving the speed adjustment voice instruction, voice recognition is performed on the speed adjustment voice instruction, and a fourth voice recognition result is obtained, and the fourth voice recognition result is obtained from the speed adjustment voice instruction. The adjustment speed update configuration information is obtained from the fourth speech recognition result. The adjustment speed update configuration information is used to represent the relationship between the updated gesture movement speed and the device adjustment speed on the various dimensional parameters of the target device; then, in the dynamic During the subsequent continuous action of the gesture, the movement speed of the dynamic gesture is obtained in real time or according to the preset adjustment cycle, and the configuration information is updated based on the above adjustment speed to determine the updated device adjustment corresponding to the movement speed of the dynamic gesture on the target dimension parameters. speed, and then adjust the target device in the target adjustment direction at the updated adjustment speed on the target dimension parameters.
在对目标设备进行连续调节的过程中,用户可能会发现对目标设备的调节速度过快或过慢,基于本实施例,用户可以在对目标设备进行调节的过程中,根据调节效果需求发送调节速度更新语音指令来更新调节速度配置信息,从而实现对目标设备调节速度的实时更新,进一步提高了对目标设备的调节效率、调节效果和用户的操作体验。During the process of continuously adjusting the target device, the user may find that the adjustment speed of the target device is too fast or too slow. Based on this embodiment, the user can send adjustments according to the adjustment effect requirements during the process of adjusting the target device. The speed update voice command is used to update the adjustment speed configuration information, thereby realizing real-time update of the adjustment speed of the target device, further improving the adjustment efficiency and effect of the target device and the user's operating experience.
另外,在本公开上述实施例中,还可以包括预设动态手势检测的步骤。In addition, in the above embodiments of the present disclosure, the step of presetting dynamic gesture detection may also be included.
图6是本公开再一示例性实施例提供的设备控制方法的流程示意图。如图6所示,在其中一些实现方式中,可以通过如下方式进行预设动态手势的检测:Figure 6 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 6, in some implementations, preset dynamic gestures can be detected in the following ways:
步骤302,确定发送语音控制指令的声源对象的位置。Step 302: Determine the position of the sound source object that sends the voice control instruction.
例如,可以通过音区定位方式,确定发送语音控制指令的声源对象的位置。For example, the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
步骤304,基于声源对象的位置,获取包括声源对象的手部的图像序列。Step 304: Based on the position of the sound source object, obtain an image sequence including the hand of the sound source object.
其中,该图像序列包括具有时序关系的多帧图像。Wherein, the image sequence includes multiple frames of images with a temporal relationship.
确定声源对象的位置后,便可以调用图像采集模块(例如摄像头等)针对该声源对象进行图像采集,并对采集到的图像进行手部检测和跟踪,得到包括该声源对象的手部的视频流,从该视频流中按照预设方式选取(例如连续选取或者隔帧选取等)具有时序关系的多帧图像,作为声源对象的手部的图像序列,或者,进一步从选取的多帧图像中分别截取出包含手部的、统一尺寸的图像,从而得到声源对象的手部的图 像序列。After determining the position of the sound source object, you can call the image acquisition module (such as a camera, etc.) to collect images of the sound source object, and perform hand detection and tracking on the collected images to obtain the hand including the sound source object. A video stream, from which multiple frames of images with a temporal relationship are selected in a preset manner (such as continuous selection or every other frame selection, etc.), as an image sequence of the hand of the sound source object, or further from the selected multiple frames. Images of the same size containing the hand are intercepted from the frame images to obtain the image of the hand of the sound source object. Like sequence.
从选取的多帧图像中截取得到手部的图像序列的方式,相对于声源对象的图像序列,由于图像中包含的背景信息较少,干扰较小,可以提高手势检测结果的准确性。Compared with the image sequence of the sound source object, the image sequence of the hand is intercepted from the selected multi-frame images. Since the image contains less background information and less interference, the accuracy of the gesture detection results can be improved.
在具体实现中,可以通过一个第一神经网络,例如卷积神经网络(convolutional neural network,CNN),对采集到的图像进行手部检测和跟踪,得到包括该声源对象的手部的视频流。该第一神经网络可以预先利用包括手部的样本图像对神经网络模型进行训练得到。In a specific implementation, a first neural network, such as a convolutional neural network (CNN), can be used to detect and track the hand on the collected images to obtain a video stream including the hand of the sound source object. . The first neural network can be obtained by pre-training a neural network model using sample images including hands.
步骤306,依次对图像序列中的各帧图像进行手部关键点检测,得到手部关键点序列。Step 306: Perform hand key point detection on each frame image in the image sequence in sequence to obtain a hand key point sequence.
其中,该手部关键点序列由各帧图像中的手部关键点基于时序关系形成。Wherein, the hand key point sequence is formed based on the time series relationship of the hand key points in each frame image.
在具体实现中,可以通过一个第二神经网络,例如CNN,对各帧图像进行手部关键点检测,得到手部关键点。该第二神经网络可以预先利用标注有手部关键点信息的样本图像对神经网络模型进行训练得到。In a specific implementation, a second neural network, such as CNN, can be used to detect hand key points on each frame image to obtain hand key points. The second neural network can be obtained by pre-training the neural network model using sample images marked with hand key point information.
步骤308,基于手部关键点序列,进行预设动态手势检测。Step 308: Perform preset dynamic gesture detection based on the hand key point sequence.
在具体实现中,可以将手部关键点序列输入一个第三神经网络,例如CNN,经该第三神经网络输出是否预设动态手势的预设手势检测结果。该第三神经网络可以预先利用做出预设动态手势的样本视频进行训练得到。In a specific implementation, the hand key point sequence can be input into a third neural network, such as CNN, and the preset gesture detection result of whether the dynamic gesture is preset is output through the third neural network. The third neural network can be trained in advance using sample videos of preset dynamic gestures.
基于本实施例,通过获取包括声源对象的手部的图像序列,基于视觉技术的方式实现了对预设动态手势的检测,以便在检测到预设动态手势时触发对目标设备的状态的调节。Based on this embodiment, by acquiring an image sequence including the hand of the sound source object, the detection of the preset dynamic gesture is implemented based on visual technology, so that when the preset dynamic gesture is detected, the adjustment of the state of the target device is triggered. .
相应地,在图6所示实施例的基础上,可以基于步骤306得到的手部关键点序列,来确定动态手势的运动方向。例如,可以根据手部关键点序列的轨迹对应的方向,来确定动态手势的运动方向。Correspondingly, based on the embodiment shown in FIG. 6 , the movement direction of the dynamic gesture can be determined based on the hand key point sequence obtained in step 306 . For example, the movement direction of the dynamic gesture can be determined based on the direction corresponding to the trajectory of the hand key point sequence.
基于本实施例,通过图像序列对应的手部关键点序列,基于视觉技术的方式实现了动态手势运动方向的确定。Based on this embodiment, through the hand key point sequence corresponding to the image sequence, the dynamic gesture movement direction is determined based on visual technology.
另外,可以基于步骤304获取到的图像序列中最后一帧图像中的手部关键点和前一帧图像中的手部关键点、以及该最后一帧图像的采集时刻和该前一帧图像的采集时刻,获取动态手势的运动速度。其中的前一帧图像,可以是图像序列中位于该最后一帧图像之前的任意一帧图像,例如可以是该最后一帧图像相邻的前一帧图像,也可以是与该最后一帧图像间隔若干帧的图像,本公开实施例对此不做限制。In addition, it can be based on the hand key points in the last frame image and the hand key points in the previous frame image in the image sequence obtained in step 304, as well as the collection time of the last frame image and the time of the previous frame image. At the collection moment, the movement speed of the dynamic gesture is obtained. The previous frame of image may be any frame of image located before the last frame of image in the image sequence. For example, it may be the previous frame of image adjacent to the last frame of image, or it may be the image of the last frame of image. The embodiment of the present disclosure does not limit the images separated by several frames.
例如,可以根据图像序列中最后一帧图像中的手部关键点和前一帧图像中的手部关键点之间的距离,以及该最后一帧图像的采集时刻和该前一帧图像的采集时刻之间的时间,来计算得到动态手势的运动速度。其中,最后一帧图像中的手部关键点和前一帧图像中的手部关键点之间的距离,可以是最后一帧图像和前一帧图像中各对应手部关键点之间距离的平均值,也可以是最后一帧图像和前一帧图像中预设手部关键点(例如指尖关键点)之间的距离,等等,本公开实施例对此不做限制。For example, it can be based on the distance between the hand key points in the last frame of the image and the hand key points in the previous frame of the image, as well as the acquisition time of the last frame of image and the acquisition of the previous frame of image. The time between moments is used to calculate the movement speed of dynamic gestures. Among them, the distance between the hand key points in the last frame image and the hand key points in the previous frame image can be the distance between the corresponding hand key points in the last frame image and the previous frame image. The average value may also be the distance between the preset hand key points (such as fingertip key points) in the last frame image and the previous frame image, etc. This embodiment of the present disclosure does not limit this.
在具体实现中,可以将手部关键点序列输入上述第三神经网络,经该第三神经网络输出该手部关键点序列对应的动态手势的运动方向和运动幅度(例如转圈角度),然后基于该运动幅度和图像序列对应的时间,可以计算得到动态手势的运动速度。或者,也可以将携带有采集时刻信息并标注手部关键点的图像序列输入上述第三神经网络,经该第三神经网络输出该手部关键点序列对应的动态手势的运动方向和运动速度, 等等。本公开实施例对此不做限制。In a specific implementation, the hand key point sequence can be input into the above-mentioned third neural network, and the movement direction and movement amplitude (such as the circle angle) of the dynamic gesture corresponding to the hand key point sequence are output through the third neural network, and then based on The movement amplitude and the time corresponding to the image sequence can be used to calculate the movement speed of the dynamic gesture. Alternatively, the image sequence carrying the collection time information and labeling the hand key points can also be input into the above-mentioned third neural network, and the movement direction and movement speed of the dynamic gesture corresponding to the hand key point sequence are output through the third neural network. etc. The embodiments of the present disclosure do not limit this.
基于本实施例,通过图像序列中两帧图像对应的手部关键点和图像采集时刻,可以精确确定动态手势的运动速度。Based on this embodiment, the movement speed of the dynamic gesture can be accurately determined through the key points of the hand corresponding to the two frames of images in the image sequence and the image collection time.
图7是本公开还一示例性实施例提供的设备控制方法的流程示意图。如图7所示,在另一些实现方式中,也可以通过如下方式进行预设动态手势的检测:FIG. 7 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 7, in other implementations, preset dynamic gestures can also be detected in the following ways:
步骤402,确定发送语音控制指令的声源对象的位置。Step 402: Determine the position of the sound source object that sends the voice control instruction.
例如,可以通过音区定位方式,确定发送语音控制指令的声源对象的位置。For example, the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
步骤404,基于声源对象的位置,利用光学飞行时间(Time of Flight,ToF)传感器,测量声源对象的手部各点与ToF传感器之间的距离信息,得到一组距离信息。Step 404: Based on the position of the sound source object, use an optical Time of Flight (ToF) sensor to measure the distance information between each point on the hand of the sound source object and the ToF sensor to obtain a set of distance information.
确定声源对象的位置后,便可以利用ToF传感器测量该声源对象的手部各点与ToF传感器之间的距离,在各测量时刻得到的一组距离信息,包括该测量时刻声源对象的手部各点与ToF传感器之间的距离信息。After determining the position of the sound source object, the ToF sensor can be used to measure the distance between the hand points of the sound source object and the ToF sensor. A set of distance information obtained at each measurement moment includes the distance of the sound source object at that measurement moment. Distance information between each point of the hand and the ToF sensor.
步骤406,基于具有时序关系的多组距离信息,得到距离信息序列。Step 406: Obtain a distance information sequence based on multiple sets of distance information with time series relationships.
步骤408,基于距离信息序列,进行预设动态手势检测。Step 408: Perform preset dynamic gesture detection based on the distance information sequence.
可选地,在其中一些实现方式中,基于该距离信息序列,可以获知声源对象的手部各点与ToF传感器之间的距离随时间的变化,从而可以根据该距离变化情况是否符合预设动态手势对应的距离变化规律,确定声源对象的手部是否做出预设动态手势。Optionally, in some implementations, based on the distance information sequence, the change over time of the distance between the hand points of the sound source object and the ToF sensor can be learned, so that the distance changes can be based on whether the distance changes meet the preset The distance change pattern corresponding to the dynamic gesture determines whether the hand of the sound source object makes a preset dynamic gesture.
或者,在另一些实现方式中,可以基于距离信息序列中的各组距离信息分别进行三维(three dimensional,3D)建模得到相应的手部姿态,由距离信息序列对应的手部姿态可以确定声源对象的手部是否做出预设动态手势。Or, in other implementations, three-dimensional (3D) modeling can be performed based on each set of distance information in the distance information sequence to obtain the corresponding hand posture, and the sound can be determined from the hand posture corresponding to the distance information sequence. Whether the source object's hands make preset dynamic gestures.
基于本实施例,通过ToF传感器实现了方式实现了对预设动态手势的检测,以便在检测到预设动态手势时触发对目标设备的状态的调节。Based on this embodiment, the ToF sensor is used to realize the detection of the preset dynamic gesture, so that when the preset dynamic gesture is detected, the adjustment of the state of the target device is triggered.
相应地,在图7所示实施例的基础上,可以基于步骤406得到的距离信息序列,确定动态手势的运动方向。例如,可以根据预设动态手势在不同运动方向对应的距离的变化规律,确定步骤406得到的距离信息序列对应的动态手势的运动方向。Correspondingly, based on the embodiment shown in FIG. 7 , the movement direction of the dynamic gesture can be determined based on the distance information sequence obtained in step 406 . For example, the movement direction of the dynamic gesture corresponding to the distance information sequence obtained in step 406 can be determined based on the change pattern of the distance corresponding to the preset dynamic gesture in different movement directions.
基于本实施例,通过ToF传感器检测到与声源对象的手部各点之间的距离变化,实现了动态手势运动方向的确定。Based on this embodiment, the ToF sensor detects the change in distance from each point on the hand of the sound source object, thereby realizing the determination of the direction of dynamic gesture movement.
另外,可以基于步骤406得到的距离信息序列中最后一组距离信息和前一组距离信息、以及该最后一组距离信息对应的测量时刻和该前一组距离信息对应的测量时刻,获取动态手势的运动速度。其中的前一组距离信息,可以是距离信息序列中位于该最后一组距离信息之前的任意一组距离信息,例如可以是该最后一组距离信息相邻的前一组距离信息,也可以是与该最后一组距离信息间隔若干组距离信息的一组距离信息,本公开实施例对此不做限制。In addition, dynamic gestures can be obtained based on the last set of distance information and the previous set of distance information in the distance information sequence obtained in step 406, and the measurement time corresponding to the last set of distance information and the measurement time corresponding to the previous set of distance information. movement speed. The previous set of distance information may be any set of distance information located before the last set of distance information in the distance information sequence. For example, it may be the previous set of distance information adjacent to the last set of distance information, or it may be A set of distance information that is separated from the last set of distance information by several sets of distance information, and this embodiment of the present disclosure does not limit this.
例如,可以根据距离信息序列中最后一组距离信息和前一组距离信息之间的距离变化,以及该最后一组距离信息对应的测量时刻和该前一组距离信息对应的测量时刻之间的时间,来计算得到动态手势的运动 速度。其中,最后一组距离信息和前一组距离信息之间的距离变化,可以是最后一组距离信息和前一组距离信息中手部各对应点之间的距离变化的平均值,也可以是最后一组距离信息和前一组距离信息中预设手部点(例如指尖)之间的距离变化,等等,本公开实施例对此不做限制。For example, it can be based on the distance change between the last set of distance information and the previous set of distance information in the distance information sequence, and the difference between the measurement time corresponding to the last set of distance information and the measurement time corresponding to the previous set of distance information. time to calculate the movement of dynamic gestures speed. Among them, the distance change between the last set of distance information and the previous set of distance information can be the average of the distance changes between the last set of distance information and the previous set of distance information, or it can be The distance changes between the last set of distance information and the preset hand points (such as fingertips) in the previous set of distance information, etc., are not limited in this embodiment of the disclosure.
基于本实施例,通过距离信息序列中的两组距离信息和测量时刻,可以精确确定动态手势的运动速度。Based on this embodiment, through the two sets of distance information and the measurement time in the distance information sequence, the movement speed of the dynamic gesture can be accurately determined.
图8是本公开又一示例性实施例提供的设备控制方法的流程示意图。如图8所示,在又一些实现方式中,还可以通过如下方式进行预设动态手势的检测:Figure 8 is a schematic flowchart of a device control method provided by yet another exemplary embodiment of the present disclosure. As shown in Figure 8, in some implementations, preset dynamic gestures can also be detected in the following ways:
步骤502,确定发送语音控制指令的声源对象的位置。Step 502: Determine the position of the sound source object that sends the voice control instruction.
例如,可以通过音区定位方式,确定发送语音控制指令的声源对象的位置。For example, the position of the sound source object that sends the voice control instruction can be determined through the sound zone positioning method.
步骤504,基于声源对象的位置,利用穿戴设备,获取声源对象的手部各点的位置,得到手部位置信息。Step 504: Based on the position of the sound source object, use the wearable device to obtain the positions of each point on the hand of the sound source object to obtain hand position information.
其中的手部位置信息包括手部各点的位置信息。The hand position information includes position information of each point of the hand.
本公开实施例中的穿戴设备,例如可以是智能手套、智能眼镜等智能设备,其中的智能手套可以直接定位任意时刻手部各点的位置,智能眼镜可以通过视觉方式获取手部各点的位置,本公开实施例对具体采用的穿戴设备及其获取声源对象的手部各点的位置的方式不做限制。Wearable devices in embodiments of the present disclosure may be, for example, smart gloves, smart glasses and other smart devices. The smart gloves can directly locate the positions of various points on the hand at any time, and the smart glasses can visually obtain the positions of various points on the hand. , the embodiments of the present disclosure do not limit the specific wearable device used and the manner in which it obtains the positions of various points on the hand of the sound source object.
步骤506,基于手部位置信息确定手部的姿态。Step 506: Determine the posture of the hand based on the hand position information.
步骤508,基于多个时刻的手部的姿态,确定手部的动作。Step 508: Determine hand movements based on hand postures at multiple moments.
步骤510,确认手部的动作是否为预设动态手势的动作。Step 510: Confirm whether the hand movement is a preset dynamic gesture movement.
步骤512,响应于手部的动作为预设动态手势的动作,确认检测到预设动态手势。Step 512: In response to the hand movement being the preset dynamic gesture, confirm that the preset dynamic gesture is detected.
否则,响应于手部的动作不为预设动态手势的动作,确认未检测到预设动态手势。Otherwise, in response to the movement of the hand not being the preset dynamic gesture, it is confirmed that the preset dynamic gesture is not detected.
基于本实施例,利用穿戴设备可以直接获取声源对象的手部各点的位置,进而确定手部的姿态,基于多个时刻的手部的姿态确定手部的动作,从而可确认是否为预设动态手势,以便在检测到预设动态手势时触发对目标设备的状态的调节。Based on this embodiment, the wearable device can be used to directly obtain the position of each point of the hand of the sound source object, and then determine the posture of the hand, and determine the movement of the hand based on the posture of the hand at multiple moments, thereby confirming whether it is a predetermined Set dynamic gestures to trigger adjustments to the target device's state when the preset dynamic gesture is detected.
相应地,在图8所示实施例的基础上,可以基于步骤506确定的多个时刻手部的姿态,确定动态手势的运动方向。例如,可以根据基于多个时刻手部的姿态的变化,确定动态手势的运动方向。或者,也可以基于步骤508确定的手部的动作,直接确定动态手势的运动方向。Correspondingly, based on the embodiment shown in FIG. 8 , the movement direction of the dynamic gesture can be determined based on the postures of the hand determined at multiple moments in step 506 . For example, the movement direction of a dynamic gesture can be determined based on changes in hand postures at multiple moments. Alternatively, the movement direction of the dynamic gesture can also be directly determined based on the hand movement determined in step 508.
基于本实施例,通过穿戴设备获取手部位置信息,实现了动态手势运动方向的确定。Based on this embodiment, hand position information is obtained through the wearable device, and the direction of dynamic gesture movement is determined.
另外,可以基于步骤504得到的多个时刻中最后一个时刻和前一个时刻、以及该最后一个时刻的手部位置信息和该前一个时刻的手部位置信息,获取动态手势的运动速度。其中的时刻可以是穿戴设备获取声源对象的手部各点的位置的信息采集时刻,穿戴设备可以按照预设信息采集周期(例如0.01s)获取声源对象的手部各点的位置,则两个信息采集时刻之间的时间间隔为0.01s。其中的前一个时刻,可以是该最后一个时刻之前的一个时刻,也可以是位于该最后一个时刻之前、与该最后一个时刻间隔预设数量个时刻(例如2个)的一个时刻,本公开实施例对此不做限制。In addition, the motion speed of the dynamic gesture can be obtained based on the last moment and the previous moment among the multiple moments obtained in step 504, as well as the hand position information of the last moment and the hand position information of the previous moment. The time can be the information collection time when the wearable device obtains the position of each point on the hand of the sound source object. The wearable device can obtain the position of each point on the hand of the sound source object according to the preset information collection period (for example, 0.01s), then The time interval between two information collection moments is 0.01s. The previous moment may be a moment before the last moment, or may be a moment before the last moment and separated from the last moment by a preset number of moments (for example, 2). The present disclosure implements There is no restriction on this.
例如,可以根据该最后一个时刻的手部位置信息和该前一个时刻的手部位置信息之间的变化,以及该最后一个时刻和该前一个时刻的测量时刻之间的时间,来计算得到动态手势的运动速度。其中,最后一个 时刻的手部位置信息和前一个时刻的手部位置信息之间的变化,可以是最后一个时刻和前一个时刻的手部位置信息中手部各对应点之间的距离变化的平均值,也可以是最后一个时刻和前一个时刻的手部位置信息中预设手部点(例如指尖)之间的距离变化,等等,本公开实施例对此不做限制。For example, the dynamics can be calculated based on the change between the hand position information at the last moment and the hand position information at the previous moment, and the time between the last moment and the measurement moment at the previous moment. The speed of the gesture. Among them, the last The change between the hand position information at a moment and the hand position information at the previous moment can be the average change in the distance between the corresponding points of the hand in the hand position information at the last moment and the previous moment, or It may be the change in the distance between the preset hand points (such as fingertips) in the hand position information at the last moment and the previous moment, etc., and the embodiment of the present disclosure does not limit this.
基于本实施例,利用穿戴设备获取的声源对象在不同时刻的手部位置信息,可以精确确定动态手势的运动速度。Based on this embodiment, the movement speed of the dynamic gesture can be accurately determined using the hand position information of the sound source object obtained by the wearable device at different times.
如下所示,为本公开实施例的几个示例性应用场景:As shown below, there are several exemplary application scenarios of the embodiments of the present disclosure:
场景一,调节车辆窗户(车窗):Scenario 1, adjust vehicle windows (windows):
用户发送语音控制指令“我要手势调节主驾车窗”,实现本公开实施例的装置在接收到用户发送的语音控制指令后,进行语音识别,基于得到的第一语音识别结果确定目标设备为主驾车窗,并接入主驾窗户的控制权限;用户顺时针画圈,主驾车窗连续上升。在主驾车窗连续下降的过程中,用户发送调节速度更新语音指令“太慢了,转三圈能升起整面窗户”,实现本公开实施例的装置据此确定用户的画圈动作对应的更新设备调节速度,进而,控制主驾车窗以该更新调节速度上升。用户继续画圈动作,直到主驾车窗调节到用户预期的高度。The user sends a voice control instruction "I want gestures to adjust the main driver's window." After receiving the voice control instruction sent by the user, the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is the main device based on the obtained first speech recognition result. driver's window, and access the control authority of the main driver's window; the user draws a circle clockwise, and the main driver's window continuously rises. During the continuous lowering of the main driver's window, the user sends the speed adjustment voice command "It's too slow. Three turns can raise the entire window." Based on this, the device implementing the embodiment of the present disclosure determines the speed corresponding to the user's circle-drawing action. The update device adjusts the speed, and in turn controls the main driving window to rise at the update adjustment speed. The user continues to draw circles until the main driver's window is adjusted to the height expected by the user.
场景二,调节车辆上的座椅的前后:Scenario 2: Adjust the front and rear seats on the vehicle:
用户发送语音控制指令“我要向前调节主驾座椅,转一圈向前一厘米”,实现本公开实施例的装置在接收到用户发送的语音控制指令后,进行语音识别,基于得到的第一语音识别结果确定目标设备为主驾座椅、目标维度参数为前后、调节速度配置信息为“转一圈向前一厘米”,接入主驾座椅的控制权限;用户顺时针画圈,主驾座椅连续向前。用户继续画圈动作,直到主驾座椅调节到用户预期的位置。The user sends a voice control instruction "I want to adjust the driver's seat forward, turn it one centimeter forward." After receiving the voice control instruction sent by the user, the device implementing the embodiment of the present disclosure performs speech recognition. Based on the obtained The first voice recognition result determines that the target device is the main driving seat, the target dimension parameter is front and rear, the adjustment speed configuration information is "turn one circle forward one centimeter", and accesses the control authority of the main driving seat; the user draws a circle clockwise , the driver's seat moves forward continuously. The user continues to move in circles until the driver's seat is adjusted to the user's desired position.
场景三,基于手部形态信息调节车辆上的左后视镜:Scenario 3: Adjust the left rearview mirror on the vehicle based on hand shape information:
用户发送语音控制指令“我要手势调节左后视镜”,实现本公开实施例的装置在接收到用户发送的语音控制指令后,进行语音识别,基于得到的第一语音识别结果确定目标设备为左后视镜,接入左后视镜的控制权限;用户右手逆时针画圈,左后视镜连续低头;用户顺势针画圈,左后视镜连续抬头;用户左手逆时针画圈,左后视镜连续向外;用户顺势针画圈,左后视镜连续向内。或者,用户右手伸出食指逆时针画圈,左后视镜连续低头;用户顺势针画圈,左后视镜连续抬头;用户右手同时伸出食指和中指逆时针画圈,左后视镜连续向外;用户顺势针画圈,左后视镜连续向内。具体的调节速度可以获取预先配置的调节速度配置信息确定,或者,也可以参考上述场景一和场景二,通过用户语音命令的方式配置调节速度配置信息。用户继续画圈动作,直到左后视镜调节到用户预期的方向。The user sends a voice control instruction "I want to adjust the left rearview mirror with gestures." After receiving the voice control instruction sent by the user, the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is based on the obtained first speech recognition result. Left rearview mirror, access the control authority of the left rearview mirror; when the user draws a circle with the right hand counterclockwise, the left rearview mirror lowers its head continuously; when the user draws a circle counterclockwise, the left rearview mirror continuously raises the head; when the user draws a circle counterclockwise with the left hand, the left rearview mirror The rearview mirror continuously points outward; the user draws a circle with the needle, and the left rearview mirror continuously points inward. Alternatively, the user extends the index finger of the right hand to draw a counterclockwise circle, and the left rearview mirror lowers its head continuously; the user draws a circle counterclockwise, and the left rearview mirror continuously raises the head; the user extends the index finger and middle finger of the right hand to draw a counterclockwise circle, and the left rearview mirror continuously raises the head. outward; the user draws circles with the needle, and the left rearview mirror continuously points inward. The specific adjustment speed can be determined by obtaining the preconfigured adjustment speed configuration information, or you can also refer to the above scenario one and two to configure the adjustment speed configuration information through user voice commands. The user continues to move in circles until the left rearview mirror is adjusted to the direction expected by the user.
场景四,调节空调的风量:Scenario 4: Adjust the air volume of the air conditioner:
用户发送语音控制指令“我要手势调节空掉的风量”,实现本公开实施例的装置在接收到用户发送的语音控制指令后,进行语音识别,基于得到的第一语音识别结果确定目标设备为空调、目标维度参数为风量,并接入空调的控制权限;用户顺时针画圈,空调的风量加大;用户逆时针画圈,空调的风量减小,具体的调节速度可以获取预先配置的调节速度配置信息确定。在空调风量的调节过程中,用户发送调节速度更新语音指令更新空调风量的调节速度。用户继续画圈动作,直到空调的风量达到用户预期效果。 The user sends a voice control instruction "I want to adjust the empty air volume with gestures." After receiving the voice control instruction sent by the user, the device implementing the embodiments of the present disclosure performs speech recognition and determines that the target device is based on the obtained first speech recognition result. The air conditioner and target dimension parameters are air volume, and are accessed to the control authority of the air conditioner; if the user draws a circle clockwise, the air volume of the air conditioner increases; if the user draws a circle counterclockwise, the air volume of the air conditioner decreases, and the specific adjustment speed can be obtained from the preconfigured adjustment Speed configuration information is determined. In the process of adjusting the air volume of the air conditioner, the user sends an adjustment speed update voice command to update the adjustment speed of the air volume of the air conditioner. The user continues to move in circles until the air volume of the air conditioner reaches the user's desired effect.
用户可以采用类似方式调节空调的温度、方向等,此处不再赘述。Users can adjust the temperature, direction, etc. of the air conditioner in similar ways, which will not be described again here.
本公开实施例提供的任一种设备控制方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本公开实施例提供的任一种设备控制方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本公开实施例提及的任一种设备控制方法。下文不再赘述。Any device control method provided by the embodiments of the present disclosure can be executed by any appropriate device with data processing capabilities, including but not limited to: terminal devices and servers. Alternatively, any of the device control methods provided by the embodiments of the present disclosure can be executed by the processor. For example, the processor executes any of the device control methods mentioned in the embodiments of the present disclosure by calling corresponding instructions stored in the memory. No further details will be given below.
示例性装置Exemplary device
本公开实施例的设备控制装置可用于实现本公开上述各实施例的设备控制方法。The equipment control device of the embodiment of the present disclosure can be used to implement the equipment control method of the above-mentioned embodiments of the present disclosure.
图9是本公开一示例性实施例提供的设备控制装置的结构示意图。如图9所示,该实施例的设备控制装置包括:语音识别模块602,第一确定模块604,检测模块606和调节模块608。其中:Figure 9 is a schematic structural diagram of an equipment control device provided by an exemplary embodiment of the present disclosure. As shown in Figure 9, the equipment control device of this embodiment includes: a voice recognition module 602, a first determination module 604, a detection module 606 and an adjustment module 608. in:
语音识别模块602,用于响应于接收到语音控制指令,对该语音控制指令进行语音识别,得到第一语音识别结果。The voice recognition module 602 is configured to perform voice recognition on the voice control instruction in response to receiving the voice control instruction to obtain a first voice recognition result.
第一确定模块604,用于基于语音识别模块602得到的第一语音识别结果,确定所述语音控制指令对应的目标设备。The first determination module 604 is configured to determine the target device corresponding to the voice control instruction based on the first voice recognition result obtained by the voice recognition module 602.
该语音控制指令对应的目标设备,即需要对其状态进行调节的设备。该目标设备可以是家电设备、车载设备、终端设备等任意设备,其中的车载设备即车辆上设备,例如可以包括但不限于车辆上的以下设备:左后视镜、右后视镜、车辆内部后视镜、各窗户、各空调、各座椅、音响、各灯等等。本公开实施例对目标设备的范围和车载设备的具体范围不做限制。The target device corresponding to the voice control instruction is the device whose status needs to be adjusted. The target device can be any device such as home appliances, vehicle-mounted equipment, terminal equipment, etc. The vehicle-mounted equipment is the equipment on the vehicle. For example, it can include but is not limited to the following equipment on the vehicle: left rearview mirror, right rearview mirror, vehicle interior Rearview mirrors, windows, air conditioners, seats, stereos, lights, etc. The embodiments of the present disclosure do not limit the scope of the target device and the specific scope of the vehicle-mounted device.
检测模块606,用于检测预设动态手势。The detection module 606 is used to detect preset dynamic gestures.
本公开实施例中的检测预设动态手势,例如可以包括但不限于画圈等。The detection of preset dynamic gestures in the embodiment of the present disclosure may include, but is not limited to, drawing circles, etc., for example.
调节模块608,用于响应于检测模块606检测到预设动态手势,基于该动态手势的持续动作对第一确定模块604确定的目标设备的状态进行连续调节。The adjustment module 608 is configured to continuously adjust the state of the target device determined by the first determination module 604 based on the continuous action of the dynamic gesture in response to the detection module 606 detecting the preset dynamic gesture.
基于本实施例,可以基于语音控制指令确定需要调节的目标设备,而无需手动选择目标设备,可以提高目标设备选取的效率和便利性,有效避免手动选择目标设备存在的不方便性问题;另外,基于动态手势的持续动作对该目标设备的状态进行连续调节,实现了对目标设备的连续性操作控制,使得对目标设备的状态的调节更灵活、精细、精确,从而提高了对目标设备的控制效果。Based on this embodiment, the target device that needs to be adjusted can be determined based on the voice control instruction without manually selecting the target device, which can improve the efficiency and convenience of selecting the target device and effectively avoid the inconvenience problem of manually selecting the target device; in addition, Continuous actions based on dynamic gestures continuously adjust the status of the target device, achieving continuous operational control of the target device, making the adjustment of the status of the target device more flexible, fine, and precise, thus improving the control of the target device. Effect.
图10是本公开另一示例性实施例提供的设备控制装置的结构示意图。如图10所示,在图9所示实施例的基础上,该实施例的设备控制装置中,还可以包括:第二确定模块702,用于确定目标设备的待调节的目标维度参数。Figure 10 is a schematic structural diagram of an equipment control device provided by another exemplary embodiment of the present disclosure. As shown in Figure 10, based on the embodiment shown in Figure 9, the equipment control device of this embodiment may further include: a second determination module 702, used to determine the target dimension parameters to be adjusted of the target equipment.
相应地,调节模块608可以包括:第一确定单元6082,用于确定动态手势的运动方向;第二确定单元6084,用于基于动态手势的运动方向,确定目标设备在目标维度参数上的目标调节方向;调节单元6086,用于基于动态手势在运动方向上的持续动作,对目标设备在目标维度参数上,向目标调节方向进行连续调节。Correspondingly, the adjustment module 608 may include: a first determination unit 6082, used to determine the movement direction of the dynamic gesture; a second determination unit 6084, used to determine the target adjustment of the target device on the target dimension parameter based on the movement direction of the dynamic gesture. Direction; the adjustment unit 6086 is used to continuously adjust the target dimension parameters of the target device in the target adjustment direction based on the continuous action of the dynamic gesture in the movement direction.
可选地,在其中一些实现方式中,目标设备的状态基于一个维度参数确定。相应地,该实施例中,第 二确定模块702,具体用于确定目标设备的一个维度参数为目标维度参数。Optionally, in some of these implementations, the status of the target device is determined based on a dimensional parameter. Correspondingly, in this embodiment, the The second determination module 702 is specifically used to determine one dimension parameter of the target device as the target dimension parameter.
可选地,在另一些实现方式中,目标设备的状态基于多个维度参数确定。相应地,该实施例中,第二确定模块702,具体用于基于第一语音识别结果,确定目标维度参数。Optionally, in other implementations, the status of the target device is determined based on multiple dimensional parameters. Correspondingly, in this embodiment, the second determination module 702 is specifically configured to determine the target dimension parameters based on the first speech recognition result.
可选地,在又一些实现方式中,目标设备的状态基于多个维度参数确定。相应地,该实施例中,语音识别模块602,还用于响应于接收到维度参数语音指令,对维度参数语音指令进行语音识别,得到第二语音识别结果。第二确定模块702,具体用于基于第二语音识别结果,确定目标维度参数。Optionally, in some implementations, the status of the target device is determined based on multiple dimensional parameters. Correspondingly, in this embodiment, the voice recognition module 602 is also configured to perform voice recognition on the dimension parameter voice command in response to receiving the dimension parameter voice command, and obtain a second voice recognition result. The second determination module 702 is specifically configured to determine the target dimension parameters based on the second speech recognition result.
可选地,在再一些实现方式中,目标设备的状态基于多个维度参数确定。相应地,再参见图10,该实施例的设备控制装置中,还可以包括:第一获取模块704,具体用于获取动态手势对应的手部形态信息,其中的手部形态信息,例如可以包括但不限于以下任意一项:手指伸出形式、手指数量、单双手信息等。其中,手指伸出形式例如可以是伸直、弯曲等;手指数量例如可以是一根、两根等;单双手信息例如可是左手、右手、或者双手等。相应地,第二确定模块702,具体用于基于第一获取模块704获取到的手部形态信息,确定目标维度参数。Optionally, in some implementations, the status of the target device is determined based on multiple dimensional parameters. Correspondingly, referring to Figure 10 again, the device control device of this embodiment may also include: a first acquisition module 704, specifically used to acquire hand form information corresponding to dynamic gestures, where the hand form information may include, for example But it is not limited to any of the following: finger extension form, number of fingers, single-hand information, etc. The finger extension form may be, for example, straightening, bending, etc.; the number of fingers may be, for example, one, two, etc.; the single-hand information may be, for example, the left hand, the right hand, or both hands, etc. Correspondingly, the second determination module 702 is specifically configured to determine the target dimension parameters based on the hand morphology information obtained by the first acquisition module 704 .
再参见图10,在又一实施例的设备控制装置中,还可以包括:第二获取模块706和第三确定模块708。其中,第二获取模块706,用于在动态手势的持续动作期间,实时或者按照预设调节周期,获取动态手势的运动速度。第三确定模块708,用于基于动态手势的运动速度,确定目标设备在目标维度参数上的目标调节速度。相应地,该实施例中,调节单元6086,具体用于对目标设备在目标维度参数上,以目标调节速度向目标调节方向进行调节。Referring again to FIG. 10 , in yet another embodiment of the equipment control device, a second acquisition module 706 and a third determination module 708 may also be included. Among them, the second acquisition module 706 is used to acquire the movement speed of the dynamic gesture in real time or according to a preset adjustment period during the continuous action of the dynamic gesture. The third determination module 708 is configured to determine the target adjustment speed of the target device on the target dimension parameter based on the movement speed of the dynamic gesture. Correspondingly, in this embodiment, the adjustment unit 6086 is specifically used to adjust the target device on the target dimension parameter at the target adjustment speed in the target adjustment direction.
再参见图10,在又一实施例的设备控制装置中,还可以包括:第三获取模块710,用于获取目标设备在目标维度参数上的调节速度配置信息,调节速度配置信息用于表示在目标设备的各维度参数上,手势运动速度和设备调节速度之间的关系。相应地,该实施例中,第三确定模块708,具体用于基于调节速度配置信息,确定在目标维度参数上,动态手势的运动速度对应的设备调节速度为目标调节速度。Referring again to Figure 10, in another embodiment of the equipment control device, it may also include: a third acquisition module 710, used to obtain the adjustment speed configuration information of the target device on the target dimension parameter, and the adjustment speed configuration information is used to represent the adjustment speed configuration information in the target dimension parameter. Regarding the various dimensional parameters of the target device, the relationship between the gesture movement speed and the device adjustment speed. Accordingly, in this embodiment, the third determination module 708 is specifically configured to determine, based on the adjustment speed configuration information, the device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter as the target adjustment speed.
可选地,在其中一些实现方式中,第三获取模块710,具体用于从第一语音识别结果中获取目标设备在目标维度参数上的调节速度配置信息。Optionally, in some implementations, the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the first speech recognition result.
或者,在另一些实现方式中,第三获取模块710,具体用于从预先配置的调节速度配置信息中获取目标设备在目标维度参数上的调节速度配置信息。Or, in other implementations, the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the preconfigured adjustment speed configuration information.
或者,再参见图10,在又一些实现方式中,语音识别模块602,还可用于响应于接收到调节速度配置语音指令,对调节速度配置语音指令进行语音识别,得到第三语音识别结果。相应地,该实施例中,第三获取模块710,具体用于从第三语音识别结果中获取目标设备在目标维度参数上的调节速度配置信息;Or, referring to FIG. 10 again, in some implementations, the voice recognition module 602 can also be used to respond to receiving the voice instruction to adjust the speed configuration, perform voice recognition on the voice instruction to adjust the speed configuration, and obtain a third voice recognition result. Accordingly, in this embodiment, the third acquisition module 710 is specifically configured to acquire the adjustment speed configuration information of the target device on the target dimension parameter from the third speech recognition result;
再参见图10,在又一实施例的设备控制装置中,还可以包括:配置模块712,用于通过设置接口接收调节速度配置请求,调节速度配置请求包括设备标识、维度参数标识、手势运动幅度和设备调节幅度信息,设备标识用于唯一标识一个设备,维度参数标识用于唯一标识一个维度参数;基于手势运动幅度和设备调节幅度信息,确定手势运动速度和设备调节速度之间的关系;基于设备标识、维度参数标识、手势运动速度和设备调节速度之间的关系,配置设备标识所标识的设备在维度参数标识所标识的维度参数上的调节速 度配置信息;或者,基于设备标识、维度参数标识、手势运动速度和设备调节速度之间的关系,更新预先配置的调节速度配置信息中设备标识和维度参数标识对应的调节速度配置信息。Referring again to Figure 10, in another embodiment of the device control device, it may also include: a configuration module 712, configured to receive an adjustment speed configuration request through a setting interface. The adjustment speed configuration request includes a device identification, a dimension parameter identification, and a gesture movement amplitude. and device adjustment amplitude information, the device identifier is used to uniquely identify a device, and the dimension parameter identifier is used to uniquely identify a dimension parameter; based on the gesture movement amplitude and device adjustment amplitude information, the relationship between the gesture movement speed and the device adjustment speed is determined; based on The relationship between device identification, dimension parameter identification, gesture movement speed and device adjustment speed, configure the adjustment speed of the device identified by the device identification on the dimension parameter identified by the dimension parameter identification degree configuration information; or, based on the relationship between the device identification, the dimension parameter identification, the gesture movement speed and the device adjustment speed, update the adjustment speed configuration information corresponding to the device identification and the dimension parameter identification in the preconfigured adjustment speed configuration information.
可选地,在其中一些实现方式中,语音识别模块602,还用于在基于动态手势在运动方向上的持续动作,对目标设备在目标维度参数上,向目标调节方向进行连续调节的过程中,响应于接收到调节速度更新语音指令,对调节速度更新语音指令进行语音识别,得到第四语音识别结果。相应地,该实施例中,第三获取模块710,还用于从第四语音识别结果中获取调节速度更新配置信息,调节速度更新配置信息用于表示在目标设备的各维度参数上,更新后的手势运动速度和设备调节速度之间的关系。第二获取模块706,还用于在动态手势的后续持续动作期间,实时或者按照预设调节周期,获取动态手势的运动速度。第三确定模块708,还用于基于调节速度更新配置信息,确定在目标维度参数上,动态手势的运动速度对应的更新设备调节速度。调节单元6086,还用于对目标设备在目标维度参数上,以更新调节速度向目标调节方向进行调节。Optionally, in some implementations, the speech recognition module 602 is also used to continuously adjust the target device in the target dimension parameters in the target adjustment direction based on the continuous action of the dynamic gesture in the movement direction. , in response to receiving the speed adjustment voice instruction, perform voice recognition on the speed adjustment voice instruction, and obtain a fourth voice recognition result. Correspondingly, in this embodiment, the third acquisition module 710 is also used to acquire the adjustment speed update configuration information from the fourth speech recognition result. The adjustment speed update configuration information is used to represent the parameters of each dimension of the target device. After the update The relationship between gesture movement speed and device adjustment speed. The second acquisition module 706 is also used to acquire the movement speed of the dynamic gesture in real time or according to a preset adjustment period during the subsequent continuous action of the dynamic gesture. The third determination module 708 is also configured to determine the update device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter based on the adjustment speed update configuration information. The adjustment unit 6086 is also used to adjust the target device on the target dimension parameters in the target adjustment direction at an updated adjustment speed.
再参见图10,在又一实施例的设备控制装置中,还可以包括:第四确定模块714,用于确定发送语音控制指令的声源对象的位置。Referring again to FIG. 10 , in yet another embodiment of the device control device, a fourth determination module 714 may be further included, configured to determine the position of the sound source object that sends the voice control instruction.
相应地,在其中一些实现方式中,检测模块606具体用于:基于声源对象的位置,获取包括声源对象的手部的图像序列,图像序列包括具有时序关系的多帧图像;依次对图像序列中的各帧图像进行手部关键点检测,得到手部关键点序列,手部关键点序列由各帧图像中的手部关键点基于时序关系形成;基于手部关键点序列,进行预设动态手势检测。Correspondingly, in some implementations, the detection module 606 is specifically configured to: based on the position of the sound source object, obtain an image sequence including the hand of the sound source object, where the image sequence includes multiple frames of images with a temporal relationship; Hand key point detection is performed on each frame image in the sequence to obtain a hand key point sequence. The hand key point sequence is formed based on the time series relationship between the hand key points in each frame image; based on the hand key point sequence, a preset is performed Dynamic gesture detection.
相应地,在该实施例中,第一确定单元6082,具体用于基于手部关键点序列,确定动态手势的运动方向。Correspondingly, in this embodiment, the first determining unit 6082 is specifically configured to determine the movement direction of the dynamic gesture based on the hand key point sequence.
相应地,在该实施例中,第二获取模块706,具体用于基于图像序列中最后一帧图像中的手部关键点和前一帧图像中的手部关键点、以及最后一帧图像的采集时刻和前一帧图像的采集时刻,获取动态手势的运动速度。Correspondingly, in this embodiment, the second acquisition module 706 is specifically used to obtain the hand key points based on the hand key points in the last frame image in the image sequence, the hand key points in the previous frame image, and the last frame image. The acquisition time and the acquisition time of the previous frame image are used to obtain the movement speed of the dynamic gesture.
相应地,在另一些实现方式中,检测模块606具体用于::基于声源对象的位置,利用ToF传感器,测量声源对象的手部各点与ToF传感器之间的距离信息,得到一组距离信息;基于具有时序关系的多组距离信息,得到距离信息序列;基于距离信息序列,进行预设动态手势检测。Correspondingly, in other implementations, the detection module 606 is specifically used to: based on the position of the sound source object, use the ToF sensor to measure the distance information between the hand points of the sound source object and the ToF sensor to obtain a set of Distance information; based on multiple sets of distance information with time-series relationships, a distance information sequence is obtained; based on the distance information sequence, preset dynamic gesture detection is performed.
相应地,在该实施例中,第一确定单元6082,具体用于基于距离信息序列,确定动态手势的运动方向。Correspondingly, in this embodiment, the first determining unit 6082 is specifically configured to determine the movement direction of the dynamic gesture based on the distance information sequence.
相应地,在该实施例中,第二获取模块706,具体用于基于距离信息序列中最后一组距离信息和前一组距离信息、以及最后一组距离信息对应的测量时刻和前一组距离信息对应的测量时刻,获取动态手势的运动速度。Correspondingly, in this embodiment, the second acquisition module 706 is specifically configured to calculate the distance information based on the last set of distance information and the previous set of distance information in the distance information sequence, and the measurement time corresponding to the last set of distance information and the previous set of distances. The measurement moment corresponding to the information is used to obtain the movement speed of the dynamic gesture.
相应地,在又一些实现方式中,检测模块606具体用于:基于声源对象的位置,利用穿戴设备,获取声源对象的手部各点的位置,得到手部位置信息,手部位置信息包括手部各点的位置信息;确定获取的手部位置信息确定手部的姿态;基于多个时刻手部的姿态,确定手部的动作;确认该手部的动作是否为预设动态手势的动作,响应于手部的动作是否为预设动态手势的动作,确认检测到预设动态手势。Correspondingly, in some implementations, the detection module 606 is specifically used to: based on the position of the sound source object, use the wearable device to obtain the position of each point of the hand of the sound source object, and obtain the hand position information. The hand position information Including position information of each point of the hand; determining the acquired hand position information to determine the posture of the hand; determining the movement of the hand based on the posture of the hand at multiple moments; confirming whether the movement of the hand is a preset dynamic gesture Action, in response to whether the hand movement is a preset dynamic gesture action, confirming that the preset dynamic gesture is detected.
相应地,在该实施例中,第一确定单元6082,具体用于基于所述多个时刻手部的姿态,确定动态手势 的运动方向。Correspondingly, in this embodiment, the first determining unit 6082 is specifically configured to determine the dynamic gesture based on the hand postures at the multiple moments. direction of movement.
相应地,在该实施例中,第二获取模块706,具体用于基于所述多个时刻中最后一个时刻和前一个时刻、以及最后一个时刻的手部位置信息和前一个时刻的手部位置信息,获取动态手势的运动速度。Correspondingly, in this embodiment, the second acquisition module 706 is specifically configured to obtain information based on the last moment and the previous moment among the multiple moments, as well as the hand position information of the last moment and the hand position of the previous moment. Information to obtain the movement speed of dynamic gestures.
示例性电子设备Example electronic device
图11是本公开一示例性实施例提供的电子设备的结构示意图。下面,参考图11来描述根据本公开实施例的电子设备。该电子设备可以是第一设备100和第二设备200中的任一个或两者、或与它们独立的单机设备,该单机设备可以与第一设备和第二设备进行通信,以从它们接收所采集到的输入信号。FIG. 11 is a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure. Next, an electronic device according to an embodiment of the present disclosure is described with reference to FIG. 11 . The electronic device may be any one or both of the first device 100 and the second device 200, or a stand-alone device independent of them. The stand-alone device may communicate with the first device and the second device to receive the information from them. collected input signal.
如图11所示,电子设备包括一个或多个处理器802和存储器804。As shown in Figure 11, the electronic device includes one or more processors 802 and memory 804.
处理器802可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备中的其他组件以执行期望的功能。Processor 802 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
存储器804可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器802可以运行所述程序指令,以实现上文所述的本公开的各个实施例的设备控制方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。Memory 804 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache). The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 802 may execute the program instructions to implement the device control methods of various embodiments of the present disclosure described above and/or other Desired functionality. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
在一个示例中,电子设备还可以包括:输入装置806和输出装置808,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。In one example, the electronic device may also include an input device 806 and an output device 808, with these components interconnected by a bus system and/or other forms of connection mechanisms (not shown).
例如,在该电子设备是第一设备100或第二设备200时,该输入装置806可以是上述的麦克风或麦克风阵列,用于捕捉声源的输入信号。在该电子设备是单机设备时,该输入装置806可以是通信网络连接器,用于从第一设备100和第二设备200接收所采集的输入信号。For example, when the electronic device is the first device 100 or the second device 200, the input device 806 may be the above-mentioned microphone or microphone array, used to capture the input signal of the sound source. When the electronic device is a stand-alone device, the input device 806 may be a communication network connector for receiving the collected input signals from the first device 100 and the second device 200 .
此外,该输入设备806还可以包括例如键盘、鼠标等等。In addition, the input device 806 may also include, for example, a keyboard, a mouse, and the like.
该输出装置808可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出设备808可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 808 can output various information to the outside, including determined distance information, direction information, etc. The output devices 808 may include, for example, displays, speakers, printers, and communication networks and remote output devices to which they are connected, among others.
当然,为了简化,图802中仅示出了该电子设备中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备还可以包括任何其他适当的组件。Of course, for simplicity, only some of the components in the electronic device related to the present disclosure are shown in diagram 802, and components such as buses, input/output interfaces, etc. are omitted. In addition to this, the electronic device may include any other suitable components depending on the specific application.
示例性计算机程序产品和计算机可读存储介质Example computer program products and computer-readable storage media
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的设备控制方法中的步骤。In addition to the above methods and devices, embodiments of the present disclosure may also be a computer program product, which includes computer program instructions that, when executed by a processor, cause the processor to perform the “exemplary method” described above in this specification The steps in the device control method according to various embodiments of the present disclosure are described in Sec.
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、 效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。The basic principles of the present disclosure have been described above in conjunction with specific embodiments. However, it should be pointed out that the advantages, advantages, and The effects, etc. are only examples and not limitations, and it cannot be considered that these advantages, advantages, effects, etc. are necessarily possessed by each embodiment of the present disclosure. In addition, the specific details disclosed above are only for the purpose of illustration and to facilitate understanding, and are not limiting. The above details do not limit the present disclosure to be implemented by using the above specific details.
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner, and each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.
本公开中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。The block diagrams of the devices, devices, equipment, and systems involved in the present disclosure are only illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, devices, equipment, and systems may be connected, arranged, and configured in any manner.
可能以许多方式来实现本公开的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。The methods and apparatus of the present disclosure may be implemented in many ways. For example, the methods and devices of the present disclosure may be implemented through software, hardware, firmware, or any combination of software, hardware, and firmware. The above order for the steps of the methods is for illustration only, and the steps of the methods of the present disclosure are not limited to the order specifically described above unless otherwise specifically stated. Furthermore, in some embodiments, the present disclosure may also be implemented as programs recorded in recording media, and these programs include machine-readable instructions for implementing methods according to the present disclosure. Thus, the present disclosure also covers recording media storing programs for executing methods according to the present disclosure.
还需要指出的是,在本公开的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。 It should also be noted that in the devices, equipment and methods of the present disclosure, each component or each step can be decomposed and/or recombined. These decompositions and/or recombinations should be considered equivalent versions of the present disclosure.

Claims (17)

  1. 一种设备控制方法,包括:A device control method including:
    响应于接收到语音控制指令,对所述语音控制指令进行语音识别,得到第一语音识别结果;In response to receiving the voice control instruction, perform voice recognition on the voice control instruction to obtain a first voice recognition result;
    基于所述第一语音识别结果,确定所述语音控制指令对应的目标设备;Based on the first voice recognition result, determine the target device corresponding to the voice control instruction;
    响应于检测到预设动态手势,基于所述动态手势的持续动作对所述目标设备的状态进行连续调节。In response to detecting the preset dynamic gesture, continuously adjusting the state of the target device based on continued actions of the dynamic gesture.
  2. 根据权利要求1所述的方法,还包括:The method of claim 1, further comprising:
    确定所述目标设备的待调节的目标维度参数;Determine the target dimension parameters to be adjusted of the target device;
    所述基于所述动态手势的持续动作对所述目标设备的状态进行连续调节,包括:The continuous action based on the dynamic gesture continuously adjusts the state of the target device, including:
    确定所述动态手势的运动方向;Determine the movement direction of the dynamic gesture;
    基于所述动态手势的运动方向,确定所述目标设备在所述目标维度参数上的目标调节方向;Based on the movement direction of the dynamic gesture, determine the target adjustment direction of the target device on the target dimension parameter;
    基于所述动态手势在所述运动方向上的持续动作,对所述目标设备在所述目标维度参数上,向所述目标调节方向进行连续调节。Based on the continuous action of the dynamic gesture in the movement direction, the target device is continuously adjusted in the target dimension parameter in the target adjustment direction.
  3. 根据权利要求2所述的方法,其中,所述目标设备的状态基于一个维度参数确定;The method of claim 2, wherein the status of the target device is determined based on a dimensional parameter;
    所述确定所述目标设备的待调节的目标维度参数,包括:Determining the target dimension parameters to be adjusted of the target device includes:
    确定所述目标设备的所述一个维度参数为所述目标维度参数。The one dimension parameter of the target device is determined to be the target dimension parameter.
  4. 根据权利要求2所述的方法,其中,所述目标设备的状态基于多个维度参数确定;The method of claim 2, wherein the status of the target device is determined based on multiple dimensional parameters;
    所述确定所述目标设备的待调节的目标维度参数,包括:Determining the target dimension parameters to be adjusted of the target device includes:
    基于所述第一语音识别结果,确定所述目标维度参数。Based on the first speech recognition result, the target dimension parameter is determined.
  5. 根据权利要求2所述的方法,其中,所述目标设备的状态基于多个维度参数确定;The method of claim 2, wherein the status of the target device is determined based on multiple dimensional parameters;
    所述确定所述目标设备的待调节的目标维度参数,包括:Determining the target dimension parameters to be adjusted of the target device includes:
    响应于接收到维度参数语音指令,对所述维度参数语音指令进行语音识别,得到第二语音识别结果;In response to receiving the dimension parameter voice command, perform voice recognition on the dimension parameter voice command to obtain a second voice recognition result;
    基于所述第二语音识别结果,确定所述目标维度参数。Based on the second speech recognition result, the target dimension parameter is determined.
  6. 根据权利要求2所述的方法,其中,所述目标设备的状态基于多个维度参数确定;The method of claim 2, wherein the status of the target device is determined based on multiple dimensional parameters;
    所述确定所述目标设备的待调节的目标维度参数,包括:Determining the target dimension parameters to be adjusted of the target device includes:
    获取所述动态手势对应的手部形态信息,所述手部形态信息包括以下任意一项:手指伸出形式,手指数量,单双手信息;Obtain hand morphology information corresponding to the dynamic gesture. The hand morphology information includes any of the following: finger extension form, number of fingers, and single-hand information;
    基于所述手部形态信息,确定所述目标维度参数。Based on the hand morphology information, the target dimension parameters are determined.
  7. 根据权利要求2-6任一所述的方法,其中,所述基于所述动态手势在所述运动方向上的持续动作,对所述目标设备在所述目标维度参数上,向所述目标调节方向进行连续调节,包括:The method according to any one of claims 2 to 6, wherein based on the continuous action of the dynamic gesture in the movement direction, the target device adjusts the target dimension parameter toward the target. Continuous adjustment of direction, including:
    在所述动态手势的持续动作期间,实时或者按照预设调节周期,获取所述动态手势的运动速度;During the continuous action of the dynamic gesture, obtain the movement speed of the dynamic gesture in real time or according to a preset adjustment period;
    基于所述动态手势的运动速度,确定所述目标设备在所述目标维度参数上的目标调节速度;Based on the movement speed of the dynamic gesture, determine the target adjustment speed of the target device on the target dimension parameter;
    对所述目标设备在所述目标维度参数上,以所述目标调节速度向所述目标调节方向进行调节。 The target device is adjusted in the target adjustment direction at the target adjustment speed on the target dimension parameter.
  8. 根据权利要求7所述的方法,还包括:The method of claim 7, further comprising:
    获取所述目标设备在所述目标维度参数上的调节速度配置信息,所述调节速度配置信息用于表示在所述目标设备的各维度参数上,手势运动速度和设备调节速度之间的关系;Obtain the adjustment speed configuration information of the target device on the target dimension parameters, and the adjustment speed configuration information is used to represent the relationship between the gesture movement speed and the device adjustment speed on each dimension parameter of the target device;
    所述基于所述动态手势的运动速度,确定所述目标设备在所述目标维度参数上的目标调节速度,包括:Determining the target adjustment speed of the target device on the target dimension parameter based on the movement speed of the dynamic gesture includes:
    基于所述调节速度配置信息,确定在所述目标维度参数上,所述动态手势的运动速度对应的设备调节速度为所述目标调节速度。Based on the adjustment speed configuration information, it is determined that the device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter is the target adjustment speed.
  9. 根据权利要求8所述的方法,其中,所述获取所述目标设备在所述目标维度参数上的调节速度配置信息,包括:The method according to claim 8, wherein said obtaining the adjustment speed configuration information of the target device on the target dimension parameter includes:
    从所述第一语音识别结果中获取所述目标设备在所述目标维度参数上的调节速度配置信息;或者,Obtain the adjustment speed configuration information of the target device on the target dimension parameter from the first speech recognition result; or,
    响应于接收到调节速度配置语音指令,对所述调节速度配置语音指令进行语音识别,得到第三语音识别结果;In response to receiving the speed adjustment voice instruction, perform voice recognition on the speed adjustment voice instruction to obtain a third voice recognition result;
    从所述第三语音识别结果中获取所述目标设备在所述目标维度参数上的调节速度配置信息;或者,Obtain the adjustment speed configuration information of the target device on the target dimension parameter from the third speech recognition result; or,
    从预先配置的调节速度配置信息中获取所述目标设备在所述目标维度参数上的调节速度配置信息。The adjustment speed configuration information of the target device on the target dimension parameter is obtained from the preconfigured adjustment speed configuration information.
  10. 根据权利要求9所述的方法,其中,预先配置所述调节速度配置信息,包括:The method according to claim 9, wherein preconfiguring the adjustment speed configuration information includes:
    通过设置接口接收调节速度配置请求,所述调节速度配置请求包括设备标识、维度参数标识、手势运动幅度和设备调节幅度信息,所述设备标识用于唯一标识一个设备,所述维度参数标识用于唯一标识一个维度参数;The adjustment speed configuration request is received through the setting interface. The adjustment speed configuration request includes device identification, dimension parameter identification, gesture motion amplitude and device adjustment amplitude information. The device identification is used to uniquely identify a device, and the dimension parameter identification is used to uniquely identify a device. Uniquely identifies a dimension parameter;
    基于所述手势运动幅度和所述设备调节幅度信息,确定手势运动速度和设备调节速度之间的关系;Based on the gesture movement amplitude and the device adjustment amplitude information, determine the relationship between the gesture movement speed and the device adjustment speed;
    基于所述设备标识、所述维度参数标识、所述手势运动速度和设备调节速度之间的关系,配置所述设备标识所标识的设备在所述维度参数标识所标识的维度参数上的调节速度配置信息;或者,基于所述设备标识、所述维度参数标识、所述手势运动速度和设备调节速度之间的关系,更新预先配置的调节速度配置信息中所述设备标识和所述维度参数标识对应的调节速度配置信息。Based on the relationship between the device identification, the dimension parameter identification, the gesture movement speed and the device adjustment speed, configure the adjustment speed of the device identified by the device identification on the dimension parameter identified by the dimension parameter identification Configuration information; or, based on the relationship between the device identification, the dimension parameter identification, the gesture movement speed and the device adjustment speed, update the device identification and the dimension parameter identification in the preconfigured adjustment speed configuration information Corresponding adjustment speed configuration information.
  11. 根据权利要求7-10任一所述的方法,还包括:The method according to any one of claims 7-10, further comprising:
    在所述基于所述动态手势在所述运动方向上的持续动作,对所述目标设备在所述目标维度参数上,向所述目标调节方向进行连续调节的过程中,响应于接收到调节速度更新语音指令,对所述调节速度更新语音指令进行语音识别,得到第四语音识别结果;In the process of continuously adjusting the target device in the target dimension parameter in the target adjustment direction based on the continuous action of the dynamic gesture in the movement direction, in response to receiving the adjustment speed Update the voice command, perform voice recognition on the speed adjustment voice command, and obtain a fourth voice recognition result;
    从所述第四语音识别结果中获取调节速度更新配置信息,所述调节速度更新配置信息用于表示在所述目标设备的各维度参数上,更新后的手势运动速度和设备调节速度之间的关系;The adjustment speed update configuration information is obtained from the fourth speech recognition result. The adjustment speed update configuration information is used to represent the difference between the updated gesture movement speed and the device adjustment speed on each dimension parameter of the target device. relation;
    在所述动态手势的后续持续动作期间,实时或者按照预设调节周期,获取所述动态手势的运动速度;During the subsequent continuous action of the dynamic gesture, obtain the movement speed of the dynamic gesture in real time or according to a preset adjustment period;
    基于所述调节速度更新配置信息,确定在所述目标维度参数上,所述动态手势的运动速度对应的更新设备调节速度; Based on the adjustment speed update configuration information, determine the update device adjustment speed corresponding to the movement speed of the dynamic gesture on the target dimension parameter;
    对所述目标设备在所述目标维度参数上,以所述更新调节速度向所述目标调节方向进行调节。The target device is adjusted in the target adjustment direction at the update adjustment speed on the target dimension parameter.
  12. 根据权利要求7-11任一所述的方法,还包括:The method according to any one of claims 7-11, further comprising:
    确定发送所述语音控制指令的声源对象的位置;Determine the location of the sound source object that sends the voice control instruction;
    基于所述声源对象的位置,获取包括所述声源对象的手部的图像序列,所述图像序列包括具有时序关系的多帧图像;Based on the position of the sound source object, obtain an image sequence including the hand of the sound source object, where the image sequence includes multiple frame images with a temporal relationship;
    依次对所述图像序列中的各帧图像进行手部关键点检测,得到手部关键点序列,所述手部关键点序列由所述各帧图像中的手部关键点基于所述时序关系形成;Perform hand key point detection on each frame image in the image sequence in sequence to obtain a hand key point sequence, which is formed from the hand key points in each frame image based on the temporal relationship. ;
    基于所述手部关键点序列,进行预设动态手势检测。Based on the hand key point sequence, preset dynamic gesture detection is performed.
  13. 根据权利要求12所述的方法,其中,所述确定所述动态手势的运动方向,包括:The method according to claim 12, wherein determining the movement direction of the dynamic gesture includes:
    基于所述手部关键点序列,确定所述动态手势的运动方向。Based on the hand key point sequence, the movement direction of the dynamic gesture is determined.
  14. 根据权利要求12或13所述的方法,其中,所述获取所述动态手势的运动速度,包括:The method according to claim 12 or 13, wherein said obtaining the movement speed of the dynamic gesture includes:
    基于所述图像序列中最后一帧图像中的手部关键点和前一帧图像中的手部关键点、以及所述最后一帧图像的采集时刻和所述前一帧图像的采集时刻,获取所述动态手势的运动速度。Based on the hand key points in the last frame image and the hand key points in the previous frame image in the image sequence, as well as the collection time of the last frame image and the collection time of the previous frame image, obtain The movement speed of the dynamic gesture.
  15. 一种设备控制装置,包括:An equipment control device including:
    语音识别模块,用于响应于接收到语音控制指令,对所述语音控制指令进行语音识别,得到第一语音识别结果;A voice recognition module, configured to perform voice recognition on the voice control instruction in response to receiving the voice control instruction, and obtain a first voice recognition result;
    确定模块,用于基于所述语音识别模块得到的所述第一语音识别结果,确定所述语音控制指令对应的目标设备;A determination module, configured to determine the target device corresponding to the voice control instruction based on the first voice recognition result obtained by the voice recognition module;
    检测模块,用于检测预设动态手势;Detection module, used to detect preset dynamic gestures;
    调节模块,用于响应于所述检测模块检测到所述预设动态手势,基于所述动态手势的持续动作对所述目标设备的状态进行连续调节。An adjustment module, configured to continuously adjust the state of the target device based on the continuous action of the dynamic gesture in response to the detection module detecting the preset dynamic gesture.
  16. 一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1-14任一所述的设备控制方法。A computer-readable storage medium stores a computer program, and the computer program is used to execute the device control method described in any one of claims 1-14.
  17. 一种电子设备,所述电子设备包括:An electronic device, the electronic device includes:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;memory for storing instructions executable by the processor;
    所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-14任一所述的设备控制方法。 The processor is configured to read the executable instructions from the memory and execute the instructions to implement the device control method described in any one of claims 1-14.
PCT/CN2023/074997 2022-03-11 2023-02-08 Device control method and apparatus, and electronic device and medium WO2023169123A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210242711.4 2022-03-11
CN202210242711.4A CN114613362A (en) 2022-03-11 2022-03-11 Device control method and apparatus, electronic device, and medium

Publications (1)

Publication Number Publication Date
WO2023169123A1 true WO2023169123A1 (en) 2023-09-14

Family

ID=81863083

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074997 WO2023169123A1 (en) 2022-03-11 2023-02-08 Device control method and apparatus, and electronic device and medium

Country Status (2)

Country Link
CN (1) CN114613362A (en)
WO (1) WO2023169123A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316158A (en) * 2023-11-28 2023-12-29 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114613362A (en) * 2022-03-11 2022-06-10 深圳地平线机器人科技有限公司 Device control method and apparatus, electronic device, and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886070A (en) * 2018-12-24 2019-06-14 珠海格力电器股份有限公司 A kind of apparatus control method, device, storage medium and equipment
CN110770693A (en) * 2017-06-21 2020-02-07 三菱电机株式会社 Gesture operation device and gesture operation method
CN110936797A (en) * 2019-12-02 2020-03-31 恒大新能源汽车科技(广东)有限公司 Automobile skylight control method and electronic equipment
CN112487958A (en) * 2020-11-27 2021-03-12 苏州思必驰信息科技有限公司 Gesture control method and system
CN112545373A (en) * 2019-09-26 2021-03-26 珠海市一微半导体有限公司 Control method of sweeping robot, sweeping robot and medium
CN114613362A (en) * 2022-03-11 2022-06-10 深圳地平线机器人科技有限公司 Device control method and apparatus, electronic device, and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110770693A (en) * 2017-06-21 2020-02-07 三菱电机株式会社 Gesture operation device and gesture operation method
CN109886070A (en) * 2018-12-24 2019-06-14 珠海格力电器股份有限公司 A kind of apparatus control method, device, storage medium and equipment
CN112545373A (en) * 2019-09-26 2021-03-26 珠海市一微半导体有限公司 Control method of sweeping robot, sweeping robot and medium
CN110936797A (en) * 2019-12-02 2020-03-31 恒大新能源汽车科技(广东)有限公司 Automobile skylight control method and electronic equipment
CN112487958A (en) * 2020-11-27 2021-03-12 苏州思必驰信息科技有限公司 Gesture control method and system
CN114613362A (en) * 2022-03-11 2022-06-10 深圳地平线机器人科技有限公司 Device control method and apparatus, electronic device, and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316158A (en) * 2023-11-28 2023-12-29 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium
CN117316158B (en) * 2023-11-28 2024-04-12 科大讯飞股份有限公司 Interaction method, device, control equipment and storage medium

Also Published As

Publication number Publication date
CN114613362A (en) 2022-06-10

Similar Documents

Publication Publication Date Title
WO2023169123A1 (en) Device control method and apparatus, and electronic device and medium
US11017217B2 (en) System and method for controlling appliances using motion gestures
EP3656094B1 (en) Controlling a device based on processing of image data that captures the device and/or an installation environment of the device
EP3616034B1 (en) Generating and/or adapting automated assistant content according to a distance between user(s) and an automated assistant interface
CN103353935B (en) A kind of 3D dynamic gesture identification method for intelligent domestic system
JP2023517383A (en) Method and system for controlling devices using hand gestures in a multi-user environment
CN103970264B (en) Gesture recognition and control method and device
KR20230173211A (en) Adapting automated assistant based on detected mouth movement and/or gaze
CN107066085B (en) Method and device for controlling terminal based on eyeball tracking
EP4127879A1 (en) Method and device for adjusting the control-display gain of a gesture controlled electronic device
KR20220104304A (en) Transferring an automated assistant routine between client devices during execution of the routine
WO2022116656A1 (en) Methods and devices for hand-on-wheel gesture interaction for controls
KR20210011146A (en) Apparatus for providing a service based on a non-voice wake-up signal and method thereof
CN112083795A (en) Object control method and device, storage medium and electronic equipment
WO2022262538A1 (en) Vehicle control method and apparatus, electronic device, and storage medium
US10444831B2 (en) User-input apparatus, method and program for user-input
JP2023534589A (en) METHOD AND APPARATUS FOR NON-CONTACT OPERATION BY GUIDING OPERATING BODY
CN110858467A (en) Display screen control system and vehicle
CN109688512B (en) Pickup method and device
US9761009B2 (en) Motion tracking device control systems and methods
JP7152908B2 (en) Gesture control device and gesture control program
CN113448429A (en) Method and device for controlling electronic equipment based on gestures, storage medium and electronic equipment
KR102192051B1 (en) Device and method for recognizing motion using deep learning, recording medium for performing the method
KR102677096B1 (en) Adapting automated assistant based on detected mouth movement and/or gaze
CN115291721A (en) Intelligent equipment control method, device, storage medium and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765701

Country of ref document: EP

Kind code of ref document: A1