CN115476366A

CN115476366A - Control method, device, control equipment and storage medium for foot type robot

Info

Publication number: CN115476366A
Application number: CN202110663189.2A
Authority: CN
Inventors: 舒梓峰
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2022-12-16
Anticipated expiration: 2041-06-15
Also published as: CN115476366B

Abstract

The application provides a control method, a control device, control equipment and a storage medium of a foot type robot, wherein the method comprises the steps of obtaining a control intention of a user and scene information of a scene where the user is located; identifying a target object from the scene according to the control intention and the scene information; acquiring a target control instruction corresponding to the control intention; and adopting the target control instruction to control the foot type robot to execute the target action aiming at the target object. By the method and the device, the interaction mode between the people and the robot can be effectively enriched, the interactive dimension and the application scene are expanded, and the man-machine interaction effect is improved.

Description

Control method, device, control equipment and storage medium for foot type robot

Technical Field

The present application relates to the field of robotics, and in particular, to a method and an apparatus for controlling a foot robot, a control device, and a storage medium.

Background

With the rapid development of scientific technology, more and more scenes begin to use robots instead of manual operations, so that robots having various functions, such as wheeled robots or legged robots, are in motion and capable of providing various business services to users. On the other hand, with the progress of artificial intelligence technology, the interaction between the human and the robot tends to be more intelligent and natural.

In the related art, the interaction mode between the human and the robot is simple, and the interaction dimensionality is not rich enough, so that the expansion of an interaction scene is influenced, and the interaction effect is poor.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, the application aims to provide a control method, a control device and a storage medium for a foot-type robot, which can effectively enrich the interaction mode between people and the robot, expand the interaction dimension and application scene and improve the man-machine interaction effect.

In order to achieve the above object, a method for controlling a legged robot according to an embodiment of the present application includes: acquiring a control intention of a user and scene information of a scene where the user is located; identifying a target object from among the scenes according to the control intention and the scene information; acquiring a target control instruction corresponding to the control intention; and controlling the legged robot to execute a target action aiming at the target object by adopting the target control instruction.

In some embodiments of the present application, the obtaining scene information of a scene where the user is located includes:

and acquiring interactive scene information of multiple dimensions of the scene where the user is located, and taking the interactive scene information of the multiple dimensions as the scene information.

In some embodiments of the present application, the obtaining of interaction scene information of multiple dimensions of a scene where the user is located includes:

performing image recognition on the scene where the user is located so as to acquire interactive scene information of visual dimensions of the scene according to the recognized image; and/or the presence of a gas in the atmosphere,

and performing voice recognition on the scene where the user is located so as to acquire interactive scene information of voice dimension of the scene according to the recognized voice.

In some embodiments of the application, said identifying a target object from among the scenes according to the control intention and the scene information comprises:

determining a target action characteristic corresponding to the control intention;

according to the interaction scene information, a plurality of candidate objects are identified from the scene, and the candidate objects respectively have a plurality of corresponding candidate action characteristics;

and taking the candidate object to which the candidate action characteristic matched with the target action characteristic belongs as the target object.

In some embodiments of the present application, the determining a target action characteristic corresponding to the control intent includes:

determining a control type corresponding to the control intention;

and determining a target action characteristic corresponding to the control type according to the context information carried by the control intention.

determining a user gesture feature corresponding to the control intent;

determining a target direction in the scene to which the user gesture features are mapped according to the scene information;

and identifying a target object from the scene according to the target direction.

In some embodiments of the present application, the obtaining of the control intention of the user includes:

acquiring the interactive information of the user and the foot type robot;

and analyzing the control intention of the user according to the interaction information.

In some embodiments of the present application, the foot robot connecting to an external input device, and the acquiring the interaction information of the user with the foot robot, includes:

acquiring interactive information directly input by the user through the legged robot; and/or the presence of a gas in the atmosphere,

and acquiring interactive information input to the legged robot by the user through the external input equipment.

In some embodiments of the present application, the acquiring interaction information directly input by the user via the legged robot includes:

and acquiring voice interaction information directly input by the user through the foot type robot.

In some embodiments of the present application, the acquiring interaction information input by the user to the legged robot via the external input device includes:

sending a plurality of candidate interaction information to the external input device;

and responding to a selection instruction of the user, and determining candidate interaction information matched with the selection instruction as the interaction information.

In some embodiments of the present application, before the sending the plurality of candidate interaction information to the external input device, the method further includes:

capturing a plurality of environmental video frames around the user and taking the plurality of environmental video frames as the plurality of candidate interaction information.

In some embodiments of the present application, after identifying the target object from among the scenes, the method further includes:

performing in vivo detection on the target object;

then said controlling said legged-robot to perform a target action for said target object comprises:

controlling the legged robot to perform a target action with respect to the target object if it is determined that the target object is a living body.

According to the control method of the legged robot provided by the embodiment of the first aspect of the application, the control intention of the user and the scene information of the scene where the user is located are obtained, the target object is identified from the scene according to the control intention and the scene information, the target control instruction corresponding to the control intention is obtained, and the target control instruction is adopted to control the legged robot to execute the target action aiming at the target object, so that the interaction mode between a person and the robot can be effectively enriched, the interaction dimension and the application scene are expanded, and the man-machine interaction effect is improved.

In order to achieve the above object, a control device for a legged robot according to an embodiment of the present invention includes: the first acquisition module is used for acquiring the control intention of a user and scene information of a scene where the user is located; the identification module is used for identifying a target object from the scene according to the control intention and the scene information; the second acquisition module is used for acquiring a target control instruction corresponding to the control intention; and the control module is used for adopting the target control instruction to control the foot type robot to execute a target action aiming at the target object.

In some embodiments of the present application, the first obtaining module is specifically configured to:

In some embodiments of the application, the first obtaining module is specifically configured to:

carrying out image recognition on the scene where the user is located so as to obtain interactive scene information of visual dimensions of the scene according to the recognized image; and/or the presence of a gas in the atmosphere,

and performing voice recognition on the scene where the user is located so as to acquire interactive scene information of voice dimensionality of the scene according to the recognized voice.

In some embodiments of the present application, the identification module comprises:

the first determining submodule is used for determining a target action characteristic corresponding to the control intention;

the first identification submodule is used for identifying a plurality of candidate objects from the scene according to the interactive scene information, and the candidate objects respectively have a plurality of corresponding candidate action characteristics;

and the matching sub-module is used for taking the candidate object to which the candidate action characteristic matched with the target action characteristic belongs as the target object.

In some embodiments of the application, the first determining submodule is specifically configured to:

determining a control type corresponding to the control intention;

a second determination submodule for determining a user gesture feature corresponding to the control intention;

the third determining submodule is used for determining the target direction of the user gesture feature mapped in the scene according to the scene information;

and the second identification submodule is used for identifying a target object from the scene according to the target direction.

acquiring the interactive information of the user and the foot type robot;

In some embodiments of the application, the legged robot is connected to an external input device, and the first obtaining module is specifically configured to:

acquiring interactive information directly input by the user through the foot type robot; and/or the presence of a gas in the gas,

and acquiring voice interaction information directly input by the user through the legged robot.

capturing a plurality of environment video frames around the user before the sending of the plurality of candidate interaction information to the external input device, and taking the plurality of environment video frames as the plurality of candidate interaction information.

In some embodiments of the present application, further comprising:

the detection module is used for carrying out living body detection on the target object;

the control module is specifically configured to:

According to the control device of the legged robot provided by the embodiment of the second aspect of the application, the control intention of the user and the scene information of the scene where the user is located are obtained, the target object is identified from the scene according to the control intention and the scene information, the target control instruction corresponding to the control intention is obtained, and the target control instruction is adopted to control the legged robot to execute the target action aiming at the target object, so that the interaction mode between a person and the robot can be effectively enriched, the interaction dimension and the application scene are expanded, and the man-machine interaction effect is improved.

In order to achieve the above object, a control apparatus for a legged robot according to an embodiment of the third aspect of the present application includes: the present invention relates to a foot robot, and more particularly to a foot robot control method, a control device, and a control program for a foot robot.

According to the control device of the legged robot provided by the embodiment of the third aspect of the application, the control intention of the user and the scene information of the scene where the user is located are obtained, the target object is identified from the scene according to the control intention and the scene information, the target control instruction corresponding to the control intention is obtained, and the target control instruction is adopted to control the legged robot to execute the target action aiming at the target object, so that the interaction mode between the human and the robot can be effectively enriched, the interaction dimension and the application scene are expanded, and the human-computer interaction effect is improved.

In order to achieve the above object, a fourth aspect of the present application provides a computer-readable storage medium, where a computer program is executed by a processor to implement the method for controlling a foot robot according to the foregoing description.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of a control method of a legged robot according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a control method of a legged robot according to another embodiment of the present application;

fig. 3 is a schematic flowchart of a control method of a legged robot according to another embodiment of the present application;

fig. 4 is a schematic flowchart of a control method for a legged robot according to another embodiment of the present application;

fig. 5 is a schematic flowchart of a control method of a legged robot according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of a control device of a legged robot according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a control device of a legged robot according to another embodiment of the present application;

fig. 8 is a schematic structural diagram of a control device of a legged robot according to another embodiment of the present application;

fig. 9 is a schematic structural diagram of a control apparatus of a foot robot according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative and are only for the purpose of explaining the present application and are not to be construed as limiting the present application. On the contrary, the embodiments of the application include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.

Fig. 1 is a flowchart illustrating a control method of a legged robot according to an embodiment of the present application.

The present embodiment is exemplified in a case where the control method of the foot robot is configured as a control apparatus of the foot robot.

The control method of the foot robot in the present embodiment may be configured in a control device of the foot robot, and the control device of the foot robot may be provided in the foot robot.

It should be noted that the execution main body of the embodiment may be, for example, a Central Processing Unit (CPU) in the foot robot in terms of hardware, and may be, for example, a related background service in the foot robot in terms of software, which is not limited to this.

As shown in fig. 1, the method for controlling a legged robot includes:

s101: and acquiring the control intention of the user and scene information of the scene where the user is located.

The embodiment of the present application can be specifically applied to an application scenario in which a person interacts with a legged robot, and in the scenario, for example, a user may apply a certain action instruction to the legged robot, and the legged robot recognizes the action instruction and executes an action matching the action instruction, which is not limited to this.

The above control intention can be used to describe the control requirement of the foot robot by the user, for example, if the user wants the foot robot to perform action a, the intention of "wanting the foot robot to perform action a" may be referred to as a control intention, or if the user wants the foot robot to prevent the object B in the scene from approaching, the intention of "wanting the foot robot to prevent the object B in the scene from approaching" may be referred to as a control intention, or if the user wants the foot robot to ask himself, the intention of "wanting the foot robot to ask himself" may be referred to as a control intention, which is not limited.

In the embodiment of the present application, the control intention of the user may be obtained in any possible manner in the related art, for example, a manner of a model in artificial intelligence, a manner of mathematical operation, and the like may be used, which is not limited.

The scene information of the scene where the user is located may specifically be a scene image, a scene video, a scene audio, and the like of the surrounding environment of the user, or may also be other environmental characteristics of the surrounding environment of the user, such as an environmental brightness, a geographic location corresponding to the environment, and the like, which is not limited thereto.

The above-mentioned obtaining of the scene information of the scene where the user is located may be turning on a camera device of the legged robot, and the camera device captures the scene information of the scene where the user is located, for example, a scene image, a scene depth map, depth information of a plurality of targets in the scene, and the like, which is not limited to this.

In some embodiments, the scene information of the scene where the user is located may be acquired by acquiring multi-dimensional interactive scene information of the scene where the user is located, and taking the multi-dimensional interactive scene information as the scene information, the dimensions of the scene information can be enriched as much as possible, so that the interaction between the user and the legged robot can effectively simulate the interaction mode between a human and a pet dog, and the target object can be effectively and accurately identified in a follow-up manner.

Multiple dimensions such as, without limitation, a visual dimension, a voice dimension, an auditory dimension, a haptic dimension, and the like.

The interaction scenario information is used to describe interaction environment information of an environment where a user currently interacts with the legged robot is located, or describe interaction information between the user and the legged robot, and may also describe control information of the user on an environment state, and the like.

The interactive scene information includes, but is not limited to, an image, an audio/video, and tactile information of an interactive scene of the user with the legged robot.

In other embodiments, as shown in fig. 2, fig. 2 is a schematic flowchart of a control method of a legged robot according to another embodiment of the present application, where acquiring interaction scene information of multiple dimensions of a scene where a user is located includes:

s201: and carrying out image recognition on the scene where the user is located so as to acquire interactive scene information of visual dimensions of the scene according to the recognized image.

For example, a camera of the legged robot may be turned on, and images of a scene where the user is located, such as a scene image and a scene depth map, are captured by the camera, so as to identify and analyze image features of the scene image and the scene depth map, and use the image features as interactive scene information of a visual dimension.

S202: and performing voice recognition on the scene where the user is located so as to acquire interactive scene information of voice dimensionality of the scene according to the recognized voice.

In some embodiments, when the interactive scene information of the visual dimension of the scene is acquired according to the recognized image, specifically, the recognized image (the scene image and the scene depth map) may be subjected to image analysis, and features of the image, such as content, time, brightness, and the like, are analyzed, so that the features are used as the recognized interactive scene information, and when the interactive scene information of the voice dimension of the scene is acquired according to the recognized voice, the recognized voice may be subjected to voice parsing processing, so that features of tone, voice duration, and the like of the voice are obtained, so that the features are used as the recognized interactive scene information.

The identified interactive scene information of the visual dimension and the interactive scene information of the voice dimension can be jointly used as scene information, and the interactive scene information of the visual dimension and the voice dimension can be used for assisting in identifying a target object from a scene subsequently.

Therefore, the method is simple and convenient to realize, and accurate interactive scene information can be acquired, so that the target object can be identified from the scene accurately in a follow-up mode.

S102: and identifying the target object from the scene according to the control intention and the scene information.

The target object may be, for example, an object on which the user wants the legged robot to perform a corresponding action according to the control intention of the user, and the object may be other users, objects, other pets, and the like in the scene, without limitation.

When the user wants the legged robot to prevent the object B in the scene from approaching, the object B recognized by the legged robot from the scene may be referred to as a target object, and when the user wants the legged robot to make a good call to himself, the legged robot recognizes the user itself from the scene, which may also be referred to as a target object, which is not limited.

In the embodiment of the application, the scene information may specifically be a scene image, a scene video, a scene audio and the like of the surrounding environment of the user, and the control intention of the user is obtained through pre-analysis, so that the target object can be identified from the scene by combining the control intention and the scene information.

Optionally, in some embodiments, after the target object is recognized from the scene, the living body detection may be further performed on the target object, and then the subsequent control foot type robot performs the target action on the target object, which may be when it is determined that the target object is a living body, the control foot type robot performs the target action on the target object, so that the accuracy of target object recognition can be effectively ensured, and the target action can be accurately performed by the foot type robot.

When living body detection is performed on the target object, living body detection can be performed on the target object by any possible means in the related art, such as an infrared detection method and the like.

S103: and acquiring a target control instruction corresponding to the control intention.

After the control intention of the user is obtained, semantic analysis can be performed on the control intention, so that a target control instruction corresponding to the control intention is obtained.

For example, assuming that the control intention is presented in a voice-based form, a user speaks a piece of voice "wealth and wealth are available and blocks him" to the legged robot, then semantic analysis can be performed on "wealth and block him" to obtain that the control intention of the user is "block him", before semantic analysis, voiceprint recognition can be performed on the voice of the user by the legged robot to determine that the speaker of the voice is the user, and then a target control instruction corresponding to the control intention can be determined, for example, a target control instruction corresponding to the control intention "block him" can be determined based on processing logic pre-configured by the legged robot, and the target control instruction has a function of: the foot robot is controlled to move between the user and the target object, and a stopping sound is emitted to the target object to bark, which is not limited.

S104: and controlling the foot type robot to execute a target action aiming at the target object by adopting a target control instruction.

After the target control command corresponding to the control intention is obtained, the target control command may be directly adopted to control the legged robot to perform the target action on the target object, for example, if the target control command is used to: and correspondingly, the foot type robot is controlled to execute the target action corresponding to the target control command to the target object, and moves between the user and the target object, and sends the preventive sound to the target object.

For another example, after the legged robot recognizes the target control command corresponding to the control intention, the target object may be recognized from the scene based on scene information of the scene where the user is located, and then the target object is determined to be present and living, and the target action may be executed.

For another example, assume that the control intent indicates: if the user wishes the legged robot to ask himself, the target object, which may be the user himself, can be identified from the scene by the legged robot, that is, if the legged robot identifies that the target object is the user himself, and analyzes that the control intention of the user is "wish to ask himself", the target control instruction corresponding to the control intention "wish to ask himself" can be determined based on the processing logic preconfigured by the legged robot, and the target control instruction functions as: controlling the legged robot to circle around the user himself, sounding a dongle cry (to indicate a good discussion), so that the legged robot can perform the corresponding target action: move to the user side, circle around the user himself, and make a dog call, without limitation.

In the embodiment, the control intention of the user and the scene information of the scene where the user is located are obtained, the target object is identified from the scene according to the control intention and the scene information, the target control instruction corresponding to the control intention is obtained, and the target control instruction is adopted to control the foot type robot to execute the target action aiming at the target object, so that the interaction mode between the human and the robot can be effectively enriched, the interaction dimension and the application scene are expanded, and the man-machine interaction effect is improved.

Fig. 3 is a flowchart illustrating a control method of a legged robot according to another embodiment of the present application.

As shown in fig. 3, S102 in the above embodiment: identifying the target object from among the scenes according to the control intention and the scene information may include:

s301: and determining the target action characteristic corresponding to the control intention.

The target action characteristics can be used for describing the characteristics of the target action carried in the control intention of the user and expected to be executed by the legged robot to the target object by the user.

For example, if the control intention is "block him", the target action feature may specifically be "block" therein, or if the control intention is "the user wants the full robot to ask himself", the target action feature may specifically be "ask" therein, but of course, based on different control intents that the user may express, the target action feature identified from among the control intents may be correspondingly, and may also be any other possible content, without limitation.

The above-described determination of the target motion characteristic corresponding to the control intention is to assist in accurately recognizing the target motion from the control intention.

Optionally, in some embodiments, the determining of the target action feature corresponding to the control intention may be determining a control type corresponding to the control intention, and determining the target action feature corresponding to the control type according to context information carried by the control intention.

The control type may be, for example, block, ask, help, invite, or the like, and then, the semantic context information carried by the control intent may be analyzed, so that the target action feature corresponding to the control type, for example, a corresponding relationship may be configured in advance, where the corresponding relationship includes: and then, matching the identified control type with the corresponding relation, and taking the action characteristic obtained by matching as a target action characteristic without limitation.

S302: according to the interaction scene information, a plurality of candidate objects are identified from the scene, and the candidate objects respectively have a plurality of corresponding candidate action characteristics.

That is, the embodiments of the present application may be equally applicable to a plurality of objects existing in a scene, and assuming that the objects a, B, and C exist in the scene, the objects a, B, and C may become candidate objects, and since different candidate objects may perform different or the same motion characteristics at the current time, for example, the candidate object a is not currently moving forward and remains still, while the candidate object B is currently approaching the user, the candidate object C moves away from the user, and the determined target motion characteristic is "blocking", the control intention of the user at this time may be determined based on the cognitive common sense, and the candidate object B may be blocked.

The candidate action feature may, for example, without limitation, be currently not moving forward and remain stationary, currently moving closer to the user, and moving away from the user.

The method and the device are suitable for determining the target object from the candidate objects in the scene by combining the control intention of the user, and specifically, in the process of determining the target object, the target object is identified by combining the target action characteristic corresponding to the control intention and the candidate objects respectively having the corresponding candidate action characteristics.

S303: and taking the candidate object to which the candidate action characteristic matched with the target action characteristic belongs as the target object.

For example, if the candidate object a is not moving forward and remains still, the candidate object B is moving toward the user, the candidate object C is moving away from the user, and the determined target motion feature is "block", the control intention of the user at this time may be determined based on the common sense of knowledge, and the candidate object B is block, and the candidate object B to which the candidate motion feature "is moving toward the user" is determined to be the target object, which is not limited to this.

In the embodiment, the target action characteristic corresponding to the control intention is determined, the plurality of candidate objects are identified from the scene according to the interactive scene information, the plurality of candidate objects respectively have the plurality of corresponding candidate action characteristics, the candidate object to which the candidate action characteristic matched with the target action characteristic belongs is taken as the target object, and the determination of the target object from the plurality of candidate objects in the scene in combination with the control intention of the user is supported, so that the man-machine interaction function is richer, and the target object can be accurately identified when the plurality of candidate objects exist in the scene.

Fig. 4 is a flowchart illustrating a control method of a legged robot according to another embodiment of the present application.

As shown in fig. 4, S102 in the above embodiment: identifying the target object from among the scenes according to the control intention and the scene information may include:

s401: a user gesture feature corresponding to the control intent is determined.

That is, in the embodiment of the present application, when the control intention of the user is captured, the user may be photographed to obtain the user image, and then the image analysis may be performed on the user image to obtain the image feature, so as to recognize the user gesture feature according to the image feature, for example, the user hand is extended forward to make a prohibited gesture (guard), or the finger is looped down (good), which is not limited thereto.

S402: according to the scene information, determining a target direction of the user gesture feature mapping in the scene.

After determining the user gesture feature corresponding to the control intention, the target direction in the scene to which the user gesture feature is mapped may be determined according to the scene information.

The target direction may be, for example, a hand of the user extending forward to make a prohibited gesture (guard), and is mapped in front of a finger of the user in the scene, that is, may be referred to as a target direction, and for example, a finger looping down (good), is mapped below a finger of the user in the scene, that is, may be referred to as a target direction, which is not limited thereto.

S403: the target object is identified from the scene according to the target direction.

After determining that the gesture feature of the user is mapped to the target direction in the scene according to the scene information, an object indicated by the target direction in the scene may be used as the target object, which is not limited to this.

For example, the control intention of the user is received as ' preventing him ', the gesture feature of the user corresponding to the control intention is determined as ' the hand of the user stretches forwards to make a forbidden gesture (guard) ', then the target direction of the gesture feature of the user mapped in the scene is determined according to the scene information, the target object is determined according to the target direction, then the target object pointed by the target direction is judged to be a living body, the legged robot is controlled to execute the target action corresponding to the target control instruction ' to move between the user and the target object, and a preventive sound ' barking ' is sent to the target object until a stopping instruction sent by the user is detected and the target action is sent.

In other embodiments, the detection of the stop instruction issued by the user may be to receive a speech instruction spoken by the user, perform user identity authentication on the speech instruction, perform keyword recognition on the speech instruction, and if the speech instruction includes a keyword "stop," stop the target action, which is not limited to this.

In the embodiment, the user gesture feature corresponding to the control intention is determined, the target direction of the user gesture feature mapped in the scene is determined according to the scene information, and the target object is identified from the scene according to the target direction, so that the target object can be accurately and quickly determined from the scene, the flexibility of a target object determination mode is improved, and the application scene of man-machine interaction is effectively enriched.

Alternatively, in some embodiments, the control intention of the user is obtained by obtaining the interaction information of the user and the legged robot and analyzing the control intention of the user according to the interaction information, so that another way of determining the control intention of the user is provided, and the accuracy and the adaptability of the control intention analysis can be guaranteed because the control intention is determined based on the interaction information of the user and the legged robot.

Fig. 5 is a flowchart illustrating a control method of a legged robot according to another embodiment of the present application.

As shown in fig. 5, the obtaining of the control intention of the user by S101 in the above embodiment may further include:

s501: and acquiring interactive information directly input by a user through the legged robot.

The interaction information may be, for example, text information input by the user directly to the legged robot, or voice information, that is, the legged robot may provide a text input interface or a microphone to capture the text information input by the user via the text input interface, or capture the voice information input by the user via the microphone, which is not limited thereto.

Optionally, in some embodiments, when acquiring the interaction information directly input by the user via the legged robot, voice interaction information directly input by the user via the legged robot may be acquired, and the voice interaction information is, for example, information included in a segment of voice spoken by the user to the legged robot, so that triggering control of the legged robot based on the voice interaction information directly may be supported, and a control intention of the user is analyzed based on the voice interaction information, so that a better control intention recognition effect may be obtained.

S502: the method comprises the steps of acquiring interactive information input to the legged robot by a user through an external input device.

In the embodiment of the application, the legged robot is connected to the external input device in a wired communication manner, or in a wireless communication manner, which is not limited herein.

The external input device may be, for example, a mobile terminal for controlling the legged robot, and the external input device may be configured with an application program in advance on the mobile terminal, and assist a user in controlling the legged robot based on the application program, without limitation.

For example, the corresponding text information or voice information may be input by an application program in the external input device by the user, so that the text information or voice information is transmitted to the foot robot by the external input device.

The interactive information input by the user directly through the foot robot is obtained, and the interactive information input by the user to the foot robot through the external input device can also be obtained, that is, the interaction with the user directly is supported, and the interaction with the user through the external input device of a third party is also supported, so that the control intention of the user is obtained in an auxiliary manner, the obtaining mode of the control intention can be expanded, and the interactive scene is expanded in an auxiliary manner.

Optionally, in some embodiments, the acquiring of the interaction information input by the user to the legged robot via the external input device may be capturing a plurality of environment video frames around the user, taking the plurality of environment video frames as a plurality of candidate interaction information, sending the plurality of candidate interaction information to the external input device, and determining, in response to a selection instruction of the user, candidate interaction information matching the selection instruction as the interaction information.

For example, the legged robot may be controlled to capture a plurality of environmental video frames around the user, send the plurality of environmental video frames to the external input device, display the plurality of environmental video frames by the external input device, provide a selection interface for the environmental video frames, receive a selection instruction of the user through the selection interface, and send the environmental video frame selected by the user to the legged robot, so as to assist the legged robot in taking the environmental video frame selected by the user as candidate interaction information matched with the selection instruction.

Therefore, the confirmation of the user on the interactive information is supported, the acquisition and identification of the interactive information are more in line with the personalized interaction requirements of the user, and the personalized interaction effect is improved.

S503: and analyzing the control intention of the user according to the interaction information.

After the interactive information is obtained, for example, the text information, the voice information, and the environmental video frame are obtained, so that the text information, the voice information, and the environmental video frame can be analyzed by any possible data analysis method to determine the control intention of the user, which is not limited herein.

In the embodiment, by acquiring the interaction information directly input by the user through the foot robot, acquiring the interaction information input by the user to the foot robot through the external input device, and analyzing the control intention of the user according to the interaction information, not only can the interaction with the user directly be supported, but also the interaction with the user through the external input device of a third party is supported, so that the control intention of the user can be acquired in an auxiliary manner, the acquisition mode of the control intention can be expanded, and the interaction scene can be expanded in an auxiliary manner.

Fig. 6 is a schematic structural diagram of a control device of a legged robot according to an embodiment of the present application.

As shown in fig. 6, the control device 60 for the foot robot includes:

the first obtaining module 601 is configured to obtain a control intention of a user and scene information of a scene where the user is located.

An identifying module 602, configured to identify a target object from the scene according to the control intention and the scene information.

A second obtaining module 603, configured to obtain a target control instruction corresponding to the control intention.

And the control module 604 is configured to control the legged robot to execute a target action on a target object by using a target control instruction.

In some embodiments of the present application, the first obtaining module 601 is specifically configured to:

acquiring multi-dimensional interactive scene information of a scene where a user is located, and taking the multi-dimensional interactive scene information as scene information.

carrying out image recognition on a scene where a user is located so as to acquire interactive scene information of visual dimensions of the scene according to the recognized image; and/or the presence of a gas in the gas,

In some embodiments of the present application, as shown in fig. 7, fig. 7 is a schematic structural diagram of a control apparatus of a legged robot according to another embodiment of the present application, and the identification module 602 includes:

a first determination sub-module 6021 for determining a target action feature corresponding to the control intention;

the first identification submodule 6022 is configured to identify a plurality of candidate objects from a scene according to the interaction scene information, where the plurality of candidate objects have a plurality of corresponding candidate action features respectively;

the matching sub-module 6023 is configured to set the candidate object to which the candidate motion feature matching the target motion feature belongs as the target object.

In some embodiments of the present application, the first determination submodule 6021 is specifically configured to:

determining a control type corresponding to the control intention;

and determining the target action characteristics corresponding to the control type according to the context information carried by the control intention.

In some embodiments of the present application, as shown in fig. 8, fig. 8 is a schematic structural diagram of a control apparatus of a legged robot according to another embodiment of the present application, and the identification module 602 includes:

a second determination submodule 6024 for determining a user gesture feature corresponding to the control intention;

the third determining submodule 6025 is configured to determine, according to the scene information, a target direction in which the user gesture feature is mapped in the scene;

a second identifying submodule 6026 for identifying the target object from the scene according to the target direction.

acquiring interaction information of a user and the foot type robot;

In some embodiments of the present application, the legged robot is connected to an external input device, and the first obtaining module 601 is specifically configured to:

acquiring interactive information directly input by a user through a foot type robot; and/or the presence of a gas in the gas,

and acquiring interactive information input to the legged robot by a user through an external input device.

and acquiring voice interaction information directly input by a user through the legged robot.

sending a plurality of candidate interaction information to an external input device;

and responding to a selection instruction of a user, and determining candidate interaction information matched with the selection instruction as interaction information.

before sending the multiple candidate interaction information to the external input device, capturing multiple environment video frames around the user, and taking the multiple environment video frames as the multiple candidate interaction information.

In some embodiments of the present application, as shown in fig. 7, further comprising:

a detection module 605 for performing in vivo detection on the target object;

the control module 604 is specifically configured to:

and if the target object is determined to be a living body, controlling the legged robot to perform a target action with respect to the target object.

It should be noted that the explanation of the embodiment of the control method for the foot robot is also applicable to the control device for the foot robot in this embodiment, and the details are not repeated here.

The control apparatus of the legged robot includes:

a memory 901, a processor 902 and a computer program stored on the memory 901 and executable on the processor 902.

The processor 902, when executing the program, implements the control method of the legged robot provided in the above-described embodiments.

In one possible implementation, the control apparatus of the legged robot further includes:

a communication interface 903 for communication between the memory 901 and the processor 902.

A memory 901 for storing computer programs executable on the processor 902.

Memory 901 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 902 is configured to implement the control method of the legged robot according to the above-described embodiment when executing a program.

If the memory 901, the processor 902, and the communication interface 903 are implemented independently, the communication interface 903, the memory 901, and the processor 902 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 9, but that does not indicate only one bus or one type of bus.

Optionally, in a specific implementation, if the memory 901, the processor 902, and the communication interface 903 are integrated on a chip, the memory 901, the processor 902, and the communication interface 903 may complete mutual communication through an internal interface.

The processor 902 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the control method of a legged robot as described above.

In order to implement the above embodiments, the present application also proposes a computer program product, wherein when instructions in the computer program product are executed by a processor, the control method of the legged robot shown in the above embodiments is executed.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present application, the meaning of "a plurality" is two or more unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer-readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of controlling a legged robot, the method comprising:

acquiring a control intention of a user and scene information of a scene where the user is located;

identifying a target object from among the scenes according to the control intention and the scene information;

acquiring a target control instruction corresponding to the control intention; and

and controlling the foot type robot to execute a target action aiming at the target object by adopting the target control instruction.

2. The method of claim 1, wherein obtaining scene information of a scene in which the user is located comprises:

and acquiring multi-dimensional interactive scene information of the scene where the user is located, and taking the multi-dimensional interactive scene information as the scene information.

3. The method of claim 2, wherein the obtaining interactive scene information for multiple dimensions of a scene in which the user is located comprises:

performing image recognition on the scene where the user is located so as to acquire interactive scene information of visual dimensions of the scene according to the recognized image; and/or the presence of a gas in the gas,

4. The method of claim 1, wherein said identifying a target object from among the scenes based on the control intent and the scene information comprises:

5. The method of claim 4, wherein the determining a target action characteristic corresponding to the control intent comprises:

determining a control type corresponding to the control intent;

6. The method of claim 1, wherein said identifying a target object from among the scenes based on the control intent and the scene information comprises:

determining a user gesture feature corresponding to the control intent;

7. The method of claim 1, wherein the obtaining the control intent of the user comprises:

acquiring the interactive information of the user and the foot type robot;

8. The method of claim 7, wherein the legged robot is connected to an external input device, and the obtaining of the interaction information of the user with the legged robot comprises:

acquiring interactive information directly input by the user through the legged robot; and/or the presence of a gas in the gas,

9. The method of claim 8, wherein said obtaining interaction information input by said user directly via said legged robot comprises:

10. The method of claim 8, wherein said obtaining interaction information input by the user to the legged robot via the external input device comprises:

11. The method of claim 8, prior to said sending a plurality of candidate interaction information to the external input device, further comprising:

12. The method of claim 1, after said identifying a target object from among said scenes, further comprising:

performing living body detection on the target object;

13. A control device for a legged robot, characterized in that the device comprises:

the first acquisition module is used for acquiring the control intention of a user and scene information of a scene where the user is located;

the identification module is used for identifying a target object from the scene according to the control intention and the scene information;

the second acquisition module is used for acquiring a target control instruction corresponding to the control intention; and

and the control module is used for controlling the foot type robot to execute a target action aiming at the target object by adopting the target control instruction.

14. The apparatus of claim 13, wherein the first obtaining module is specifically configured to:

15. The apparatus of claim 14, wherein the first obtaining module is specifically configured to:

carrying out image recognition on the scene where the user is located so as to obtain interactive scene information of visual dimensions of the scene according to the recognized image; and/or the presence of a gas in the gas,

16. The apparatus of claim 13, wherein the identification module comprises:

the first identification sub-module is used for identifying a plurality of candidate objects from the scene according to the interaction scene information, wherein the candidate objects respectively have a plurality of corresponding candidate action characteristics;

17. The apparatus of claim 16, wherein the first determination submodule is specifically configured to:

determining a control type corresponding to the control intention;

18. The apparatus of claim 13, wherein the identification module comprises:

19. The apparatus of claim 13, wherein the first obtaining module is specifically configured to:

acquiring the interactive information of the user and the foot type robot;

20. The apparatus of claim 19, wherein the legged robot is coupled to an external input device, and wherein the first obtaining module is specifically configured to:

21. The apparatus of claim 20, wherein the first obtaining module is specifically configured to:

22. The apparatus of claim 20, wherein the first obtaining module is specifically configured to:

23. The apparatus of claim 20, wherein the first obtaining module is specifically configured to:

capturing a plurality of environmental video frames around the user before the sending of the plurality of candidate interaction information to the external input device, and taking the plurality of environmental video frames as the plurality of candidate interaction information.

24. The apparatus as recited in claim 13, further comprising:

the control module is specifically configured to:

25. A control apparatus of a legged robot, characterized by comprising:

memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of any of claims 1-12.

26. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-12.