CN109551489B

CN109551489B - Control method and device for human body auxiliary robot

Info

Publication number: CN109551489B
Application number: CN201811290891.3A
Authority: CN
Inventors: 王天
Original assignee: Hangzhou Chengtian Technology Development Co Ltd
Current assignee: Hangzhou Chengtian Technology Development Co Ltd
Priority date: 2018-10-31
Filing date: 2018-10-31
Publication date: 2020-05-08
Anticipated expiration: 2038-10-31
Also published as: CN109551489A; CN111515946A; CN111515946B

Abstract

The invention provides a control method and a control device for a human body auxiliary robot, and relates to the field of automatic control. The control method of the human body auxiliary robot adopts an eye movement control mode, and first human eye visual angle data of a user and a corresponding environment image are obtained; then, according to the first human eye visual angle data, selecting an interested object of the user from the environment image; and finally, generating a robot control instruction according to the position of the interested object. The user can issue corresponding control instructions by rotating the glasses, and the convenience degree of use is improved.

Description

Control method and device for human body auxiliary robot

Technical Field

The invention relates to the field of automatic control, in particular to a control method and a control device for a human body auxiliary robot.

Background

With the improvement of automatic control technology, the robot assisted by manpower is widely applied in various fields. Common fields are machine manufacturing field and nursing field.

For the nursing field, the robot is more used for helping the nursed person to perform certain actions, such as position moving, object grabbing and the like.

In the related art, when a care-giver controls a robot, a control instruction is issued through a handheld remote controller.

Disclosure of Invention

The invention aims to provide a control method of a human body auxiliary robot.

In a first aspect, an embodiment of the present invention provides a method for controlling a human body-assisted robot, including:

acquiring first human eye visual angle data of a user and a corresponding environment image;

selecting an object of interest of a user from the environment image according to the first human eye visual angle data;

robot control instructions are generated from the position of the object of interest.

With reference to the first aspect, an embodiment of the present invention provides a first possible implementation manner of the first aspect, where the method is applied to a human-body-assisted robot, where the human-body-assisted robot includes an arm;

the step of generating robot control instructions based on the position of the object of interest comprises:

and generating an arm movement instruction according to the position of the interested object and the position of the human body auxiliary robot arm.

With reference to the first aspect, embodiments of the present invention provide a second possible implementation manner of the first aspect, where the method is applied to a human-body-assisted robot, where the human-body-assisted robot includes a mobile part; the step of generating robot control instructions based on the position of the object of interest comprises:

and generating a whole moving instruction according to the position of the interested object and the position of the human body auxiliary robot.

With reference to the first aspect, an embodiment of the present invention provides a third possible implementation manner of the first aspect, where the step of selecting, according to the first human eye viewing angle data, an object of interest of the user from the environment image includes:

selecting an interested area of a user from the environment image according to the first human eye visual angle data;

an object located in the region of interest is selected as the object of interest.

With reference to the first aspect, an embodiment of the present invention provides a fourth possible implementation manner of the first aspect, where the step of selecting an object located in the region of interest as the object of interest includes:

if a plurality of candidate objects exist in the region of interest, outputting the candidate objects existing in the region of interest;

and selecting the specified candidate object in the region of interest as the object of interest according to a first selection instruction given by a user aiming at the display screen.

With reference to the first aspect, an embodiment of the present invention provides a fifth possible implementation manner of the first aspect, where the step of outputting that a plurality of candidate objects exist in the region of interest includes:

displaying the magnified image of the region of interest on AR glasses;

the step of selecting the specified candidate object in the region of interest as the object of interest according to a first selection instruction issued by a user for the display screen comprises:

acquiring second human eye visual angle data generated when a user observes the AR glasses; the first selection instruction is second human eye visual angle data;

and selecting the specified object in the region of interest as the object of interest according to the second human eye visual angle data.

With reference to the first aspect, an embodiment of the present invention provides a sixth possible implementation manner of the first aspect, where the step of selecting an object located in the region of interest as the object of interest includes:

performing foreground extraction on the region of interest to determine a foreground object;

extracting a reference image from a target database;

and taking an object, of which the similarity with the reference image meets a preset requirement, in the foreground object as an object of interest.

With reference to the first aspect, an embodiment of the present invention provides a seventh possible implementation manner of the first aspect, where the method further includes:

extracting a reference image from a target database;

and taking an object, of the foreground objects, with the similarity to the reference image meeting the preset requirement as a candidate object.

With reference to the first aspect, an embodiment of the present invention provides an eighth possible implementation manner of the first aspect, where the method further includes:

selecting a target database from the candidate databases according to the acquired second selection instruction; the candidate database includes a home environment database, a medical environment database, and an outdoor environment database.

With reference to the first aspect, an embodiment of the present invention provides a ninth possible implementation manner of the first aspect, where the method further includes:

acquiring current position information;

searching place information corresponding to the current position information;

and generating a second selection instruction according to the location information.

In combination with the first aspect, the present invention provides a tenth possible implementation manner of the first aspect, where,

the second selection instruction is a database selection instruction issued by the user.

With reference to the first aspect, an embodiment of the present invention provides an eleventh possible implementation manner of the first aspect, where the step of selecting, according to the first human eye viewing angle data, an object of interest of the user from the environment image includes:

selecting a first object to be confirmed from the environment image according to the first human eye visual angle data;

respectively outputting prompt information of each first object to be confirmed;

and if a confirmation instruction responding to the prompt information is acquired, taking the first object to be confirmed corresponding to the confirmation instruction as the object of interest.

With reference to the first aspect, an embodiment of the present invention provides a twelfth possible implementation manner of the first aspect, where the step of selecting, according to a first selection instruction issued by a user for a display screen, a specified candidate object in the region of interest as the object of interest includes:

selecting an object corresponding to the first selection instruction as a second object to be confirmed;

respectively outputting prompt information corresponding to each second object to be confirmed;

and if a confirmation instruction corresponding to the prompt information is acquired, taking the corresponding second object to be confirmed as the object of interest.

In combination with the first aspect, the present invention provides a thirteenth possible implementation manner of the first aspect, wherein,

the step of respectively outputting each prompt message corresponding to the first object to be confirmed comprises the following steps:

displaying image information corresponding to the first pair of objects to be confirmed on a display screen;

and/or playing voice information of the name of the first pair of objects to be confirmed;

the step of respectively outputting prompt information corresponding to each second object to be confirmed comprises the following steps:

displaying image information corresponding to the second pair of objects to be confirmed on the display screen;

and/or playing voice information of the name of the second pair of objects to be confirmed.

With reference to the first aspect, an embodiment of the present invention provides a fourteenth possible implementation manner of the first aspect, where after the step of outputting prompt information corresponding to the object to be confirmed, the method further includes:

acquiring user behaviors;

and if the user behavior meets the preset standard behavior requirement, determining to acquire a confirmation instruction corresponding to the prompt message.

With reference to the first aspect, an embodiment of the present invention provides a fifteenth possible implementation manner of the first aspect, where the standard behavior requirement includes: the user completes one or more actions specified as follows:

blinking, mouth opening, tongue stretching, blowing, head movement, voice behavior, eye movement behavior.

In a second aspect, an embodiment of the present invention further provides a control device for a human body-assisted robot, including:

the first acquisition module is used for acquiring first human eye visual angle data of a user and a corresponding environment image;

the first selection module is used for selecting an interested object of a user from the environment image according to the first human eye visual angle data;

a first generating module for generating robot control instructions depending on the position of the object of interest.

With reference to the second aspect, the present invention provides a first possible implementation manner of the second aspect, wherein the apparatus acts on a human body-assisted robot, and the human body-assisted robot includes an arm;

the first generation module comprises:

and the first generation unit is used for generating an arm movement instruction according to the position of the interested object and the position of the human body auxiliary robot arm.

With reference to the second aspect, the present invention provides a second possible implementation manner of the second aspect, wherein the apparatus acts on a human body-assisted robot, and the human body-assisted robot includes a mobile part; the first generation module comprises:

and the second generation unit is used for generating a whole movement instruction according to the position of the interested object and the position of the human body auxiliary robot.

With reference to the second aspect, an embodiment of the present invention provides a third possible implementation manner of the second aspect, where the first selecting module includes:

a first selection unit, configured to select a region of interest of a user from an environment image according to first human eye viewing angle data;

a second selection unit for selecting an object located in the region of interest as the object of interest.

With reference to the second aspect, an embodiment of the present invention provides a fourth possible implementation manner of the second aspect, where the second selecting unit includes:

the first output subunit is used for outputting a plurality of candidate objects in the region of interest if the plurality of candidate objects exist in the region of interest;

and the first selection subunit is used for selecting the specified candidate object in the interest region as the interest object according to a first selection instruction given by the user aiming at the display screen.

With reference to the second aspect, an embodiment of the present invention provides a fifth possible implementation manner of the second aspect, where the first output subunit is further configured to: displaying the magnified image of the region of interest on AR glasses;

the first selection subunit is further to: acquiring second human eye visual angle data generated when a user observes the AR glasses; the first selection instruction is second human eye visual angle data; and selecting a specified object in the region of interest as the object of interest according to the second human eye perspective data.

With reference to the second aspect, an embodiment of the present invention provides a sixth possible implementation manner of the second aspect, where the second selecting unit includes:

the first extraction subunit is used for performing foreground extraction on the region of interest to determine a foreground object;

a second extraction subunit, configured to extract a reference image from the target database;

and the first operation subunit is used for taking an object, of the foreground objects, of which the similarity with the reference image meets a preset requirement as an object of interest.

With reference to the second aspect, an embodiment of the present invention provides a seventh possible implementation manner of the second aspect, where the method further includes:

the third extraction subunit is used for performing foreground extraction on the region of interest to determine a foreground object;

a fourth extraction subunit, configured to extract the reference image from the target database;

and the second operation subunit is used for taking an object, of the foreground objects, of which the similarity with the reference image meets the preset requirement as a candidate object.

With reference to the second aspect, an embodiment of the present invention provides an eighth possible implementation manner of the second aspect, where the method further includes:

the second selection module is used for selecting a target database from the candidate databases according to the obtained second selection instruction; the candidate database includes a home environment database, a medical environment database, and an outdoor environment database.

With reference to the second aspect, an embodiment of the present invention provides a ninth possible implementation manner of the second aspect, where the method further includes:

the second acquisition module is used for acquiring current position information;

the first searching module is used for searching the place information corresponding to the current position information;

and the second generation module is used for generating a second selection instruction according to the location information.

In combination with the second aspect, the embodiments of the present invention provide a tenth possible implementation manner of the second aspect, wherein,

With reference to the second aspect, an embodiment of the present invention provides an eleventh possible implementation manner of the second aspect, where the first selecting module includes:

a third selecting unit configured to select a first object to be confirmed from the environment image based on the first human eye viewing angle data;

the first output unit is used for respectively outputting the prompt information of each first object to be confirmed;

and the operation unit is used for taking the first object to be confirmed corresponding to the confirmation instruction as the object of interest if the confirmation instruction responding to the prompt information is acquired.

With reference to the second aspect, an embodiment of the present invention provides a twelfth possible implementation manner of the second aspect, where the first selecting subunit includes:

the second selection subunit is used for selecting the object corresponding to the first selection instruction as a second object to be confirmed;

the second output subunit is used for respectively outputting the prompt information of each first object to be confirmed;

and the third operation subunit is used for taking the first object to be confirmed corresponding to the confirmation instruction as the object of interest if the confirmation instruction responding to the prompt information is acquired.

With reference to the second aspect, the present invention provides a thirteenth possible implementation manner of the second aspect, wherein the first output unit is further configured to display image information corresponding to the first pair of objects to be confirmed on the display screen;

the second output subunit is further used for displaying image information corresponding to the second pair of objects to be confirmed on the display screen;

With reference to the second aspect, an embodiment of the present invention provides a fourteenth possible implementation manner of the second aspect, where the method further includes:

the third acquisition module is used for acquiring user behaviors;

and the first determining module is used for determining to acquire a confirmation instruction corresponding to the prompt message if the user behavior meets the preset standard behavior requirement.

In combination with the second aspect, an embodiment of the present invention provides a fifteenth possible implementation manner of the second aspect, where the standard behavior requirement is that a user performs one or more of the following actions:

In a third aspect, an embodiment of the present invention further provides a computer-readable medium having non-volatile program code executable by a processor, where the program code causes the processor to execute any one of the methods in the first aspect.

In a fourth aspect, an embodiment of the present invention further provides a computing device, including: a processor, a memory and a bus, the memory storing execution instructions, the processor and the memory communicating via the bus when the computing device is running, the processor executing the method according to any one of the first aspect stored in the memory.

The control method of the human body auxiliary robot provided by the embodiment of the invention adopts an eye movement control mode, and first human eye visual angle data of a user and a corresponding environment image are obtained; then, according to the first human eye visual angle data, selecting an interested object of the user from the environment image; and finally, generating a robot control instruction according to the position of the interested object. The user can issue corresponding control instructions by rotating the glasses, and the convenience degree of use is improved.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 shows a basic flowchart of a control method of a human body-assisted robot according to an embodiment of the present invention;

FIG. 2 illustrates a schematic diagram of viewing a real environment image through AR glasses provided by an embodiment of the present invention;

FIG. 3 is a diagram illustrating a first scenario of displaying an image of an environment on a display provided by an embodiment of the present invention;

FIG. 4 is a diagram illustrating a first scenario of displaying an image of an environment on a display provided by an embodiment of the present invention;

FIG. 5 illustrates a schematic diagram of viewing a real environment image through AR glasses provided by an embodiment of the present invention;

fig. 6 illustrates a schematic diagram of a first computing device provided by an embodiment of the application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

In the related art, the group of people who are used by the body-assisted robot is mainly injured patients and users who need to perform auxiliary operations (further, the technical field of the method provided by the present application can also be understood as a robot that performs auxiliary operations on users), the injured patients are restricted by their own conditions (for example, the body of a hemiplegic patient and a paraplegic patient cannot move), and it is inconvenient to perform some actions (for example, move and grab), therefore, the conventional body-assisted robot should generally be able to perform at least the tasks of moving and grabbing.

The precondition for the human body auxiliary robot to complete the grabbing and moving actions is that an operation instruction issued by a user is received, and under the normal condition, the user issues the operation instruction through a handle controller. However, for some patients with injuries, the handle controller is inconvenient to use, and even the handle controller is misoperated when being used, thereby causing danger.

In view of such a situation, the present application provides a method for controlling a human body-assisted robot, as shown in fig. 1, including:

s101, acquiring first human eye visual angle data of a user and a corresponding environment image;

s102, selecting an interested object of a user from an environment image according to the first human eye visual angle data;

s103, generating a robot control instruction according to the position of the interested object.

The first human eye visual angle data may be obtained by detecting the eyeball of the user through an eyeball tracking technology, and the first human eye visual angle data at least reflects an angle observed by the eyeball of the user. The device for detecting the eyeballs of the user can be infrared device or general image acquisition device, and the image acquisition device can be a general computer camera or a camera on a mobile phone or other terminals. That is, in a specific implementation, the step S101 may be to capture an eyeball image of the user through an image capture device to obtain an eyeball moving image; and analyzes the eye movement image to determine first-person eye viewing angle data of the user. Wherein, the image acquisition device can be any one of the following: eye movement sensor, small-sized camera, computer camera, and camera on intelligent terminal (mobile phone, tablet computer). The step of analyzing the eye movement image to determine the execution subject of the first human eye visual angle data of the user may be performed by a processor on the human-assisted robot or may be performed independently of the processor on the human-assisted robot.

The environment image is associated with the human eye perspective data, and generally, the environment image is preferably acquired according to a user perspective (a shooting perspective of the environment image and an observation perspective of human eyes of the user should be substantially the same), and the environment image may be a real environment image acquired by a camera disposed on a head of the user (more specifically, may be disposed near eyes of the user), or may be a virtual/semi-virtual (e.g., AR) image shot and displayed on a display screen. That is, the environment image may be acquired by a camera provided on the head of the user, or the environment image may be an image displayed on the display screen (the image displayed on the display screen may be a real environment image displayed on the display screen or a virtual environment image displayed on the display screen). When the environment image is an image displayed on the display screen, the environment image can be obtained directly from the display screen or a processor for transmitting signals to the display screen instead of obtaining the environment image through the camera. Regardless of the manner in which the environment image is acquired, the environment image should be an image that can be directly viewed by the user with the naked eye.

When the environment image is an image displayed on a display screen, in the method provided by the present application, it is preferable that the environment image is displayed through lenses of glasses (AR glasses) worn on the head of the user (i.e., the display screen is a lens of the AR glasses). Of course, the environment image may also be displayed through other displays (e.g., a display provided on the human-assisted robot, or a display on a smart terminal such as a mobile phone or a tablet computer).

As shown in fig. 2, a schematic view of an image of a real environment viewed through AR glasses (or an image of the real environment viewed directly by the naked eye, as it is), is shown, and it can be seen through the diagram that objects appearing in the line of sight of the user are an object a, an object B, and an object C. As shown in fig. 5, which shows the result of observing the environment image through the AR glasses, in fig. 5, the object a is a chair, the object B is a table, and the object C is a computer.

In addition to direct observation of an object in a real environment through the AR glasses/by the naked eye, the system may directly project on lenses of the AR glasses to display an image of the real environment on the AR glasses (in the image of the real environment, the position and size of each scene are the same as those of each scene in the real environment), or display a virtual image simulating the real environment (in the virtual image simulating the real environment, the position and size of each scene may be set, and are not required to be the same as those of each scene in the real environment). When displaying a virtual image simulating a real environment, after the image simulating the real environment is acquired by video recording, foreground identification is required to be adopted to identify foreground objects, such as tables, chairs and the like, from the image simulating the real environment, and then the virtual image simulating the real environment is formed by using icons corresponding to the identified foreground objects

As shown in fig. 3, a first case of displaying an environment image on a display (e.g., a display device such as a tablet computer, a mobile phone, etc.) is shown. That is, the environment image displayed on the display may be a real environment image obtained by video-shooting a real environment (similar to playing a video on the display after recording), or may be a virtual image that displays a simulated real environment. In the virtual image simulating the real environment, the distribution and size of each object displayed are the same as those of each object in the real environment.

As shown in fig. 4, a second case of displaying an environment image on a display (e.g., a display device such as a tablet computer, a mobile phone, etc.) is shown. In contrast to the first case of displaying an image of the environment on the display, it is clear that in the second case the image corresponding to the real environment is no longer displayed, but rather icons of objects in the real environment are listed (listed in an array) on the display.

That is, step S101 may be performed as follows:

acquiring an environment image through a camera arranged on the head of a user, wherein the environment image is any one of the following images:

an image of the real environment, an image of a simulated real environment displayed on the display screen (the relative size and relative positional relationship of each different object in the image of the simulated real environment are the same as those of the different objects in the real environment), an image composed of icons of target objects (the image presented in fig. 4); wherein the target object is an object appearing in the real environment.

Furthermore, when the environment image is an image that simulates a real environment displayed on the display screen, the method provided by the present application further includes:

an image simulating a real environment is displayed on a display screen.

When the environment image is an image composed of icons of the target objects, the method provided by the application further comprises:

displaying an image composed of icons of the target object on the display screen; the target object is an object appearing in the real environment. At this time, the environment image appearing in step S101 may be an image obtained by photographing the display screen, or may be obtained not by photographing, but by directly acquiring, from the data source, an image code corresponding to an image formed by icons of the target object displayed on the display screen, so that the system can clearly know what the specific image content is.

More preferably, the icons of the target objects displayed on the display screen are arranged in an array. The array shape here refers to a square array, a circular array, or an array of other shapes.

After the first-person eye viewing angle data and the environment image are acquired, which object (such as a table or a chair in the environment image) the user gazes at in the environment image can be known according to the first-person eye viewing angle data, that is, the object determined according to the first-person eye viewing angle data that the user gazes at is the object of interest of the user.

And finally, generating a robot control command according to the position of the interested object.

The robot control command here can be divided into two types, that is, a whole movement command of the human body-assisted robot (a command for driving the human body-assisted robot to move toward the object of interest) and a command for grasping the object of interest (a command for driving the human body-assisted robot to grasp the object of interest). Of course, the robot control command here may also mean that the robot control command is moved to a designated position and then grabbed.

When the robot control command is an overall movement command, the method provided by the application acts on the human body auxiliary robot, and the human body auxiliary robot comprises a moving part; the step of generating robot control instructions based on the position of the object of interest comprises:

Further, the human body-assisted robot can move toward the object of interest by driving the action unit after receiving the entire movement instruction. Wherein both the position of the object of interest and the position of the human-assisted robot can be understood as coordinate values in space. There are several ways of obtaining the coordinates (position) of the object of interest, only a few of which are listed below:

first, this is achieved by a locator and a wireless signal transmitter arranged on the object of interest. When the method is specifically implemented, a locator on the object of interest can be driven to acquire the position signal first, and the position signal is sent to a system (an execution main body of the method provided by the application) through a wireless signal transmitter. That is, in the method provided by the present application, the position of the object of interest may be obtained by a locator disposed on the object of interest.

Secondly, the position of the object of interest is obtained by setting an external positioning device, for example, the position of the object of interest can be determined by positioning with an ultrasonic positioner, a wifi positioner, or the like, or further by performing auxiliary positioning with a picture of the actual environment.

Thirdly, the location of some objects of interest is determined, for example, when the user desires to go to a toilet or to the bedside, the location of the toilet and the bedside are relatively fixed in one room, and then the location information of the objects with relatively fixed locations can be pre-stored in the system, and then the locations can be directly retrieved when in use. In this case, in the method provided by the present application, the position of the object of interest may be determined as follows:

searching the position of the interested object from an object position list prestored in a position database; the object position list is recorded with the corresponding relation between the designated object and the position, and after the user determines the object of interest, the position of the object of interest can be inquired in a table look-up manner.

That is, when the object is a bed or a toilet, which is not a generally moving object, the position of the object can be determined not by using a temporary location, but by using a location pre-stored in the system and by searching the location when using the system.

After the position of the object of interest is determined, the body-assisted robot may be driven towards the object of interest. Of course, when the human body assists the robot to move, the problem of obstacle avoidance should be considered, for example, an ultrasonic sensor may be used to avoid colliding with an obstacle. When determining the robot control command, if the position of the human-body-assisted robot needs to be determined, the position of the human-body-assisted robot may also be determined by referring to the manner of obtaining the position of the object of interest as described above.

Specifically, in order to improve the accuracy of the movement, the robot control command is preferably a navigation route (a route that avoids obstacles and avoids roads that are inconvenient to pass through) directed to the object of interest in step S103.

Similarly, when the control command is an arm movement command (a grabbing command), the method provided by the application acts on the human body auxiliary robot, and the human body auxiliary robot comprises a moving part; the step of generating robot control instructions based on the position of the object of interest comprises:

The position of the arm of the human body auxiliary robot mainly refers to the position of a structure which can be grabbed by the human body auxiliary robot. When the control command is an arm movement command, the position of the object of interest may be obtained in the manner described in the foregoing, which is not described too much herein.

In the method provided by the present application, one of the main functions of the body-assisted robot is to carry the user's movement, that is, to carry the user's movement toward the object of interest while the body-assisted robot moves toward the object of interest according to the overall movement instruction. The method provided by the application is preferably applied indoors, for example, the method is preferably used in places with simple environments, such as the home of a user, a hospital and the like. Therefore, in the method provided by the application, the human body auxiliary robot can move with the user when moving according to the robot control instruction, but the human body auxiliary robot does not move away from the user. The method provided by the present application may also be understood as a method of controlling a body-assisted robot applied in a relatively closed/environmentally relatively fixed room.

The method provided by the application can determine the interested object of the user according to the first human eye visual angle data of the user, so that the instruction of moving towards the interested object or the instruction of grabbing the interested object is issued to the human body auxiliary robot, and the user can automatically finish the task of issuing the instruction.

In practical use, because the objects of interest are not necessarily evenly distributed in the line of sight of the user, it may be that there are more objects of interest in a certain area, and there are fewer objects that can be the objects of interest in a certain area, or different objects may be blocked or overlapped.

That is, in the method provided by the present application, the step S102 of selecting the object of interest of the user from the environment image according to the first-person eye viewing angle data includes:

The region of interest refers to a region at which the user's glasses stare, and when determining the region of interest, the following steps may be performed:

determining a fixation point of a user in the environment image according to the first human eye visual angle data;

and selecting a region with a distance from the fixation point smaller than a preset threshold value as an interested region by taking the fixation point as a reference point.

For example, a circle with a center of a gaze point and a radius of 5 cm may be drawn on the environment image, and an area within the circle may be used as the region of interest. Similarly, a square may be drawn with the point of regard as the center, and the area within the square may be used as the region of interest.

In addition to the manner of using the gazing point as the reference point to circle the region of interest, the method may also be performed in the following manner, in which the environment image is divided into a plurality of preset regions, and then, which region is the region where the gazing point falls is determined as the region of interest, that is, when the region of interest is determined:

dividing the environment image into a plurality of different candidate areas according to the density of the candidate objects in the environment image;

and taking the candidate area where the fixation point is positioned as the interested area.

When the method is used, the environment image is divided into a plurality of candidate areas before the fixation point is determined, so that the influence of the position of the fixation point on the division can be reduced, and the environment image is divided more reasonably.

Generally, the more dense candidate areas in the environment image, the more candidate areas should be divided, and conversely, the fewer candidate areas should be divided; or the number of candidates is positively correlated with the number of candidate regions.

After the region of interest is determined, the specified candidate object can be found from the region of interest as the object of interest.

The step of selecting an object located in the region of interest as the object of interest has the following four implementations:

a first way of selecting an object located in a region of interest as an object of interest:

if only one candidate object exists in the region of interest, the candidate object existing in the region of interest is taken as the object of interest.

This implementation is relatively simple, since there is only one candidate in the region of interest, and therefore the user may only select this candidate as the object of interest.

A second way of selecting an object located in the region of interest as the object of interest:

That is, when there are multiple candidate objects in the region of interest, the system cannot directly determine which candidate object can be the object of interest, and at this time, only the multiple candidate objects existing in the region of interest can be output, and then it is determined which candidate object is the object of interest according to the first selection instruction issued by the user.

Specifically, there are several ways to output that a plurality of candidate objects exist in the region of interest:

amplifying the region of interest, and displaying the amplified region of interest on a display screen; the display screen refers to, for example, a display screen on a human body-assisted robot, or a display screen on some mobile terminals (a display screen on a mobile phone or a tablet computer) or a display screen on AR glasses;

displaying icons corresponding to the candidate objects in the region of interest on a display screen; the display screen refers to, for example, a display screen on a human body-assisted robot, or a display screen on some mobile terminals (a display screen on a mobile phone or a tablet computer) or a display screen on AR glasses;

and performing voice playing on the name corresponding to the candidate object located in the region of interest (namely, if the candidate object is a bed, the system can directly play the voice of the bed in a voice mode).

Correspondingly, the first selection instruction issued by the user also has several specific forms, the first selection instruction may be a voice instruction, may be an instruction issued through a remote controller (e.g., a handle-type remote controller), and may also be an instruction issued through second human eye perspective data (e.g., after an enlarged region of interest or an icon of a candidate object is displayed on a display, which candidate object is determined as an object of interest by which the user gazes (which the user gazes is determined through the second human eye perspective data), and further, the second human eye perspective data is data that can reflect that the user gazes at a certain object).

That is, in a certain preferred embodiment, the step of outputting the presence of the plurality of candidate objects in the region of interest comprises:

displaying the amplified image of the region of interest on AR glasses/display screen;

acquiring second human eye visual angle data generated when a user observes the AR glasses/display screen; the first selection instruction is second human eye visual angle data;

A third way of selecting an object located in the region of interest as the object of interest:

acquiring historical operation habits of a user;

and determining the candidate object which is positioned in the region of interest and is appointed as the object of interest according to the historical operation habit.

The historical operation habits generally include two behavior habits, namely a first behavior habit determined according to the corresponding relation between the behavior content of the user and the behavior occurrence time and a second behavior habit counted according to the front-back sequence of the user behavior.

In general, the first behavior habit reflects a habit of time when a behavior of a user occurs, or the first behavior habit reflects a habit of behavior of a user occurring in a different time. For example, a user often/must go to the toilet at around eight am, a user often/must move to the bed at around twelve am, and so on; or the moving time of the user to the bed is respectively twelve o' clock and half twenty one clock; all three habits may be considered as first behavior habits. The first behavior habit is usually counted from a large number of historical behaviors, and may be pre-recorded. After determining the first behavior habit, it is possible to know at which candidate the user more desires to go, as long as the current time is determined.

Typically, the second behavior habit reflects the precedence order between different behaviors of the user. For example, a user typically moves toward a chair after going to a toilet; for another example, the user typically moves toward the table after going to the kitchen. Since the second behavior habit reflects the sequence of different behaviors of the user, after the last behavior of the user is determined, the next object which the user desires to go can be determined more accurately.

Furthermore, in the specific implementation, the object of interest may be determined by using only the first behavior habit, may be determined by using only the second behavior habit, or may be determined by using both the first behavior habit and the second behavior habit.

Specifically, the step of determining the candidate object specified in the region of interest as the object of interest according to the historical operating habits may be performed as follows:

calculating a reference value of each candidate object located in the region of interest according to historical operating habits (first behavior habits and/or second behavior habits);

selecting the candidate object with the highest reference value as the object of interest, or selecting the candidate object with the reference value exceeding a preset value as the object of interest.

A fourth way of selecting an object located in the region of interest as the object of interest:

extracting a reference image from a target database;

The foreground object is a foreground image obtained by performing foreground extraction on the region of interest, or is a part of the foreground image, if the foreground image is divided into a plurality of blocks (unconnected blocks), each block can be used as a foreground image, and of course, the mode for determining the foreground object can also be to compare the extracted image with a reference image in a target database, and determine the foreground object appearing in the region of interest according to the condition of the reference image.

The reference images are pre-stored in a target database (objects such as tables and chairs are stored in the target database according to the selection of a user), in order to ensure the accuracy of the calculation, a plurality of reference images with different viewing angles are stored in the target database for each candidate object, and during the calculation, the similarity calculation between the reference image with each viewing angle and the foreground object is performed, and the maximum value of the similarity is used as the similarity of the foreground object.

In contrast to the first three, the fourth approach may not be able to determine a unique object of interest. I.e. there may still be multiple objects of interest in the region of interest, but using the fourth approach it is not possible to distinguish which object of interest is actually of interest to the user. However, the fourth mode still has advantages over the first three modes, mainly because the reference object can be determined more accurately (by comparing with reference images pre-stored in the database to improve the accuracy of the determination), and further, the fourth mode can be combined with the first three modes. That is, in the method provided by the present application, the candidate object may be determined as follows:

extracting a reference image from a target database;

The foreground object is a foreground image of the region of interest, and if the foreground image is divided into a plurality of blocks (unconnected blocks), each block can be used as a foreground object, and of course, the mode of determining the foreground object may also be to compare the extracted image with a reference image in the target database, and determine the foreground object appearing in the region of interest according to the condition of the reference image.

The object whose similarity to the reference image meets the preset requirement refers to the object with the highest similarity to the reference image. When an object, of the foreground objects, whose similarity to the reference image meets a preset requirement is specifically executed as a candidate object, the similarity of the reference image corresponding to each foreground object (multiple similarities corresponding to the foreground objects) may be calculated first, and then the maximum value of the similarities is selected as the similarity of the foreground object. And then, taking the foreground object with the highest similarity as a candidate object.

In some of the above schemes, a target database is used, that is, the target database can determine a reference image, and therefore, different target databases can help to accurately determine different objects. According to the application of the scheme, the inventor considers that the database can be divided into the following types:

a home environment database, a medical environment database, and an outdoor environment database.

The reference images stored in the family environment database mainly include the following images:

a table image, a chair image, a toilet image.

The reference images stored in the medical environment database mainly include the following images:

images of each department, hospital bed images, and toilet images.

The reference images stored in the outdoor environment database mainly include the following images:

images of various buildings in the vicinity, images of major merchants.

Correspondingly, the method provided by the application further comprises the following steps:

Furthermore, after the target database is determined, when the reference image is extracted from the target database, the corresponding reference image can be extracted, and further, the identification can be completed more accurately by using the corresponding reference image.

The second selection instruction may be issued by a user (an operator using the human-assisted robot), may be issued by a third-party user, or may be generated by the system in response to an external environment.

That is, when the second selection instruction is generated by the system in response to the external environment, the method provided by the present application further includes:

acquiring current position information;

searching place information corresponding to the current position information;

The current position information reflects the current position of the human body auxiliary robot, and then the electronic map technology can be used for searching the place information corresponding to the current position information, wherein the place information can be places of hospitals, families, parks and the like. Then, generating a second selection instruction according to the location information, wherein if the location information is a hospital, the generated second selection instruction can be used for selecting a medical environment database; if the location information is a family, the second selection instruction generated may be for selecting a family database.

Correspondingly, when the second selection instruction is issued by the user, the method provided by the application further includes:

and receiving a database selection instruction issued by a user.

The database selection command issued by the user can be issued by the user through operating keys on the handheld remote controller or through a voice control command.

In the scheme provided by the application, when a user selects an object of interest, the system can automatically help the user to select, and as shown in the foregoing, when too many objects exist in a certain area of the environment image, the system may not be able to accurately determine which object the user is looking at, and at this time, the system may adopt a local amplification form to help the user to confirm. In addition to this, the user may be assisted in confirmation by means of confirmation with the user, for example, the system determines that the user is observing the area a, which currently has three objects in total, namely, the table, the vase, and the tablecloth, according to the first-person eye viewing angle data, and at this time, the system may confirm with the user to determine which object the user specifically looks at.

That is, in the method provided by the present application, step S102 may be performed as follows:

step 1021, selecting a first object to be confirmed from the environment image according to the first human eye visual angle data;

step 1022, respectively outputting prompt information of each first object to be confirmed;

in step 1023, if a confirmation instruction in response to the prompt information is acquired, the first object to be confirmed corresponding to the confirmation instruction is taken as the object of interest.

The object to be confirmed may be understood as an object that may be seen from a viewing angle corresponding to the first-person eye viewing angle data, for example, a user may see a table and a bed from a certain viewing angle. In step 1022, the output prompt message may be the corresponding information of the table and the bed. In step 1022, the output form may be various, for example, the output form may be output in the form of image information or in the form of voice. The output is in the form of image information, specifically, characters or graphics of the table and the bed are displayed on the display screen, so that the user issues a confirmation instruction. The output in the form of voice message can be the name of bed and table automatically played by the system to let the user give confirmation instruction. That is, the step of outputting prompt information corresponding to each first object to be confirmed respectively includes:

displaying image information corresponding to the first pair of objects to be confirmed on a display screen; and/or playing voice information of the name of the first pair of objects to be confirmed.

More specifically, when the prompt message is output, the system may simultaneously display a plurality of objects on the display screen, for example, simultaneously display the characters of the bed and the table on the display screen, or the system may display different objects on the display screen in a cycle, for example, 1 st to 5 th, 11 th to 15 th seconds show the table, 6 th to 10 th, and 16 th to 20 th seconds show the bed, so that the user only needs to press a confirmation button, and the system may determine which object the user desires to select according to the currently displayed object when the confirmation button is pressed. Such as a confirmation button pressed by the user at second 8 (the time period the bed is displayed), the system may determine that the user desires to select the bed as the object of interest.

Correspondingly, the step of selecting the specified object in the region of interest as the object of interest according to the first selection instruction issued by the user for the display screen can be realized as follows:

The first selection instruction is issued by the user aiming at the display screen, and after the user issues the first selection instruction, the system can determine the corresponding second object to be confirmed. Step outputs prompt information corresponding to the second object to be confirmed, which is the same as the implementation manner of step 1022, and step 1023 is the same as the step of taking the corresponding second object to be confirmed as the object of interest, and a description thereof is not repeated.

Similarly, the step of outputting the prompt information corresponding to each second object to be confirmed respectively includes:

After outputting the prompt message of the first object to be confirmed or the second object to be confirmed, the user can confirm in a plurality of different ways, for example, confirm by voice or confirm by eye movement.

Furthermore, in the solution provided in the present application, after the step of outputting the prompt information corresponding to the object to be confirmed, the method further includes:

acquiring user behaviors;

Wherein the standard behavior requirements include: the user completes one or more actions specified as follows:

In the specific implementation, the standard behavior requirement means that the user simultaneously performs one of the following actions:

blinking, mouth opening, tongue stretching, air blowing, head movement, voice behavior, eye movement behavior;

or at least two of the following actions are performed simultaneously:

It should be noted that when the user is required to complete at least two behaviors at the same time, usually there are many objects to be confirmed, and at this time, a certain prompt may be given to the user. Specifically, while step 1022 is being executed, the following steps may also be executed:

and respectively outputting standard behavior requirements corresponding to each first object to be confirmed.

The outputted standard behavior requirement is that actions required by the user, such as blinking and stretching the tongue, are displayed at the same time as the "bed" (one type of prompting information) is displayed on the display screen, which means that the user must blink and stretch the tongue at the same time to be considered by the system as the user desires to select the bed as the object of interest.

Corresponding to the above method, the present application also provides a control device of a human body-assisted robot, comprising:

Preferably, the device acts on a body-assisted robot comprising an arm;

the first generation module comprises:

Preferably, the device acts on a human body-assisted robot, the human body-assisted robot comprising a mobile part; the first generation module comprises:

Preferably, the first selection module includes:

Preferably, the second selection unit includes:

Preferably, the first output subunit is further configured to: displaying the magnified image of the region of interest on AR glasses;

Preferably, the second selection unit includes:

Preferably, the method further comprises the following steps:

Preferably, the second selection instruction is a database selection instruction issued by the user.

Preferably, the first selection module includes:

the first output unit is used for respectively outputting prompt information corresponding to each first object to be confirmed;

and the operation unit is used for taking the corresponding first object to be confirmed as the object of interest if the confirmation instruction corresponding to the prompt information is acquired.

Preferably, the first selection subunit includes:

the second output subunit is used for respectively outputting prompt information corresponding to each second object to be confirmed;

and the third operation subunit is used for taking the corresponding second object to be confirmed as the object of interest if the confirmation instruction corresponding to the prompt information is obtained.

Preferably, the first output unit is further configured to display image information corresponding to the first pair of objects to be confirmed on the display screen;

Preferably, the method further comprises the following steps:

the third acquisition module is used for acquiring user behaviors;

Preferably, the standard behavior requirement is that the user performs one or more of the following actions:

The method provided herein also provides a computer readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the control method of the human-assisted robot.

As shown in fig. 6, which is a schematic diagram of a first computing device provided in the embodiment of the present application, the first computing device 1000 includes: the system comprises a processor 1001, a memory 1002 and a bus 1003, wherein the memory 1002 stores execution instructions, when the first computing device runs, the processor 1001 and the memory 1002 communicate through the bus 1003, and the processor 1001 executes steps stored in the memory 1002, such as a signal lamp period determination method.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for controlling a human body-assisted robot, comprising:

generating a robot control instruction according to the position of the object of interest;

the method acts on a body-assisted robot, the body-assisted robot comprising an arm;

2. The method of claim 1, wherein the step of selecting the user's object of interest from the environmental image based on the first eye view data comprises:

3. The method of claim 2, wherein the step of selecting an object located in the region of interest as the object of interest comprises:

4. The method of claim 3, wherein the step of outputting the presence of the plurality of candidate objects in the region of interest comprises:

displaying the magnified image of the region of interest on AR glasses;

5. The method of claim 2, wherein the step of selecting an object located in the region of interest as the object of interest comprises:

extracting a reference image from a target database;

6. The method of claim 3, further comprising:

extracting a reference image from a target database;

7. The method according to any one of claims 5 or 6, further comprising:

8. The method of claim 7, further comprising:

acquiring current position information;

searching place information corresponding to the current position information;

9. The method of claim 7,

10. The method of claim 1, wherein the step of selecting the user's object of interest from the environmental image based on the first eye view data comprises:

11. The method of claim 3, wherein the step of selecting the specified candidate object in the region of interest as the object of interest according to a first selection instruction issued by a user for the display screen comprises:

12. The method according to any one of claims 10 or 11,

13. The method according to any one of claims 10 or 11, further comprising, after the step of outputting the prompt information corresponding to the object to be confirmed:

acquiring user behaviors;

14. The method of claim 13, wherein standard behavior requirements comprise: the user completes one or more actions specified as follows:

15. A control device for a human body-assisted robot, comprising:

the first generation module is used for generating a robot control instruction according to the position of the interested object;

the device acts on a human body auxiliary robot, wherein the human body auxiliary robot comprises an arm;

the first generation module comprises:

16. The apparatus of claim 15, wherein the first selection module comprises:

17. The apparatus of claim 16, wherein the second selecting unit comprises:

18. The apparatus of claim 17, wherein the first output subunit is further configured to: displaying the magnified image of the region of interest on AR glasses;

19. The apparatus of claim 16, wherein the second selecting unit comprises:

20. The apparatus of claim 17, further comprising:

21. The apparatus of any one of claims 19 or 20, further comprising:

22. The apparatus of claim 21, further comprising:

23. The apparatus of claim 21,

24. The apparatus of claim 15, wherein the first selection module comprises:

25. The apparatus of claim 17, wherein the first selection subunit comprises:

26. The apparatus according to any one of claims 24 or 25, wherein the first output unit is further configured to display image information corresponding to the first pair of objects to be confirmed on the display screen;

27. The apparatus of any one of claims 24 or 25, further comprising:

the third acquisition module is used for acquiring user behaviors;

28. The apparatus of claim 27, wherein the standard behavior requirement is that a user perform one or more of the following actions:

29. A computer-readable medium having non-volatile program code executable by a processor, wherein the program code causes the processor to perform the method of any of claims 1-14.

30. A computing device comprising: a processor, a memory and a bus, the memory storing instructions for execution, the processor and the memory communicating via the bus when the computing device is operating, the processor executing the method of any of claims 1-14 stored in the memory.