CN112223288B - Visual fusion service robot control method - Google Patents

Visual fusion service robot control method Download PDF

Info

Publication number
CN112223288B
CN112223288B CN202011073216.2A CN202011073216A CN112223288B CN 112223288 B CN112223288 B CN 112223288B CN 202011073216 A CN202011073216 A CN 202011073216A CN 112223288 B CN112223288 B CN 112223288B
Authority
CN
China
Prior art keywords
service robot
user
target object
target
res
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011073216.2A
Other languages
Chinese (zh)
Other versions
CN112223288A (en
Inventor
段峰
张丽娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN202011073216.2A priority Critical patent/CN112223288B/en
Publication of CN112223288A publication Critical patent/CN112223288A/en
Application granted granted Critical
Publication of CN112223288B publication Critical patent/CN112223288B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems

Abstract

The invention belongs to the technical field of medical service robots, and particularly relates to a service robot control method with vision fusion. The method comprises the following steps: step S1, acquiring scene information of the service robot for analysis, and acquiring potential target objects and pixel positions thereof; step S2, sequencing the target objects according to the significance, sequentially displaying the target object pictures according to the sequencing by the human-computer interaction interface, and sequentially inquiring whether the user grabs or not through an inquiry window; step S3, recognizing the electroencephalogram of the user by adopting a brain-computer interface, recognizing the intention of the user, judging whether to grab an object of a human-computer interaction interface, if so, turning to step S4, and if not, continuing to display the next target object picture; and step S4, moving the service robot to the vicinity of the target object according to the target object selected by the user, and controlling the mechanical arm of the service robot by using the visual servo to complete grabbing. The service robot is controlled through the electroencephalogram signals under the assistance of vision, so that the visual burden and fatigue of a user can be reduced, required articles can be easily obtained, and the self-care level of a patient with serious dyskinesia can be improved.

Description

Visual fusion service robot control method
Technical Field
The invention belongs to the technical field of medical service robots, and particularly relates to a service robot control method with vision fusion.
Background
With the coming of global aging, patients with diseases such as cerebral apoplexy, Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, spinal cord injury, muscular dystrophy and the like are increasing, and the diseases bring great inconvenience and burden to the lives of the patients. For these severely dyskinetic patients, it is essential to develop intelligent service devices to assist their daily lives. The proportion of the old people who cannot take care of themselves is high, so that the complex system operation has certain difficulty; speaking and gesturing are also difficult for them. The brain-computer interface can establish a bridge between human brain intention and external equipment, and the brain-computer interface has a high probability of becoming the optimal interaction mode between a patient with serious dyskinesia and service equipment. The brain-computer interface system based on the electroencephalogram signals can convert consciousness of the human brain into instructions without manual operation or voice commands and other ways, realizes communication between the human brain and external equipment, and is a service system realization mode capable of being explored.
In practical application scenarios, most of the existing brain-computer interface systems are not high enough in accuracy, and the number of control commands which can be realized is relatively limited; in addition, the brain-computer interface system cannot sense environmental spatial information, and the user can feel brain fatigue after long-time operation, thereby affecting the control effect. The brain-computer interface system is difficult to accurately and comfortably realize the complete operation of the service equipment, the intelligent level of the system is not enough to judge the operation intention of the user, and the operation burden of the user is difficult to avoid. An intelligent method and system are urgently needed in the field of medical service robots to help judge the intention of patients with severe dyskinesia, so that the medical service robots are assisted to complete equipment operation, and convenience is provided for life of the patients.
Disclosure of Invention
The control method of fusing computer vision is a good way for assisting a user to operate a system. Regions of interest are automatically processed and regions of non-interest are selectively ignored, mimicking a human being when facing a particular scene. Human vision has the capability of quickly searching and positioning interested targets, and a visual attention mechanism, namely visual saliency, can be introduced into service robot vision, so that the visual information processing task is greatly improved. Visual servoing techniques may also be introduced in the service robot. The visual servo technology refers to the behavior of automatically receiving and processing images through a visual sensor, and enabling the system to further control or adaptively adjust the robot through information fed back by the images. The visual servo technology is applied to the service robot, so that the control operation of a user on the robot can be simplified. Computer vision technologies such as vision significance detection, vision servo technology and the like are integrated in a service robot controlled by a brain-computer interface system, so that the system is optimized to reduce the operation burden of disabled people, and the method has exploration potential.
In order to achieve the purpose, the invention adopts the following technical scheme:
a vision-fused service robot control method comprises the following steps:
step S1, acquiring scene information of the service robot for analysis, and acquiring potential target objects and pixel positions thereof;
step S2, sequencing the target objects according to the significance, sequentially displaying the target object pictures according to the sequencing by the human-computer interaction interface, and sequentially inquiring whether the user grabs or not through an inquiry window;
step S3, recognizing the electroencephalogram of the user by adopting a brain-computer interface, recognizing the intention of the user, judging whether to grab an object of a human-computer interaction interface, if so, turning to step S4, and if not, continuing to display the next target object picture;
and step S4, moving the service robot to the vicinity of the target object according to the target object selected by the user, and controlling the mechanical arm of the service robot by using the visual servo to complete grabbing.
In a further optimization of the present technical solution, the step S1 specifically includes the following steps,
collecting image and depth information by using camera, performing target detection on the collected image by using neural network model, and detecting n objects O1、O2…OnAnd its corresponding quantity ratio omega1、ω2…ωnGo through m scenes E1、E2、…、EmThe scene E where the service robot is located is identified, and naive Bayes is used for training a scene identification classifier; when the l-th feature is a continuous value, making it subject to a Gaussian distribution, thenCorresponding scene Ei,P(al|Ei) Is characterized by alThe probability of the occurrence of the event is,
Figure BDA0002715822460000023
the expression class is EiThe mean of the features of dimension i,
Figure BDA0002715822460000024
the expression class is EiIn the sample of (1), the variance of the l-dimensional feature, CNBCIn order to be a naive bayes probability model,
Figure BDA0002715822460000021
Figure BDA0002715822460000022
in a scene E, firstly, screening for the first time to remove background objects and objects which cannot be captured by the service robot; performing secondary screening on the remaining objects, and selecting c target objects k1, k2, … and kc which are most likely to be selected under the scene E where the user is located; res1And res2Results of the first and second screens, res, respectively1Based on the screening conditions S to OnScreening individual objects res2Result res of the first screening based on the screening condition E1The screening is carried out, and the screening is carried out,
Figure BDA0002715822460000031
res1=classifier1(On,S),
res2=classifier2(res1,E),
the neural network model can obtain a rectangular identification frame (x) of the target object kk,yk,wk,hk) And x and y are respectively the abscissa and the abscissa of the pixel at the upper left corner of the rectangular recognition boxThe ordinate, w and h are the length and width of the rectangular identification box, respectively.
In a further optimization of the present technical solution, the step of sorting the target objects according to the significance in step S2 includes the following steps:
sequencing the target articles by utilizing two-dimensional Gaussian distribution in combination with a significance detection result, sequentially presenting the target articles on a human-computer interaction interface from high to low according to the significance of an identification frame, and popping up an inquiry window for judging whether to grab or not; the significance of the recognition box is specifically formulated as follows:
Figure BDA0002715822460000032
Figure BDA0002715822460000033
Result=rank[Obj(k)],
wherein i and j are the abscissa and ordinate of a pixel point on the image; x is the number ofcenter=x+w/2,ycenterY + h/2 are horizontal and vertical coordinate values of the center of the detection rectangular frame respectively; h (i, j) is the output value of the significance detection of the pixel point; g (i, j) is the Gaussian distribution probability value of the pixel point in the corresponding rectangular identification frame; and multiplying H (i, j) and G (i, j) of each point of the rectangular identification frame, and then accumulating to obtain the significance of the frame.
In a further optimization of the present technical solution, in step S4, the robot arm is controlled by the visual servo module to grasp, and the rectangular identification frame (x) of the target item k is identifiedk,yk,wk,hk) Calculating the mean value of all point cloud data after denoising to obtain the position of a target object relative to a depth camera, converting the position into the position (p.x, p.y and p.z) relative to a mechanical arm coordinate system through coordinate conversion, wherein the mechanical arm consists of 5 rotating joints and connecting rods, the first 4 rotating joints control the posture of the mechanical arm, and the last 1 rotating joint controls the grabbing of the mechanical arm, so that the mechanical arm l can stably grab the target object4Held horizontal while grasping, i.e.:
θ234=0,
from the geometric analysis it is possible to obtain:
l1+l2cosθ2+l3cos(θ23)=p.z,
Figure BDA0002715822460000035
for convenience of presentation, the side length l and the angle are introduced
Figure BDA0002715822460000036
Figure BDA0002715822460000034
Figure BDA0002715822460000041
The angle of each joint can be obtained by the formula, namely the angle theta1、θ2、θ3、θ4,θ5The range of the grabbing angle is 0.2-0.3rad,
θ1=arctan(p.x,p.y),
Figure BDA0002715822460000042
Figure BDA0002715822460000043
θ4=-(θ23)。
different from the prior art, the technical scheme has the following beneficial effects:
based on the service robot control method, the service robot is controlled through the electroencephalogram signal under the assistance of computer vision, the visual burden and fatigue of a user are reduced, the user is helped to easily obtain required articles, and the service robot control method can be used for improving the self-care level of patients with serious dyskinesia.
Drawings
FIG. 1 is a flow chart of a vision-converged service robot control method;
FIG. 2 is a schematic diagram of scene-based target determination and saliency ranking;
FIG. 3 is a diagram of a kinematic analysis of a robotic arm;
FIG. 4 is a schematic view of a vision-converged service robot control system;
FIG. 5 is a schematic diagram of a human-machine interface.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Fig. 1 shows a flowchart of a service robot control method for visual fusion. The invention provides a control method of a vision-integrated service robot, which comprises the following specific steps:
and step S1, acquiring scene information of the service robot, and analyzing to obtain potential target objects and pixel positions thereof.
The service robot carries out target inference based on scenes and obtains potential target objects and pixel positions thereof.
And acquiring the current image and the depth information by using a component depth camera of the service robot. The acquired image is subjected to target detection by using a neural network model, and the target detection model used in the embodiment is YOLOv 3. The backbone network for the YOLOv3 feature extraction is Darknet53, and the full connection layer is replaced by 52 convolutional layer structures and 1x1 convolution. YOLOv3 used a multi-scale feature fusion prediction approach to detect objects of different sizes. The input image size is 416x3, and feature maps at the scales of 13 x 13, 26 x 26 and 52 x 52 can be obtained for detecting large, medium and small targets. 9 anchor frames with different sizes, namely prior frames in the target detection process, are set under the original image pixels. The 9 anchor boxes make the output of each scale evenly distributed to the 3 anchor boxes. For each anchor box, 5 parameters of target coordinates x and y, target sizes w and h, confidence C, and category number C can be output, 5+ C channels in total, and 3 anchor boxes per scale output (5+ C) × 3 channels.
Using n detected objects O1、O2…OnAnd its corresponding quantity ratio omega1、ω2…ωnGo through m scenes E1、E2、…、EmThe scene E of the service robot can be identified. And (3) carrying out training of the scene recognition classifier by using a naive Bayes algorithm. Considering the non-linearity of the change of the characteristic, the characteristic is continuous in the first characteristic and is made to be in Gaussian distribution, and the formula is as follows, P (a)l|Ei) Is characterized by alThe probability of the occurrence of the event is,
Figure BDA0002715822460000054
the expression class is EiThe mean of the features of dimension i in the sample of (1).
Figure BDA0002715822460000055
The expression class is EiIn the sample of (2), the variance of the l-th dimension feature. CNBCIs a naive bayes probability model.
Figure BDA0002715822460000051
Figure BDA0002715822460000052
In the scene E, first screening is performed, and the tag S is used to remove background objects and objects that cannot be captured by the service robot. The labels of the background articles and the articles which cannot be grabbed by the service robot in the screener are 0, and the labels of the articles which can be grabbed by the service robot are 1; and screening the remaining objects for the second time to screen out the objects which are unlikely to appear or have extremely low probability under the scene E of the user, and selecting the c target objects k which are most likely to be selected1、k2、…、kc. Res in the formula1And res2The results of the first and second screening are obtained. res1Based on the screening conditions S to OnScreening the objects; res2Result res of the first screening based on the screening condition E1And (5) screening.
Figure BDA0002715822460000053
res1=classifier1(On,S),
res2=classifier2(res1,E),
The neural network model can obtain a rectangular identification frame (x) of the target object kk,yk,wk,hk) X and y are respectively the abscissa and the ordinate of the pixel at the upper left corner of the rectangular recognition box, and w and h are respectively the length and the width of the rectangular recognition box.
And step S2, sequencing the target objects according to the significance, sequentially displaying the target object pictures according to the sequencing by the human-computer interaction interface, and sequentially inquiring whether the user grabs or not through the inquiry window.
Fig. 2 is a schematic diagram of scene-based target determination and saliency ranking. The significance detection model used in this example was EML-Net. And the EML-Net trains two pre-trained DenseNet and NasNet deep networks independently at the coding stage, the output is the output of the last layer of the network, the last full-connection layer is replaced by 1x1 convolution, and the network is obtained after 1x1 convolution. In the decoding stage, the two networks trained before are subjected to multilevel feature, four layers are extracted by DenseNet, three layers are extracted by NasNet, and the total number of the layers is seven, then seven feature maps are obtained by respectively using 1x1 convolution, the seven feature maps are sampled to have the same size with the maximum feature map, finally, 1x1 convolution is used to obtain a final result, and the output result is subjected to normalization processing to be used as a significant value of each pixel.
And (4) sequencing the target articles by utilizing two-dimensional Gaussian distribution in combination with the significance detection result, and calculating the significance in the rectangular frame obtained by the Yolov 3. And arranging the target items according to the sequence of the significance from high to low, namely identifying the intention of the user. And displaying the frames on the human-computer interaction interface from high to low in sequence according to the significance of the identification frames, and popping up an inquiry window for judging whether to grab. The specific formula is as follows:
Figure BDA0002715822460000061
Figure BDA0002715822460000062
Result=rank[Obj(k)],
wherein i and j are the abscissa and ordinate of a pixel point on the image; x is the number ofcenter=x+w/2,ycenterY + h/2 are horizontal and vertical coordinate values of the center of the detection rectangular frame respectively; h (i, j) is the output value of the significance detection of the pixel point; g (i, j) is the Gaussian distribution probability value of the pixel point in the corresponding rectangular identification frame; multiplying H (i, j) and G (i, j) of each point of the rectangular identification frame and then accumulating to obtain the significance obj (k) of the frame; the significance is calculated in such a way that the significant region is concentrated on the center of the article as much as possible, and the influence of the size of the rectangular recognition frame on the calculation result is eliminated.
And step S3, recognizing the electroencephalogram of the user by adopting a brain-computer interface, recognizing the intention of the user, judging whether to grab an object of the human-computer interaction interface, if so, turning to step S4, and if not, continuing to display the next target object picture. And acquiring the electroencephalogram signals generated by the user by using a brain-computer interface, processing and identifying the selection corresponding to 'yes' and 'no'.
Electroencephalogram (EEG) often has certain rhythmicity and spatial distribution characteristics, and the consciousness state causing the electroencephalogram can be distinguished by utilizing a specific calculation method to extract the characteristics and identifying the characteristics, so that the electroencephalogram is converted into the thinking intention of the human brain, and further used for controlling external equipment. Because the electroencephalogram signal is a very weak nonlinear electrophysiological signal, the amplitude of the electroencephalogram signal is in the millivolt level, the frequency domain distribution range is 0.5 to 50Hz, and signal amplification and denoising processing are required before feature extraction.
The brain-machine interface used in this embodiment is an exogenous brain-machine interface based on Steady State Visual Evoked Potential (SSVEP). The SSVEP signal is an evoked potential generated by the optical signal stimulating the visual system. It is evoked by a visual stimulus of fixed frequency, the visual system producing an evoked response with a steady frequency signature when the visual stimulus flashes periodically at a specific frequency. The brain-computer interface based on the SSVEP flickers the 'yes' and 'no' options at different fixed frequencies, stimulates the occipital region of the brain of the user to generate electroencephalogram signals with corresponding frequencies, acquires the electroencephalogram signals, amplifies the electroencephalogram signals by an amplifier, removes power frequency interference and eliminates artifact signals by signal preprocessing, identifies the signals by a typical correlation analysis method, and corresponds the identification results to the 'yes' and 'no' judgment of the user. The 'yes' output signal is used for controlling the service robot to take the target object through visual servo; and the output signal of 'no' is used for controlling the human-computer interaction interface to be switched to the next target object picture.
And step S4, moving the service robot to the vicinity of the target object according to the target object selected by the user, and controlling the mechanical arm of the service robot by using the visual servo to complete grabbing. If the user selects 'yes', the service robot moves to the position near the target object, and the mechanical arm is controlled to complete grabbing by using the visual servo; if the user selects "no", the man-machine interface displays the next target item picture, and step S3 is repeated.
The specific method for controlling the mechanical arm to grab the object by the visual servo comprises the following steps: acquiring a rectangular recognition frame (x) of a target item k by using depth information acquired by a depth camerak,yk,wk,hk) And (3) carrying out denoising processing on the point cloud data of all the points, calculating a mean value to obtain the position of the target object relative to the depth camera, and converting the position into the position relative to a mechanical arm coordinate system through coordinate conversion (p.x, p.y and p.z). The arm comprises 5 revolute joints and connecting rod, and the gesture of preceding 4 revolute joints control arm, and snatching of last 1 revolute joint control arm. First of all, calculateWhether the target object is within the working space of the robot arm. If the robot arm is in the working space of the robot arm, the robot arm is controlled to grab by the servo grabbing module. If the robot arm exceeds the working space of the mechanical arm, the robot arm feeds back to the movable platform to enable the service robot to move into the working space of the mechanical arm, and then the follow-up grabbing action is completed. In order to ensure that the mechanical arm can stably grasp, the grasping posture of the mechanical arm is set, and reference is made to fig. 3, which is a kinematic analysis diagram of the mechanical arm. Make the mechanical arm l4Held horizontal while grasping, i.e.:
θ234=0,
from the geometric analysis it is possible to obtain:
l1+l2 cosθ2+l3 cos(θ23)=p.z,
Figure BDA0002715822460000071
for convenience of presentation, the side length l and the angle are introduced
Figure BDA0002715822460000076
Figure BDA0002715822460000072
Figure BDA0002715822460000073
The angle of each joint can be obtained by the formula, namely the angle theta1、θ2、θ3、θ4。θ5For grabbing angle, 0.2-0.3rad is taken according to actual needs.
θ1=arctan(p.x,p.y),
Figure BDA0002715822460000074
Figure BDA0002715822460000075
θ4=-(θ23)。
And controlling each rotary joint to rotate the angle in sequence according to the angle obtained by the kinematic analysis to complete the grabbing action.
Fig. 4 is a schematic diagram of a vision-converged service robot control system. The system mainly comprises two parts, namely a brain-computer interface part based on SSVEP and a service robot part. The brain-computer interface part based on the SSVEP comprises brain electrical signal acquisition equipment, a human-computer interaction interface and a first processing unit; the service robot part comprises a movable platform, an identification and positioning module, a servo grabbing module and a second processing unit.
The electroencephalogram signal acquisition equipment comprises a biological signal amplifier with a plurality of sampling channels, a set of high-performance active electrode system (gamma box for short) capable of recording non-invasive electrophysiological signals, an electroencephalogram cap and a plurality of electroencephalogram electrodes. The electroencephalogram signal acquisition equipment used in the embodiment is g.tec, and comprises a g.gamma cap, a g.gamma ys gamma box and a g.usdamp amplifier.
Referring to fig. 5, the schematic diagram of the human-computer interaction interface includes two layers of interfaces, where the first page is a mobile navigation interface; the second page is an identification capture interface. The display screen of the man-machine interface used in the embodiment is a 23.8-inch LCD display screen. The mobile navigation interface controls the robot to perform mobile navigation, and the interface is provided with 5 frequency flicker blocks, wherein the frequencies are respectively 6Hz, 7Hz, 8Hz, 9Hz and 10Hz and respectively correspond to commands of 'forward', 'backward', 'left turn', 'right turn' and 'next page'; the first 4 can realize the movement of any position in navigation; the next page can switch the flashing interface to the recognition grabbing interface. The recognition grabbing interface can be used for the user to select grabbing from recognized target objects in sequence. The interface has 3 scintillation blocks, the frequencies are respectively 9Hz, 11Hz and 13Hz, and the scintillation blocks respectively correspond to the commands of 'previous page', 'yes' and 'no'; the 'previous page' can switch the interface to the mobile navigation interface; and after the scene-based target judgment is finished, the 'yes' and 'no' commands appear in a popup mode along with the picture of the target object, so that the user can select whether to grab the target object.
The first processing unit and the second processing unit realize data transmission between the brain-computer interface based on the SSVEP and the service robot by utilizing a communication protocol of TCP/IP.
The service robot, which is a turkebot used in this embodiment, includes a movable platform, an identification and capture module, and a servo capture module. The movable platform is a Kobuki chassis and is a two-wheel differential base, and the robot can stably move forwards and backwards and rotate left and right. And a visual sensor of the identification and grabbing module is a PrimeSense depth camera, when the PrimeSense acquires an image, the image is identified by using a scene-based target judgment method, and position information in an identification result is transmitted to the servo grabbing module. The main component of the servo grabbing module is a Turtlebot _ Arm robot, and the module is responsible for automatically grabbing a target object.
The working environment of the service robot of the present embodiment is set to a home. The user sits at the position about 70cm in front of the display screen, wears the electroencephalogram cap on the head, and selects 9 electrode channels of the occipital lobe area of the brain as signal sources for analyzing the response of the steady-state visual evoked potential. The gamma box and the amplifier are connected through the electrode wires, and the acquisition and identification processing of the electroencephalogram signals are completed after the gamma box and the amplifier are connected with the first processing unit. The user controls the robot to move and navigate by watching the 'forward', 'backward', 'left turn', 'right turn' flashing blocks of the mobile navigation interface. When a user navigates the service robot to a position near an expected acquired article in a living room, the man-machine interaction interface can be switched to the recognition grabbing interface by watching the 'next page' of the flashing block, and the service robot recognizes a plurality of articles by using the object recognition and positioning module and performs scene analysis. And after the environment is analyzed to be a living room scene, objects which cannot be grabbed by sofas, tea tables and the like are removed, and the television remote controller, the tea cup, the potato chips, the vase and the like are sequentially used as target objects according to the significance sequence by combining the scene. The television remote controller is used as a target to be presented on the recognition grabbing interface, if the user watches the 'yes' flicker block, the Turtlebot automatically adjusts to the grabbing posture of the mechanical arm and then the grabbing is completed; if the user watches the 'no' flicker block, identifying that the grabbing interface presents a target object teacup ranked in the second order; by analogy, the Turtlebot automatically adjusts to the grabbing posture of the mechanical arm until the user watches the 'yes' flash block, and then grabbing is completed; after the grabbing is successful, the user returns to the mobile navigation interface by watching the previous flashing block to navigate the robot to the delivery destination. The above is that the service robot successfully completes one item delivery task.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrases "comprising … …" or "comprising … …" does not exclude the presence of additional elements in a process, method, article, or terminal that comprises the element. Further, herein, "greater than," "less than," "more than," and the like are understood to exclude the present numbers; the terms "above", "below", "within" and the like are to be understood as including the number.
Although the embodiments have been described, once the basic inventive concept is obtained, other variations and modifications of these embodiments can be made by those skilled in the art, so that the above embodiments are only examples of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes using the contents of the present specification and drawings, or any other related technical fields, which are directly or indirectly applied thereto, are included in the scope of the present invention.

Claims (3)

1. A vision-fused service robot control method is characterized by comprising the following steps:
step S1, acquiring scene information of the service robot for analysis, and acquiring potential target objects and pixel positions thereof;
step S2, sequencing the target objects according to the significance, sequentially displaying the target object pictures according to the sequencing by the human-computer interaction interface, and sequentially inquiring whether the user grabs or not through an inquiry window;
the step S2 of sorting the target objects according to their saliency includes the following steps:
sequencing the target articles by utilizing two-dimensional Gaussian distribution in combination with a significance detection result, sequentially presenting the target articles on a human-computer interaction interface from high to low according to the significance of an identification frame, and popping up an inquiry window for judging whether to grab or not; the significance of the recognition box is specifically formulated as follows:
Figure FDA0003203045460000011
Figure FDA0003203045460000012
Result=rank[Obj(k)],
wherein i and j are the abscissa and ordinate of a pixel point on the image; the rectangular recognition box (x) of the potential target object k obtained in step S1k,yk,wk,hk) X and y are respectively the abscissa and the ordinate of the pixel at the upper left corner of the rectangular identification frame, and w and h are respectively the length and the width of the rectangular identification frame; x is the number ofcenter=x+w/2,ycenterY + h/2 are horizontal and vertical coordinate values of the center of the detection rectangular frame respectively; h (i, j) is the output value of the significance detection of the pixel point; g (i, j) is the Gaussian distribution probability value of the pixel point in the corresponding rectangular identification frame; multiplying H (i, j) and G (i, j) of each point of the rectangular identification frame and then accumulating to obtain the significance obj (k) of the frame; the rank function isA ranking function;
step S3, recognizing the electroencephalogram of the user by adopting a brain-computer interface, recognizing the intention of the user, judging whether to grab an object of a human-computer interaction interface, if so, turning to step S4, and if not, continuing to display the next target object picture;
and step S4, moving the service robot to the vicinity of the target object according to the target object selected by the user, and controlling the mechanical arm of the service robot by using the visual servo to complete grabbing.
2. The vision-converged service robot control method according to claim 1, wherein the step S1 specifically includes the steps of,
collecting image and depth information by using camera, performing target detection on the collected image by using neural network model, and detecting n objects O1、O2…OnAnd its corresponding quantity ratio omega1、ω2…ωnGo through m scenes E1、E2、…、EmThe scene E where the service robot is located is identified, and naive Bayes is used for training a scene identification classifier; if the ith characteristic is a continuous value and is made to obey Gaussian distribution, the corresponding scene Ei,P(al|Ei) Is characterized by alThe probability of the occurrence of the event is,
Figure FDA0003203045460000021
the expression class is EiThe mean of the features of dimension i,
Figure FDA0003203045460000022
the expression class is EiIn the sample of (1), the variance of the l-dimensional feature, CNBCIn order to be a naive bayes probability model,
Figure FDA0003203045460000023
Figure FDA0003203045460000024
in a scene E, firstly, screening for the first time to remove background objects and objects which cannot be captured by the service robot; performing secondary screening on the remaining objects, and selecting c target objects k1, k2, … and kc which are most likely to be selected under the scene E where the user is located; res1And res2Results of the first and second screens, res, respectively1Based on the screening conditions S to OnScreening individual objects res2Result res of the first screening based on the screening condition E1The screening is carried out, and the screening is carried out,
Figure FDA0003203045460000025
res1=classifier1(On,S),
res2=classifier2(res1,E),
the neural network model can obtain a rectangular identification frame (x) of the target object kk,yk,wk,hk) X and y are respectively the abscissa and the ordinate of the pixel at the upper left corner of the rectangular recognition box, and w and h are respectively the length and the width of the rectangular recognition box.
3. The vision-fusion service robot control method according to claim 1, wherein the step S4 is performed by using the vision servo module to control the robot arm to grasp the rectangular recognition frame (x) of the target item kk,yk,wk,hk) Calculating the mean value of all point cloud data after denoising to obtain the position of a target object relative to a depth camera, converting the position into the position (p.x, p.y and p.z) relative to a mechanical arm coordinate system through coordinate conversion, wherein the mechanical arm consists of 5 rotating joints and connecting rods, the first 4 rotating joints control the posture of the mechanical arm, and the last 1 rotating joint controls the grabbing of the mechanical arm, so that the mechanical arm l can stably grab the target object4Held horizontal while grasping, i.e.:
θ234=0,
from the geometric analysis it is possible to obtain:
l1+l2cosθ2+l3cos(θ23)=p.z,
Figure FDA0003203045460000026
for convenience of presentation, the side length l and the angle are introduced
Figure FDA0003203045460000027
Figure FDA0003203045460000028
Figure FDA0003203045460000029
The angle of each joint can be obtained by the formula, namely the angle theta1、θ2、θ3、θ4,θ5The range of the grabbing angle is 0.2-0.3rad,
θ1=arctan(p.x,p.y),
Figure FDA0003203045460000031
Figure FDA0003203045460000032
θ4=-(θ23)。
CN202011073216.2A 2020-10-09 2020-10-09 Visual fusion service robot control method Active CN112223288B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011073216.2A CN112223288B (en) 2020-10-09 2020-10-09 Visual fusion service robot control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011073216.2A CN112223288B (en) 2020-10-09 2020-10-09 Visual fusion service robot control method

Publications (2)

Publication Number Publication Date
CN112223288A CN112223288A (en) 2021-01-15
CN112223288B true CN112223288B (en) 2021-09-14

Family

ID=74120081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011073216.2A Active CN112223288B (en) 2020-10-09 2020-10-09 Visual fusion service robot control method

Country Status (1)

Country Link
CN (1) CN112223288B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113138668B (en) * 2021-04-25 2023-07-18 清华大学 Automatic driving wheelchair destination selection method, device and system
CN115476366B (en) * 2021-06-15 2024-01-09 北京小米移动软件有限公司 Control method, device, control equipment and storage medium for foot robot
CN113499138B (en) * 2021-07-07 2022-08-09 南开大学 Active navigation system for surgical operation and control method thereof
CN114146283A (en) * 2021-08-26 2022-03-08 上海大学 Attention training system and method based on target detection and SSVEP
CN115730236B (en) * 2022-11-25 2023-09-22 杭州电子科技大学 Medicine identification acquisition method, equipment and storage medium based on man-machine interaction

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824080A (en) * 2014-02-21 2014-05-28 北京化工大学 Robot SLAM object state detection method in dynamic sparse environment
CN107139179A (en) * 2017-05-26 2017-09-08 西安电子科技大学 A kind of intellect service robot and method of work
CN109015635A (en) * 2018-08-08 2018-12-18 西安科技大学 A kind of service robot control method based on brain-machine interaction
CN109176521A (en) * 2018-09-19 2019-01-11 北京因时机器人科技有限公司 A kind of mechanical arm and its crawl control method and system
CN109366508A (en) * 2018-09-25 2019-02-22 中国医学科学院生物医学工程研究所 A kind of advanced machine arm control system and its implementation based on BCI
CN109531584A (en) * 2019-01-31 2019-03-29 北京无线电测量研究所 A kind of Mechanical arm control method and device based on deep learning
CN109977970A (en) * 2019-03-27 2019-07-05 浙江水利水电学院 Character recognition method under water conservancy project complex scene based on saliency detection
CN111126335A (en) * 2019-12-31 2020-05-08 珠海大横琴科技发展有限公司 SAR ship identification method and system combining significance and neural network
CN111191650A (en) * 2019-12-30 2020-05-22 北京市新技术应用研究所 Object positioning method and system based on RGB-D image visual saliency
CN111515945A (en) * 2020-04-10 2020-08-11 广州大学 Control method, system and device for mechanical arm visual positioning sorting and grabbing
CN111640116A (en) * 2020-05-29 2020-09-08 广西大学 Aerial photography graph building segmentation method and device based on deep convolutional residual error network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8965580B2 (en) * 2012-06-21 2015-02-24 Rethink Robotics, Inc. Training and operating industrial robots
US9616569B2 (en) * 2015-01-22 2017-04-11 GM Global Technology Operations LLC Method for calibrating an articulated end effector employing a remote digital camera
PT108690B (en) * 2015-07-13 2023-04-04 Fund D Anna Sommer Champalimaud E Dr Carlos Montez Champalimaud SYSTEM AND METHOD FOR BRAIN-MACHINE INTERFACE FOR OPERANT LEARNING

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824080A (en) * 2014-02-21 2014-05-28 北京化工大学 Robot SLAM object state detection method in dynamic sparse environment
CN107139179A (en) * 2017-05-26 2017-09-08 西安电子科技大学 A kind of intellect service robot and method of work
CN109015635A (en) * 2018-08-08 2018-12-18 西安科技大学 A kind of service robot control method based on brain-machine interaction
CN109176521A (en) * 2018-09-19 2019-01-11 北京因时机器人科技有限公司 A kind of mechanical arm and its crawl control method and system
CN109366508A (en) * 2018-09-25 2019-02-22 中国医学科学院生物医学工程研究所 A kind of advanced machine arm control system and its implementation based on BCI
CN109531584A (en) * 2019-01-31 2019-03-29 北京无线电测量研究所 A kind of Mechanical arm control method and device based on deep learning
CN109977970A (en) * 2019-03-27 2019-07-05 浙江水利水电学院 Character recognition method under water conservancy project complex scene based on saliency detection
CN111191650A (en) * 2019-12-30 2020-05-22 北京市新技术应用研究所 Object positioning method and system based on RGB-D image visual saliency
CN111126335A (en) * 2019-12-31 2020-05-08 珠海大横琴科技发展有限公司 SAR ship identification method and system combining significance and neural network
CN111515945A (en) * 2020-04-10 2020-08-11 广州大学 Control method, system and device for mechanical arm visual positioning sorting and grabbing
CN111640116A (en) * 2020-05-29 2020-09-08 广西大学 Aerial photography graph building segmentation method and device based on deep convolutional residual error network

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
Heterogeneous Sensor Fusion Framework for Autonomous Mobile Robot Obstacle Avoidance;Ali Zia;《2010 10th International Conference on Intelligent Systems Design and Applications》;20101231;全文 *
On the Distribution of Salient Objects in Web Images and Its Influence on Salient Object Detection;Boris Schauerte;《PLOS ONE 》;20150722;全文 *
visual saliency detection by spatially weighted dissimilarity;Lijuan Duan;《Proceedings of IEEE Conference on Computer Vision and Pattern Recognition》;20111231;全文 *
基于凸包背景先验和目标先验的显著性检测;惠凯;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200215;全文 *
基于视觉显著性的移动机器人环境建模;唐莉莎;《设计与分析》;20190930;全文 *
多模态卷积神经网络的物体抓取检测;魏英姿;《沈阳理工大学学报》;20190831;第36-38页 *
结合视觉显著性的海空背景下的目标检测;王冠军;《计算机与多媒体技术》;20200531;全文 *
融合显著性与运动信息的相关滤波跟踪算法;张伟俊;《自动化学报》;20200914;全文 *

Also Published As

Publication number Publication date
CN112223288A (en) 2021-01-15

Similar Documents

Publication Publication Date Title
CN112223288B (en) Visual fusion service robot control method
Bell et al. Control of a humanoid robot by a noninvasive brain–computer interface in humans
EP3721320B1 (en) Communication methods and systems
CN110666791B (en) RGBD robot nursing system and method based on deep learning
US9389685B1 (en) Vision based brain-computer interface systems for performing activities of daily living
CN108646915B (en) Method and system for controlling mechanical arm to grab object by combining three-dimensional sight tracking and brain-computer interface
Mao et al. A brain–robot interaction system by fusing human and machine intelligence
Pathirage et al. A vision based P300 brain computer interface for grasping using a wheelchair-mounted robotic arm
Zhong et al. A dynamic user interface based BCI environmental control system
CN111399652A (en) Multi-robot hybrid system based on layered SSVEP and visual assistance
CN110673721B (en) Robot nursing system based on vision and idea signal cooperative control
Vijayprasath et al. Experimental explorations on EOG signal processing for realtime applications in labview
Sasaki et al. Robot control system based on Electrooculography and Electromyogram
CN112464768A (en) Fatigue detection method based on self-attention multi-feature fusion
Li et al. An adaptive P300 model for controlling a humanoid robot with mind
CN115050104A (en) Continuous gesture action recognition method based on multichannel surface electromyographic signals
CN108415568B (en) Robot intelligent idea control method based on modal migration complex network
CN113887374B (en) Brain control water drinking system based on dynamic convergence differential neural network
CN112836549A (en) User information detection method and system and electronic equipment
CN112936259B (en) Man-machine cooperation method suitable for underwater robot
US11687074B2 (en) Method for controlling moving body based on collaboration between the moving body and human, and apparatus for controlling the moving body thereof
Naijian et al. Coordination control strategy between human vision and wheelchair manipulator based on BCI
Zhang et al. Mind control of a service robot with visual servoing
Nandikolla et al. Hybrid bci controller for a semi-autonomous wheelchair
CN113947815A (en) Man-machine gesture cooperative control method based on myoelectricity sensing and visual sensing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant