CN117961916A

CN117961916A - Object grabbing performance judgment method, object grabbing device and object grabbing system

Info

Publication number: CN117961916A
Application number: CN202410371148.XA
Authority: CN
Inventors: 郭旭峰; 许晋诚
Original assignee: Parsini Perception Technology Zhangjiagang Co ltd
Current assignee: Parsini Perception Technology Zhangjiagang Co ltd
Priority date: 2024-03-29
Filing date: 2024-03-29
Publication date: 2024-05-03

Abstract

The embodiment of the application belongs to the technical field of robots, and relates to a target object grabbing property judging method, which comprises the steps of obtaining a current grabbing image output by an image sensor in a current trial grabbing state; generating a grabbing part indication map based on the current grabbing image; acquiring a preset touch model; acquiring a current touch signal output by a force/touch sensor in a current test grabbing state; and judging the grabbing property of the target object based on the grabbing position indication map, the preset touch model and the current touch signal. The embodiment of the application also relates to a target object grabbing method, a device, a system, a controller and a storage medium. The technical scheme of the application can improve the success rate of grabbing the target object subsequently.

Description

Object grabbing performance judgment method, object grabbing device and object grabbing system

Technical Field

The present application relates to the field of robotics, and in particular, to a method, apparatus, and system for determining object grabbing capability and grabbing objects.

Background

Along with development of science and technology, the application field of robots is more and more extensive, automatic grabbing of objects based on robots is also applied to various industries, and artificial intelligence requirements on robots are also higher and higher, so that the robot is often involved in grabbing materials with different materials, such as: some materials have high rigidity, and some materials are easy to slide and break.

However, when the robot grabs special material materials such as smooth surface and soft material, the material drops and damages easily, so that the grabbing success rate is reduced.

Disclosure of Invention

The embodiment of the application aims to provide a target object grabbing performance judgment method, a target object grabbing device and a target object grabbing system so as to improve the success rate of grabbing a target object subsequently.

In a first aspect, an embodiment of the present application provides a method for determining object grabbing property, which adopts the following technical scheme:

a target object grippability judgment method, the method comprising the steps of:

acquiring a current grabbing image in a current grabbing state;

Generating a grabbing part indication map based on the current grabbing image;

acquiring a preset touch model;

acquiring a current touch signal in the current trial grabbing state;

And judging the grabbing property of the target object based on the grabbing position indication map, the preset touch model and the current touch signal.

Further, in an embodiment, the generating the grabbing position indication map based on the current grabbing image includes the following steps:

acquiring an image of a target object before the robot executes the current trial grabbing operation;

Taking the target object image as input of a first feature extraction model to obtain target object features;

taking the current grabbing image as the input of a second feature extraction model to obtain the current grabbing feature;

splicing the target object characteristics and the current grabbing characteristics to obtain a characteristic splicing diagram;

And taking the characteristic mosaic as input of a grabbing recognition model to obtain the grabbing position indication map comprising grabbing contacts.

Further, in one embodiment, the determining the object gripability based on the grip location indication map, the preset haptic model, and the current haptic signal includes the steps of:

combining the touch model and the grabbing position indication map to obtain a force feedback expression of the grabbing position;

Combining the force feedback expression of the grabbing part and the current touch signal, and calculating a grabbing property evaluation result of the target object;

and judging the object grippability based on the object grippability evaluation result.

Further, in one embodiment, the calculating the object grippability evaluation result by combining the force feedback expression of the gripping portion and the current haptic signal includes the steps of:

taking the force feedback expression of the grabbing part as the input of a third feature extraction model to obtain a force feedback feature;

taking the current touch signal as the input of a fourth feature extraction model to obtain a touch feature;

And combining the force feedback characteristic and the tactile characteristic to obtain the object grabbing property evaluation result.

Further, in one embodiment, the combining the force feedback feature and the tactile feature to obtain the target object grabbing property evaluation result includes the following steps:

splitting the force feedback characteristic to obtain a split force feedback characteristic;

the haptic characteristics are split to obtain split haptic characteristics, and the split haptic characteristics are used as the input of a self-attention model to obtain associated split haptic characteristics;

and taking the splitting force feedback characteristic and the splitting touch characteristic as the input of a cross attention model together to obtain the object grabbing property evaluation result comprising the object grabbing confidence and the object grabbing danger.

Further, in one embodiment, before the step of acquiring the current captured image output by the image sensor in the current capture state, the method includes the following steps:

Sending a current grabbing instruction to the robot so as to instruct the robot to contact with the target object in the current grabbing state; the current test grabbing state comprises a test grabbing gesture of the grabbing actuator and an initial acting force.

In a second aspect, an embodiment of the present application provides a method for capturing an object, where the method includes the method for determining the object grippability described in any one of the above, and the method further includes the following steps:

if the object is judged to be gripable, a gripping instruction is sent to instruct the robot to grip the object based on the current trial gripping state.

In a third aspect, an embodiment of the present application provides a target object grippability determination apparatus, including:

the image acquisition module is used for acquiring a current grabbing image in a current trial grabbing state;

the indication generation module is used for generating a grabbing part indication map based on the current grabbing image;

the model acquisition module is used for acquiring a preset touch model;

The touch sense acquisition module is used for acquiring the current touch sense signal in the current test grabbing state;

and the grabbing judgment module is used for judging the grabbing property of the target object based on the grabbing position indication map, the preset touch model and the current touch signal.

In a fourth aspect, an embodiment of the present application provides an object gripping apparatus, including the object gripping property determination apparatus described above, and

And the instruction sending module is used for sending a grabbing instruction to instruct the robot to grab the target object based on the current grabbing state if the target object is judged to be grabbed.

In a fifth aspect, an embodiment of the present application provides an object capturing system, including: an image sensor, a force/touch sensor, a robot, and a controller; wherein the image sensor and the force/touch sensor have a preset calibration relation with the robot respectively;

the controller is respectively in communication connection with the image sensor, the force/touch sensor and the robot;

The controller is configured to implement the above-described object accessibility determination and/or the above-described object grabbing method.

In a sixth aspect, an embodiment of the present application provides a controller, including a memory and a processor, where the memory stores a computer program, and the processor executes the computer program to implement the steps of the object accessibility determination and/or the object grabbing method described above.

In a seventh aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the object accessibility determination and/or object gripping method described above.

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

According to the embodiment of the application, based on the current grabbing image acquired by the robot when the target object is in trial grabbing, a macroscopic grabbing position indication image is obtained, the specific grabbing position of the target object can be known through the grabbing position indication image, then the grabbing property of the target object is judged by combining the grabbing position indication image with a preset touch model and a current touch signal, the accuracy of judging the grabbing property of the target object can be improved, and the final grabbing success rate of the follow-up target object can be improved.

Drawings

In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.

Fig. 1 is an exemplary system architecture diagram of the object gripping system of the present application.

Fig. 2 is a flowchart of an embodiment of the object accessibility determination method of the present application.

Fig. 3 is a flow chart of an embodiment of the object grabbing method of the present application.

Fig. 4 is a schematic structural view of an embodiment of the object graspability judgment device of the present application.

Fig. 5 is a schematic structural view of an embodiment of the object gripping device of the present application.

Fig. 6 is a schematic structural view of another embodiment of the object graspable device of the present application.

FIG. 7 is a schematic diagram of an embodiment of a computer device of the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, fig. 1 is an exemplary system architecture diagram to which the present application may be applied.

An embodiment of the present application provides a target capturing system 100, including: an image sensor 110, a force/touch sensor 120, a robot 130, and a controller 140.

The controller 140 is communicatively connected to the image sensor 110, the force/touch sensor 120, and the robot 130 by wired or wireless means, respectively.

It should be noted that the wireless connection may include, but is not limited to, 3G/4G/5G connection, wiFi connection, bluetooth connection, wiMAX connection, zigbee connection, UWB (ultra wideband) connection, and other now known or later developed wireless connection.

Image sensor

The image sensor 110 is used for acquiring a current grabbing image of the robot in a current try grabbing state of grabbing the target object, and the like.

The image sensor may be, but is not limited to: cameras, video cameras, scanners or other devices (cell phones, computers, etc.) with associated functionality. Specifically, the device can be a 2D image sensor, a 3D image sensor (such as a 3D laser sensor and a depth sensor) which are developed now or in the future, and the like.

It should be noted that the image sensor may be fixed to the robot (for example, fixed to a distal end shaft of the mechanical arm) or fixed to a predetermined position outside the robot (as shown in fig. 1) according to the requirement. The robot and the image sensor are calibrated in advance (for example, hand-eye calibration), so that the coordinate conversion relation between the robot and the image sensor can be obtained.

Force/touch sensor

Wherein the force/tactile sensor refers to a force sensor and/or a tactile sensor.

In particular, the force sensor may be, but is not limited to: a two-dimensional or multi-dimensional force sensor for measuring data of a two-dimensional or multi-dimensional force.

Specifically, the touch sensor is a sensing device which can be placed on a grabbing actuator such as a dexterous hand and is used for measuring contact force information on the premise of matching with the dexterous hand to achieve the grabbing function. Wherein the tactile sensor can be used for grabbing objects with different shapes and softness, the contact surface of the tactile sensor and the object is flexible and has good rebound resilience. Contact force information includes, but is not limited to: array type multidimensional contact force information, surface deformation information, temperature information, texture information and the like. Implementations of the tactile sensor include flexible contact surfaces, sensing circuitry, computing devices, and contact force information parsing algorithms. The tactile sensor may sense forces more sensitively from multiple dimensions than conventional force sensors, such as: dense tangential friction forces can be perceived.

In one embodiment, the following description will mainly take the force/touch sensor as an array sensor as an example. The array type force/touch sensor is composed of a plurality of sensor units in an arrayed mode, and therefore sensing of multipoint force distribution information can be achieved.

In general, a force/touch sensor is provided to a grip actuator of a robot, and when the grip actuator contacts a target object, a force is generated between the force/touch sensor and the target object, so that a touch signal in a current grip state can be obtained. For ease of understanding, the signals output by the force sensors and/or the tactile sensors are collectively referred to as tactile signals in embodiments of the present application.

The force/touch sensor is calibrated in advance by various methods which are now available or developed in the future, so that the conversion relation between the force/touch sensor coordinate system and the robot coordinate system or between the force/touch sensor coordinate system and the grabbing actuator coordinate system can be obtained.

For example, as shown in fig. 1, taking the dexterous hand 131 with a gripping actuator as a human robot as an example, a plurality of independent or integrated force/touch sensors 120 are generally fixed on the palm center and the inner side of the finger of the dexterous hand 131 (as shown in fig. 1, the force/touch sensors 120 are indicated by dotted lines due to shielding relationship), and based on the preset calibration result of the force/touch sensors and the robot and the hand-eye calibration result described in the above embodiment, the contact position and the contact force of the dexterous hand and the object can be determined subsequently.

Robot

A robot 130 for trying to grasp or grab a target object or the like based on an instruction of the controller.

In particular, the robot may be, but is not limited to, any existing or future developed automation device such as a humanoid robot, a mechanical arm, etc. that can implement the corresponding functions of the embodiments of the present application.

The end of the robot is typically provided with a gripping actuator, such as: the robot arm is provided with a robot arm, and the force/touch sensor is provided on the robot arm. For ease of understanding, as shown in fig. 1, the following description will mainly take a robot as a humanoid robot 130 (only a part of the structure is schematically shown in the figure), and the gripping actuator is a dexterous hand 131 as an example.

Controller for controlling a power supply

The controller 140 is configured to execute the steps of the object grabbing performance determination and/or the object grabbing method according to the embodiments of the present application.

The object grabbing method provided by the embodiment of the invention can be applied to a computer terminal (Personal Computer, PC); an industrial control computer terminal (Industrial Personal Computer, IPC); a mobile terminal; a server; the system comprises a terminal and a server, and is realized through interaction between the terminal and the server; a programmable logic controller (Programmable Logic Controller, PLC); a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA); a Digital signal processor (Digital SignalProcesser, DSP) or a micro control unit (Microcontroller unit, MCU) or the like. The controller generates program instructions according to a program fixed in advance in conjunction with data signals and the like output from the external image sensor 110, the force/touch sensor 120, the robot 130 and the like. For specific limitations on the controller, reference may be made to the object accessibility determination and/or object gripping method limitations in the following examples. In particular, it can be applied to a computer device as shown in fig. 7.

Based on the object capturing system described in the above embodiments, the embodiment of the present application provides an object snatchability determination method, which is generally performed by the controller 140, and accordingly, the object snatchability determination device described in the following embodiments is generally disposed in the controller 140.

As shown in fig. 2, fig. 2 is a flowchart of an embodiment of the object accessibility determination method of the present application; the method for judging the grabbing property of the target object can comprise the following method steps:

step 210 obtains a current captured image in a current capture state.

Step 220 generates a grip location indication map based on the current grip image.

Step 230 obtains a preset haptic model.

Step 240 obtains the current haptic signal in the current test grip state.

Step 250 determines the object's grippability based on the grip location indication map, the preset haptic model, and the current haptic signal.

For ease of understanding, the method steps described above are described in further detail below.

Step 210 obtains a current captured image in a current capture state.

In one embodiment, the controller responds to a touch trigger signal generated when the grabbing actuator contacts with the target object in the current grabbing state, and acquires the current grabbing image acquired and output by the image sensor or the current grabbing image after some preprocessing according to a preset address, wherein the grabbing actuator and the target object which are contacted in the current grabbing state are generally included in the images.

In one embodiment, prior to step 210, the following method steps may be included:

Step 280 sends a current try-grabbing instruction to the robot to instruct the robot to contact with the target object in the current try-grabbing state; the current test grabbing state comprises a test grabbing gesture of the grabbing actuator and an initial acting force.

The controller may instruct the robot to drive the gripping actuator to approach the target object based on a pre-set or real-time generated (e.g., based on the target object image and the robot trajectory planning method, etc.) and perform the gripping with a certain gripping posture and an initial acting force (i.e., the current gripping state described in this embodiment), where the object grippability determination is usually performed, the initial force may be given a smaller force value as required.

Specifically, based on the foregoing embodiments, the currently captured image may be acquired for different image sensors or may be subjected to some preprocessing to obtain an RGB image, a point cloud image, and/or a depth image, etc. according to needs, which is not limited by the embodiments of the present application.

In one embodiment, the "generate a grip location indication map based on the current grip image" in step 220 may comprise the following method steps:

step 221 obtains an image of the target object before the robot performs the current try-grabbing operation.

In one embodiment, the controller acquires the target image before the robot performs the test grabbing operation or the target image after some preprocessing, which is acquired and output by the image sensor according to the preset address, and the target image only includes the target, or the grabbing executor in the image is in a separated state from the target.

Step 222 takes the object image as input of the first feature extraction model, and obtains the object feature.

In one embodiment, the controller extracts the target feature of the target image through the first feature extraction model.

Specifically, the first feature extraction model may employ any CNN network model that can be developed now or in the future, such as: resnet18, mobilenet.

Step 223 takes the current captured image as input of the second feature extraction model, and obtains the current captured feature.

In one embodiment, the controller extracts the current capture feature of the current capture image via the second feature extraction model.

In particular, the second feature extraction model may employ any now existing or future developed CNN network model that can perform feature extraction, as desired, such as: resnet18, mobilenet.

It should be noted that the network structures of the first feature extraction model and the second feature extraction model may be the same or different, and the present application is not limited thereto.

Step 224 splices the target object feature and the current grabbing feature to obtain a feature splice map.

In one embodiment, step 224 may comprise the following method steps:

And taking the target object characteristics and the current grabbing characteristics as the input of the characteristic splicing module to obtain a spliced characteristic splicing diagram.

Exemplary, as shown in fig. 6, fig. 6 is a schematic structural view of another embodiment of the object graspable device of the present application. And inputting the characteristics extracted by the characteristic extraction model A1 into a splicing block A3 (such as a concat) for characteristic splicing respectively by the target object image I0 and the current grabbing image I1. Specifically, the splicing mode can be selected at the characteristic channel level or the length-width level according to the requirement, and the application is not limited.

Step 225 takes the feature stitching graph as input of a grabbing recognition model to obtain a grabbing position indication graph comprising grabbing contacts.

As an example, continuing with fig. 6, the grabbing recognition model A5 may be a multi-layer perceptron MLP, but other models with similar functions may be used as needed, for example: FCN (full convolutional neural network), U-net, de-CONV (deconvolution neural network) and the like. The MLP is composed of a plurality of fully connected layers, and the output result of A5 is a grabbing part indication map A6, which is a contact map (for example: thermodynamic force_field). The contact map may be an expanded view of the outer surface of a three-dimensional object (e.g., 640 x 480) that indicates which locations of the object and the grasping actuators may be in contact in the current state of the test grasp.

According to the embodiment of the application, the target object characteristics and the current trial grabbing characteristics are respectively extracted, the two characteristics are spliced, and then the grabbing position indication map comprising grabbing contact points is obtained through the grabbing recognition model, so that the grabbing position indication map (such as thermodynamic diagram) of the touched position on the surface of the object can be obtained explicitly. Because the surface of the object may be made of materials with various mechanical properties, the position indication map is grasped to know which places the hand touches the object, and the materials of the touched places can be known by combining the subsequent haptic model, so that the targeted evaluation can be given. Therefore, the method according to the embodiment of the application can evaluate the grabbing success rate more accurately than other methods for treating the surface of the object as a uniform material.

It should be noted that, in addition to the method steps described in the foregoing embodiment steps 221 to 225, the embodiment of the present application may also generate the grasping portion indication map related to the specific grasping portion prediction of the target object in the current grasping state based on various methods now existing or developed in the future. Such as: the captured thermodynamic diagram may be generated based on a point cloud approach. In particular, the thermodynamic diagram may be any diagram describing the surface form of the object, such as a 2D thermodynamic diagram in a camera coordinate system or a 3D thermodynamic diagram in a world coordinate system. In addition, the capture part indication map may be a picture of other contents than the thermodynamic diagram, or descriptive text such as xmls and obj.

Step 230 obtains a preset haptic model.

In one embodiment, the controller may retrieve the pre-trained haptic model I2 from a memory or server according to a preset address.

Further, in one embodiment, haptic model I2 generally corresponds to the above described grip location indication map, such as: the haptic model may be a w x h x ch2 object outer surface unfolded view. Where w is the width of the expanded view, h is the height of the expanded view, ch2 represents the dimension needed to describe the haptic sensation (e.g., three dimensions are used to describe the haptic sensation x, y, z then ch2 is equal to 3). This value of ch2 is not limited to 3 provided we describe the haptic sensation using more dimensions. Such as: based on the dimensions 640 x 480 of the above mentioned grip location indication map, the haptic model may be: 640 x 480 x ch2.

Step 240 obtains the current haptic signal in the current test grip state.

In one embodiment, the control acquires the haptic signal in the current test grip state or the haptic signal after some preprocessing, which is measured and output by each force/haptic sensor in the current test grip state, according to a preset address.

Step 250 determines object grippability based on the grip location indication map, the preset haptic model, and the current haptic signal.

In one embodiment, the step 250 of determining the grabbing property of the target object based on the grabbing position indication map, the preset haptic model and the current haptic signal may include the following steps:

step 251 combines the haptic model and the grasping site indication map to obtain a force feedback representation of the grasping site.

As further shown in FIG. 5, by way of example, a preset haptic model I2 may be combined with the grip location indication map A6 to generate an expression describing a function call relationship for force feedback regarding the grip location.

Step 252 combines the force feedback representation of the grasping location and the current haptic signal to calculate the target object graspability evaluation result.

In one embodiment, the controller may calculate the corresponding target object snatchability assessment based on any now existing or later developed method in combination with the force feedback representation of the snatch site and the current haptic signal. Such as: the object snatchability evaluation result may be calculated based on LSTM, reinforcement learning, or a manner of solving an equation, or the like.

Specifically, the object grippability evaluation result may include a confidence level, a risk level, or the like.

The confidence may represent the success rate of the current test grasp, for example: and judging to be unsuccessful when the confidence coefficient is smaller than a preset threshold range. The risk may represent the probability of damaging the object by the current attempt to grasp, and when the risk value is higher than a certain preset range, it represents that the object is damaged by the present attempt to grasp.

Step 252 determines object grippability based on the object grippability evaluation result.

In one embodiment, the controller may compare the snatchability evaluation result with a preset threshold range to determine the snatchability of the target object.

For example, continuing taking the example that the grabbing performance evaluation result comprises the confidence coefficient and the risk coefficient, and when the confidence coefficient is smaller than the preset threshold range, representing that grabbing is unsuccessful; when the value is larger than the preset threshold range, the grabbing is successful; when the value of the risk is higher than a certain preset range, the object is damaged by the grabbing, and the object is judged to be grabbed only when the value is smaller than or equal to the preset threshold range. Therefore, only when the confidence and the risk simultaneously meet the grabbing conditions, the object in the current grabbing state is judged to be grabbed, otherwise, the object in the current grabbing state is judged to be not grabbed.

According to the embodiment of the application, the tactile model is combined with the grabbing position indication map, and the credibility evaluation result is calculated based on the combination model, so that the grabbing position indication map can be used for knowing which positions of the target object are touched by the grabbing actuator, then the material of the touched positions can be known by combining the subsequent tactile model, and the targeted grabbing performance evaluation can be given. Therefore, compared with other modes of grabbing the surface of the object as a uniform material, the method provided by the embodiment of the application can be used for accurately evaluating the grabbing property of the object.

Further, in one embodiment, the step 252 "calculate the object accessibility assessment result based on the combined model and the current haptic signal" may include the following method steps:

step 2521 uses the force feedback representation of the grasping location as input to the third feature extraction model to obtain a force feedback feature.

Specifically, the third feature extraction model may refer to the first feature extraction model and the second feature extraction model described in the above embodiments. The same or different network structure as the first feature extraction model and the second feature extraction model may be employed as required.

Step 2522 takes the current haptic signal as input to a fourth feature extraction model, resulting in a haptic feature.

Specifically, the fourth feature extraction model may refer to the third feature extraction model, the first feature extraction model, and the second feature extraction model described in the above embodiments. The same or different network structure as that of the third diagnosis extraction model, the first feature extraction model, and the second feature extraction model may be employed as needed.

According to the embodiment of the application, the information can be abstracted through the third feature extraction model and the fourth feature extraction model, and features related to tasks are extracted for subsequent link fusion.

Step 2523, combining the force feedback characteristic and the tactile characteristic to obtain a target object grippability evaluation result.

Further, in one embodiment, step 2523 may include the method steps of:

Step 25231 splits the force feedback feature to obtain a split force feedback feature.

Illustratively, as further shown in FIG. 6, the output of the combined model via the third feature extraction model C1 may be input to D3 (tokenizor).

Step 25232 splits the haptic features to obtain split haptic features, which are used as input to the self-attention model to obtain correlated split haptic features.

Illustratively, as further shown in fig. 6, the haptic features extracted from the current haptic signal I3 by the fourth feature extraction model D1 may be used as input to the haptic splitting module D3 (tokenizor) to obtain split haptic features; the split haptic feature is used as input to the Self-attention model D5 (self_ attention) to obtain the associated split haptic feature.

The current haptic signal I3 is the feedback of the force/haptic sensor at the time of the trial grip. Formally, all force/touch sensors (based on the above embodiment, the sensors are distributed on the gripping execution surface of the dexterous hand) are spread and then spliced on a map with a fixed size, based on the above embodiment, the size (w×h) of I3 generally corresponds to the gripping site indication map A6 (for example, 640×480), and the dimension of I3 is w×h×ch2.

D3 (tokenizor) the module converts D2 into a number tokenD of n_token number, if any. D4 combines a randomly initialized token before adding D5, i.e. the number of tokens outputting D5 is n_token+1.

And step 25233, taking the split force feedback characteristic and the split touch characteristic as the input of a Cross Attention (Cross Attention) module together, so as to obtain a target object grabbing performance evaluation result comprising the confidence and the risk of target object grabbing.

Wherein a Cross Attention (Cross Attention) module is used to fuse the two signals. This fusion approach may combine global information with local information.

Illustratively, as further shown in fig. 5, the split bond model and the associated split haptic signals may be used together as inputs to the cross-attention module D6 to obtain a confidence level and a risk level.

According to the embodiment of the application, the information can be reduced and abstracted to the characteristics related to the task through the third characteristic extraction model and the fourth characteristic extraction model, so that the subsequent calculation is more efficient, and the self-attention and cross-attention mechanisms are used for processing, so that the final result can be fully combined with the global information and the local information, and the calculation is more efficient and accurate.

In one embodiment, the above step 2523 may be implemented based on other now existing or future developed method steps, such as by directly stitching the split force feedback feature and the split haptic feature and then inputting the stitched split force feedback feature to the MLP model.

It should be noted that, in addition to the method steps described in the above embodiments, the embodiment of the present application may also determine the grippability of the target object based on the grip location indication map, the preset haptic model, and the current haptic signal based on various methods that have been developed now or in the future.

Based on the object grabber determination method described in the above embodiment, the embodiment of the present application further provides an object grabbing method, which is generally executed by the controller 140, and accordingly, the object grabbing device described in the following embodiment is generally disposed in the controller 140.

As shown in fig. 4, fig. 4 is a flowchart of one embodiment of the object grabbing method of the present application; the object grabbing method includes the object grabbing judgment method described in the above embodiment, and the following method steps:

If it is determined in step 260 that the target object is gripable, a gripping instruction is sent to instruct the robot to grip the target object based on the current gripping state.

In one embodiment, if it is determined that the object is grippable based on the method described in the above embodiment, the controller may send a gripping instruction to the robot to instruct the robot to grip the object based on the current gripping state. Such as: and grabbing the target object with the current grabbing gesture and the current applied acting force.

According to the embodiment of the application, based on the current grabbing image acquired by the robot when the object is subjected to trial grabbing, a macroscopic grabbing position indication image is obtained, the specific grabbing position of the target object can be known through the grabbing position indication image, then the grabbing property of the target object is judged by combining the grabbing position indication image with a preset touch model and a current touch signal, the accuracy of the grabbing property judgment can be improved, and the final grabbing success rate of the subsequent target object can be improved.

In one embodiment, the object capturing method according to the embodiment of the present application may further include:

Step 270, if the target object is not grabbed, changing the current grabbing state.

Specifically, the "changing the current capture state" may be changed in various ways as needed, for example: and changing the grabbing acting force and/or the grabbing gesture.

Step 280 re-performs object grippability determination based on the changed current try-grip state until it is determined that the object is grippable.

In one embodiment, in the current try-grip gesture, the force applied by the robot to the target object may be adjusted according to the results of the grippability evaluation. Such as: based on the acting force dF applied by the current robot, when the confidence coefficient and the risk coefficient are larger than a preset threshold value, the object is destroyed by the grabbing, dF can be reduced, and the steps 240 to 270 are repeated until the object is grabbed in a certain current grabbing state; or the method can also directly judge that the current test grabbing is finished, and after the grabbing gesture is changed, the steps 210 to 270 are re-executed until the target object is grabbed in a certain current test grabbing state.

In another embodiment, under the current capture posture, the acting force applied to the target object by the robot may be adjusted according to the capture performance evaluation result, and the steps 240 to 270 are repeatedly performed until the preset requirement (for example, the preset times or the preset acting force limit value) is met and the target object can still be captured, so that the capture posture may be changed, that is, the controller sends an instruction to instruct the robot to adjust the capture posture, and then the method steps 210 to 270 are re-performed.

It should be noted that, in addition to the method steps described in the above embodiments, the method for determining the object grabbing ability according to the present embodiment of the present application may use any other existing or future developed method to perform the subsequent grabbing of the object, which falls within the scope of protection of the present application.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

With further reference to fig. 4, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a target object accessibility determination apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 4, the object grippability determination device 200 of the present embodiment includes:

the image acquisition module 210 is configured to acquire a current captured image in a current capture test state;

an indication generating module 220, configured to generate a grabbing position indication map based on the current grabbing image;

the model acquisition module 230 is configured to acquire a preset haptic model;

the touch acquisition module 240 is configured to acquire a current touch signal in a current capture state;

The grabbing judgment module 250 is configured to judge the grabbing property of the target object based on the grabbing part indication map, the preset haptic model and the current haptic signal.

In one embodiment, the indication generation module 220 may include:

the image acquisition sub-module is used for acquiring an image of a target object before the robot executes the current trial grabbing operation;

The feature solving sub-module is used for taking the target object image as the input of the first feature extraction model to obtain the target object feature;

the current solving sub-module is used for taking the current grabbing image as the input of the second feature extraction model to obtain the current grabbing feature;

the characteristic splicing sub-module is used for splicing the characteristics of the target object and the current grabbing characteristics to obtain a characteristic splicing diagram;

and the grabbing indication sub-module is used for taking the characteristic spliced graph as input of a grabbing identification model to obtain a grabbing position indication graph comprising grabbing contacts.

In one embodiment, the grabbing determination module 250 may include:

The feedback solving sub-module is used for combining the touch model and the grabbing part indication map to obtain a force feedback expression of the grabbing part;

The result evaluation sub-module is used for combining the force feedback expression of the grabbing part and the current touch signal to calculate the grabbing property evaluation result of the target object;

and the grabbing judgment sub-module is used for judging the object grabbing property based on the object grabbing property evaluation result.

Further, in one embodiment, the result evaluation sub-module may include:

the feedback solving unit is used for taking the force feedback expression of the grabbing part as the input of the third feature extraction model to obtain a force feedback feature;

The feature solving unit is used for taking the current touch signal as the input of a fourth feature extraction model to obtain the touch feature;

and the result evaluation unit is used for feeding back the characteristic and the tactile characteristic to obtain the object grabbing property evaluation result.

Further, in one embodiment, the result evaluation unit may include:

the characteristic splitting subunit is used for splitting the force feedback characteristic to obtain a splitting force feedback characteristic;

The characteristic association subunit is used for splitting the tactile characteristics to obtain split tactile characteristics, and taking the split tactile characteristics as the input of the self-attention model to obtain the associated split tactile characteristics;

And the result evaluation subunit is used for taking the splitting force feedback characteristic and the splitting tactile characteristic as the input of the cross attention model together to obtain a target object grabbing property evaluation result comprising the confidence coefficient and the risk coefficient of target object grabbing.

In one embodiment, the object accessibility determination apparatus 200 may further include:

The test grab sending module is used for sending a current test grab instruction to the robot so as to instruct the robot to contact with the target object in a current test grab state; the current test grabbing state comprises a test grabbing gesture of the grabbing actuator and an initial acting force.

With further reference to fig. 5, as an implementation of the method shown in fig. 3, the present application provides an embodiment of a target gripping device, where the embodiment of the device corresponds to the embodiment of the method shown in fig. 3, and the device may be applied to various electronic devices.

As shown in fig. 5, the object gripping device 300 of the present embodiment includes the object gripping property judgment device 200 of the above embodiment; and

The instruction sending module 260 is configured to send a grabbing instruction to instruct the robot to grab the target object based on the current grabbing state if the target object is determined to be grabbed.

In one embodiment, the object gripping apparatus 300 may further include:

And the state changing module is used for changing the current try-grabbing state if the target object is judged to be non-grabbed.

And the re-judging module is used for re-judging the grabbing property of the target object based on the changed current grabbing state until the target object is judged to be grabbed.

In order to solve the technical problems, the embodiment of the application also provides a controller. Referring specifically to fig. 7, fig. 7 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device, and the like.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is generally used to store an operating system and various application software installed on the computer device 6, such as program codes for object accessibility determination and/or object grabbing. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.

The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute the program code or process data stored in the memory 61, for example, the program code for executing the object accessibility determination and/or the object grabbing method.

The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.

The present application also provides another embodiment, namely, a computer readable storage medium storing an object accessibility determination and/or object grabbing program, where the object accessibility determination and/or object grabbing program is executable by at least one processor, so that the at least one processor performs the steps of the object accessibility determination and/or object grabbing method as described above.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.

It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims

1. The object grabbing performance judging method is characterized by comprising the following steps of:

acquiring a current grabbing image in a current grabbing state;

Generating a grabbing part indication map based on the current grabbing image;

acquiring a preset touch model;

acquiring a current touch signal in the current trial grabbing state;

2. The object accessibility determination method according to claim 1, wherein the generating a grip portion indication map based on the current grip image includes the steps of:

3. The object grippability determination method according to claim 1 or 2, wherein the determining the object grippability based on the grip portion indication map, the preset haptic model, and the current haptic signal comprises the steps of:

4. The object grippability determination method according to claim 3, wherein the calculating the object grippability evaluation result by combining the force feedback expression of the grip portion and the current tactile signal comprises the steps of:

5. The method for determining object grippability according to claim 4, wherein said combining said force feedback feature and said haptic feature to obtain said object grippability evaluation result comprises the steps of:

6. The object grippability determination method according to claim 1 or 2, characterized by comprising the steps of, before acquiring the current captured image output by the image sensor in the current trial capture state:

7. A gripping method of an object, characterized in that the method includes the object grippability judgment method according to any one of claims 1 to 6, the method further comprising the steps of:

8. An object grippability judgment device, characterized by comprising:

the model acquisition module is used for acquiring a preset touch model;

9. An object gripping apparatus, characterized in that the apparatus comprises the object gripping property judgment apparatus according to claim 8; and

10. An object gripping system, the system comprising: an image sensor, a force/touch sensor, a robot, and a controller; wherein the image sensor and the force/touch sensor have a preset calibration relation with the robot respectively;

The controller is configured to implement the object grasp ability judgment according to any one of claims 1 to 6 and/or the object grasp method according to claim 7.