CN116402956A

CN116402956A - Intelligent driven three-dimensional object interactive reconstruction method, device, equipment and medium

Info

Publication number: CN116402956A
Application number: CN202310649977.5A
Authority: CN
Inventors: 黄惠; 闫子豪; 胡瑞珍; 张皓
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2023-07-07
Anticipated expiration: 2043-06-02
Also published as: CN116402956B

Abstract

The application relates to an intelligent driven three-dimensional object interactive reconstruction method, device, equipment and medium. The method comprises the following steps: performing interactive prediction on initial point cloud data of a target three-dimensional object to obtain an interactive prediction result; the target three-dimensional object includes an interactable component; the interactive prediction result is used for representing interactive actions which can generate interaction with the target three-dimensional object; controlling the interactive tool to execute the interactive action to interact with the interactable component; after interacting with the interactable component, the target three-dimensional object presents an internal structure; fitting the motion trail of the interactive tool, and determining the component motion parameters of the interactive component; under the condition that the interactive action is successfully executed, determining current point cloud data; the current point cloud data is used for representing surface points when the target three-dimensional object presents an internal structure after being interacted with the interactable component; and carrying out three-dimensional reconstruction according to the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result. The method can improve the accuracy of reconstruction.

Description

Intelligent driven three-dimensional object interactive reconstruction method, device, equipment and medium

Technical Field

The present disclosure relates to the field of computer graphics, and in particular, to an intelligent driven three-dimensional object interactive reconstruction method, apparatus, device, and medium.

Background

With the development of computer graphics technology, three-dimensional reconstruction technology has emerged, and by performing three-dimensional reconstruction on a three-dimensional object, a three-dimensional model suitable for computer representation and processing can be obtained. The three-dimensional model provides more spatial information and is more visualized than the two-dimensional model.

In the traditional three-dimensional reconstruction technology, only the surface of a three-dimensional object can be acquired by equipment such as a camera, a laser scanner and the like, so that the external surface of the three-dimensional object can be captured only by a geometric-based method or a deep learning-based method, the problem that the internal structure of the three-dimensional object is lost cannot be avoided, and the generated three-dimensional model is inaccurate.

Disclosure of Invention

In view of the foregoing, it is desirable to provide an intelligent driven three-dimensional object interactive reconstruction method, apparatus, computer device, computer readable storage medium, and computer program product that can improve accuracy.

In a first aspect, the present application provides an intelligently driven three-dimensional object interactively reconstructable method. The method comprises the following steps:

Determining initial point cloud data of a target three-dimensional object to be reconstructed; the target three-dimensional object includes an interactable component;

performing interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactive prediction result is used for representing interactive actions which can generate interaction with the target three-dimensional object;

controlling the interactive tool to execute interactive actions aiming at the target three-dimensional object so as to interact with the interactable component; wherein, after interacting with the interactable component, the target three-dimensional object presents an internal structure;

acquiring a motion trail of the interactive tool when executing the interactive action;

fitting the motion trail and determining the component motion parameters of the interactable component;

under the condition that the interactive action is successfully executed, determining current point cloud data; the current point cloud data is used for representing a set of surface points when the target three-dimensional object presents an internal structure after being interacted with the interactable component;

and carrying out three-dimensional reconstruction on the target three-dimensional object according to the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result.

In a second aspect, the present application further provides an intelligently driven three-dimensional object interactive reconstruction device. The device comprises:

the first determining module is used for determining initial point cloud data of a target three-dimensional object to be reconstructed; the target three-dimensional object includes an interactable component;

The prediction module is used for carrying out interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactive prediction result is used for representing interactive actions which can generate interaction with the target three-dimensional object;

the interaction module is used for controlling the interaction tool to execute interaction actions aiming at the target three-dimensional object so as to interact with the interactable component; wherein, after interacting with the interactable component, the target three-dimensional object presents an internal structure;

the acquisition module is used for acquiring a motion trail when the interactive tool executes the interactive action;

the fitting module is used for fitting the motion trail and determining the component motion parameters of the interactable component;

the second determining module is used for determining current point cloud data under the condition that the interactive action is successfully executed; the current point cloud data is used for representing a set of surface points when the target three-dimensional object presents an internal structure after being interacted with the interactable component;

and the reconstruction module is used for carrying out three-dimensional reconstruction on the target three-dimensional object according to the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.

In a fourth aspect, the present application also provides a computer-readable storage medium. A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in the method described above.

In a fifth aspect, the present application also provides a computer program product. Computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described above.

The intelligent driven three-dimensional object interactive reconstruction method, the intelligent driven three-dimensional object interactive reconstruction device, the intelligent driven three-dimensional object interactive reconstruction computer device, the intelligent driven three-dimensional object interactive reconstruction storage medium and the intelligent driven three-dimensional object interactive reconstruction computer program product are used for carrying out interactive prediction on initial point cloud data of a target three-dimensional object to obtain an interactive prediction result. The interaction prediction results are used for representing interaction actions which can generate interaction with the target three-dimensional object. The target three-dimensional object includes an interactable component for performing a predicted interaction action with respect to the target three-dimensional object by controlling the interaction tool to interact with the interactable component. By fitting the motion trail of the interactive tool when executing the interactive action, the component motion parameters of the interactable component can be accurately determined. And under the condition that the interaction is successfully executed, determining the current point cloud data when the target three-dimensional object presents the internal structure after the interaction with the interactable component. And the internal structure corresponding to the interactable component and the component motion parameters of the interactable component are fused in the three-dimensional object reconstruction result obtained by performing three-dimensional reconstruction according to the current point cloud data and the component motion parameters, so that the three-dimensional object reconstruction method is more complete, and the accuracy of three-dimensional reconstruction is improved.

Drawings

FIG. 1 is a diagram of an application environment for an intelligent driven three-dimensional object interactive reconstruction method in one embodiment;

FIG. 2 is a flow chart of an interactive reconstruction method of a three-dimensional object driven by intelligence in one embodiment;

FIG. 3 is a diagram of an application environment for an intelligent driven three-dimensional object interactive reconstruction method in another embodiment;

FIG. 4 is a schematic illustration of force resolution for a translational type of embodiment;

FIG. 5 is a schematic illustration of force resolution for one embodiment under a type of rotation;

FIG. 6 is a schematic diagram of an interactive predictive model to be trained in one embodiment;

FIG. 7 is a schematic diagram of a component segmentation model in one embodiment;

FIG. 8 is a schematic diagram of a completion model in one embodiment;

FIG. 9 is a schematic diagram of three-dimensional reconstruction of full point cloud data in one embodiment;

FIG. 10 is a schematic diagram of a fitted straight line in one embodiment;

FIG. 11 is a schematic diagram of a fitted circle in one embodiment;

FIG. 12 is a simplified flow diagram of an exemplary method for interactively reconstructing a three-dimensional object driven by intelligence;

FIG. 13 is a schematic diagram of an initial geometric model, an interactable probability map, a part segmentation result, and a three-dimensional object reconstruction result for a plurality of three-dimensional objects in one embodiment;

FIG. 14 is a schematic diagram of point cloud data for various interactable components in one embodiment;

FIG. 15 is a schematic diagram showing the comparison of the effects of a first intelligently driven three-dimensional object interactive reconstruction method and a second intelligently driven three-dimensional object interactive reconstruction method in one embodiment;

FIG. 16 is a block diagram of a three-dimensional object interactive reconstruction device, which is intelligently driven in one embodiment;

fig. 17 is an internal structural view of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The intelligent driven three-dimensional object interactive reconstruction method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. The intelligent robot 102 may determine initial point cloud data of the target three-dimensional object 104 to be reconstructed; the target three-dimensional object 104 includes an interactable component; the intelligent robot 102 may interact with the target three-dimensional object 104 to implement the intelligent driven three-dimensional object interactive reconstruction method provided in the embodiments of the present application. Wherein the intelligent robot 102 may be, but is not limited to, a fully autonomous robot, a semi-autonomous robot, or the like. The intelligent robot 102 may be computer device implemented. The computer device may include at least one of a terminal or a server. The terminal can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things equipment and portable wearable equipment, and the internet of things equipment can be smart speakers, smart televisions, smart air conditioners, smart vehicle-mounted equipment and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In some embodiments, the intelligent robot is a mobile gripping robot with built-in depth sensors. A Depth (RGB-Depth) sensor is used to capture a Depth map of the target three-dimensional object.

In one embodiment, as shown in fig. 2, there is provided an intelligent driven three-dimensional object interactive reconstruction method, which is applied to the intelligent robot in fig. 1 for illustration, and includes the following steps:

s202, determining initial point cloud data of a target three-dimensional object to be reconstructed; the target three-dimensional object includes an interactable component.

Wherein the initial point cloud data is used to characterize a set of surface points when the target three-dimensional object presents an external structure. The target three-dimensional object comprises an interactable component and a non-interactable component. Interacting with the interactable means, the position of the interactable means relative to the target three-dimensional object is changed, i.e. the relative position of the interactable means and the target three-dimensional object is changed. It will be appreciated that as the relative positions of the interactable elements change, the target three-dimensional object will assume a new configuration. The non-interactable component cannot realize interaction, any action is executed on the non-interactable component, and the position of the non-interactable component relative to the target three-dimensional object is not changed.

For example, the intelligent robot may acquire initial point cloud data of the target three-dimensional object in an initial state. The initial state refers to a state in which no interaction is generated by each interactable component of the target three-dimensional object. It will be appreciated that the initial state may specifically be the state when the target three-dimensional object exhibits a complete external structure. The initial point cloud data is used to characterize a set of exterior surface points of the target three-dimensional object. The outer surface points refer to surface points on the outer structure.

In some embodiments, the initial state may be that each of the interactable components of the target three-dimensional object is in a closed state. For example, for a storage compartment having a drawer and a door, the initial state of the storage compartment refers to the state in which both the door and the drawer are closed.

In some embodiments, for a target three-dimensional object, the intelligent robot may move around it and capture depth maps from a preset number of perspectives with a built-in depth camera. Further, the intelligent robot can fuse the depth map with the corresponding camera parameters to obtain point cloud data. Wherein the normal direction of each surface point is also calculated from the camera exterior. It is understood that the point cloud data may be initial point cloud data or current point cloud data, or the like.

In some embodiments, the predetermined plurality of viewing angles may be predetermined four viewing angles. The four views may include a front view, a right view, a rear view, and a left view.

In some embodiments, the interactable component may be a moveable component in particular. For example, drawers and cabinet doors of the storage compartment are interactive components. The non-interactable component may be a non-moveable component. For example, the case of the storage case is a non-interactive component.

S204, performing interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactable prediction results are used to characterize interactions that may be generated with the target three-dimensional object.

Wherein the interaction is for changing a position of the interactable component relative to the target three-dimensional object. The elements constituting the interaction may include at least one of an interaction direction, an interaction position, and the like. The interaction direction refers to the direction of the interaction. The interactive position refers to a position to which an interactive action is applied.

For example, the intelligent robot may perform feature extraction on the initial point cloud data to obtain initial point cloud features. The initial point cloud features are used to characterize the interactable condition of the upper surface points of the target three-dimensional object. The intelligent robot can conduct interactive prediction on the initial point cloud characteristics to obtain an interactive prediction result.

In some embodiments, the interactable prediction result may comprise an interactable probability map. The interactable probability map is used to characterize the probability of each interaction being successful in performing the respective interaction upon each surface point of the target three-dimensional object.

In some embodiments, the intelligent robot may encode the initial point cloud data to obtain initial point cloud features. The intelligent robot can decode the initial point cloud characteristics to obtain an interactive prediction result.

In some embodiments, the interactive prediction result is an output of an interactive prediction model. The interactive prediction model includes an interactive encoder and an interactive decoder. The initial point cloud characteristic is the output of the interactive encoder. The interactive prediction result is the output of the interactive decoder. The interactive predictive model may enable prediction of motion from pixels to a joint three-dimensional object.

S206, controlling the interaction tool to execute interaction action aiming at the target three-dimensional object so as to interact with the interactable component; wherein the target three-dimensional object presents an internal structure upon interaction with the interactable component.

The interactive tool is a component part of the intelligent robot. An interactive tool refers to a tool used in performing an interactive action.

For example, the intelligent robot may determine the interactive action to be performed from the interactive prediction result. The intelligent robot may control the interactive tool to perform the interaction to be performed with respect to the target three-dimensional object to interact with the interactable means. It will be appreciated that the interactive actions to be performed are not necessarily interacted with by the interactable means. For example, the elements of the interaction may include at least one of a direction or a position, etc. When the interaction position of the interaction action to be executed is not on the interactable component or the interaction direction is not consistent with the movement direction of the interactable component, interaction with the interactable component can not be performed by executing the interaction action.

In some embodiments, the target three-dimensional object may include a plurality of interactable components. The intelligent robot may perform multiple rounds of interactions on the target three-dimensional object. Wherein the number of interactive wheels matches the number of interactable parts. In each round of interaction process, the intelligent robot can determine the interaction action of the round from the interactive prediction result. Each round of interaction action refers to interaction actions to be executed, wherein the interaction actions to be executed are determined in each round of interaction process. The intelligent robot may control the interactive tool to perform the current round of interactive actions with respect to the target three-dimensional object to interact with the interactable component. Under the condition that the execution of the interactive action of the round is successful, the interactive action of the round is ended, and the intelligent robot can update the interactive prediction result to obtain the updated interactive prediction result of the round. Wherein the next round of interaction is determined from the updated interaction prediction results of the present round.

In some embodiments, the number of interactive rounds corresponds to the number of interactable components.

In some embodiments, interaction with the interactable component is achieved by performing the interaction action, representing successful execution of the interaction action.

In some embodiments, the interactions before successful interaction with an interactable component all belong to the round of interactions. Each round of interactions may include at least one interaction. The interactive action of the round can comprise a plurality of interactive actions which are executed back and forth, and the last interactive action in the interactive action of the round is the action which is successfully interacted with the interactable component. In the process of each round of interaction, under the condition that the execution of the interaction action of the previous round fails, the intelligent robot can determine the current interaction action of the round from the interactive prediction result. And under the condition that the current round of interaction action is successfully executed, the round of interaction is ended.

In some embodiments, the intelligent robot may determine candidate interactions from the interactable predictions. The interaction location of the candidate interaction is outside a preset range of interaction locations for which failed interactions were performed. The intelligent robot may determine an interaction to be performed from the candidate interactions.

In some embodiments, the interactable prediction result may comprise an interactable probability map. The interaction is performed for the purpose of interacting with the interactable means. For the interactable probability map, the intelligent robot can select the interaction action with the highest probability as the interaction action to be executed. In the case that the interaction to be performed fails to successfully interact with any interactable component when the interaction to be performed is performed, the intelligent robot may select another interaction to be performed from the interactable probability map. In order to avoid that the interaction positions of the interaction actions with high probability are concentrated in a small area, the intelligent robot can shield the interaction actions at the positions within the preset radius of the interaction positions of the interaction actions with failure execution, and select another interaction action to be executed with highest probability from the candidate interaction actions remained after shielding. Wherein the preset radius may be, but is not limited to, 0.05.

In some embodiments, an interactable probability map is used to characterize the probability of successful performance of each interaction action at each external surface point on the target three-dimensional object. External surface points may characterize the interaction location. Included in the interactable probability map are a plurality of interactions predicted for each external surface point, and a probability of success of each interaction. The interaction is an action at an external surface point on the target three-dimensional object.

S208, acquiring a motion trail of the interactive tool when the interactive tool executes the interactive action.

For example, the intelligent robot may record the motion trajectory of the interactive tool during execution of the interactive action. Wherein the motion profile may be, but is not limited to, a movement profile.

In some embodiments, the intelligent robot may send an interaction signal to instruct the interaction tool to perform an interaction action with respect to the target three-dimensional object. During interaction of the interactive tool with the interactable component, a plurality of track positions are sampled along a motion track of the interactive tool. The plurality of track positions is used to characterize the motion track.

And S210, fitting the motion trail and determining the component motion parameters of the interactable component.

Wherein the component motion parameters are used to characterize the motion of the interactable component.

For example, the intelligent robot may sample a preset number of track positions in the motion track. Fitting the track positions of the preset number, and determining the component motion parameters of the interactable component.

In some embodiments, the intelligent robot may fit a preset number of track positions by a least square method to obtain a fitting result. And determining the component motion parameters of the interactable component according to the fitting result.

In some embodiments, the intelligent robot can perform circle or straight line fitting on a plurality of track positions through a least square method to obtain a fitting result.

In some embodiments, the geometric characteristics in the fit result can reflect the motion of the interactable component. The intelligent robot may determine the component motion parameters of the interactable component based upon the geometric characteristics in the fit result.

In some embodiments, the component motion parameters may include at least one of an axis of motion, a direction of motion, a type of motion, a range of motion, or the like.

S212, under the condition that the interactive action is successfully executed, determining current point cloud data; and the current point cloud data is used for representing a set of surface points when the target three-dimensional object presents the internal structure after being interacted with the interactable component.

Under the condition that the interactive action is successfully executed, the position of the interactive component relative to the target three-dimensional object is changed, and the target three-dimensional object can present a new structure.

Illustratively, when the relative position of the interactable component is changed during performance of the interaction, the interaction performance is successful in the event that the relative position of the interactable component cannot continue to be changed. The intelligent robot can acquire point cloud data of a target three-dimensional object presenting a new structure to obtain current point cloud data. The point cloud data of the interactable component included in the current point cloud data is more comprehensive than the point cloud data of the interactable component in the initial point cloud data.

And S214, carrying out three-dimensional reconstruction on the target three-dimensional object according to the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result.

The three-dimensional object reconstruction result is used for expressing a target three-dimensional object in the objective world in a computer.

For example, the intelligent robot can perform three-dimensional reconstruction according to the current point cloud data to obtain a preliminary reconstruction result. And mapping the component motion parameters of the interactable component to the preliminary reconstruction result to obtain a three-dimensional reconstruction result of the interactable component. And carrying out three-dimensional reconstruction on the non-interactable component to obtain a three-dimensional reconstruction result of the non-interactable component. And carrying out fusion processing on the three-dimensional reconstruction result of the interactable part and the three-dimensional reconstruction result of the non-interactable part to obtain a three-dimensional object reconstruction result.

In some embodiments, as shown in FIG. 3, another application environment is provided. The target three-dimensional object in the figure is a cabinet with three drawers. The three drawers are interactable components and the cabinet is a non-interactable component. The intelligent robot may perform an interaction with respect to the target three-dimensional object to interact with the interactable component such that a relative position of the interactable component changes. In the figure, the drawer at the uppermost layer is initially in a closed state, and is gradually opened in the interaction process, and when the drawer at the uppermost layer is completely opened, the interaction of the round is ended. When the uppermost drawer is opened, the cabinet presents the internal structure of the uppermost drawer and at least part of the internal structure of the cabinet body. The intelligent robot can acquire current point cloud data aiming at the cabinet with the drawer at the uppermost layer being completely opened. The intelligent robot can reconstruct the three-dimension of the cabinet to obtain the three-dimension object reconstruction result of the cabinet. The three-dimensional object reconstruction results comprise three-dimensional reconstruction results of the interactable parts, namely, a three-dimensional reconstruction result of the uppermost drawer, a three-dimensional reconstruction result of the middle drawer and a three-dimensional reconstruction result of the lowermost drawer.

It can be appreciated that the intelligent driven three-dimensional object interactive reconstruction method provided in this embodiment is an active three-dimensional reconstruction method. In the three-dimensional reconstruction method provided by the embodiment, the point cloud data of the external and internal structures of the target three-dimensional object are acquired through the interactive perception of the intelligent robot. Unlike the three-dimensional reconstruction task focused on optimizing the camera view angle to better shoot the object, the main feature of the intelligent-driven three-dimensional object interactive reconstruction method provided by the embodiment is to analyze the interactivity of each component of the target three-dimensional object, and operate and interact with the components by using the intelligent robot, so that the shielded internal structure becomes visible. And through interactive perception, the understanding of the motion condition of the components in the target three-dimensional object is also obtained on the basis of complete geometric reconstruction. The three-dimensional reconstruction is driven by the interactive perception of the intelligent robot, so that the accuracy of the three-dimensional reconstruction is ensured.

In the intelligent-driven three-dimensional object interactive reconstruction method, the initial point cloud data of the target three-dimensional object is subjected to interactive prediction, and an interactive prediction result is obtained. The interaction prediction results are used for representing interaction actions which can generate interaction with the target three-dimensional object. The target three-dimensional object includes an interactable component for performing a predicted interaction action with respect to the target three-dimensional object by controlling the interaction tool to interact with the interactable component. By fitting the motion trail of the interactive tool when executing the interactive action, the component motion parameters of the interactable component can be accurately determined. And under the condition that the interaction is successfully executed, determining the current point cloud data when the target three-dimensional object presents the internal structure after the interaction with the interactable component. And the internal structure corresponding to the interactable component and the component motion parameters of the interactable component are fused in the three-dimensional object reconstruction result obtained by performing three-dimensional reconstruction according to the current point cloud data and the component motion parameters, so that the three-dimensional object reconstruction method is more complete, and the accuracy of three-dimensional reconstruction is improved.

In some embodiments, the interactive prediction result is an output derived from the initial point cloud data as an input to the interactive prediction model; the method further comprises a training step of the interactive prediction model; the training step of the interactive prediction model comprises the following steps: determining a sample interactable component of a sample three-dimensional object; a plurality of corresponding sample interaction actions are preset for the sample interactable component; the elements of each sample interaction include a force acting on the sample interactable member; respectively decomposing acting forces corresponding to the interaction actions of a plurality of samples according to physical characteristics corresponding to the movement types of the sample interactable parts to obtain dynamic data and resistance data of each sample interaction action; determining the probability of successfully executing the sample interaction according to the difference between the dynamic data and the resistance data of the sample interaction; and taking a plurality of sample interaction actions corresponding to the sample interactable component and the probability of each sample interaction action as training data of the interaction prediction model to be trained so as to obtain the trained interaction prediction model.

Wherein the elements of each sample interaction comprise forces acting on the respective surface points of the sample interaction component.

Illustratively, the sample three-dimensional object has a sample part tag. The sample part labels are used to indicate sample interactable parts of a sample three-dimensional object. The computer device may randomly sample a plurality of sample interaction directions for each external surface point on the sample interactable component, resulting in interaction actions corresponding to the plurality of sample interaction directions, respectively. The elements of the sample interaction include a sample interaction direction. The direction of the force corresponding to the sample interaction is the sample interaction direction. The computer device may decompose the acting force corresponding to the sample interaction action according to the movement axis of the sample interactable component and the sample interaction direction corresponding to the sample interaction action, to obtain component force data.

The computer device may determine the dynamic data and the resistive data of the sample interactable component based upon the physical characteristics and the force component data corresponding to the type of movement of the sample interactable component. It will be appreciated that the interactivity of most articulated objects follows a certain rule. For example, the door is easier to open from the side remote from the hinge. The physical properties of these objects are used to support quantification of interactions with the interactable component through different interactions.

The computer device may normalize the difference between the kinetic data and the resistance data of the sample interaction to obtain a probability of successfully executing the sample interaction.

In some embodiments, the type of motion may include at least one of a translational type or a rotational type, etc. For the translation type, the power data includes a power magnitude and the resistance data includes a resistance magnitude. For the type of rotation, the power data includes a power moment and the resistance data includes a resistance moment.

In some embodiments, the power level of each sample interaction applied on the same surface point is uniform. The outer surface point is actually the point of action of the force corresponding to the sample interaction.

In some embodiments, in the case that the movement type of the sample interactable component is a translation type, the computer device may decompose the acting force corresponding to the sample interaction in the direction parallel to the movement axis and the direction perpendicular to the movement axis, to obtain component force data. The force component data includes a force in a direction parallel to the axis of motion and a force in a direction perpendicular to the axis of motion. The computer device may determine a coefficient of resistance corresponding to the sample interactable component. The computer device may weight the drag coefficient and the force in the direction perpendicular to the axis of motion to obtain the drag data. The computer device may take as the power data a force in a direction parallel to the axis of motion.

In one embodiment, in the case that the movement type of the sample interactable component is a rotation type, the computer device may decompose the acting force corresponding to the sample interaction in the parallel normal direction and the perpendicular normal direction to obtain component force data. The component force data includes a force in a parallel normal direction and a force in a perpendicular normal direction. The normal direction is perpendicular to the power arm and the axis of motion. The computer device may determine the power arm and the resistance arm. And (5) weighting the power arm and the force in the direction parallel to the normal line to obtain the power moment. And (5) weighting the resistance arm and the force in the direction vertical to the normal line to obtain the resistance moment.

In some embodiments, as shown in FIG. 4, a schematic diagram of force resolution for a translational type is provided. For interactable components having a translational type, forces corresponding to interaction are resolved along two directions: a direction parallel to the axis of motion (axis), the power in this direction (F _||axis ) Driving the interactable component to translate; and a direction perpendicular to the axis of motion (axis), which will generate a resistance force (muF) during the movement of the interactable means _⊥axis ). μ is the drag coefficient. By normalizing the difference between the power and the resistance, a corresponding probability of success, i.e. Score, is obtained _t =norm（F _||axis -μF _⊥axis ). norm () refers to normalization. Score _t Is the probability of success.

In some embodiments, as shown in FIG. 5, a schematic diagram of force resolution under a rotating type is provided. For the interactive component with a rotating type, the acting force is decomposed along the normal direction to obtain power F _||normal To obtain a dynamic moment d x F _||normal . d is the power arm, i.e. the minimum distance from the force application position to the axis of motion (axis). By normalizing the difference between the moment of force and the moment of resistance, a corresponding probability of success is obtained, i.e. Score _r =norm（d×F _||normal -μF _⊥normal ). norm () refers to normalization. Score _r Is the probability of success. It will be appreciated that the difference distribution of the power data and the resistance data for the translational and rotational types may be different, and thus the difference of the power data and the resistance data for each interactable element is normalized to generate training data. Specifically, for each point of the interactable element, 10 directions are randomly sampled in a three-dimensional space, then 10 differences of the point are calculated, and the differences are normalized to be within a range of (0, 1), resulting in training data.

In some embodiments, the computer device may obtain training data in a physical simulation environment.

In some embodiments, the computer device may be an integral part of the intelligent robot or may be a stand alone device.

In some embodiments, a schematic structure diagram of the interactive prediction model to be trained is shown in fig. 6. The interactive prediction model to be trained comprises an interactive encoder, an action prediction decoder, an action scoring decoder and an interactive decoder. Initial point cloud data C ₀ For input of the interactive encoder, the output of the interactive encoder is the initial point cloud characteristic f _p . The input of the interaction decoder is the initial point cloud characteristic of the sample three-dimensional object, and the output is a first prediction result comprising an interactable probability map. The inputs of the motion prediction decoder are a random vector z and an initial point cloud characteristic f _p . Different random vectors are used to guide the motion prediction decoder to predict interactions in different directions. Random directionThe quantity is in fact a noise. Second prediction result a output by motion prediction decoder ^dir For characterizing interactions at points on the exterior surface of a sample three-dimensional object. The input to the action scoring decoder includes a second prediction result and an initial point cloud feature. The third prediction result output by the action scoring decoder is used for representing the probability of successful execution of each interactive action.

In some embodiments, the trained interactive prediction model includes an interactive encoder and an interactive decoder.

In some embodiments, the second prediction is used to characterize the interactable direction of each surface point on the sample three-dimensional object. The third prediction is used to characterize the probability of successful interaction in each interactable direction. It will be appreciated that the interactable probability map in the first prediction result comprises the respective interactable direction at each surface point and the interaction success probability corresponding to the interactable direction.

In some embodiments, the output of the motion prediction decoder is the interaction direction at the outer surface point of the sample three-dimensional object. The penalty for training the motion prediction decoder is the cosine distance between the predicted interaction direction and the correct interaction direction. It will be appreciated that the correct interaction direction may be the power direction.

In some embodiments, the output of the action scoring decoder is the probability of success for each interaction direction, i.e., corresponds to the result of scoring each interaction direction. The penalty for the training action scoring decoder is the mean square error between the probability of success of the predicted sample interaction and the probability of success of the sample interaction in the training data.

In some embodiments, the plurality of random vectors are randomly sampled to produce a plurality of third prediction results. Taking the average value of the plurality of third prediction results as the supervision data of the first prediction results. The penalty for training the interactive decoder is the mean square error between the first prediction result and the supervision data.

In some embodiments, the penalty of the interactive prediction model is a weighted sum of the penalty of the interactive decoder, the penalty of the motion prediction decoder, and the penalty of the motion scoring decoder.

In the embodiment, the acting forces corresponding to the interaction actions of the samples are respectively decomposed according to the physical characteristics corresponding to the movement types of the sample interactable parts, so that the dynamic data and the resistance data of each interaction action of the samples are obtained; according to the difference between the dynamic data and the resistance data of the sample interaction action, the probability of successfully executing the sample interaction action is determined, and the obtained training data follow the physical characteristics of the interactable component, so that the accuracy of the training interaction prediction model can be ensured.

In some embodiments, the interactive tool includes an adsorption tool and an adjustment tool; the interactive action is an action at an interactive location on the target three-dimensional object; controlling the interactive tool to perform an interactive action with respect to the target three-dimensional object to interact with the interactable component, comprising: the suction tool is controlled to maintain the suction state at the interaction position, and the adjustment tool is controlled to apply a force in the adjusted interaction direction to determine the continuous movement of the interactable component.

Illustratively, the elements of the interaction include the interaction location and predicted interaction direction of the surface point representation. It is understood that the interaction location refers to the surface point on which the force corresponding to the interaction acts. The interaction direction refers to the direction of the force corresponding to the interaction. The adsorption tool is connected with the adjustment tool. The intelligent robot can control the adsorption tool to keep an adsorption state at a surface point corresponding to the interaction. The intelligent robot may control the adjustment tool to apply force in the predicted interaction direction and adjust the interaction direction to apply force in the adjusted interaction direction to determine the continuous movement of the interactable component. It will be appreciated that for a translational type of interaction, maintaining one interaction direction at all times enables continuous movement of the interactable component. For a rotational type of interaction, it is necessary to constantly change the interaction direction in order to allow a continuous movement of the interactable means.

In some embodiments, where the interactable component no longer moves with the interaction tool, the interactable component has reached a movement limit and the round of interaction is ended. The intelligent robot may control the interactive tool to release the adsorption state, after which the intelligent robot may begin the next round of interaction.

In some embodiments, to ensure that the adsorption state is stable, the intelligent robot may control the adsorption tool to move a adsorption distance toward the normal direction of the interaction location. Then, the intelligent robot may send out an adsorption signal to instruct the adsorption tool to adsorb to the interaction location.

In some embodiments, the interactive tool may be a suction cup hand. The suction means may be a suction cup. The adjustment tool may be an arm. The suction cup hand is adsorbed on the surface point of the target three-dimensional object, and any interaction action can be executed without giving action types such as pushing or pulling. Wherein for a given interaction, the suction cup hand first touches the interaction location and then moves forward a small distance along the normal direction of the surface where the interaction location is located. Then, the suction cup hand receives the suction signal, and the arm starts to move. During operation, the intelligent robot may automatically adjust the interaction direction to ensure continuous movement of the interactable component. Once the suction cup hand can no longer move with the interactable means that the interactable means has reached a movement limit, the round of interaction is ended.

In this embodiment, the adsorption tool is used to adsorb the interactive component on the surface of the target three-dimensional object, and the adjustment tool is used to adjust the interaction direction and apply the acting force, so as to ensure the continuous movement of the interactive component, and any interactive action can be realized without giving action types such as pushing or pulling, thereby improving the adaptability.

In some embodiments, the three-dimensional reconstruction of the target three-dimensional object according to the current point cloud data and the component motion parameters, to obtain a three-dimensional object reconstruction result, includes: aiming at each interactable component, carrying out segmentation processing on the current point cloud data to obtain a component segmentation result corresponding to the interactable component; determining point cloud data of the interactable parts from the current point cloud data according to the part segmentation result corresponding to each interactable part; and carrying out three-dimensional reconstruction according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result of the target three-dimensional object.

The component segmentation result is used for indicating point cloud data of the interactable component in the point cloud data of the target three-dimensional object. The initial point cloud data and the current point cloud data are both point cloud data of the target three-dimensional object.

For example, the intelligent robot may perform fusion processing on the current point cloud data and the initial point cloud data for each interactable component, to obtain point cloud fusion data. The intelligent robot can segment the point cloud fusion data to obtain a part segmentation result corresponding to the interactable part. The part segmentation result may include a part segmentation mask.

The intelligent robot may determine point cloud data of the interactable component from the current point cloud data according to the component segmentation mask corresponding to each interactable component. The intelligent robot can reconstruct the point cloud data of the interactable component and the component motion parameters of the interactable component in three dimensions to obtain a three-dimensional reconstruction result of the interactable component. The intelligent robot can reconstruct the point cloud data of the non-interactive component in the current point cloud data in three dimensions to obtain a three-dimensional reconstruction result of the non-interactive component. And carrying out fusion processing on the three-dimensional reconstruction result of the interactable part and the three-dimensional reconstruction result of the non-interactable part to obtain a three-dimensional object reconstruction result.

In the embodiment, the current point cloud data is segmented for each interactable component to obtain a component segmentation result corresponding to the interactable component; determining point cloud data of the interactable parts from the current point cloud data according to the part segmentation result corresponding to each interactable part; and carrying out three-dimensional reconstruction according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result, reconstructing the interactable component and the non-interactable component, fusing the component motion parameters of the interactable component, and ensuring the integrity of the three-dimensional object reconstruction result.

In some embodiments, the interactable prediction results comprise an interactable probability map for characterizing a probability of successful execution of each interaction action; for each interactable component, carrying out segmentation processing on the current point cloud data to obtain a component segmentation result corresponding to the interactable component, wherein the segmentation result comprises the following steps: for each interactable component, carrying out segmentation processing on the initial point cloud data and the current point cloud data to obtain a first segmentation result corresponding to the interactable component in the initial point cloud data and a second segmentation result corresponding to the interactable component in the current point cloud data; determining the point cloud data of the interactable component from the current point cloud data according to the component segmentation result corresponding to each interactable component comprises the following steps: according to the second segmentation result, determining point cloud data of the interactable component from the current point cloud data; the method further comprises the steps of: for each interactable component, under the condition that the interactable component is successfully executed with the interaction action, updating the interactable probability map according to a first segmentation result corresponding to the interactable component to obtain an updated interactable probability map; the updated interactable probability map includes probabilities of successful execution of the interaction with the non-interacted component.

The first segmentation result is used for indicating point cloud data of the interactable component in the initial point cloud data. The second segmentation result is used for indicating the point cloud data of the interactable component in the current point cloud data.

For example, the intelligent robot may combine the initial point cloud data and the current point cloud data for each interactable component to obtain point cloud fusion data. And carrying out feature extraction processing on the point cloud fusion data to obtain point cloud fusion features. And decoding the point cloud fusion characteristics to obtain a part segmentation result. The first division result and the second division result are split from the component division result. Wherein the point cloud fusion feature is used to characterize the position distribution of the interactable component in the target three-dimensional object.

The first segmentation result may include a first segmentation mask. The second segmentation result may include a second segmentation mask. The intelligent robot may determine point cloud data of the interactable component indicated by the second segmentation mask from the current point cloud data. When the interactive action is executed for the target three-dimensional object, the relative position of the interactive component is changed, and the interactive action is successfully executed for the interactive component. The intelligent robot can multiply the first segmentation result corresponding to the interactable part with the interactable probability map to obtain an updated interactable probability map for each interactable part under the condition that the interactable part is successfully executed with the interaction action. It will be appreciated that the updated interactable probability map is used for the next round of interaction. Multiple round interactions refer to interactions with different interactable components.

In some embodiments, as shown in FIG. 7, a structural schematic of a part segmentation model is provided. The component segmentation model includes a component segmentation encoder and a component segmentation decoder. Initial point cloud data C ₀ And current point cloud data C ₁ Superposition to obtain point cloud fusion data C _concat . And taking the point cloud fusion data as the input of the component segmentation encoder to obtain the point cloud fusion characteristics output by the component segmentation encoder. Taking the point cloud fusion characteristic as the input of the part segmentation decoder to obtain a part segmentation result P output by the part segmentation decoder _concat . From the part segmentation result P _concat Middle splitting of the first split result P ₀ ¹ And a second segmentation result P ₁ 。

In some embodiments, the probability of success of the candidate interaction is not less than a preset probability threshold. The interaction location of the candidate interaction action is the candidate interaction location. At least one candidate interaction location exists. Under the condition that the number of candidate interaction positions in the interactive probability map updated in the round is smaller than a preset number threshold, the intelligent robot can stop executing interaction actions aiming at the target three-dimensional object, namely, the round of interaction is the last round of interaction. It will be appreciated that after each round of interaction is completed, it will be determined whether the current round of interaction is the last round of interaction. For example, the preset probability threshold is 0.8 and the preset number threshold is 30.

In this embodiment, for each interactable component, the initial point cloud data and the current point cloud data are subjected to segmentation processing, so as to obtain a first segmentation result and a second segmentation result corresponding to the interactable component; according to the second segmentation result, the point cloud data of the interactable component is determined from the current point cloud data, and the segmentation of the component from the target three-dimensional object is realized, so that the three-dimensional reconstruction of the internal and external structures of the interactable component can be realized based on the point cloud data of the interactable component, and an accurate reconstruction result is obtained. Under the condition that the interactive action is successfully executed on the interactive component, updating the interactive probability graph according to a first segmentation result corresponding to the interactive component to obtain an updated interactive probability graph; the updated interactable probability map comprises the probability of successfully executing the interaction action on the non-interacted part, so that the waste of calculation resources caused by repeated interaction with the same part can be avoided.

In some embodiments, the component motion parameter comprises a motion axis of the interactable component; performing three-dimensional reconstruction according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result of the target three-dimensional object, wherein the three-dimensional object reconstruction result comprises: performing complement processing according to the motion axis of the interactable component and the point cloud data of the interactable component to obtain complement point cloud data of the interactable component; performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a three-dimensional reconstruction result of the interactable component; performing complement processing on the point cloud data of the non-interactive component in the current point cloud data to obtain complement point cloud data of the non-interactive component; performing three-dimensional reconstruction according to the complement point cloud data of the non-interactive component to obtain a three-dimensional reconstruction result of the non-interactive component; and carrying out fusion processing on the three-dimensional reconstruction result of the interactable part and the three-dimensional reconstruction result of the non-interactable part to obtain a three-dimensional object reconstruction result.

For example, the axis of motion of the interactable component may direct the completion of the point cloud data of the interactable component. The motion axis has an association relationship with the motion plane, and the geometric characteristics of the interactable component can be reflected to a certain extent. The intelligent robot can perform feature mapping on the motion axis of the interactable component to obtain the motion axis feature of the interactable component; performing feature mapping on the point cloud data of the interactable component to obtain the point cloud feature of the interactable component; and carrying out fusion processing on the motion axis characteristics of the interactable component and the point cloud characteristics of the interactable component to obtain the complement point cloud data of the interactable component. The intelligent robot can perform three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a preliminary reconstruction result of the interactable component; and mapping the component motion parameters of the interactable component to the preliminary reconstruction result to obtain the preliminary reconstruction result of the interactable component.

The intelligent robot can perform point cloud complementation on the point cloud data of the non-interactive component in the current point cloud data to obtain the complement point cloud data of the non-interactive component. The intelligent robot can perform three-dimensional reconstruction on the complement point cloud data of the non-interactable component to obtain a three-dimensional reconstruction result of the non-interactable component; and combining the three-dimensional reconstruction result of the interactable component and the three-dimensional reconstruction result of the non-interactable component to obtain a three-dimensional object reconstruction result.

In some embodiments, each current point cloud data includes point cloud data for at least a portion of the non-interactable components. The intelligent robot can fuse the point cloud data of the non-interactable component in the current point cloud data to obtain the complement point cloud data of the non-interactable component.

In some embodiments, each current point cloud data includes other point cloud data in addition to the point cloud data of the interactable component. The other point cloud data may include at least one of point cloud data of a non-interactable component or point cloud data of other interactable components other than the interactable component described above.

In some embodiments, the intelligent robot may fuse, further complement, and then obtain complement point cloud data of the non-interactable component in the point cloud data.

In some embodiments, as shown in FIG. 8, a schematic diagram of a completion model is provided. The point cloud data and the motion axis of the interactable component are used as inputs of a complement model, and the complement model outputs the complement point cloud data of the interactable component. And (3) carrying out maximum pooling on the point cloud data of the interactable component after passing through the first combined multi-layer sensor (CMLP 1) to obtain the point cloud characteristics of the interactable component. After the axis of motion of the interactable component passes through the second bonded multi-layer sensor (CMLP 2), the axis of motion characteristic of the interactable component is obtained. And overlapping the motion axis characteristics of the interactable component and the point cloud characteristics of the interactable component, and then performing convolution processing on the output of the multilayer perceptron through the multilayer perceptron (MLP) to obtain the complement point cloud data of the interactable component.

In some embodiments, as shown in fig. 9, a schematic diagram of three-dimensional reconstruction of the complement point cloud data is provided. Because the interactable components are interacted one by one in the whole active three-dimensional reconstruction process, the point cloud data of each interactable component is separately segmented and reconstructed, and finally combined together to form a complete three-dimensional object reconstruction result. The point cloud data of the interactable means 1 is determined from the first segmentation result 1 of the interactable means 1 (drawer in the figure). And carrying out point cloud complementation according to the point cloud data of the interactable part 1 and the motion axis 1 to obtain the complement point cloud data of the interactable part 1. Generating grids of the complement point cloud data of the interactable part 1 to obtain a grid model M of the interactable part 1 ₁ 。

And determining the point cloud data of the non-interactable component according to at least part of the point cloud data of the non-interactable component (cabinet in the figure) in each piece of current point cloud data. And carrying out point cloud complementation on the point cloud data of the non-interactable component to obtain the complement point cloud data of the non-interactable component. Generating grids of the complement point cloud data of the non-interactive component to obtain a grid model M of the non-interactive component ₀ 。

The point cloud data of the interactable component 2 is determined from the first segmentation result 2 of the interactable component 2. And performing point cloud completion on the point cloud data of the interactable part 2 to obtain the completed point cloud data of the interactable part 2. Generating a grid of the complement point cloud data of the interactable part 2 to obtain a grid model M of the interactable part 2 ₂ . It can be appreciated that the grid model is finer in information than the point cloud data, which is equivalent to an upgraded version of the point cloud data. And constructing a three-dimensional reconstruction result aiming at the grid model.

In some embodiments, after grid generation is performed on point cloud data to obtain a grid model, a three-dimensional reconstruction result is generated for the grid model. The mesh model may be output data of the mesh generation model, and is not limited in this embodiment.

In some embodiments, the mesh model of the component is used as an input to a reconstruction model, the output of which is the three-dimensional reconstruction result of the component. The present embodiment is not limited to the reconstruction model, and may be any network model for realizing three-dimensional reconstruction.

In this embodiment, the motion axis can reflect the geometric feature of the interactable component, and the point cloud data of the interactable component is complemented by combining the motion axis, so that the accuracy of the complemented point cloud data of the interactable component can be ensured. And carrying out three-dimensional reconstruction on the complement point cloud data of the interactable part and the complement point cloud data of the non-interactable part, wherein the obtained three-dimensional object reconstruction result comprises the complete information of the non-interactable part and the interactable part in the target three-dimensional object, so that the accuracy of three-dimensional reconstruction is improved.

In some embodiments, the component motion parameters include a type of motion, an axis of motion, and a range of motion; fitting the motion trajectories to determine component motion parameters of the interactable component, comprising: performing circle fitting treatment on the motion trail to obtain a fitting circle; under the condition that the radius of the fitting circle is not larger than a preset type threshold value, determining the motion type to be rotation, and determining a motion range and a motion axis according to the fitting circle; under the condition that the radius of the fitting circle is larger than a preset type threshold value, determining the motion type to be translation, and performing linear fitting treatment on the motion track to obtain a fitting line segment; determining a motion range and a motion axis according to the fitted line segments; performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a three-dimensional reconstruction result of the interactable component, wherein the three-dimensional reconstruction result comprises: performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a preliminary reconstruction result; mapping the motion type, the motion axis and the motion range of the interactable component to the preliminary reconstruction result to obtain a three-dimensional reconstruction result of the interactable component.

Illustratively, the plurality of track positions includes a start track position and an end track position. The range of motion may include at least one of a range of rotation or a range of translation, etc. The intelligent robot can perform circle fitting processing on a plurality of track positions through least square optimization to obtain a fitting circle. In case the radius of the fitted circle is not larger than a preset type threshold, the type of movement of the interactable means is rotation. In the case where the type of movement of the interactable means is rotation, the intelligent robot may determine that the axis of movement of the interactable means is a straight line perpendicular to the center of the fitted circle. The intelligent robot may determine the arc length of the start track position and the end track position in the fitted circle as the rotation range.

In case the radius of the fitted circle is larger than a preset type threshold, the type of movement of the interactable means is translation. The intelligent robot can perform linear fitting processing on the motion trail through least square optimization to obtain a fitting linear. As shown in fig. 10, a schematic diagram of a fitted straight line is provided. The intelligent robot may determine the fitted straight line as the axis of motion of the interactable component. The intelligent robot may determine a distance between a start track position and an end track position on the fitted line as a translation range.

The intelligent robot can reconstruct the full point cloud data of the interactable part in three dimensions through the reconstruction model to obtain a preliminary reconstruction result. It will be appreciated that the coordinate system used to perform the fit is consistent with the coordinate system used to perform the three-dimensional reconstruction. The component motion parameters and the preliminary reconstruction result are located in the same coordinate system. The intelligent robot can directly map the motion type, the motion axis and the motion range of the interactable component to the preliminary reconstruction result under the same coordinate system to obtain the three-dimensional reconstruction result of the interactable component.

In some embodiments, as shown in fig. 11, a schematic representation of a fitted circle is provided. The intelligent robot can draw a straight line from the initial track position to the circle center, draw a straight line from the end track position to the circle center, and obtain the central angle by calculating the included angle between the two straight lines. The intelligent robot can determine the arc length between the initial track position and the end track position according to the central angle and the radius of the fitting circle, and the rotation range is obtained.

In this embodiment, the motion type of the interactable component is determined by performing a circle fitting on the motion track, under the rotation type, the motion axis and the motion range are directly determined according to the fitted circle, under the translation type, the motion track is subjected to a straight line fitting, and the motion axis and the motion range are determined according to the fitted straight line. The component motion parameters determined through the actual motion trail are more accurate, and then the accurate component motion parameters are mapped into the preliminary reconstruction result, so that a more accurate three-dimensional reconstruction result is obtained.

In some embodiments, a simplified flow diagram of an intelligent driven three-dimensional object interactive reconstruction method is provided in fig. 12. The Input (Input) is S ₀ Target three-dimensional object in initial state, pair S ₀ Scanning to obtain C ₀ Initial point cloud data. Due to C ₀ The initial point cloud data does not include the internal structure information of the target three-dimensional object, and the point cloud data of any component cannot be determined, so for C ₀ The initial point cloud data is subjected to three-dimensional reconstruction to obtain an empty component-free reconstruction result.

For C ₀ Interactive prediction is carried out on initial point cloud data to obtain I ₀ An initial interactable probability map. Performing interaction action selection on the I0 initial interactable probability map, and executing the selected interaction action pair S ₀ ¹ The target three-dimensional object in the first round of interaction is interacted to obtain S ₁ The target three-dimensional object after the first round of interaction. The motion axis of the axis1 interactable part 1 and the motion range of the range1 interactable part 1 can be determined by performing the motion trail of the interaction.

For S ₁ Scanning and dividing the target three-dimensional object after the first round of interaction to obtain P ₁ The second segmentation result of the part 1 can be interacted with. P (P) ₁ Point cloud data and C, which may indicate the interactable means 1 ₁ ^’ The point cloud data of the non-interactable part 1 in the current point cloud data of the first round. Three-dimensional reconstruction is carried out on the point cloud data of the interactable part 1 to obtain M ₁ The three-dimensional reconstruction result of the interactable means 1.

Through P ₁ Updating the interactable probability map to obtain I according to the second segmentation result of the interactable component 1 ₁ Interactive probability map of first round update. Pair I ₁ The interactive probability map updated in the first round carries out interactive action selection, and the selected interactive action pair S is adopted ₀ ² The target three-dimensional object in the second round of interaction is interacted to obtain S ₂ And the target three-dimensional object after the second round of interaction. The motion axis of the axis2 interactable part 2 and the range2 interactable part 2 range of motion can be determined by performing the motion trajectory of the interaction.

For S ₂ After the second round of interaction Scanning and dividing the target three-dimensional object to obtain P ₂ The second segmentation result of the interactable means 2. P (P) ₂ The second segmentation result of the interactable means 2 may be indicative of point cloud data and C of the interactable means 2 ₂ ^’ The point cloud data of the non-interactable part 2 in the current point cloud data of the second round. Three-dimensional reconstruction is carried out on the point cloud data of the interactable component 2 to obtain M ₂ The three-dimensional reconstruction result of the interactable means 2.

Through P ₂ Updating the interactable probability map to obtain I according to the second segmentation result of the interactable component 2 ₂ Interactive probability map updated in the second round. At I ₂ And stopping interaction under the condition that the number of candidate interaction positions in the interactive probability map updated in the second round is smaller than a preset number threshold value. From C ₁ ^’ Point cloud data and C of non-interactable part 1 in current point cloud data of first round ₂ ^’ And determining the point cloud data of the non-interactable component from the point cloud data of the non-interactable component 2 in the current point cloud data of the second round. Three-dimensional reconstruction is carried out on the point cloud data of the non-interactable component to obtain M ₀ Three-dimensional reconstruction of the non-interactable component.

Wherein the circular symbols containing M in fig. 12 characterize the operation of the interaction. The circular sign containing S characterizes the operation of the scan. The circle containing SS characterizes the scan + split operation. The circle mark containing a characterizes the operation of the interaction selection. The circular sign containing R characterizes the operation of the three-dimensional reconstruction. The hexagonal signature containing STOP characterizes the operation of stopping the interaction.

In some embodiments, a schematic representation of the initial geometric model, the interactable probability map, the part segmentation result, and the three-dimensional object reconstruction result of the plurality of three-dimensional objects as provided in FIG. 13. FIG. 13 illustrates an initial geometric model of a wardrobe, and an interactive probability map, a part segmentation result and a three-dimensional object reconstruction result of the wardrobe obtained by the method provided by the application; the method comprises the steps of obtaining an initial geometric model of the refrigerator, and obtaining an interactive probability graph, a part segmentation result and a three-dimensional object reconstruction result of the refrigerator by adopting the method provided by the application; the method comprises the steps of obtaining an initial geometric model of the table, and obtaining an interactive probability map, a part segmentation result and a three-dimensional object reconstruction result of the table by adopting the method provided by the application.

In some embodiments, performance of the methods provided herein is quantified by multi-dimensions as three-dimensional reconstruction of three-dimensional objects under each class. Wherein the number of interactions M, k performed by the intelligent robotic operation represents the number of interactable components. In the most ideal case, M is equal to the number k of moving parts, which means that all interactable parts succeed in the first interaction performed by the intelligent robot. Will predict accuracy A _action The calculation formula is A _action =k/M. Segmentation accuracy A _seg Defined as A _seg =P _correct /P _total Wherein P is _correct Is the number of correctly segmented point clouds, P, in the part segmentation result _total Is the total point cloud number of the component. It will be appreciated that the component segmentation result is actually a result of classifying each point cloud as belonging to a point cloud of an interactable component or a point cloud of a non-interactable component. For the complement model, measuring the Earth Movement Distance (EMD) between the generated complement point cloud data and the tag point cloud data as the complement error E _comp . For the reconstruction model, the chamfer distance of each part is calculated, expressed as reconstruction error E _recon . For component motion parameters, including motion axis and motion range, the axis direction error E is measured by calculating the cosine angle between the predicted axis direction and the true axis direction _modir . Shaft position error E _mopos Then it is calculated from the L2 distance between the predicted axis position and the true axis position. It will be appreciated that for translational type of motion, the position of the axis of motion is not important, and therefore only the translational axis direction error is counted. The prediction accuracy, the segmentation accuracy, the complement error, the reconstruction error, the axial direction error and the axial position error are quantitative results of the method provided by the application. For example, the quantitative result corresponding to the three-dimensional object in the wardrobe category is 0.17 in prediction accuracy, 0.95 in segmentation accuracy, 2.63 in complement error, 0.59 in reconstruction error, 4.07 in axial direction error, and 1.17 in axial position error.

In some embodiments, a schematic diagram of point cloud data for a variety of interactable components is provided as in fig. 14. The point cloud data of the interactable component which is not obtained by the motion information comprises more outliers and has larger phase difference with the point cloud data of the real interactable component, and the point cloud data of the interactable component which is obtained by the motion information comprises fewer outliers and is closer to the point cloud data of the real interactable component. It is understood that the point cloud data of the interactable component obtained by using the motion information may be the complement point cloud data output by the complement model.

In some embodiments, the first three-dimensional object reconstruction method and the second three-dimensional object reconstruction method as provided in fig. 15 are contrasted with each other. The second three-dimensional object reconstruction method is an intelligent driven three-dimensional object interactive reconstruction method provided by the application. The first three-dimensional object reconstruction method is another three-dimensional object reconstruction method. The second three-dimensional object reconstruction method is better in the two quantitative analysis dimensions of segmentation accuracy and reconstruction errors. And the three-dimensional reconstruction result obtained by adopting the second three-dimensional object reconstruction method is higher in accuracy than the three-dimensional reconstruction result obtained by adopting the first three-dimensional object reconstruction method.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides an intelligent driven three-dimensional object interactive reconstruction device for realizing the intelligent driven three-dimensional object interactive reconstruction method. The implementation scheme of the solution provided by the device is similar to the implementation scheme recorded in the method, so the specific limitation in the embodiment of the one or more intelligent driven three-dimensional object interactive reconstruction devices provided below can be referred to the limitation of the intelligent driven three-dimensional object interactive reconstruction method hereinabove, and the description is omitted here.

In one embodiment, as shown in FIG. 16, an intelligently driven three-dimensional object interactable reconstruction device 1600 is provided, comprising: a first determination module 1602, a prediction module 1604, an interaction module 1606, an acquisition module 1608, a fitting module 1610, a second determination module 1612, and a reconstruction module 1614, wherein:

a first determining module 1602, configured to determine initial point cloud data of a target three-dimensional object to be reconstructed; the target three-dimensional object includes an interactable component.

The prediction module 1604 is used for performing interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactable prediction results are used to characterize interactions that may be generated with the target three-dimensional object.

An interaction module 1606 for controlling the interaction tool to perform an interaction action with respect to the target three-dimensional object to interact with the interactable component; wherein the target three-dimensional object presents an internal structure upon interaction with the interactable component.

And the obtaining module 1608 is used for obtaining the motion trail of the interactive tool when the interactive tool executes the interactive action.

A fitting module 1610, configured to fit the motion trail and determine a component motion parameter of the interactable component.

A second determining module 1612, configured to determine current point cloud data if the interaction is performed successfully; and the current point cloud data is used for representing a set of surface points when the target three-dimensional object presents the internal structure after being interacted with the interactable component.

And the reconstruction module 1614 is used for performing three-dimensional reconstruction on the target three-dimensional object according to the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result.

In some embodiments, the interactive prediction result is an output derived from the initial point cloud data as an input to the interactive prediction model; the prediction module 1604 is also for determining a sample interactable component of the sample three-dimensional object; a plurality of corresponding sample interaction actions are preset for the sample interactable component; the elements of each sample interaction include a force acting on the sample interactable member; respectively decomposing acting forces corresponding to the interaction actions of a plurality of samples according to physical characteristics corresponding to the movement types of the sample interactable parts to obtain dynamic data and resistance data of each sample interaction action; determining the probability of successfully executing the sample interaction according to the difference between the dynamic data and the resistance data of the sample interaction; and taking a plurality of sample interaction actions corresponding to the sample interactable component and the probability of each sample interaction action as training data of the interaction prediction model to be trained so as to obtain the trained interaction prediction model.

In some embodiments, the interactive tool includes an adsorption tool and an adjustment tool; the interactive action is an action at an interactive location on the target three-dimensional object; the interaction module 1606 is further configured to control the suction tool to maintain a suction state at the interaction location and control the adjustment tool to apply a force in the adjusted interaction direction to determine the continuous movement of the interactable component.

In some embodiments, a reconstruction module 1614 is configured to perform segmentation processing on the current point cloud data for each interactable component, to obtain a component segmentation result corresponding to the interactable component; determining point cloud data of the interactable parts from the current point cloud data according to the part segmentation result corresponding to each interactable part; and carrying out three-dimensional reconstruction according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result of the target three-dimensional object.

In some embodiments, the interactable prediction results comprise an interactable probability map for characterizing a probability of successful execution of each interaction action; a reconstruction module 1614, configured to perform segmentation processing on the initial point cloud data and the current point cloud data for each interactable component, to obtain a first segmentation result corresponding to the interactable component in the initial point cloud data and a second segmentation result corresponding to the interactable component in the current point cloud data; according to the second segmentation result, determining point cloud data of the interactable component from the current point cloud data; for each interactable component, under the condition that the interactable component is successfully executed with the interaction action, updating the interactable probability map according to a first segmentation result corresponding to the interactable component to obtain an updated interactable probability map; the updated interactable probability map includes probabilities of successful execution of the interaction with the non-interacted component.

In some embodiments, the component motion parameter comprises a motion axis of the interactable component; the reconstruction module 1614 is configured to perform complement processing according to the motion axis of the interactable component and the point cloud data of the interactable component, so as to obtain complement point cloud data of the interactable component; performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a three-dimensional reconstruction result of the interactable component; performing complement processing on the point cloud data of the non-interactive component in the current point cloud data to obtain complement point cloud data of the non-interactive component; performing three-dimensional reconstruction according to the complement point cloud data of the non-interactive component to obtain a three-dimensional reconstruction result of the non-interactive component; and carrying out fusion processing on the three-dimensional reconstruction result of the interactable part and the three-dimensional reconstruction result of the non-interactable part to obtain a three-dimensional object reconstruction result.

In some embodiments, the component motion parameters include a type of motion, an axis of motion, and a range of motion; a fitting module 1610, configured to perform a circle fitting process on the motion trail to obtain a fitted circle; under the condition that the radius of the fitting circle is not larger than a preset type threshold value, determining the motion type to be rotation, and determining a motion range and a motion axis according to the fitting circle; under the condition that the radius of the fitting circle is larger than a preset type threshold value, determining the motion type to be translation, and performing linear fitting treatment on the motion track to obtain a fitting line segment; determining a motion range and a motion axis according to the fitted line segments; the reconstruction module 1614 is configured to perform three-dimensional reconstruction according to the complement point cloud data of the interactable component, so as to obtain a preliminary reconstruction result; mapping the motion type, the motion axis and the motion range of the interactable component to the preliminary reconstruction result to obtain a three-dimensional reconstruction result of the interactable component.

The modules in the intelligent driven three-dimensional object interactive reconstruction device can be all or partially realized by software, hardware and the combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server or a terminal, and the internal structure of which may be as shown in fig. 17. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements an intelligently driven three-dimensional object interactive reconstruction method.

It will be appreciated by those skilled in the art that the structure shown in fig. 17 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. An intelligent driven three-dimensional object interactive reconstruction method, which is characterized by comprising the following steps:

performing interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactive prediction result is used for representing an interactive action which can generate interaction with the target three-dimensional object;

Controlling an interaction tool to perform the interaction action with respect to the target three-dimensional object to interact with the interactable component; wherein, after interacting with the interactable component, the target three-dimensional object presents an internal structure;

fitting the motion trail and determining component motion parameters of the interactable component;

2. The method of claim 1, wherein the interactively predicted result is an output derived from the initial point cloud data as an input to an interactive prediction model; the method further comprises a training step of the interactive prediction model; the training step of the interactive prediction model comprises the following steps:

determining a sample interactable component of a sample three-dimensional object; a plurality of corresponding sample interaction actions are preset for the sample interactable component; the elements of each of the sample interactions include a force acting on the sample interactable component;

Respectively decomposing acting forces corresponding to the sample interaction actions according to physical characteristics corresponding to the movement types of the sample interactable parts to obtain dynamic data and resistance data of each sample interaction action;

determining the probability of successfully executing the sample interaction according to the difference between the dynamic data and the resistance data of the sample interaction;

and taking a plurality of sample interaction actions corresponding to the sample interactable component and the probability of each sample interaction action as training data of the interaction prediction model to be trained so as to obtain the trained interaction prediction model.

3. The method of claim 1, wherein the interactive tool comprises an adsorption tool and an adjustment tool; the interactive action is an action at an interactive location on the target three-dimensional object; the control interaction tool performs the interaction action with respect to the target three-dimensional object to interact with the interactable component, comprising:

and controlling the adsorption tool to maintain an adsorption state at the interaction position, and controlling the adjustment tool to apply a force in the adjusted interaction direction so as to determine the continuous movement of the interactable component.

4. A method according to any one of claims 1 to 3, wherein said reconstructing the target three-dimensional object in three dimensions based on the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result comprises:

for each interactable component, carrying out segmentation processing on the current point cloud data to obtain a component segmentation result corresponding to the interactable component;

determining point cloud data of the interactable parts from the current point cloud data according to the part segmentation result corresponding to each interactable part;

and carrying out three-dimensional reconstruction according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters to obtain a three-dimensional object reconstruction result of the target three-dimensional object.

5. The method of claim 4, wherein the interactable prediction comprises an interactable probability map for characterizing a probability of successful performance of each interaction;

the step of performing segmentation processing on the current point cloud data for each interactable component to obtain a component segmentation result corresponding to the interactable component, includes:

For each interactable component, carrying out segmentation processing on the initial point cloud data and the current point cloud data to obtain a first segmentation result corresponding to the interactable component in the initial point cloud data and a second segmentation result corresponding to the interactable component in the current point cloud data;

the determining the point cloud data of the interactable component from the current point cloud data according to the component segmentation result corresponding to each interactable component comprises the following steps:

determining point cloud data of the interactable component from the current point cloud data according to the second segmentation result;

the method further comprises the steps of:

for each interactable component, under the condition that the interactable component is successfully executed with the interaction action, updating the interactable probability map according to a first segmentation result corresponding to the interactable component to obtain an updated interactable probability map; the updated interactable probability map comprises the probability of successfully executing interaction actions on the non-interacted parts.

6. The method of claim 4, wherein the component motion parameter comprises a motion axis of the interactable component; the three-dimensional reconstruction is performed according to the point cloud data of the interactable component, the point cloud data of the non-interactable component in the current point cloud data and the component motion parameters, so as to obtain a three-dimensional object reconstruction result of the target three-dimensional object, including:

Performing complement processing according to the motion axis of the interactable component and the point cloud data of the interactable component to obtain complement point cloud data of the interactable component;

performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a three-dimensional reconstruction result of the interactable component;

performing complement processing on the point cloud data of the non-interactive component in the current point cloud data to obtain complement point cloud data of the non-interactive component;

performing three-dimensional reconstruction according to the complement point cloud data of the non-interactable component to obtain a three-dimensional reconstruction result of the non-interactable component;

and carrying out fusion processing on the three-dimensional reconstruction result of the interactable component and the three-dimensional reconstruction result of the non-interactable component to obtain a three-dimensional object reconstruction result.

7. The method of claim 6, wherein the component motion parameters include a type of motion, an axis of motion, and a range of motion;

the fitting of the motion trail to determine the component motion parameters of the interactable component comprises:

performing circle fitting treatment on the motion trail to obtain a fitting circle;

under the condition that the radius of the fitting circle is not larger than a preset type threshold value, determining that the motion type is rotation, and determining the motion range and the motion axis according to the fitting circle;

Under the condition that the radius of the fitting circle is larger than the preset type threshold value, determining the motion type to be translation, and performing linear fitting treatment on the motion track to obtain a fitting line segment;

determining the motion range and the motion axis according to the fitted line segment;

the three-dimensional reconstruction is performed according to the complement point cloud data of the interactable component to obtain a three-dimensional reconstruction result of the interactable component, comprising:

performing three-dimensional reconstruction according to the complement point cloud data of the interactable component to obtain a preliminary reconstruction result;

mapping the motion type, the motion axis and the motion range of the interactable component to the preliminary reconstruction result to obtain a three-dimensional reconstruction result of the interactable component.

8. An intelligently driven three-dimensional object interactive reconstruction device, the device comprising:

the prediction module is used for carrying out interactive prediction on the initial point cloud data to obtain an interactive prediction result; the interactive prediction result is used for representing an interactive action which can generate interaction with the target three-dimensional object;

An interaction module for controlling an interaction tool to perform the interaction action with respect to the target three-dimensional object to interact with the interactable component; wherein, after interacting with the interactable component, the target three-dimensional object presents an internal structure;

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.