CN118071968A

CN118071968A - Intelligent interaction deep display method and system based on AR technology

Info

Publication number: CN118071968A
Application number: CN202410461591.6A
Authority: CN
Inventors: 廖文瑾; 王启; 赵云辉
Original assignee: Shenzhen Shine Exhibition Co ltd
Current assignee: Shenzhen Shine Exhibition Co ltd
Priority date: 2024-04-17
Filing date: 2024-04-17
Publication date: 2024-05-24
Anticipated expiration: 2044-04-17
Also published as: CN118071968B

Abstract

The invention relates to the technical field of virtual interaction, in particular to an intelligent interaction deep display method and system based on an AR technology. The method comprises the following steps: acquiring a real-time environment image; performing image preprocessing on the real-time environment image to obtain a standard environment image; performing environmental object segmentation on the standard environmental image to generate an environmental object area image; modeling an environmental object on the environmental object area image, thereby generating a three-dimensional object model; detecting the target position of the three-dimensional object model to generate object target position detection data; and carrying out recognition boundary frame construction on the three-dimensional object model based on the object target position detection data, and generating an object recognition boundary frame. According to the invention, the real-time environment modeling three-dimensional modeling, the accurate perception of the actual environment and the accurate positioning of the three-dimensional object model are carried out according to the object image, so that the processing speed and the recognition precision of the modeling and the fluency of interaction are improved.

Description

Intelligent interaction deep display method and system based on AR technology

Technical Field

The invention relates to the technical field of virtual interaction, in particular to an intelligent interaction deep display method and system based on an AR technology.

Background

With the popularization of smart phones and tablet computers, AR technology is coming into the field of view of the public, and people can experience interest in AR games through their own mobile phones, which marks that AR technology is coming into a brand new stage. Over the past few years, AR hardware has seen significant progress, such as micro-displays, lighter, more comfortable head-mounted devices, and the like. Meanwhile, the concept of AR glasses starts to be a brand-new corner, and along with the continuous development of technology, intelligent interaction deep display becomes the front edge of AR technology. This technique combines artificial intelligence, machine learning, and advanced sensors, making the AR more intelligent and personalized. The user may control the AR application through voice, gesture, or eye interactions, making the experience more intuitive and natural. For example, in the retail industry, smart AR mirrors may automatically recommend clothing styles based on customer selection and preferences; in the educational field, intelligent AR teaching can adjust teaching content and speed according to the learning mode of students. However, the current AR interaction technology is still not mature enough, and particularly, the problems of uncoordinated and disjointed virtual objects and actual scenes, inaccurate object identification and lack of fineness exist in the current AR interaction technology.

Disclosure of Invention

Based on this, it is necessary to provide an intelligent interactive deep display method and system based on AR technology, so as to solve at least one of the above technical problems.

In order to achieve the above purpose, an intelligent interactive deep display method based on AR technology comprises the following steps:

Step S1: acquiring a real-time environment image; performing image preprocessing on the real-time environment image to obtain a standard environment image; performing environmental object segmentation on the standard environmental image to generate an environmental object area image; modeling an environmental object on the environmental object area image, thereby generating a three-dimensional object model;

Step S2: detecting the target position of the three-dimensional object model to generate object target position detection data; performing recognition boundary frame construction on the three-dimensional object model based on object target position detection data to generate an object recognition boundary frame; carrying out object continuous frame motion path analysis on the three-dimensional object model according to the object identification boundary frame to generate object motion path data;

Step S3: sensor data acquisition is carried out on the three-dimensional object model through object motion track path data, and standard sensor acquisition data are obtained; importing the acquired data of the standard sensor into a three-dimensional object model for data space registration to generate sensor registration data; filling the image depth holes into the sensor registration data to generate a three-dimensional object hole filling map; virtual object reality positioning is carried out according to the three-dimensional object cavity filling map, and a virtual object position indicator is generated;

Step S4: real-time dynamic virtual element fusion is carried out through a virtual object position indicator, and a dynamic virtual element synthetic diagram is generated; user characteristic recognition is carried out on the dynamic virtual element synthetic graph, so that user behavior recognition data and user gesture recognition data are obtained; and carrying out user feedback monitoring based on the user behavior recognition data and the user gesture recognition data to generate a user intelligent interaction display interface.

The invention can improve the efficiency and accuracy of the subsequent processing steps through image preprocessing. Common pre-processing includes denoising, color correction, image enhancement, etc., which can improve image quality and contrast, and reduce errors in subsequent environmental object segmentation. Environmental object segmentation can distinguish different objects or objects in an image from a background, and helps to extract the outline and characteristics of each object in the environment. Through the environmental object segmentation, the shape, the size and the position of each object in the environment can be more accurately captured, and accurate data support is provided for subsequent modeling. Based on the environmental object region image, three-dimensional object modeling can be performed to convert objects in the image into a three-dimensional model. The three-dimensional object model can provide more real and stereoscopic environmental scenes, and is beneficial to scene restoration and interaction in applications such as virtual reality, augmented reality and the like. By detecting the target position of the three-dimensional object model, the position information of the object in the scene can be accurately acquired, basic data is provided for subsequent object identification, tracking and motion path analysis, and accurate positioning of the target is ensured. Based on the target position detection data, constructing the object recognition bounding box helps to further clarify the boundary and shape of the object, and the generated bounding box information can be used for more accurate object recognition, tracking and interaction with the environment in subsequent processing. By analyzing the motion path of the continuous frames of the object recognition bounding box, the motion track and dynamic change of the object in space can be captured, and the generated motion path data is helpful for understanding the motion mode, speed and direction of the object, and has key effects on scene analysis and decision making. The sensor data acquisition is carried out through the object motion track path data, so that real-time data acquisition related to the object motion can be realized, more comprehensive and accurate sensor data can be obtained, and a finer information basis is provided for subsequent analysis and processing. The standard sensor acquisition data is imported into the three-dimensional object model for data space registration, so that the consistency of the sensor data and the spatial position of the object model can be ensured, and the generated sensor registration data is beneficial to comprehensively analyzing the information of different sensors under the same coordinate system, and the consistency of the whole data is improved. Image depth cavity filling is carried out on the sensor registration data, data missing caused by sensor limitation or other reasons can be filled, and the generated three-dimensional object cavity filling map can provide more complete and coherent object surface information, so that the spatial perception of an object is enhanced. Virtual object reality positioning is performed based on the three-dimensional object cavity filling map, a virtual object can be accurately positioned in the space of the object model, and the generated virtual object position indicator is beneficial to displaying the accurate position of the virtual object in a reality scene, so that more visual and immersive user experience is provided. The virtual object position indicator is used for guiding the position of the virtual element in the real world, which means that the virtual element can be dynamically fused into the actual scene according to the environment of the user, and the fusion sense of the virtual and the reality is enhanced. For example, the virtual elements may be adjusted in real time based on where and in what direction the user is located in order to more naturally blend into the environment. Once the virtual element is correctly fused with the real world, a dynamic virtual element synthetic diagram can be generated, and the image contains a fused picture of the virtual element and the real world, so that a basis is provided for subsequent analysis and processing, and through the step, the user can see the interaction effect of the virtual element and the real scene. In the dynamic virtual element synthetic graph, user characteristic recognition, including user behavior and gestures, can be performed by using computer vision technology, so that understanding of behavior patterns and demands of users is facilitated, and more personalized and intelligent interaction experience is further provided. For example, a gesture or action of the user is recognized to adjust the display or operation of the virtual element. Therefore, the invention improves the processing speed and the recognition precision of modeling and the fluency of interaction by carrying out real-time environment modeling three-dimensional modeling, accurate perception of the actual environment and accurate positioning of the three-dimensional object model according to the object image.

In this specification, an intelligent deep interaction display system based on AR technology is provided, which is configured to execute the above intelligent deep interaction display method based on AR technology, where the intelligent deep interaction display system based on AR technology includes:

the object recognition module is used for acquiring a real-time environment image; performing image preprocessing on the real-time environment image to obtain a standard environment image; performing environmental object segmentation on the standard environmental image to generate an environmental object area image; modeling an environmental object on the environmental object area image, thereby generating a three-dimensional object model;

The motion path analysis module is used for detecting the target position of the three-dimensional object model and generating object target position detection data; performing recognition boundary frame construction on the three-dimensional object model based on object target position detection data to generate an object recognition boundary frame; carrying out object continuous frame motion path analysis on the three-dimensional object model according to the object identification boundary frame to generate object motion path data;

The virtual reality synchronization module is used for acquiring sensor data of the three-dimensional object model through object motion track path data to obtain standard sensor acquisition data; importing the acquired data of the standard sensor into a three-dimensional object model for data space registration to generate sensor registration data; filling the image depth holes into the sensor registration data to generate a three-dimensional object hole filling map; virtual object reality positioning is carried out according to the three-dimensional object cavity filling map, and a virtual object position indicator is generated;

the user feedback module is used for carrying out real-time dynamic virtual element fusion through the virtual object position indicator to generate a dynamic virtual element synthetic diagram; user characteristic recognition is carried out on the dynamic virtual element synthetic graph, so that user behavior recognition data and user gesture recognition data are obtained; and carrying out user feedback monitoring based on the user behavior recognition data and the user gesture recognition data to generate a user intelligent interaction display interface.

The invention has the beneficial effects that the system can keep synchronous with the actual environment by acquiring the real-time environment image, so that the environment and the situation of the user can be better understood. Through preprocessing, noise in the image can be removed, and brightness, contrast and the like of the image can be adjusted, so that the accuracy and efficiency of subsequent processing are improved. The object in the environment image is segmented and modeled into a three-dimensional object model, so that the system is facilitated to understand the environment structure and the object layout, and accurate environment information is provided for the subsequent steps. By detecting the target position of the three-dimensional object model, the system can accurately identify the position of the object in the environment and provide important data support for the subsequent steps. Generating object recognition bounding boxes helps to accurately identify boundaries of objects, and provides important data support for subsequent object motion path analysis. By means of sensor data acquisition and registration, the system can acquire motion track data of objects in the environment, so that dynamic behaviors of the objects can be better understood. Generating a three-dimensional object cavity filling map is helpful for the system to understand the space structure in the environment, and provides a foundation for positioning the virtual object. Through the virtual object position indicator, the system can accurately fuse virtual elements with the real world, providing a more immersive experience. Through the virtual object position indicator, real-time dynamic fusion of virtual elements and the real world is realized, and the immersion and experience of a user are enhanced. By recognizing the behavior and gesture of the user, the system can learn the demands and preferences of the user, adjust the interactive interface in real time according to the feedback monitoring data, and provide more intelligent and personalized user experience. Therefore, the invention improves the processing speed and the recognition precision of modeling and the fluency of interaction by carrying out real-time environment modeling three-dimensional modeling, accurate perception of the actual environment and accurate positioning of the three-dimensional object model according to the object image.

Drawings

FIG. 1 is a schematic flow chart of steps of an AR technology-based intelligent interactive deep display method;

FIG. 2 is a flowchart illustrating the detailed implementation of step S2 in FIG. 1;

FIG. 3 is a flowchart illustrating the detailed implementation of step S24 in FIG. 2;

FIG. 4 is a flowchart illustrating the detailed implementation of step S3 in FIG. 1;

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

The following is a clear and complete description of the technical method of the present patent in conjunction with the accompanying drawings, and it is evident that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.

Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. The functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor methods and/or microcontroller methods.

It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

To achieve the above objective, please refer to fig. 1 to 4, an intelligent interactive deep display method based on AR technology, the method includes the following steps:

In the embodiment of the present invention, as described with reference to fig. 1, the step flow diagram of an intelligent interaction depth display method based on AR technology of the present invention is provided, and in this example, the intelligent interaction depth display method based on AR technology includes the following steps:

In the embodiment of the invention, the environment image is captured by using a camera, a depth camera or other sensor devices, and environment image data is acquired in real time by means of an image processing library (such as OpenCV) or an API of an image acquisition device. Filters (e.g., gaussian filters) are used to reduce image noise, adjust parameters such as contrast, brightness, etc. of the image to improve image quality, correct distortions of the image, such as lens distortion, and adjust the size of the image as needed for subsequent processing. Different objects in the image are segmented out using image segmentation algorithms (e.g., pixel-based segmentation, semantic segmentation, instance segmentation, etc.), and object segmentation is performed by means of a deep learning model (e.g., mask R-CNN, U-Net, etc.). Based on the segmented object region image, three-dimensional object modeling is performed by using three-dimensional reconstruction technology (such as multi-view geometry, structured light, depth learning and the like), the object region image is registered with an environment map, and an object modeling result is mapped into a three-dimensional space. And combining the modeled object region images into a three-dimensional model, and generating a complete three-dimensional object model according to the geometric attributes and texture information of the object.

In the embodiment of the invention, the target position detection is carried out on the three-dimensional object model by using a target detection algorithm (such as YOLO, fast R-CNN, SSD and the like). Based on the three-dimensional object model that has been built, conversion to an appropriate representation (e.g., point cloud, voxel grid, etc.) may be required for detection during the detection process. The detected target position is converted into a proper coordinate system so as to be associated with the environment or other objects, and an identification boundary box is constructed according to the detected target position data on the basis of the target position detection. The bounding box may be a cube, rectangle, or any other suitable shape to accurately represent the position and size of the object. The motion path of the object is analyzed based on object position data of successive frames, and motion tracking algorithms (e.g., kalman filters, optical flow methods, methods based on inter-frame differences, etc.) may be used to estimate the motion of the object. By analyzing the position change of the object between the continuous frames, the motion path information of the object can be obtained, the motion path information of the object is recorded and stored in the form of data, and the data comprises the position coordinates, the speed, the acceleration and other motion parameters of the object and other related information such as time stamps. The generation of motion path data may be used for subsequent behavioral analysis, path planning, and the like.

In the embodiment of the invention, the data acquisition is carried out on the motion trail of the three-dimensional object model by using the sensor (such as a camera, a laser radar, a depth camera and the like), the motion trail data comprises the information of the position, the speed, the acceleration and the like of the object and the related information of the timestamp and the like, and the acquired sensor data is processed and converted to conform to the standard format and standard representation, so that the accuracy and the consistency of the data are ensured, and the subsequent data processing and analysis are facilitated. The standard sensor acquisition data is imported into the three-dimensional object model for data space registration, the purpose of the data space registration is to align the acquired sensor data with the three-dimensional object model for subsequent analysis and processing, and the sensor registration data is generated according to the result of the data space registration, wherein the sensor registration data comprises position and posture information of the sensor data in a three-dimensional object model coordinate system. And (3) performing image depth hole filling on the sensor registration data, wherein the image depth hole filling refers to filling holes or missing parts in a depth image caused by sensor acquisition so as to obtain complete three-dimensional object surface information, integrating the data subjected to depth hole filling processing, and generating a three-dimensional object hole filling map, wherein the map contains the surface topological structure and depth information of an object and is used for subsequent virtual object positioning and rendering. Based on the generated three-dimensional object cavity filling map, the virtual object is actually positioned, the positioning result can be the position and posture information of the virtual object in the object model, and a virtual object position indicator is generated according to the real positioning result of the virtual object and can be in the forms of an arrow, a label, a light shadow and the like and used for indicating the position and the direction of the virtual object in the three-dimensional space.

In the embodiment of the invention, the dynamic virtual elements are fused in real time on the corresponding positions by utilizing the position and direction information provided by the virtual object position indicator, and the virtual elements can be in the forms of images, videos, animations and the like and are set according to the requirements of users and scenes. And synthesizing the real-time fused virtual elements with the background to generate a dynamic virtual element synthesis image, wherein the synthesis image can be a video stream or a static image, the fusion effect of the virtual elements and a real scene is shown, and the generated dynamic virtual element synthesis image is subjected to user characteristic recognition, including information such as user behaviors and gestures. The behavior and the gesture of the user in the scene are analyzed by using computer vision technology such as target detection, gesture estimation and the like, and behavior and gesture recognition data of the user are obtained according to the result of user feature recognition, wherein the behavior recognition data comprises information such as actions and gestures of the user, and the gesture recognition data comprises information such as gestures and directions of the user. Based on the user behavior recognition data and the user gesture recognition data, user feedback monitoring is performed, whether the behavior and gesture of the user are consistent with the expected behavior and gesture are analyzed, and feedback information of the user such as interaction intention, satisfaction and the like is detected. And generating an intelligent user interaction display interface according to the user feedback monitoring result, wherein the interface is a virtual interface or a physical interface so as to meet the user requirements and provide better interaction experience.

Preferably, step S1 comprises the steps of:

Step S11: acquiring a real-time environment image by using a camera;

Step S12: performing image brightness enhancement on the real-time environment image to generate a real-time environment brightness enhancement image; performing image filtering on the real-time environment brightness enhancement image to generate a real-time environment filtering image; carrying out local neighborhood preprocessing on the real-time environment filtering image to generate a standard environment image;

Step S13: carrying out environmental object recognition on the standard environmental image to generate environmental object boundary data; image segmentation is carried out on the standard environment image according to the environment object boundary data, and an environment object area image is generated;

step S14: and modeling the environmental object on the environmental object area image, thereby generating a three-dimensional object model.

The invention obtains the real-time environment image by using the camera, and provides real-time visual information for subsequent processing. The brightness enhancement, filtering and local neighborhood preprocessing are carried out on the real-time environment image, so that the image quality can be improved, the noise and interference can be reduced, and the accuracy and stability of the subsequent processing steps can be improved. By carrying out object recognition and segmentation on the standard environment image, different objects or object areas in the image can be separated, and accurate object boundary information is provided for further analysis and processing. Based on the image of the area of the environmental object, three-dimensional modeling of the environmental object can be performed, so that a virtual model of the object in the real environment is generated, the model can be used in the fields of virtual reality, augmented reality, object recognition, tracking and the like, and more accurate and reliable data support is provided.

In the embodiment of the invention, the real-time image in the environment is captured by using the camera, and the real-time image can be realized by common camera equipment, such as a network camera, a smart phone camera and the like. The brightness enhancement, filtering and local neighborhood preprocessing are carried out on the real-time environment image so as to improve the image quality and reduce noise, the brightness enhancement can adopt histogram equalization and other technologies, the image filtering can adopt various filters such as a Gaussian filter, a median filter and the like, and the local neighborhood preprocessing can be realized by edge preserving filtering and other methods. Environmental object recognition is performed on standard environmental images, and deep learning techniques such as Convolutional Neural Networks (CNNs) and the like can be used. The generation of the boundary data of the environmental object can be realized through a target detection algorithm, such as YOLO, fast R-CNN and the like, the image segmentation can adopt a semantic segmentation or instance segmentation technology, such as Mask R-CNN and the like, the three-dimensional modeling can be carried out on the regional image of the environmental object, computer vision and graphics technology can be used, a three-dimensional reconstruction method combining multi-view images can also be adopted, such as structured light, stereoscopic vision and the like, and meanwhile, the three-dimensional information of the environmental object can be obtained through a mode of splicing a plurality of images or using laser scanning and the like.

Preferably, step S14 includes the steps of:

step S141: extracting object feature information from the environmental object region image to generate environmental object feature information data, wherein the environmental object feature information data comprises environmental object color data, environmental object shape data and environmental object texture data;

step S142: performing illumination condition analysis on the image of the environment object area to generate environment light condition data, wherein the environment illumination condition data comprises light source position data and light source intensity data;

step S143: carrying out environment object parameter virtualization on environment object color data, environment object shape data and environment object texture data according to the light source position data and the light source intensity data to generate object virtualization data; performing three-dimensional point cloud conversion on the object virtualization data by utilizing a three-dimensional point cloud technology to generate three-dimensional point cloud object data;

Step S144: performing light source range influence analysis on the light source position data and the light source intensity data to generate light source range influence analysis data; performing background color difference analysis on the three-dimensional point cloud object data through the light source range influence analysis data to generate object background color difference data; and carrying out three-dimensional reconstruction on the three-dimensional point cloud object data based on the object background color difference data to generate a three-dimensional object model.

The invention is helpful to accurately represent the appearance characteristics of the object in the three-dimensional object model by extracting the color information of the environmental object. By extracting shape information of the object, the spatial structure and geometry of the object can be presented more accurately. Texture information is important for the surface detail and characteristics of an object, and can improve the reality and detail degree of a model. Determining the position of the light source helps to simulate the real lighting effect in the three-dimensional object model, so that the object presents a realistic appearance in different angles and scenes. The light source intensity influences the details such as shadow and highlight of the object, and is important for the reality and visual effect of the model. By virtualizing parameters of environmental objects, the appearance of the objects can be simulated under different environmental conditions, and the universality of the model is improved. By utilizing the three-dimensional point cloud technology, the surface of the object can be represented more flexibly, and more possibility is provided for subsequent analysis and processing. Determining the influence of the light source range is helpful for understanding the performance of the object under different illumination conditions, and further improves the realism of the model. The background color difference analysis is helpful for processing the color difference between the object and the background and improving the segmentation precision of the object. The three-dimensional reconstruction based on the color difference data can restore the shape and appearance of the object more accurately.

In an embodiment of the invention, an image of an object region in an environment is acquired by using a camera or other image acquisition device. Color information of the object is extracted using image processing techniques such as color space conversion or color histogram analysis. Shape information of the object is extracted from the image using methods such as edge detection, contour extraction, or shape descriptors. Texture information of the object surface is extracted using texture analysis algorithms, such as Local Binary Pattern (LBP) or gray level co-occurrence matrix (GLCM). The illumination in the image, including the position and intensity of the light source, is analyzed using image processing and computer vision techniques. The location of the light source in the image may be determined using a specific light source detection algorithm, such as an edge detection based method or a color feature based method. The intensity of the light source is estimated by analyzing information such as brightness and shadows in the image. And carrying out virtualization processing on object characteristic information such as color, shape, texture and the like according to the illumination condition data so as to simulate the appearance of an object under different illumination conditions. And converting the virtualized object parameters into three-dimensional point cloud data, wherein a point cloud generation algorithm such as nearest neighbor search or multi-view stereo matching can be adopted. The range of the light source is analyzed to determine the influence degree of the illumination on the object, and methods such as ray tracing or radiance analysis can be adopted. Color differences between the object and the background are detected and analyzed to determine the boundary and segmentation of the object, image segmentation and color analysis techniques may be used. And reconstructing the three-dimensional point cloud object data based on the background color difference data, wherein a three-dimensional reconstruction algorithm such as a surface reconstruction or voxel method can be adopted.

Preferably, step S2 comprises the steps of:

Step S21: extracting object feature vectors of the three-dimensional object model by using a deep learning network to generate the object feature vectors; performing target position detection on the object feature vector based on a target detection algorithm to generate object target position detection data;

Step S22: performing position curvature analysis on the object target position detection data to generate object target position curvature data; comparing the object target position curvature data with a preset standard curvature threshold, and marking the corresponding object target position detection data as object position vertex data when the object target position curvature data is larger than or equal to the preset standard curvature threshold; when the curvature data of the object target position is smaller than a preset standard curvature threshold value, eliminating the corresponding object target position detection data;

step S23: performing recognition boundary frame construction on the three-dimensional object model based on the object position vertex data to generate an object recognition boundary frame;

Step S24: performing target matching on the three-dimensional object model to generate object motion trail data; performing object continuous frame motion estimation on the three-dimensional object model through the object recognition boundary frame and the object motion track data to generate an object motion vector set; and constructing an object motion path based on the object motion vector set, and generating object motion path data.

According to the invention, the feature vector of the three-dimensional object model is extracted by utilizing the deep learning network, so that the key information of the object can be captured, and the accuracy of object identification is improved. The position of the object can be effectively detected by combining the target detection algorithm, so that the system is more robust and accurate in processing complex scenes. Through the position curvature analysis, the system can more comprehensively understand the shape of the object, so that the understanding and the recognition of the characteristics of the object are improved in subsequent processing. According to the comparison of the curvature data and a preset standard curvature threshold value, the self-adaptive marking and the rejection of the object position data are realized, and the data quality is improved. And carrying out boundary frame construction based on object position vertex data, which is beneficial to accurately identifying the boundary of an object and providing clear object region information for subsequent processing. By utilizing the target matching technology, the position of the object in the continuous frames can be tracked, and the motion trail data of the object can be generated. And combining with the object recognition boundary box, the motion estimation of the object in the continuous frames is realized, and the understanding and analysis of the dynamic behavior of the object are improved. By constructing the object motion vector set, the system can generate more complete object motion path data, and provide more information for analysis of object behaviors.

As an example of the present invention, referring to fig. 2, the step S2 in this example includes:

In embodiments of the present invention, the model is selected by selecting an appropriate deep learning model, such as Convolutional Neural Network (CNN), residual network (ResNet), or a model specific to three-dimensional data, such as PointNet. The input three-dimensional object model is preprocessed, including scaling, rotation, or filling operations, to ensure consistency and manageability of the input data. And extracting features of the preprocessed three-dimensional object model by using a deep learning network, and mapping the three-dimensional object model into a high-dimensional feature space to capture key features of the object. If no off-the-shelf models are available, training or fine-tuning for specific tasks and data sets is required to improve the performance of the feature extraction network. A target detection algorithm suitable for three-dimensional object detection is selected, such as an extension of a two-dimensional framework (e.g., fast R-CNN, YOLO, SSD, etc.) based on depth learning or a dedicated algorithm for three-dimensional object detection (e.g., 3DSSD, MV3D, etc.). If no labeling data set exists, a set of training data containing object position information needs to be manually labeled for training the target detection model. The labeled dataset is used to train the object detection model to learn the position and shape characteristics of the object. And performing target detection on the feature vectors of the three-dimensional object model by using the trained model so as to determine the position and the boundary box of the object.

In the embodiment of the present invention, by detecting data for the target position of each object, the curvature thereof is first calculated. The curvature may be calculated by various methods, such as using a curvature calculation formula in differential geometry or using a fitted curve method. And correlating the calculated curvature value with corresponding object target position detection data to generate object target position curvature data. And presetting a standard curvature threshold according to specific requirements and application scenes. The threshold value can be adjusted according to the actual situation. And comparing the curvature data of the object target position with a preset standard curvature threshold value. Object target position curvature data with curvature greater than or equal to a preset standard curvature threshold value are marked as object position vertex data, and the points are indicated to be edges or corner points of an object. The object target position curvature data with curvature smaller than the preset standard curvature threshold value is eliminated because the points are flat surfaces of the object or noise.

In the embodiment of the present invention, by acquiring the object position vertex data, the data is the data marked as the object position vertex in step S22, and represents the position of the edge or corner of the object. The bounding box may be of various shapes, such as rectangular, cubic, etc. The shape of the bounding box is determined according to specific requirements and scenarios. The position of the bounding box is determined from the object position vertex data, either by calculating the smallest bounding box of the vertex data or from the distribution characteristics of the vertex data. Based on the determined shape, size, and location, an object recognition bounding box is generated. And associating the generated bounding box with the corresponding three-dimensional object model. The generated object recognition bounding box may be used as an output for subsequent object recognition, tracking, or other applications.

In embodiments of the present invention, features, such as shape, color, texture, etc., are extracted from a three-dimensional object model or projection thereof. From the extracted features, descriptors describing the object, such as feature vectors or hash codes, are generated. Matching the object in the current frame with the object in the previous frame by adopting a matching algorithm (such as nearest neighbor matching, feature-based matching and the like) so as to determine the motion trail of the object. By target matching, the position and pose of the object in successive frames are tracked. And recording the position, the gesture and other information of the object in different frames to generate the motion trail data of the object. And estimating the motion vector of the object between two frames according to the position and posture data of the object in the continuous frames, and describing the motion condition of the object. And extracting the motion vector of the object from the result of the motion estimation of the continuous frames of the object. The extracted motion vectors are combined into a set as a representation of the object motion vector set. And planning a motion path of the object according to the object motion vector set, and considering smoothness and continuity of motion. And generating motion path data of the object according to the planned path, wherein the motion path data comprises a moving track and posture change of the object in space.

Preferably, step S24 includes the steps of:

Step S241: performing frame shooting on the three-dimensional object model based on a preset time interval to obtain a three-dimensional object frame shooting image, wherein the three-dimensional object frame shooting image comprises an initial frame shooting image and an end frame shooting image; the initial frame shooting image and the end frame shooting image are subjected to image superposition to generate a frame shooting superposition image;

Step S242: carrying out object model tracking matching on the frame shooting coincident images to generate object motion trail data; performing continuous frame motion estimation on the three-dimensional object model through the object motion track data and the object identification boundary frame to generate an object motion vector set;

Step S243: performing data dimension reduction on the object motion vector set to generate an object continuous motion track vector; according to the Kalman filter, carrying out motion trail updating on the continuous motion trail vector of the object to generate an updating vector of the continuous motion trail of the object;

Step S244: and constructing an object motion trail path of the frame shooting coincident image through the object continuous motion trail update vector, and generating object motion trail path data.

The invention can ensure the synchronism of data by the frame shooting based on the preset time interval. The image superposition is helpful for reducing image shooting inconsistency caused by object movement and improving the precision of subsequent processing. And obtaining a frame shooting coincident image, and providing more consistent and reliable input for tracking a subsequent object model. The object model tracking matching is carried out on the frame shooting coincident images, so that object motion track data are generated, the position and the gesture of an object in continuous frames can be accurately tracked, and basic data of the object motion are provided. An object motion vector set is generated, and necessary information is provided for subsequent track path construction. And the data dimension reduction is carried out on the object motion vector set, so that the data structure is simplified, and the calculation complexity is reduced. The Kalman filtering is used for updating the motion track vector, so that the robustness to noise is enhanced. Generating the continuous motion trail update vector of the object provides smoother and accurate motion information. And constructing a track path of the frame shooting coincident image through the continuous motion track updating vector of the object, so as to obtain the motion path of the object in the three-dimensional space. Generating object motion track path data provides key motion information for further analysis, visualization or control and other applications.

As an example of the present invention, referring to fig. 3, the step S24 in this example includes:

In the embodiment of the invention, the three-dimensional object is subjected to frame shooting by using a video camera or a sensor, and the equipment comprises a common camera, a depth camera, a laser scanner or other three-dimensional imaging equipment. Images or point cloud data of the three-dimensional object are captured continuously over a preset time interval, the choice of which depends on the required motion resolution and real-time requirements. Within each time interval, the captured image or point cloud data is separated into an initial frame and an end frame. The initial frame is the first frame of each time interval and the end frame is the last frame of each time interval. Image processing and registration are performed on the initial frame shot image and the final frame shot image to ensure the consistency of the initial frame shot image and the final frame shot image in space and time, and the image processing and registration method relates to computer vision technologies such as image calibration, registration, feature extraction, matching and the like. Common methods include feature point matching, region matching, optical flow estimation, and the like. In overlapping image areas, a blending technique (e.g., weighted averaging) is required to blend the two images to ensure that the transition is natural. The images subjected to the image superposition processing are combined or stored as frame shooting superposition images, and the frame shooting superposition images are used as input of subsequent steps for object motion tracking, track estimation and the like. Parameters in the image superposition and frame shooting processes are adjusted and optimized to ensure that the generated superposition image has good quality and accuracy, and factors such as camera parameters, illumination conditions, object surface characteristics and the like are required to be adjusted.

In the embodiment of the invention, the object in the overlapping image is tracked and matched by utilizing the computer vision technology. Objects may be detected and tracked using a target detection algorithm (e.g., YOLO, fast R-CNN, etc.) or a target tracking algorithm (e.g., kalman filtering, optical flow tracking, etc.). For each object, the tracking algorithm will provide information on the position, velocity, etc. of the object, thereby generating motion trajectory data of the object. The tracked object position information is combined to form movement track data of the object, the data can be expressed as a time sequence, and the position and the speed of the object are recorded at each time point. The object motion trajectory data may be used for subsequent motion estimation and analysis. Using the generated object motion trajectory data and the object recognition bounding box, motion of the object is estimated between successive frames. Motion estimation algorithms (e.g., optical flow, sparse feature matching, convolutional neural networks, etc.) may be used to estimate displacement and rotation of objects between successive frames. By combining the motion trajectory data of the object with the recognition bounding box of the object at each point in time, the motion of the object can be estimated more accurately. The object motion estimated over successive frames is converted into a set of object motion vectors. Each vector contains information of displacement, velocity, acceleration, etc. of the object to describe the motion state of the object in space, and the set of motion vectors can be used for applications of analyzing the motion pattern of the object, predicting the future position of the object, etc.

In the embodiment of the invention, the dimension reduction technology (such as Principal Component Analysis (PCA), linear Discriminant Analysis (LDA) and the like) is used for carrying out dimension reduction on the object motion vector set. The purpose of dimension reduction is to reduce the dimensions of the vector set while preserving as much as possible the main features of the data set for subsequent trajectory update processing. The result of the dimension reduction will be continuous motion trajectory vectors of the object, each vector containing information about the position, velocity, etc. of the object in space. The kalman filter is a recursive filter for estimating the state of a system, and is particularly suitable for continuous state estimation with noise. And inputting the continuous motion track vector of the object subjected to the dimension reduction into a Kalman filter. The Kalman filter will use the current measurements and previous estimates to calculate the state of the system, thereby providing a more accurate state estimate. The continuous motion trajectory vector of the object is updated using a kalman filter. At each time step, the new measurement is input into the kalman filter, which will give an updated object state estimate. By continuously updating, a continuous motion trajectory update vector of the object is generated, which reflects the motion trajectory of the object over a continuous time, and takes into account the effects of measurement noise and system dynamics.

In the embodiment of the invention, the motion vector set of the object is obtained by acquiring the image sequence shot by the frame and using a motion detection algorithm or an optical flow algorithm. And performing data dimension reduction processing on the obtained object motion vector set, wherein a Principal Component Analysis (PCA) method and the like can be used to reduce the dimension of data and keep main motion information. Based on the dimension reduced data, constructing continuous motion track vectors involves interpolating or smoothing the motion vectors to obtain more continuous motion information. The continuous motion trajectory vector is updated using a kalman filter. The Kalman filter can help to estimate the actual motion state of the object, and meanwhile, the accuracy of the motion trail is improved by considering the measurement error. Based on the output of the Kalman filter, a continuous motion trajectory update vector of the object is generated, containing information about the current position and velocity of the object. The motion trajectory path of the object is constructed on overlapping images captured by frames using successive motion trajectory update vectors, including drawing a trajectory line on the images or generating a sequence of images representing the motion of the object. The constructed object motion trajectory path is converted into a data format, e.g., saved as a coordinate sequence or related data structure, which can be used for further analysis, visualization, or integration into other systems.

Preferably, step S3 comprises the steps of:

Step S31: setting a sensor for the three-dimensional object model through object motion track path data, and acquiring data by using the sensor to obtain sensor acquisition data; carrying out data preprocessing on the sensor acquisition data to generate standard sensor acquisition data, wherein the data preprocessing comprises data denoising, data calibration and data standardization;

Step S32: importing the acquired data of the standard sensor into a three-dimensional object model for data space registration to generate sensor registration data; carrying out data set division on the sensor registration data to generate a model training set and a model testing set; model training is carried out on the model training set by using a support vector machine algorithm, and a three-dimensional object depth training model is generated; performing model test on the three-dimensional object depth training model according to the model test set, so as to generate a three-dimensional object depth prediction model;

Step S33: the sensor registration data are imported into a three-dimensional object depth prediction model to carry out image depth fusion, and a three-dimensional object depth image is generated; performing depth map cavity filling on the three-dimensional object depth image to generate a three-dimensional object cavity filling map;

step S34: virtual object reality positioning is carried out according to the three-dimensional object cavity filling map, and virtual object reality position positioning data are generated; pointer setting is performed on the three-dimensional object model based on the virtual object reality position positioning data, thereby generating a virtual object position pointer.

According to the invention, the motion trail of the object can be more accurately known through the motion trail path data of the object, and a reliable data base is provided for the subsequent steps. The preprocessing steps such as data denoising, data calibration and data standardization can improve the quality of data acquired by the sensor, reduce noise and errors, and further improve the accuracy of subsequent model training and prediction. The model training set is trained by using a support vector machine algorithm, so that a model aiming at three-dimensional object depth prediction can be constructed, and the model can effectively predict the depth information of an object. Based on a model trained by a support vector machine algorithm, depth prediction can be performed on sensor acquisition data, so that a depth image of a three-dimensional object is generated. And filling the depth map holes into the depth image of the three-dimensional object, so that the holes in the depth image can be filled, and the integrity and usability of data are improved. Based on the three-dimensional object cavity filling map and the virtual object real position positioning data, a position indicator of the virtual object can be generated, and accurate positioning of the virtual object in the real world is achieved.

As an example of the present invention, referring to fig. 4, the step S3 in this example includes:

In the embodiment of the invention, the type of the sensor to be used, such as a camera, a laser radar, an Inertial Measurement Unit (IMU) and the like, is determined, and an appropriate sensor is selected according to specific requirements. And installing sensor equipment, positioning and calibrating to ensure that the position and the direction of the sensor are matched with the observation requirement of the motion track of the object. And starting the sensor equipment to acquire the motion trail data of the object. The environmental conditions in the data acquisition process are good, and factors such as illumination, temperature and the like cannot cause interference or errors on the data acquired by the sensor. And a proper denoising algorithm such as mean value filtering, median value filtering and the like is used for removing noise in the acquired data and improving the data quality. And calibrating the data acquired by the sensor, and eliminating the deviation caused by the inherent error of the sensor, the deviation of the installation position and the like. The calibration process involves adjusting the calibration parameters of the sensor. And the acquired data is subjected to standardized processing, so that the acquired data has uniform scale and range, and is beneficial to subsequent data analysis and processing. The data after noise reduction, calibration and normalization processes are integrated to form a standard sensor acquisition data set.

In the embodiment of the invention, the spatial consistency and accuracy of the data are ensured by registering the data acquired by the standard sensor with the three-dimensional object model. Common registration methods, such as point cloud registration or feature point matching, can be adopted to ensure that the data acquired by the sensor is consistent with the coordinate system of the object model. The registered sensor data set is divided into a model training set and a model testing set, and the data set is divided by adopting a cross-validation method or a leave-out method and the like, so that the data of the training set and the data of the testing set are ensured to be uniformly distributed. The model training set is trained using a Support Vector Machine (SVM) algorithm. In the training process, proper kernel functions (such as linear kernel, polynomial kernel or Gaussian kernel) and parameters are selected for model parameter tuning. After training is completed, a three-dimensional object depth training model is generated, which is capable of predicting depth information of an object in three-dimensional space from sensor data. And testing the trained three-dimensional object depth training model by using a model test set, and evaluating the performance of the model on unknown data. The evaluation index may include accuracy, precision, recall, etc., and error analysis of the depth prediction of the three-dimensional object. After model test, adjusting model parameters or an optimization algorithm according to the evaluation result, and finally generating the three-dimensional object depth prediction model.

In the embodiment of the invention, the sensor registration data is input into the three-dimensional object depth prediction model. The depth prediction model predicts depth information of the object surface based on the input sensor data. Various deep learning models, such as Convolutional Neural Networks (CNNs), deep Neural Networks (DNNs), etc., may be employed to select appropriate model structures and parameters according to particular needs. And fusing the sensor data with the model prediction result to generate a depth image of the three-dimensional object. And filling depth map holes in the generated three-dimensional object depth image, and filling missing parts and holes in the depth image. Various filling algorithms, such as an image segmentation-based method, an interpolation-based method, etc., may be used, and an appropriate algorithm is selected according to the actual situation. The depth information of adjacent pixels can be used for deduction and filling in the filling process so as to maintain the continuity and accuracy of filling results. In the filling process, object boundary information and semantic information can be combined, so that the filling effect and accuracy are improved. After filling, a three-dimensional object cavity filling map is generated, wherein the map contains complete object surface depth information and fills a cavity part in an original depth image. The cavity filling map can be used as input data of subsequent applications for three-dimensional object recognition, scene reconstruction, virtual reality and other applications.

In the embodiment of the invention, the depth information and the geometric structure in the map are filled by utilizing the three-dimensional object cavity, so that the virtual object is positioned in reality. The position and orientation of the three-dimensional object in space can be identified using computer vision techniques such as feature point matching, stereo vision, and the like. The position and posture of the virtual object in the real world are deduced by using the input of sensor data, camera data and the like and combining a positioning algorithm. After the accurate position of the virtual object in the real world is determined, real position positioning data of the virtual object is generated, wherein the real position positioning data comprise position coordinates, rotation angles and other information. And applying the generated virtual object reality position positioning data to the three-dimensional object model. And accurately placing the virtual object position indicator at a corresponding position in the three-dimensional model according to the positioning data. The virtual object position indicator may be displayed in the real scene by a graphics rendering technique so as to be fused with the real environment. The style and shape of the pointer may be designed and customized as needed to ensure ease of recognition and use in a real world environment.

Preferably, step S34 includes the steps of:

Step S341: performing feature point matching on the three-dimensional object cavity filling map according to the object feature vector to generate object feature point matching data; performing real world location mapping based on the object feature point matching data to generate virtual object real location positioning data;

Step S342: carrying out gesture estimation through the object feature point matching data and the virtual object real position positioning data to generate virtual object gesture estimation data;

Step S343: projecting the three-dimensional object model to a three-dimensional object cavity filling map according to the virtual object posture estimation data to perform virtual object projection, and generating a virtual object projection image; the method comprises the steps that camera deployment is conducted based on virtual object projection images, and user information capturing is conducted through the cameras, so that user interaction information data are obtained;

Step S344: performing user interaction perception analysis on the virtual object projection image according to the user interaction information data to generate user interaction behavior detection data; performing interactive feedback generation based on the user interaction behavior detection data to obtain interactive feedback information data; and setting the indicator on the three-dimensional object model through the interactive feedback information data, so as to generate a virtual object position indicator.

The invention can help the system identify the object in the real world by the object feature point matching data, and realize the corresponding relation between the virtual object and the real world. The virtual object real position positioning data provides accurate position information of the virtual object in the real world for the subsequent steps by performing position mapping on the feature point matching data. The virtual object posture estimation data can accurately reflect the direction and the posture of the virtual object in the real world, so that the projection of the virtual object in the real world is more real and accurate. The system can sense the position and the behavior of the user in real time through the virtual object projection image and the user information data captured by the camera, so that more intelligent and interactive user experience is realized, and basic data is provided for subsequent user interaction analysis and feedback. Analysis of the user interaction information data may help the system understand the intent and behavior of the user, providing personalized and immediate interaction feedback. The interactive feedback information data provides more visual and effective guidance for the user by setting the indicator on the three-dimensional object model, and enhances the communication and interaction between the user and the virtual object.

In the embodiment of the invention, the object feature vector comprises features such as color, shape, texture and the like. The three-dimensional object cavity filling map may be feature point extracted and matched using computer vision techniques such as feature extraction algorithms (e.g., SIFT, SURF, etc.) to generate object feature point matching data. Based on the object feature point matching data, a position mapping algorithm (e.g., least squares method, RANSAC, etc.) is used to map the position of the virtual object in three-dimensional space into the real world, generating virtual object real position location data. And estimating the pose of the virtual object by using the object feature point matching data and the virtual object reality position positioning data and combining a pose estimation algorithm (such as a PnP algorithm, an iterative closest point algorithm and the like) to generate virtual object pose estimation data. And projecting the three-dimensional object model into the three-dimensional object cavity filling map according to the virtual object posture estimation data, and generating a virtual object projection image. And deploying a camera and capturing user information, such as user positions, gestures and the like, through the camera to obtain user interaction information data. And performing user interaction perception analysis, such as gesture recognition, behavior analysis and the like, by using the user interaction information data to generate user interaction behavior detection data. And generating interactive feedback information data, such as displaying an indicator on a virtual object projection image, providing a sound prompt and the like, according to the user interactive behavior detection data so as to realize user interactive feedback. And finally, setting an indicator on the three-dimensional object model through the interactive feedback information data to generate a virtual object position indicator for indicating the interactive position or mode of the user and the virtual object.

Preferably, step S4 comprises the steps of:

step S41: real-time dynamic virtual element fusion is carried out through a virtual object position indicator, and a dynamic virtual element synthetic diagram is generated;

Step S42: performing interactive interface design on the dynamic virtual element synthetic diagram to generate an interactive interface design scheme; user behavior identification is carried out through the design scheme of the interactive interface, and user behavior identification data are obtained; performing user gesture recognition on the user behavior recognition data to obtain user gesture recognition data;

Step S43: constructing a user interaction three-dimensional model based on the user behavior recognition data and the user gesture recognition data, and generating a user interaction three-dimensional model; and carrying out user feedback monitoring through the user interaction behavior three-dimensional model to generate the user intelligent interaction display interface.

According to the invention, the fusion of the virtual element and the real scene can be realized through the virtual object position indicator, so that the virtual element looks like to interact with an object in the real world, the vivid feeling of the virtual element is enhanced, and the user experience is improved. Through the interactive interface design of the dynamic virtual element synthetic diagram, an interface conforming to the habit and intuition of a user can be designed, and the acceptance and the use efficiency of the user to the system are improved. Through the user behavior recognition data and the user gesture recognition data, the system can more accurately understand the behavior intention of the user, thereby providing a more intelligent and personalized interaction experience. By constructing the user interaction behavior three-dimensional model based on the user behavior recognition data and the user gesture recognition data, the behavior of the user can be more accurately simulated, and more realistic virtual interaction experience is provided for the user. The user feedback monitoring is carried out through the three-dimensional model of the user interaction behavior, the system can dynamically adjust the interaction interface according to feedback information of the user, intelligent interaction display is achieved, and user experience and interaction effect are improved.

In the embodiment of the invention, the virtual object position indicator can be a sensor, a camera and other devices, so as to capture the object position information in the real world. The captured object position information in the real world is fused with the position information of the virtual element, and the real-time fusion can be realized by adopting a computer vision technology or a deep learning method. A dynamic virtual element composition map is generated, i.e. virtual elements are composed with the real scene, making it appear as if it interacted with objects in the real world. The interactive interface design is carried out on the dynamic virtual element synthetic graph, and graphic design software such as Adobe XD, sktech and the like can be adopted for design. And designing an interactive interface according to the user requirements and the interactive mode, and considering user experience and interface friendliness. And designing an interactive interface scheme, including layout, style, interaction mode and the like of interface elements. User behavior recognition is performed based on the interactive interface design scheme, and actions and behaviors of the user can be recognized by using a computer vision technology or a deep learning method. User gesture recognition is performed on the user behavior recognition data, and gesture information of the user can be recognized using joint detection or gesture estimation techniques. The user interaction behavior three-dimensional model is constructed based on the user behavior recognition data and the user gesture recognition data, and can be modeled by using three-dimensional modeling software such as Blender, maya and the like. The user feedback monitoring is carried out through the three-dimensional model of the user interaction behavior, real-time feedback information of the user can be captured by utilizing a sensor or a camera, and the interactive interface and the demonstration of the virtual element are adjusted according to the user feedback.

Preferably, step S41 includes the steps of:

Step S411: virtual element selection is carried out in a preset virtual element library based on the virtual object position indicator, and virtual element selection data are obtained; performing virtual element parameter adjustment on the virtual element selection data according to the object motion track path data to generate virtual element adjustment parameter data;

step S412: performing virtual element environment interaction on the three-dimensional object model through the virtual element adjustment parameter data to generate virtual element environment interaction data; analyzing the real-time environment illumination data of the virtual element environment interaction data to obtain real-time environment illumination data;

Step S413: performing environment illumination consistency adjustment on the virtual element adjustment parameter data through the real-time environment illumination data to generate virtual environment illumination adjustment data; performing virtual element detail enhancement on the virtual object projection image by utilizing the virtual environment illumination adjustment data to generate a virtual element enhanced image;

Step S414: and carrying out real-time dynamic virtual element fusion on the virtual element enhanced image according to the deep neural network algorithm to generate a dynamic virtual element synthetic image.

The invention selects the proper virtual element in the virtual element library through the virtual object position indicator, can ensure that the selected virtual element is matched with the object in the real world, and enhances the sense of reality and fidelity of the virtual element. The parameters of the selected virtual element are adjusted according to the path data of the motion track of the object, and the parameters of the size, the shape, the color and the like of the virtual element can be adjusted according to the path and the speed of the motion of the object, so that the virtual element is matched with the motion of the object, and the fidelity and the interactivity of the virtual element are enhanced. The three-dimensional object model is interacted with the environment through the parameter data regulated by the virtual elements, so that the virtual elements interact with the surrounding environment, such as the elements of light, shadow, reflection and the like, and the sense of reality and fidelity of the virtual elements are enhanced. Real-time ambient illumination data parsing can ensure that virtual elements are matched with illumination conditions in a real environment, so that the virtual elements look more realistic. The environment illumination consistency adjustment is carried out on the virtual element adjustment parameters through the real-time environment illumination data, so that the consistency of the virtual element and the environment illumination can be ensured, and the integral sense and the sense of reality of the whole scene are enhanced. The virtual environment illumination adjustment data is utilized to carry out detail enhancement on the virtual object projection image, so that the details and texture of virtual elements can be enhanced, and the virtual object projection image is more real and vivid. The real-time dynamic virtual element fusion is carried out on the virtual element enhanced image through the deep neural network algorithm, so that the virtual element and the real-world image can be seamlessly fused, and a dynamic virtual element synthetic image with high fidelity and sense of reality is generated.

In the embodiment of the invention, the virtual object position indicator is designed in a manner of gesture control, camera capturing and the like. And establishing a virtual element library, wherein the virtual element library comprises information such as models, textures, materials and the like of various virtual elements. And selecting a proper virtual element in the virtual element library by using the position indicator, and acquiring virtual element selection data. And carrying out parameter adjustment on the selected virtual elements by utilizing the object motion track path data, and adjusting the attributes of the elements according to the tracks by using a mathematical model or algorithm. The virtual element adjustment parameter data is applied to the three-dimensional object model to interact with objects in the real environment. Virtual element environment interaction data is generated using rendering techniques. And analyzing the ambient illumination data in real time, and capturing the illumination condition of the surrounding environment by using devices such as a sensor, a camera and the like. And adjusting the illumination attribute of the virtual element according to the real-time environment illumination data so as to maintain consistency. The virtual object is projected by utilizing the virtual environment illumination adjustment data, and details of virtual elements including shadows, highlights and the like are enhanced. The virtual element enhanced image is dynamically virtual element fused in real time using a deep neural network algorithm, such as generation of a countermeasure network (GAN). The neural network is trained to learn the fusion rule of the virtual elements and the real scene. In real-time application, the trained neural network is applied to the virtual element enhanced image to generate a dynamic virtual element synthetic graph. The steps are integrated into a system, so that smooth data transmission and processing are ensured. Performance optimization is performed, real-time performance and stability are ensured, and a user interface design is required so that a user can interact with the virtual element.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An intelligent interaction deep display method based on an AR technology is characterized by comprising the following steps:

2. The AR technology-based intelligent interactive deep display method according to claim 1, wherein the step S1 comprises the steps of:

Step S11: acquiring a real-time environment image by using a camera;

3. The AR technology-based intelligent interactive deep display method according to claim 2, wherein step S14 comprises the steps of:

4. The AR technology-based intelligent interactive deep display method according to claim 1, wherein step S2 comprises the steps of:

5. The AR technology-based intelligent interactive deep display method according to claim 4, wherein the step S24 comprises the steps of:

6. The AR technology-based intelligent interactive deep display method according to claim 1, wherein step S3 comprises the steps of:

7. The AR technology-based intelligent interactive deep display method according to claim 6, wherein step S34 comprises the steps of:

8. The AR technology-based intelligent interactive deep display method according to claim 1, wherein step S4 comprises the steps of:

9. The AR technology-based intelligent interactive deep presentation method according to claim 8, wherein step S41 comprises the steps of:

10. An intelligent deep interactive display system based on AR technology, for executing the intelligent deep interactive display method based on AR technology as claimed in claim 1, the intelligent deep interactive display system based on AR technology comprising: