WO2022267781A1 - 建模方法及相关电子设备及存储介质 - Google Patents

建模方法及相关电子设备及存储介质 Download PDF

Info

Publication number
WO2022267781A1
WO2022267781A1 PCT/CN2022/093934 CN2022093934W WO2022267781A1 WO 2022267781 A1 WO2022267781 A1 WO 2022267781A1 CN 2022093934 W CN2022093934 W CN 2022093934W WO 2022267781 A1 WO2022267781 A1 WO 2022267781A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
terminal device
images
modeling
frame
Prior art date
Application number
PCT/CN2022/093934
Other languages
English (en)
French (fr)
Inventor
贺勇
蔡波
许立升
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22827279.5A priority Critical patent/EP4343698A1/en
Publication of WO2022267781A1 publication Critical patent/WO2022267781A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/10Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/24Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/12Bounding box

Definitions

  • the present application relates to the field of three-dimensional reconstruction, in particular to a modeling method and related electronic equipment and storage media.
  • 3D reconstruction applications/software can be used to model objects in 3D.
  • users need to first use mobile tools (such as mobile phones, cameras, etc.) to collect data (such as pictures, depth information, etc.)
  • the data required for the 3D modeling of the object is reconstructed in 3D to obtain the corresponding 3D model of the object.
  • the process of collecting data required for 3D modeling and the process of reconstructing objects in 3D according to the collected data required for 3D modeling are relatively complicated, and require high equipment hardware.
  • the process of collecting the data required for 3D modeling requires the collection device (such as the above-mentioned mobile terminal tool) to be equipped with a special device such as a laser radar (light detection and ranging, LIDAR) sensor or an RGB depth (RGB-D) camera.
  • the process of 3D reconstruction of objects according to the collected data required for 3D modeling requires a processing device running 3D reconstruction applications to be equipped with a high-performance independent graphics card.
  • the embodiment of the present application provides a modeling method, related electronic equipment and storage medium, which simplifies the process of collecting data required for 3D modeling and the process of reconstructing objects in 3D according to the collected data required for 3D modeling , and has low requirements for device hardware.
  • an embodiment of the present application provides a modeling method, the method is applied to a terminal device, and the method includes:
  • the terminal device displays a first interface, where the first interface includes a photographed screen of the terminal device.
  • the terminal device collects multiple frames of images corresponding to the target object to be modeled, and acquires an association relationship among the multiple frames of images.
  • the terminal device acquires a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
  • the terminal device displays a three-dimensional model corresponding to the target object.
  • the terminal device displays a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
  • the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and acquiring the association relationship between the multiple frames of images includes:
  • the terminal device When the terminal device is in the first pose, the terminal device collects the first image and changes the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, the terminal device collects the second image and changes the display effect of the patch corresponding to the first image.
  • the terminal device may determine the matching information of the key frame according to the association relationship between the patch corresponding to the key frame and other patches.
  • the relationship between the patch corresponding to the key frame and other patches may include: in the patch model, which patches correspond to the top, bottom, left, and right directions of the patch corresponding to the key frame.
  • the mobile phone can determine that other keyframes associated with the frame keyframe include the keyframe corresponding to patch 21, the keyframe corresponding to patch 20, and the keyframe corresponding to patch 2. the corresponding keyframe. Therefore, the mobile phone can obtain the matching information of the key frame of the frame, including the identification information of the key frame corresponding to the patch 21, the identification information of the key frame corresponding to the patch 20, and the identification information of the key frame corresponding to the patch 2.
  • the terminal device only relies on the ordinary RGB camera to collect the data required for 3D modeling to achieve 3D modeling, and the process of collecting the data required for 3D modeling , does not need to rely on special hardware such as LIDAR sensors or RGB-D cameras in terminal devices.
  • the terminal device obtains the 3D model corresponding to the target object according to the multi-frame images and the correlation between the multi-frame images, which can effectively reduce the calculation load of the 3D modeling process and improve the efficiency of 3D modeling.
  • the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final 3D model on the terminal device. For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
  • the terminal device includes a first application, and before the terminal device displays the first interface, the method further includes: the terminal device displays a second interface in response to an operation of opening the first application.
  • the terminal device displaying the first interface includes: the terminal device displays the first interface in response to an operation of starting the 3D modeling function of the first application on the second interface.
  • the second interface may include a function control for starting the 3D modeling function
  • the user may click or touch the function control on the second interface
  • the mobile phone may respond to the operation of the user clicking or touching the function control on the second interface to start the 3D modeling function of the first application. That is, the user's operation of clicking or touching the functional control on the second interface is an operation of starting the 3D modeling function of the first application on the second interface.
  • the first virtual bounding volume includes one or more layers, and the plurality of surface patches are distributed on the one or more layers.
  • the structure of the mesh model may include upper and lower layers, and each layer may include multiple meshes.
  • the structure of the mesh model may include upper, middle and lower layers, and each layer may include multiple meshes.
  • the structure of the mesh model may be a one-layer structure composed of multiple meshes. No limitation is imposed here.
  • the method further includes: displaying first prompt information on the terminal device, where the first prompt information is used to remind the user to place the position of the target object in the shooting picture at a central position.
  • the first prompt information may be "please place the target object at the center of the screen”.
  • the method further includes: the terminal device displays second prompt information; the second prompt information is used to remind the user to adjust the shooting environment where the target object is located, the way of shooting the target object, and the One or more of the screen-to-body ratios of the target object.
  • the second prompt information may be "put the object still on a solid-color plane, with soft lighting, and shoot around the object, and make the screen-to-body ratio of the object as large and complete as possible.”
  • the subsequent data collection process can be faster, and the quality of the collected data can be better.
  • the method further includes: the terminal device detects the operation of generating the 3D model ; The terminal device displays third prompt information in response to the operation of generating the three-dimensional model, where the third prompt information is used to prompt the user that the target object is being modeled.
  • the third prompt information may be "modeling in progress”.
  • the method further includes: the terminal device displays fourth prompt information, The fourth prompt information is used to prompt the user that the modeling of the target object has been completed.
  • the fourth prompt information may be "modeling completed”.
  • the terminal device displaying the 3D model corresponding to the target object further includes: the terminal device responds to the operation of changing the display angle of the 3D model corresponding to the target object, changing the display angle of the 3D model corresponding to the target object;
  • the operation of the display angle of the 3D model corresponding to the object includes the operation of dragging the 3D model corresponding to the target object to rotate clockwise or counterclockwise along the first direction.
  • the first direction may be any direction, such as a horizontal direction, a vertical direction, and the like.
  • the terminal device changes the display angle of the 3D model corresponding to the target object, so as to achieve the effect of presenting the 3D model to the user at different angles.
  • the terminal device displaying the 3D model corresponding to the target object further includes: the terminal device responds to the operation of changing the display size of the 3D model corresponding to the target object, changing the display size of the 3D model corresponding to the target object;
  • the operation of displaying the size of the three-dimensional model corresponding to the object includes the operation of zooming in or out the three-dimensional model corresponding to the target object.
  • the zoom-out operation may be an operation in which the user uses two fingers to slide inward (in the opposite direction) on the 3D model preview interface
  • the zoom-in operation may be an operation in which the user uses two fingers to slide outward (in the opposite direction) on the 3D model preview interface.
  • the 3D model preview interface is also an interface where the terminal device displays the 3D model corresponding to the target object.
  • the zoom-in or zoom-out operation performed by the user on the 3D model corresponding to the target object can also be a double-click operation, a long-press operation, or, the 3D model preview interface can also include a zoom-in or zoom-out operation.
  • the function control etc. of the operation are not limited here.
  • the association relationship between the multiple frames of images includes matching information of each frame of images in the multiple frames of images; the matching information of each frame of images includes the matching information between the multiple frames of images and the The identification information of other images associated with the image; the matching information of each frame of the image is based on the association relationship between the image of each frame and the patch corresponding to the image of each frame, and the relationship between the multiple patches relationship obtained.
  • the identification information of other key frames associated with the key frame with picture number 18 is the picture number of other key frames associated with the key frame with picture number 18, such as 26 , 45, 59, 78, 89, 100, 449, etc.
  • the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtaining the association relationship between the multiple frames of images further includes: the terminal device determines the target object according to the shooting picture; When the terminal device captures multiple frames of images, the position of the target object in the shooting picture is the central position of the shooting picture.
  • the terminal device collects multiple frames of images corresponding to the target object to be modeled, including: during the process of shooting the target object, the terminal device performs blur detection on each frame of the captured Images larger than the first threshold are used as images corresponding to the target object.
  • the terminal device can obtain some key frame pictures with better quality by performing fuzzy detection on the pictures taken at the shooting position, and the key frame pictures are corresponding to the target object.
  • Image the number of key frame images corresponding to each patch can be one or more.
  • displaying the three-dimensional model corresponding to the target object on the terminal device includes: displaying the three-dimensional model corresponding to the target object by the terminal device in response to an operation of previewing the three-dimensional model corresponding to the target object.
  • the terminal device may display a view button, and the user may click the view button, and the terminal device may display the 3D model corresponding to the target object in response to the user's operation of clicking the view button.
  • the user's operation of clicking the view button is an operation of previewing the 3D model corresponding to the target object.
  • the three-dimensional model corresponding to the target object includes a basic three-dimensional model of the target object and a texture of the surface of the target object.
  • the texture on the surface of the target object may be a texture map on the surface of the target object.
  • the 3D model of the target object can be generated according to the basic 3D model of the target object and the texture of the surface of the target object. Mapping the texture of the surface of the target object onto the surface of the basic 3D model of the target object in a specific way can restore the surface of the target object more realistically and make the target object look more real.
  • the terminal device is connected to the server; the terminal device obtains the three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, including: the terminal device sends the server Sending the multi-frame images and the association relationship between the multi-frame images; the terminal device receives the 3D model corresponding to the target object sent from the server.
  • this design the process of generating the three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images can be completed on the server side. That is to say, this design can realize 3D modeling by utilizing the computing resources of the server. This design can be applied to some scenarios where the computing power of some terminal devices is weak, and the universality of the 3D modeling method can be improved.
  • the method further includes: the terminal device sends camera intrinsic parameters, gravity direction information, image name, image number, camera pose information, and time stamps respectively corresponding to the multiple frames of images to the server.
  • the method further includes: the terminal device receiving an indication message from the server, where the indication message is used to indicate to the terminal device that the server has completed modeling of the target object.
  • the terminal device may display the fourth prompt information after receiving the indication message.
  • the method further includes: the terminal device sends a download request message to the server, and the download request message is used to request the server to download the 3D model corresponding to the target object. Model.
  • the server may send the three-dimensional model corresponding to the target object to the terminal device.
  • the terminal device when the terminal device collects the data required for 3D modeling of the target object, it can also display the scanning progress on the first interface.
  • the first interface may include a scan button, and the terminal device may display the scan progress through the circular black filling effect of the scan button on the first interface.
  • the UI presentation effect of the scan button is different, and the way the mobile phone displays the scan progress on the first interface can be different, which is not limited here.
  • the terminal device may also not need to display the scanning progress, and the user can know the scanning progress according to the lighting of the patch in the first virtual enclosure.
  • an embodiment of the present application provides a modeling device, which can be applied to a terminal device, and used to implement the modeling method described in the first aspect above.
  • the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
  • the hardware or software includes one or more modules or units corresponding to the above functions, for example, the device may include: a display unit and a processing unit. The display unit and the processing unit can be used to cooperate to implement the modeling method described in the first aspect above.
  • the display unit is configured to display a first interface, where the first interface includes a captured image of the terminal device.
  • the processing unit is configured to, in response to the collection operation, collect multiple frames of images corresponding to the target object to be modeled, and acquire a correlation between the multiple frames of images. Acquiring a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
  • the display unit is also used to display the three-dimensional model corresponding to the target object.
  • the display unit is further configured to display a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
  • the processing unit is specifically configured to collect the first image when the terminal device is in the first pose, and change the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, collect the second image, and change The display effect of the patch corresponding to the second image; after changing the display effect of the plurality of patches of the first virtual bounding volume, acquiring the association relationship between the multiple frames of images according to the plurality of patches.
  • the display unit and the processing unit are further configured to implement other display functions and processing functions in the method described in the first aspect above, which will not be repeated here.
  • the terminal device described in the first aspect above sends the multi-frame images and the association relationship between the multi-frame images to the server, and the server sends the multi-frame images according to the multi-frame images and the multi-frame images
  • the modeling device may also include a sending unit and a receiving unit, and the sending unit is used to send the multi-frame images and the information between the multi-frame images to the server.
  • the association relationship, the receiving unit is used to receive the 3D model corresponding to the target object sent from the server.
  • an embodiment of the present application provides an electronic device, including: a processor; a memory; and a computer program; wherein the computer program is stored in the memory, and when the computer program is executed by the processor , so that the electronic device implements the method described in the first aspect and any possible implementation manner of the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the method described in the first aspect. and the method described in any possible implementation manner of the first aspect.
  • the embodiment of the present application further provides a computer program product, including computer readable code, when the computer readable code is run in the electronic device, the electronic device realizes any of the first aspect and the first aspect.
  • a computer program product including computer readable code, when the computer readable code is run in the electronic device, the electronic device realizes any of the first aspect and the first aspect. The method described in one possible implementation.
  • the embodiment of the present application also provides a modeling method, the method is applied to a server, and the server is connected to the terminal device; the method includes: the server receives the multi-frame image corresponding to the target object sent from the terminal device and the association relationship between the multi-frame images; the server generates a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images; the server sends the three-dimensional model corresponding to the target object to the terminal device .
  • the method can utilize computing resources of the server to implement 3D modeling, and the process of generating a 3D model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images can be completed on the server side.
  • the server performs 3D modeling in combination with the association relationship among the multiple frames of images, which can effectively reduce the computing load of the server and improve modeling efficiency.
  • the server when the server performs 3D modeling, it only needs to combine the association relationship between the multiple frames of images, and perform feature detection and matching on each frame of images and other images associated with the image.
  • the image is feature detected and matched against all other images. In this way, two adjacent frames of images can be quickly compared, which can effectively reduce the computing load of the server and improve the efficiency of 3D modeling.
  • the server can combine the matching information of the first frame image to quickly and accurately determine the texture associated with the first frame image.
  • the mapping relationship between the texture of other images and the surface of the basic 3D model of the target object can be combined for each subsequent frame of image.
  • the server can combine the matching information of the image to quickly and accurately determine the mapping relationship between the texture of other images associated with the image and the surface of the basic 3D model of the target object.
  • the method can be applied to some scenarios where the computing power of some terminal devices is weak, and the universality of the 3D modeling method can be improved.
  • this method also has other beneficial effects described in the first aspect above, such as: the process of collecting data required for 3D modeling does not need to rely on the terminal device having a special device such as a LIDAR sensor or an RGB-D camera. hardware.
  • the server obtains the 3D model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, which can effectively reduce the calculation load of the 3D modeling process and improve the efficiency of the 3D modeling, etc., and will not be repeated here.
  • the method further includes: the server receives camera intrinsic parameters, gravity direction information, image name, image number, camera pose information, and time stamps respectively corresponding to the multi-frame images sent from the terminal device.
  • the server generates a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, including: the server according to the multi-frame images and the association relationship between the multi-frame images,
  • the multi-frame images respectively correspond to camera internal parameters, gravity direction information, image name, image number, camera pose information, and time stamps to generate a three-dimensional model corresponding to the target object.
  • the association relationship between the multiple frames of images includes matching information of each frame of images in the multiple frames of images; the matching information of each frame of images includes the matching information of the multiple frames of images with the image The identification information of other associated images; the matching information of each frame of the image is based on the association relationship between the image of each frame and the patch corresponding to the image of each frame, and the relationship between the multiple patches The relationship is obtained.
  • the embodiment of the present application provides a modeling device, which can be applied to a server to implement the modeling method described in the sixth aspect.
  • the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
  • Hardware or software includes one or more modules or units corresponding to the above functions, for example, the device may include: a receiving unit, a processing unit, and a sending unit. The receiving unit, the processing unit and the sending unit may be used to cooperate to implement the modeling method described in the sixth aspect.
  • the receiving unit may be configured to receive multiple frames of images corresponding to the target object sent from the terminal device and an association between the multiple frames of images.
  • the processing unit may be configured to generate a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
  • the sending unit may be used to send the three-dimensional model corresponding to the target object to the terminal device.
  • the receiving unit, the processing unit, and the sending unit may be used to implement all the functions that the server in the method described in the sixth aspect may implement, and details will not be repeated here.
  • the embodiment of the present application provides an electronic device, including: a processor; a memory; and a computer program; wherein the computer program is stored in the memory, and when the computer program is executed by the processor , so that the electronic device implements the sixth aspect and the method described in any possible implementation manner of the sixth aspect.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the sixth aspect. and the method described in any possible implementation manner of the sixth aspect.
  • the embodiment of the present application also provides a computer program product, including computer readable code, when the computer readable code is run in the electronic device, the electronic device can realize any of the sixth aspect and the sixth aspect.
  • a computer program product including computer readable code
  • the electronic device can realize any of the sixth aspect and the sixth aspect. The method described in one possible implementation.
  • the embodiment of the present application further provides a device-cloud collaboration system, including: a terminal device and a server, the terminal device is connected to the server; the terminal device displays a first interface, and the first interface includes The shooting picture of the terminal device; the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtains the correlation between the multiple frames of images; wherein, when collecting the target During the process of multiple frames of images corresponding to the object, the terminal device displays a first virtual bounding volume; the first virtual bounding volume includes a plurality of patches; the terminal device captures the target object to be modeled in response to the collection operation Corresponding multi-frame images, and obtaining the association relationship between the multi-frame images includes: when the terminal device is in the first pose, the terminal device collects the first image, and changes the corresponding The display effect of the patch; when the terminal device is in the second pose, the terminal device collects a second image, and changes the display effect of the patch corresponding to the second image
  • FIG. 1 is a schematic diagram of the composition of the device-cloud collaboration system provided by the embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a terminal device provided in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of the main interface of the mobile phone provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of the main interface of the first application provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • FIG. 6 is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
  • FIG. 7A is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
  • FIG. 7B is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
  • FIG. 7C is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
  • FIG. 7D is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • FIG. 7E is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
  • FIG. 7F is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a patch model provided by an embodiment of the present application.
  • Fig. 9 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • FIG. 10 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • Fig. 11 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • Fig. 12 is a schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • FIG. 13 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • Fig. 14 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • Fig. 15 is a schematic diagram of the user performing counterclockwise rotation operation along the horizontal direction on the 3D model of the toy car provided by the embodiment of the present application;
  • FIG. 16 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • FIG. 17 is a schematic diagram of the user performing a zoom-out operation on the 3D model of the toy car provided by the embodiment of the present application;
  • FIG. 18 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • Fig. 19 is a schematic flow chart of the 3D modeling method provided by the embodiment of the present application.
  • FIG. 20 is a logical schematic diagram of a 3D modeling method implemented by a device-cloud collaboration system provided in an embodiment of the present application
  • Fig. 21 is a schematic structural diagram of a modeling device provided by an embodiment of the present application.
  • Fig. 22 is another structural schematic diagram of the modeling device provided by the embodiment of the present application.
  • Fig. 23 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
  • references to "one embodiment” or “some embodiments” or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically stated otherwise.
  • the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
  • the term “connected” includes both direct and indirect connections, unless otherwise stated.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • Three-dimensional (3-dimension, 3D) reconstruction technology is widely used in virtual reality (virtual reality), augmented reality (augmented reality), extended reality (extended reality, XR), mixed reality (mixed reality, MR), games, film and television, education, medical and other fields.
  • 3D reconstruction technology can be used to model characters, props, vegetation, etc. in games, or to model character models in film and television, or to realize chemical analysis structure-related modeling in the field of education, and in medical Realize the modeling related to human body structure in the field.
  • 3D reconstruction applications/software that can be used to realize 3D modeling need to be implemented on a personal computer (PC) (such as a computer), and a small number of 3D reconstruction applications can realize 3D on a mobile terminal (such as a mobile phone). modeling.
  • the 3D reconstruction application on the PC side realizes 3D modeling
  • the user needs to use mobile tools (such as mobile phones, cameras, etc.) to collect the data required for 3D modeling (such as pictures, depth information, etc.), and upload the collected data to the PC side, and then the 3D reconstruction application on the PC side can perform 3D modeling processing based on the uploaded data.
  • the 3D reconstruction application on the mobile terminal implements 3D modeling, the user can directly use the mobile terminal to collect the data required for 3D modeling, and the 3D reconstruction application on the mobile terminal can directly perform 3D modeling processing based on the data collected by the mobile terminal.
  • the user uses the mobile terminal to collect the data required for 3D modeling, it must rely on the mobile terminal's laser radar (light detection and ranging, LIDAR) sensor or RGB depth (RGB depth, RGB-D) camera and other special hardware, the data acquisition process required for 3D modeling requires high hardware requirements.
  • the 3D reconstruction application of the PC/mobile terminal realizes 3D modeling, and the hardware requirements of the PC/mobile terminal are also relatively high.
  • the PC/mobile terminal may be required to be equipped with a high-performance independent graphics card.
  • the above-mentioned method of realizing 3D modeling on the PC side is relatively cumbersome. For example, after the user performs relevant data collection operations on the mobile side, the user not only needs to copy the collected data or The PC side performs related modeling operations on the 3D reconstruction application.
  • the embodiment of the present application provides a 3D modeling method, which can be applied to a device-cloud collaboration system composed of a terminal device and a cloud.
  • the "device" of the device-cloud collaboration refers to the terminal device
  • the “cloud” refers to the cloud, which can also be called a cloud server or a cloud platform.
  • the terminal device can collect the data required for 3D modeling, and after preprocessing the data required for 3D modeling, upload the preprocessed data required for 3D modeling to the cloud;
  • the preprocessed data required for 3D modeling is obtained for 3D modeling;
  • the terminal device can download the 3D model obtained from the cloud for 3D modeling from the cloud, and provide a preview function for the 3D model.
  • the terminal device can realize 3D modeling only by relying on the ordinary RGB camera to collect the data required for 3D modeling.
  • the process of collecting the data required for 3D modeling does not need to rely on the terminal device having Special hardware such as LIDAR sensors or RGB-D cameras; the process of 3D modeling is completed on the cloud, and there is no need to rely on high-performance discrete graphics cards configured on terminal devices. That is, the 3D modeling method has relatively low requirements on the hardware of the terminal device.
  • the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final model on the terminal device.
  • 3D model of For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
  • FIG. 1 is a schematic composition diagram of a device-cloud collaboration system provided by an embodiment of the present application.
  • the device-cloud collaboration system provided by the embodiment of the present application may include: a cloud 100 and a terminal device 200, and the terminal device 200 may be connected to the cloud 100 through a wireless network.
  • the cloud 100 is also a server.
  • the cloud 100 may be a single server or a server cluster composed of multiple servers, and the present application does not limit the implementation architecture of the cloud 100 .
  • the terminal device 200 may be an interactive electronic whiteboard with a shooting function, a mobile phone, a wearable device (such as a smart watch, a smart bracelet, etc.), a tablet computer, a notebook computer, a desktop computer, a portable Electronic equipment (such as laptop computer, Laptop), ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA), smart TV (such as smart screen), car computer, Smart speakers, augmented reality (augmented reality, AR) devices, virtual reality (virtual reality, VR) devices, and other smart devices with display screens, or digital cameras, SLR cameras/mirror cameras, and action cameras , pan-tilt camera, unmanned aerial vehicle and other professional shooting equipment, the embodiment of the present application does not limit the specific type of terminal equipment.
  • a wearable device such as a smart watch, a smart bracelet, etc.
  • a tablet computer such as a smart watch, a smart bracelet, etc.
  • notebook computer such as a smart watch, a smart bracelet
  • the terminal device when the terminal device is a shooting device such as a pan-tilt camera or a drone, it will also include a display device that can provide a shooting interface for displaying a collection interface for collecting data required for 3D modeling and a preview of the 3D model. interface etc.
  • the display device of the pan-tilt camera can be a mobile phone
  • the display device of the aerial drone can be a remote control device, etc.
  • a terminal device 200 is exemplarily shown in FIG. 1 .
  • the terminal device 200 in the device-cloud collaboration system may include one or more terminal devices 200, and the multiple terminal devices 200 may be the same, different or partly the same, which are not limited herein.
  • the 3D modeling method provided in the embodiment of the present application is a process for realizing 3D modeling through interaction between each terminal device 200 and the cloud 100 .
  • FIG. 2 is a schematic structural diagram of the terminal device provided in the embodiment of the present application.
  • the mobile phone can include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (universal serial bus, USB) interface 230, a charging management module 240, a power management module 241, a battery 242, and an antenna 1.
  • a processor 210 an external memory interface 220
  • an internal memory 221 a universal serial bus (universal serial bus, USB) interface 230
  • a charging management module 240 a power management module 241, a battery 242, and an antenna 1.
  • USB universal serial bus
  • Antenna 2 mobile communication module 250, wireless communication module 260, audio module 270, speaker 270A, receiver 270B, microphone 270C, earphone jack 270D, sensor module 280, button 290, motor 291, indicator 292, camera 293, display screen 294, and a subscriber identification module (subscriber identification module, SIM) card interface 295, etc.
  • SIM subscriber identification module
  • the processor 210 may include one or more processing units, for example: the processor 210 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • graphics processing unit graphics processing unit
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit, NPU
  • the controller can be the nerve center and command center of the mobile phone.
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 210 for storing instructions and data.
  • the memory in processor 210 is a cache memory.
  • the memory may hold instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 210 is reduced, thereby improving the efficiency of the system.
  • processor 210 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input/output (general-purpose input/output, GPIO) interface, SIM interface, and/or USB interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM interface SIM interface
  • USB interface etc.
  • the external memory interface 220 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the mobile phone.
  • the external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 221 may be used to store computer-executable program codes including instructions.
  • the processor 210 executes various functional applications and data processing of the mobile phone by executing instructions stored in the internal memory 221 .
  • the internal memory 221 may also include an area for storing programs and an area for storing data.
  • the program storage area may store an operating system, at least one application required by a function (such as the first application described in the embodiment of the present application), and the like.
  • the storage data area can store data created during the use of the mobile phone (such as image data, phone book) and the like.
  • the internal memory 221 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the charging management module 240 is configured to receive charging input from the charger. While the charging management module 240 is charging the battery 242 , it can also provide power for the mobile phone through the power management module 241 .
  • the power management module 241 is used for connecting the battery 242 , the charging management module 240 , and the processor 210 .
  • the power management module 241 can also receive the input of the battery 242 to provide power for the mobile phone.
  • the wireless communication function of the mobile phone can be realized by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor and the baseband processor.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in a mobile phone can be used to cover single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile phone can realize the audio function through the audio module 270, the speaker 270A, the receiver 270B, the microphone 270C, the earphone interface 270D, and the application processor. Such as music playback, recording, etc.
  • the sensor module 280 may include a pressure sensor 280A, a gyro sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 280K, an ambient light sensor 280L, bone conduction sensor 280M, etc.
  • the camera 293 may include various types.
  • the camera 293 may include a telephoto camera, a wide-angle camera or an ultra-wide-angle camera with different focal lengths.
  • the field of view of the telephoto camera is small, which is suitable for shooting distant scenes in a small range; the field of view of the wide-angle camera is relatively large; screen.
  • the telephoto camera with a smaller field of view can be rotated so as to capture scenes in different ranges.
  • the mobile phone can capture raw images (also called RAW images or digital negatives) through the camera 293 .
  • the camera 293 includes at least a lens (lens) and a sensor (sensor).
  • the shutter is opened, and the light can be transmitted to the sensor through the lens of the camera 293 .
  • the sensor can convert the optical signal passing through the lens into an electrical signal, then perform analog-to-digital (A/D) conversion on the electrical signal, and output a corresponding digital signal.
  • This digital signal is the RAW image.
  • the mobile phone can perform subsequent ISP processing and YUV domain processing on the RAW image through the processor (such as: ISP, DSP, etc.), and convert the RAW image into an image that can be used for display, such as: JPEG image or high-efficiency image file format (high efficiency image file format, HEIF) image.
  • JPEG images or HEIF images can be transmitted to the display screen of the mobile phone for display, and/or transmitted to the memory of the mobile phone for storage.
  • the mobile phone can realize the function of shooting.
  • the photosensitive element of the sensor may be a charge coupled device (CCD), and the sensor also includes an A/D converter.
  • the photosensitive element of the sensor may be a complementary metal-oxide-semiconductor (CMOS).
  • CMOS complementary metal-oxide-semiconductor
  • the ISP processing may include: bad pixel correction (bad pixel correction, DPC), RAW domain noise reduction, black level correction (black level correction, BLC), lens brightness correction (lens shading correction, LSC), automatic white Balance (auto white balance, AWB), demosica (demosica) color interpolation, color correction (color correction matrix, CCM), dynamic range compression (dynamic range compression, DRC), gamma (gamma), 3D lookup table (look up table, LUT), YUV domain noise reduction, sharpen, detail enhance, etc.
  • YUV domain processing can include: multi-frame registration, fusion, noise reduction of high-dynamic range images (high-dynamic range, HDR), and super-resolution (SR) algorithms to improve clarity, skin beautification algorithms, distortion Correction algorithm, blur algorithm, etc.
  • the display screen 294 is used to display images, videos and the like.
  • Display 294 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the mobile phone may include 1 or N display screens 294, where N is a positive integer greater than 1.
  • display screen 294 may be used to display application program interfaces.
  • the mobile phone realizes the display function through the GPU, the display screen 294, and the application processor.
  • GPU is a microprocessor for image processing, connected to display screen 294 and application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
  • the structure shown in FIG. 2 does not constitute a specific limitation on the mobile phone.
  • the mobile phone may also include more or fewer components than those shown in FIG. 2 , or combine certain components, or separate certain components, or arrange different components, etc.
  • some components shown in FIG. 2 may be implemented in hardware, software, or a combination of software and hardware.
  • terminal device 200 is an interactive electronic whiteboard, a wearable device, a tablet computer, a notebook computer, a desktop computer, a portable electronic device, a UMPC, a netbook, a PDA, a smart TV, a car computer, a smart speaker, an AR device, a VR device, And other smart devices with display screens, or other forms of terminal equipment such as digital cameras, SLR cameras/micro-single cameras, sports cameras, pan-tilt cameras, drones, etc., the specific structures of these other forms of terminal equipment are also You can refer to Figure 2. Exemplarily, on the basis of the structure shown in FIG. 2 , other forms of terminal equipment may have components added or reduced, which will not be repeated here.
  • the terminal device 200 (such as a mobile phone) can run one or more data required for 3D modeling, preprocess the data required for 3D modeling, and support the 3D model
  • An application program that performs functions such as preview, for example, may be called a 3D modeling application or a 3D reconstruction application.
  • the application program can call the camera of the terminal device 200 to take pictures according to the user's operation, collect the data required for 3D modeling, and perform 3D modeling. The data is preprocessed.
  • the application program can also display a preview interface of the 3D model through the display screen of the terminal device 200 for the user to view and preview the 3D model.
  • the embodiment of the present application uses the terminal device 200 as an example for illustration, it should be understood that the 3D modeling method provided in the embodiment of the present application is also applicable to other above-mentioned terminal devices with a shooting function.
  • the specific type of the terminal equipment is not limited.
  • the 3D modeling method may include the following three parts:
  • the user uses the mobile phone to collect the data required for 3D modeling of the target object.
  • the mobile phone preprocesses the collected data required for 3D modeling of the target object, and uploads the preprocessed data to the cloud.
  • the cloud performs 3D modeling based on the data uploaded by the mobile phone to obtain the 3D model of the target object.
  • the mobile phone can download the 3D model from the cloud for the user to preview.
  • the first application may be installed in the mobile phone, and the first application is the 3D modeling application or the 3D reconstruction application described in the foregoing embodiments.
  • the name of the first application may be "3D Rubik's Cube", and there is no limitation on the name of the first application here.
  • the main interface of the first application may include a function control for starting the 3D modeling function, and the user may click or touch the function control on the main interface of the first application.
  • the mobile phone can start the 3D modeling function of the first application in response to the user's operation of clicking or touching the function control.
  • the mobile phone can switch the display interface from the main interface of the first application to the 3D modeling data collection interface, and start the shooting function of the camera, and record the pictures captured by the camera in the 3D modeling data
  • the collection interface is displayed.
  • the user can hold the mobile phone to collect data required for 3D modeling of the target object.
  • the mobile phone may display a function control for starting the first application on the main interface (or called desktop), such as: an application icon (or called a button) of the first application.
  • a function control for starting the first application on the main interface such as: an application icon (or called a button) of the first application.
  • the user wants to use the first application to perform 3D modeling of a certain target object, he may click or touch the application icon of the first application.
  • the mobile phone may start and run the first application and display the main interface of the first application in response to the user's operation of clicking or touching the application icon of the first application.
  • FIG. 3 is a schematic diagram of a main interface of a mobile phone provided by an embodiment of the present application.
  • the main interface 301 of the mobile phone may include an application icon 302 of the first application.
  • the main interface 301 of the mobile phone may also include application icons of application A, application B, application C and other applications.
  • the user can click or touch the application icon 302 on the main interface 301 of the mobile phone to trigger the mobile phone to start and run the first application and display the main interface of the first application.
  • the mobile phone may also display a function control for starting the first application on another display interface such as a pull-down interface or a negative screen.
  • a function control for starting the first application on another display interface such as a pull-down interface or a negative screen.
  • the functional controls of the first application may be presented in the form of application icons, or in the form of other functional buttons, which is not limited here.
  • the drop-down interface refers to the display interface that appears after sliding down the top of the main interface of the mobile phone. Buttons for commonly used functions of the user can be displayed in the drop-down interface, such as WLAN, Bluetooth, etc., so that the user can quickly use related functions. For example, when the current display interface of the mobile phone is the desktop, the user can perform a downward sliding operation on the top of the mobile phone screen to trigger the mobile phone to switch the display interface from the desktop to the drop-down interface (or overlay and display the drop-down interface on the desktop).
  • the negative screen refers to the display interface that appears after sliding the main interface (or desktop) of the mobile phone to the right.
  • the negative screen can display the user's frequently used applications, functions, subscribed services and information, etc., which is convenient for users to quickly browse and use. For example, when the current display interface of the mobile phone is the desktop, the user can perform a rightward sliding operation on the screen of the mobile phone to trigger the mobile phone to switch the display interface from the desktop to a negative screen.
  • one negative screen is just a word used in the embodiment of the present application, and its meaning has been recorded in the embodiment of the present application, but its name does not constitute any limitation to the embodiment of the present application; in addition, in some other embodiments, “one negative screen” may also be called other names such as “desktop assistant”, “shortcut menu”, “Widget collection interface”, etc., which is not limited here.
  • the voice assistant when the user wants to use the first application to perform 3D modeling of a certain target object, the voice assistant may also be used to control the mobile phone to start and run the first application.
  • the present application does not limit the way of starting the first application here.
  • FIG. 4 is a schematic diagram of a main interface of a first application provided in an embodiment of the present application.
  • the main interface 401 of the first application may include a functional control: "start modeling” 402, and the "start modeling” 402 is the above-mentioned functional control for starting the 3D modeling function.
  • the user may click or touch "start modeling" 402 on the main interface 401 of the first application.
  • the mobile phone can respond to the user's operation of clicking or touching "start modeling” 402, start the 3D modeling function of the first application, switch the display interface from the main interface 401 of the first application to the 3D modeling data collection interface, and start the camera
  • the shooting function displays the pictures captured by the camera on the 3D modeling data collection interface.
  • FIG. 5 is a schematic diagram of a 3D modeling data collection interface provided by an embodiment of the present application.
  • the 3D modeling data collection interface 501 displayed by the mobile phone may include functional controls: a scan button 502 , and images captured by the camera of the mobile phone.
  • a scan button 502 the 3D modeling data collection interface 501 displayed by the mobile phone
  • images captured by the camera of the mobile phone For example, please continue to refer to FIG. 5 , assuming that the user wants to perform 3D modeling on a toy car placed on the table, the user can point the camera of the mobile phone at the toy car.
  • the 3D modeling data collection interface 501 The picture captured by the camera displayed in may include a toy car 503 and a table 504 .
  • the user can move the shooting angle of the mobile phone, adjust the position of the toy car 503 in the screen to be in the central position of the mobile phone screen (that is, the 3D modeling data collection interface 501), and click or touch the scan button 502 in the 3D modeling data collection interface 501 , the mobile phone can respond to the user's operation of clicking or touching the scan button 502, and start collecting data required for 3D modeling of the target object (ie, the toy car 503) at the center of the mobile phone screen.
  • the position in the screen can be placed at the center of the mobile phone screen object as the target object.
  • the 3D modeling data collection interface may be referred to as a first interface, and the main interface of the first application may be referred to as a second interface.
  • FIG. 6 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone can also display a prompt message on the modeling data collection interface 501 : "Please place the target object in the center of the screen" 505 .
  • the target object is also the target object, and the prompt information can be used to remind the user to place the position of the target object in the screen at the center of the mobile phone screen.
  • the display position of "Please place the target object in the center of the screen” 505 in the modeling data collection interface 501 may be above the scan button 502, or a position lower than the center of the screen, etc.
  • "Please place the target object There is no limitation on the display position of "put the object in the center of the screen” 505 in the modeling data collection interface 501.
  • the prompt message: "Please place the target object in the center of the screen" 505 is only an exemplary description. Placed in the central position of the mobile phone screen, this application does not limit the content of the prompt information.
  • the prompt information used to remind the user to place the position of the target object in the screen at the center of the screen of the mobile phone may be referred to as the first prompt information.
  • the mobile phone when the mobile phone responds to the user's operation of clicking or touching the scan button 502 and starts to collect the data required for 3D modeling of the target object, the user can hold the mobile phone around the target object to take pictures.
  • the mobile phone can collect 360-degree panoramic data of the target object.
  • the data collected by the mobile phone required for 3D modeling of the target object may include: the picture/image of the target object captured by the mobile phone during the process of shooting around the target object, and the picture may be in JPG/JPEG format.
  • the mobile phone can collect the RAW image corresponding to the target object through the camera, and then, the processor of the mobile phone can perform ISP processing and JPEG encoding on the RAW image to obtain the corresponding JPG/JPEG format image of the target object.
  • FIG. 7A is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone responds to the user's operation of clicking or touching the scan button 502 and starts to collect the data required for 3D modeling of the target object (taking a toy car as an example), it also A mesh model 701 (or called a bounding volume or a virtual bounding volume) may be displayed around the target object in the frame of the 3D modeling data collection interface 501 .
  • the mesh model 701 can take the center of the target object as the central axis and cover around the target object.
  • the mesh model 701 may include upper and lower layers, each layer may include multiple meshes, the upper layer may be called the first layer, and the lower layer may be called the second layer.
  • Each patch in each layer may correspond to a range of angles within 360 degrees around the target object. For example, assuming that the number of patches in each layer is 20, each patch in each layer corresponds to an angle range of 18 degrees.
  • the first circle is to shoot the target object with the mobile phone camera looking down (for example, looking down at an angle of 30 degrees, without limitation), and the second circle is to shoot the mobile phone around the target object.
  • the camera looks directly at the target object and shoots around the target object.
  • the mobile phone can sequentially light up the patches on the first layer as it moves around the target object.
  • the mobile phone can sequentially light up the second layer of patches as it moves around the target object.
  • the mobile phone can light up the first patch in the first layer.
  • the mobile phone captures a picture of the target object within an angle range of 18 degrees to 36 degrees
  • the mobile phone can light up the second patch in the first layer.
  • the mobile phone can light up the 20th patch in the first layer.
  • the user looks down at the target object with the camera of the mobile phone, and after completing the first round of shooting around the target object in a clockwise or counterclockwise direction, all 20 patches in the first layer can be lit.
  • the user looks directly at the target object with the camera of the mobile phone, and after completing the second round of shooting around the target object in a clockwise or counterclockwise direction, all 20 patches in the second layer can be lit.
  • the picture corresponding to the first patch may be called a first image
  • the picture corresponding to the second patch may be called a second image.
  • the pose when the mobile phone captures the first image may be called the first pose
  • the pose when the mobile phone shoots the second image may be called the second pose.
  • the mobile phone can light up the first patch in the first layer.
  • the effect of can be shown as 702 in FIG. 7A.
  • the lit patch may present a different pattern or color (that is, the display effect is changed).
  • FIG. 7B is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
  • the hour hand rotates at a certain angle (moves a certain distance)
  • the mobile phone can continue to light up the second patch, the third patch, etc. in the first layer.
  • FIG. 7C is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
  • the mobile phone can light up half or more of the patches in the first layer.
  • FIG. 7D is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
  • the clock hand moves to the initial shooting position described in FIG. 7A (or when the user holds the mobile phone and turns counterclockwise around the target object)
  • the mobile phone can light up all the patches in the first layer.
  • the user can adjust the shooting position of the mobile phone relative to the target object, lower the mobile phone by a certain distance, make the mobile phone camera face the target object, and move around the target object in a counterclockwise direction for the first time. Two lap shots.
  • FIG. 7E is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone can light up the first patch in the second layer.
  • FIG. 7F is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the rules for the mobile phone to light up each patch can be as follows:
  • the mobile phone when the mobile phone is shooting a target object, it may collect a preview stream corresponding to the target object, and the preview stream includes multiple frames of pictures.
  • the mobile phone can shoot the target object at a frame rate of 24 frames per second, 30 frames per second, etc., and there is no limitation here.
  • the mobile phone can perform blur detection on each frame of pictures captured, and obtain pictures whose resolution is greater than the first threshold.
  • the first threshold can be determined according to the requirements and the blur detection algorithm, and the size is not limited. If the resolution of the current frame does not meet the requirements (such as less than or equal to the first threshold), continue to acquire the next frame of pictures.
  • the mobile phone can perform key frame selection (or called key frame screening) on the pictures whose resolution is greater than the first threshold, and obtain pictures whose features meet the requirements.
  • the features of the picture that meet the requirements may include: the features contained in the picture are relatively clear and rich, the features contained in the picture are easy to extract, and the features contained in the picture have less redundant information, etc. There are no restrictions on the algorithm and specific requirements for key frame selection.
  • the mobile phone can obtain some key frame pictures with better quality by performing blur detection and key frame selection on the pictures captured by the shooting position, and each patch corresponds to The number of keyframe pictures can be one or more.
  • the mobile phone calculates the camera pose information corresponding to the picture obtained in 1) (that is, the pose information of the mobile phone camera).
  • the mobile phone when the mobile phone supports AR engine (engine) capability, or AR core (core) capability, or ARKIT capability and other capabilities, the mobile phone can call the aforementioned capabilities to directly obtain the camera pose information corresponding to the picture.
  • engine engine
  • core core
  • ARKIT capability and other capabilities the mobile phone can call the aforementioned capabilities to directly obtain the camera pose information corresponding to the picture.
  • the camera pose information may include qw, qx, qy, qz, tx, ty, tz.
  • qw, qx, qy, and qz represent a rotation matrix composed of unit quaternions, and tx, ty, and tz can form a translation matrix.
  • the rotation matrix and translation matrix can represent the relative positional relationship and angle between the camera (mobile phone camera) and the target object.
  • the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system through the aforementioned rotation matrix and translation matrix, and obtain the coordinates of the target object in the camera coordinate system.
  • the world coordinate system may refer to a coordinate system whose origin is the center of the target object
  • the camera coordinate system may refer to a coordinate system whose origin is the camera center.
  • the mobile phone determines the relationship between the picture obtained in 1) and each patch in the patch model, and obtains the patch corresponding to the picture.
  • the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system according to the camera pose information (rotation matrix and translation matrix) corresponding to the picture, and obtain the target object in the camera coordinate system coordinates in the system. Then, the mobile phone can determine the connection line between the camera coordinates and the coordinates of the target object according to the coordinates of the target object in the camera coordinate system and the camera coordinates, and the line between the camera coordinates and the coordinates of the target object intersects with the The patch is the patch corresponding to the frame picture.
  • the camera coordinates are known parameters for the mobile phone.
  • the frame sequence file includes pictures corresponding to each lighted patch, and these pictures can be used as data required for 3D modeling of the target object.
  • the format of the picture included in the frame sequence file may be JPG format.
  • the pictures saved in the frame sequence file may be sequentially numbered as 001.jpg, 002.jpg, 003.jpg...etc.
  • Each frame of pictures included in the above frame sequence file can be called keyframes, and these keyframes can be used as the data collected by the mobile phone in the first part for 3D modeling of the target object.
  • a frame sequence file can also be called a key frame sequence file.
  • the user can hold the mobile phone and shoot the target object within 1.5 meters from the target object.
  • the shooting distance is too short (for example, the 3D modeling data collection interface cannot present the whole picture of the target )
  • the phone can turn on the wide-angle camera for shooting.
  • the mobile phone when the user holds the mobile phone around the target object to shoot around, the mobile phone lights up the patches in the patch model sequentially as it moves around the target object, which can guide the user to collect and model the target object in 3D
  • the required data is guided by a dynamic UI through a 3D guidance interface (that is, the 3D modeling data collection interface that displays the mesh model) to enhance user interactivity and allow users to intuitively perceive the data collection process.
  • the description about the patch model (or called the bounding volume) in the first part above is only an example.
  • the number of layers of the mesh model can also include more layers or fewer layers, and the number of meshes in each layer can be greater than 20 or less than 20.
  • the present application is concerned with the number of layers of the mesh model and each The number of dough pieces in one layer is not limited.
  • FIG. 8 is a schematic structural diagram of a mesh model provided by an embodiment of the present application. Please refer to FIG. 8.
  • the structure of the patch model can be shown in (a) in FIG. described structure).
  • the structure of the mesh model may be as shown in (b) in FIG. 8 , including three layers, upper, middle and lower, and each layer may include multiple meshes.
  • the structure of the mesh model may be a one-layer structure composed of multiple meshes, as shown in (c) in FIG. 8 .
  • the structure of the mesh model may also be shown in (d) in FIG. 8 , including two upper and lower layers, and each layer may include multiple meshes.
  • the structures of the mesh models shown in FIG. 8 are all illustrative. The present application does not limit the structure of the mesh model and the inclination angle of each layer in the mesh model (the inclination angle relative to the central axis).
  • the mesh model described in the embodiment of the present application is a virtual model, and the mesh model can be preset in the mobile phone, for example, it can be configured in the file directory of the first application in the form of a configuration file.
  • multiple mesh models can be preset in the mobile phone.
  • the mobile phone collects the data required for 3D modeling of the target object, it can recommend a target mesh model that matches the target object according to the shape of the target object, or According to the user's selection operation, a target mesh model is selected from multiple mesh models, and the target mesh model is used to implement the guiding function described in the foregoing embodiments.
  • the target patch model can also be called a first virtual bounding volume.
  • FIG. 9 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone switches the display interface from the main interface 401 of the first application to the 3D modeling data collection interface 501, when it detects that there is a target object (target object) in the screen, it can also be displayed in the modeling data collection interface.
  • a prompt message is displayed on the interface 501 : “Target object detected, click the button to start scanning” 506 .
  • the button is the scan button 502 , and the prompt information can be used to remind the user to click the scan button 502 so that the mobile phone starts collecting data required for 3D modeling of the target object.
  • the mobile phone can respond to the user's operation of clicking or touching "start modeling" 402 to start the 3D modeling function of the first application, and switch the display interface from the main interface 401 of the first application to After the 3D modeling data collection interface, you can also display relevant prompt information on the 3D modeling data collection interface to remind the user to adjust the shooting environment where the target object is located, the way to shoot the target object, and the screen ratio of the object. .
  • FIG. 10 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone can respond to the user's operation of clicking or touching "start modeling" 402, start the 3D modeling function of the first application, and switch the display interface from the main interface 401 of the first application to 3D modeling data
  • you can also display prompt information 1001 on the 3D modeling data collection interface.
  • the content of the prompt information 1001 can be "Place the object on a solid color plane, with soft light, and shoot around the object.
  • the screen ratio of the object should be as high as possible. Large and complete", which can be used to remind the user to adjust the shooting environment of the target object.
  • the mobile phone may first display a function control 1002 on the 3D modeling data collection interface, for example, the function space may be "know", “confirm” and so on. After the user clicks the function control 1002, the mobile phone no longer displays the prompt information 1001 and the function control 1002, and presents the 3D modeling data collection interface as shown in FIG. 5 above.
  • the subsequent data collection process can be faster and the quality of the collected data can be better.
  • the prompt information 1001 may also be called second prompt information.
  • the mobile phone after the mobile phone switches the display interface from the main interface 401 of the first application to the 3D modeling data collection interface, it can also display the prompt information 1001 on the 3D modeling data collection interface for a preset duration. After a long period of time, the mobile phone may automatically no longer display the prompt message 1001, and present the 3D modeling data collection interface as shown in FIG. 5 above.
  • the preset duration may be 20 seconds, 30 seconds, etc., which is not limited here.
  • the second part can be executed automatically.
  • the mobile phone after the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can be displayed on the 3D modeling data collection interface Displays functional controls for uploading to the cloud for 3D modeling.
  • the user can click the functional control for uploading to the cloud for 3D modeling.
  • the mobile phone may execute the second part in response to the operation of the user clicking the functional control for uploading to the cloud for 3D modeling.
  • FIG. 11 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
  • the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can be displayed on the 3D modeling data collection interface.
  • Display function control: "upload cloud modeling” 1101, "upload cloud modeling” 1101 is a function control for uploading cloud to perform 3D modeling.
  • the user may click "Upload Cloud Modeling" 1101, and the mobile phone may execute the second part in response to the user's operation of clicking "Upload Cloud Modeling" 1101.
  • the operation of the user clicking "upload cloud modeling" 1101 may be referred to as an operation of generating a 3D model.
  • the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can also A prompt message for prompting the user that the mobile phone has collected the data required for 3D modeling of the target object is displayed on the 3D modeling data collection interface, such as: “scanning is complete” 1102 .
  • the mobile phone can also display an exit button 1103 (only marked in Figure 11 ) in the 3D modeling data collection interface, and the mobile phone is executing the first part During the process, the user can click the exit button 1103 at any time, and the mobile phone can respond to the user's operation of clicking the exit button 1103 to exit the execution process of the first part. After exiting the execution process of the first part, the mobile phone can switch the display interface from the 3D modeling data collection interface to the main interface of the first application as shown in FIG. 4 .
  • the data collected by the mobile phone in the first part and required for 3D modeling of the target object is each frame picture (ie key frame) included in the frame sequence file mentioned in the first part.
  • the mobile phone preprocesses the collected data required for 3D modeling of the target object means: the mobile phone preprocesses the key frames included in the frame sequence file collected in the first part, specifically as follows:
  • the mobile phone calculates the matching information of each key frame, and saves the matching information of each key frame in the first file.
  • the first file may be a file in JS object notation (javascript object notation, JSON) format.
  • the matching information of the key frame may include: identification information of other key frames associated with the key frame.
  • the matching information of a key frame of a certain frame may include the identification information of the key frames (such as the nearest key frame) corresponding to the four orientations of the key frame of the frame, up, down, left, and right respectively, and the four orientations of the key frame of the frame respectively correspond to A keyframe for is the other keyframes associated with that frame's keyframe.
  • the identification information of the key frame may be the picture number of the key frame.
  • the matching information of the key frame of the frame may be used to indicate: which other pictures in the frame sequence file are associated with the key frame of the frame.
  • the matching information of each key frame is obtained according to the association relationship between each key frame and the patch corresponding to each key frame, and the association relationship among the plurality of patches. That is, for each key frame, the mobile phone can determine the matching information of the key frame according to the association relationship between the patch corresponding to the key frame and other patches.
  • the relationship between the patch corresponding to the key frame and other patches may include: in the patch model, which patches are corresponding to the top, bottom, left, and right directions of the patch corresponding to the key frame.
  • the association between patch 1 and other patches can be Including: the dough piece below the dough piece 1 is the dough piece 21, the dough piece on the left side of the dough piece 1 is the dough piece 20, and the dough piece on the right side of the dough piece 1 is the dough piece 2.
  • the mobile phone can determine that other keyframes associated with the frame keyframe include the keyframe corresponding to patch 21, the keyframe corresponding to patch 20, and the keyframe corresponding to patch 2. the corresponding keyframe. Therefore, the mobile phone can obtain the matching information of the key frame of the frame, including the identification information of the key frame corresponding to the patch 21, the identification information of the key frame corresponding to the patch 20, and the identification information of the key frame corresponding to the patch 2.
  • the first file also includes: camera intrinsics corresponding to each key frame, gravity direction information (gravity), picture (image) name, picture number (index), camera pose information (slampose), Timestamp (timestamp) and other information.
  • the first file includes three parts: “intrinsics”, “keyframes”, and “matching_list”.
  • the "intrinsics” part is the internal reference of the camera;
  • keyframes is the gravity direction information, picture name, picture number, camera pose information, timestamp (timestamp) and other information corresponding to the key frame of each frame;
  • matching_list is the key frame of each frame Frame matching information.
  • the content of the first file could look like this:
  • cx, cy, fx, fy, height, k1, k2, k3, p1, p2, and width are all camera internal references.
  • cx and cy represent the offset of the optical axis to the coordinate center of the projection plane
  • fx and fy represent the focal lengths in the x and y directions of the camera when shooting
  • k1, k2, k3 represent the radial distortion coefficient
  • p1 and p2 represent the cut Distortion coefficient
  • height (height) and width (width) indicate the resolution of the camera when shooting.
  • the gravity direction information can be obtained by the mobile phone according to the built-in gyroscope, which can indicate the offset angle when the mobile phone takes pictures.
  • 18.jpg represents the name of the picture (image), and 18 is the number of the picture (only 18.jpg is used as an example here). That is, the above example is the camera intrinsics (intrinsics), gravity direction information (gravity), picture (image) name, picture number (index), camera pose information (slampose), timestamp (timestamp), and match information.
  • qw, qx, qy, qz, tx, ty, tz are all camera pose information.
  • qw, qx, qy, and qz represent a rotation matrix composed of unit quaternions, and tx, ty, and tz can form a translation matrix.
  • the rotation matrix and translation matrix can represent the relative positional relationship and angle between the camera (mobile phone camera) and the target object.
  • the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system through the aforementioned rotation matrix and translation matrix, and obtain the coordinates of the target object in the camera coordinate system.
  • the world coordinate system may refer to a coordinate system whose origin is the center of the target object
  • the camera coordinate system may refer to a coordinate system whose origin is the camera center.
  • timestamp represents a timestamp, which means the time when the camera captures the key frame of this frame.
  • src_id indicates the picture number of each key frame, for example, in the content of the first file given above, the picture number is 18, and the "matching_list" part is the matching information of the key frame with picture number 18.
  • tgt_id indicates the picture numbers of other key frames associated with the key frame with picture number 18 (ie, identification information of other key frames associated with the key frame with picture number 18).
  • the picture numbers of other key frames associated with the key frame with picture number 18 include: 26, 45, 59, 78, 89, 100, 449 and so on.
  • key frames associated with the key frame with picture number 18 include: key frame with picture number 26, key frame with picture number 45, key frame with picture number 59, key frame with picture number 78, The key frame with picture number 89, the key frame with picture number 100, the key frame with picture number 449, etc.
  • the mobile phone packs the first file and all key frames in the frame sequence file (that is, all frame pictures included in the above frame sequence file).
  • the result obtained after the first file is packaged with all key frames in the frame sequence file is the 3D modeling of the target object collected in the first part by the mobile phone in the second part
  • the required data is preprocessed to obtain the preprocessed data.
  • the data after the mobile phone preprocesses the data collected in the first part and required for 3D modeling of the target object may include: when the mobile phone is shooting around the target object, Each key frame picture saved in the frame sequence file, and a first file including matching information of each frame key frame.
  • the mobile phone After the mobile phone obtains the above-mentioned preprocessed data, it can send (that is, upload) the preprocessed data to the cloud, and the cloud can execute the third part, perform 3D modeling according to the data uploaded by the mobile phone, and obtain the 3D model of the target object.
  • the process of 3D modeling in the cloud based on the data uploaded by the mobile phone can be as follows:
  • the cloud decompresses the received data packet (the data packet includes the frame sequence file and the first file) from the mobile phone, and extracts the frame sequence file and the above-mentioned first file.
  • the cloud performs 3D modeling processing according to the key frame picture included in the frame sequence file and the above-mentioned first file to obtain a 3D model of the target object.
  • the step of performing 3D modeling processing on the cloud according to the key frame pictures included in the frame sequence file and the above-mentioned first file may at least include: key target extraction, feature detection and matching, global optimization and fusion, sparse point cloud computing, dense Point cloud computing, surface reconstruction, texture generation.
  • key target extraction refers to the operation of separating the target object of interest in the key frame picture from the background, identifying and interpreting meaningful object entities from the image, and extracting different image features.
  • Feature detection and matching refers to: detecting the unique pixels in the key frame picture as the feature points of the key frame picture; describing the feature points with significant features in different key frame pictures, and comparing the similarity of the two descriptions to judge different key points Whether the feature points in the frame picture are the same feature.
  • the cloud when the cloud performs feature detection and matching, for each key frame, the cloud can use the matching information of the key frame included in the first file (that is, the identification of other key frames associated with the key frame) Information) to determine other key frames associated with the key frame and the key frame of the frame, and perform feature detection and matching on the key frame and other key frames associated with the key frame of the frame.
  • the matching information of the key frame included in the first file that is, the identification of other key frames associated with the key frame
  • Information that is, the identification of other key frames associated with the key frame
  • the cloud can use the matching information of the key frame included in the first file (that is, the identification of other key frames associated with the key frame) Information) to determine other key frames associated with the key frame and the key frame of the frame, and perform feature detection and matching on the key frame and other key frames associated with the key frame of the frame.
  • the cloud can determine based on the first file that other key frames associated with the key frame of picture number 18 include: Key frame with picture number 26, key frame with picture number 45, key frame with picture number 59, key frame with picture number 78, key frame with picture number 89, key frame with picture number 100, picture number 449 keyframes etc. Then, the cloud can compare the key frame with picture number 18 with the key frame with picture number 26, the key frame with picture number 45, the key frame with picture number 59, the key frame with picture number 78, and the key frame with picture number 89.
  • the key frame, the key frame with picture number 100, the key frame with picture number 449, etc. are used for feature detection and matching, and there is no need to perform feature detection and matching for the key frame with picture number 18 and all other key frames in the frame sequence file.
  • the cloud can combine the matching information of the key frame included in the first file, and associate the key frame and other key frames associated with the key frame It is only necessary to perform feature detection and matching, and it is not necessary to perform feature detection and matching on this key frame and all other key frames in the frame sequence file. In this way, the computing load on the cloud can be effectively reduced and the efficiency of 3D modeling can be improved.
  • Global optimization and fusion refers to the use of global optimization and fusion algorithms to optimize and fuse the matching results of feature detection and matching.
  • the results of global optimization and fusion can be used to generate basic 3D models.
  • Sparse point cloud computing and dense point cloud computing refer to generating 3D point cloud data corresponding to the target object according to the results of global optimization and fusion. Compared with images, point clouds have an irreplaceable advantage—depth.
  • the 3D point cloud data directly provides the data of the 3D space, while the image needs to reverse the 3D data through the perspective geometry.
  • Surface reconstruction refers to the use of 3D point cloud data to accurately restore the 3D surface shape of an object to obtain the basic 3D model of the target object.
  • Texture generation refers to: generating the texture (also called texture map) of the surface of the target object according to the key frame picture or the characteristics of the key frame picture. After obtaining the surface texture of the target object, the texture is mapped to the surface of the basic 3D model of the target object in a specific way, so that the surface of the target object can be restored more realistically and the target object looks more real.
  • texture also called texture map
  • the cloud can also quickly and accurately determine the mapping relationship between the texture and the surface of the basic 3D model of the target object based on the matching information of each key frame included in the first file, which can further improve the modeling efficiency and effectiveness.
  • the cloud determines the mapping relationship between the texture of the first key frame and the surface of the basic 3D model of the target object, it can combine the matching information of the first key frame to quickly and accurately determine the texture of the first key frame.
  • the cloud can combine the matching information of this key frame to quickly and accurately determine the relationship between the texture of other key frames associated with this key frame and the surface of the basic 3D model of the target object. Mapping relations.
  • the cloud can generate the 3D model of the target object according to the basic 3D model of the target object and the texture of the surface of the target object.
  • the cloud can save the basic 3D model of the target object and the texture of the surface of the target object for downloading by the mobile phone.
  • the matching information of each frame key frame included in the first file can effectively improve the processing speed of 3D modeling, reduce the computing load on the cloud, and improve the Efficiency in 3D modeling.
  • the basic 3D model of the target object can be stored in OBJ format, and the texture of the surface of the target object can be stored in JPG format (such as texture map).
  • the basic 3D model of the target object can be an OBJ file, and the texture on the surface of the target object can be a JPG file.
  • the cloud can save the basic 3D model of the target object and the texture of the surface of the target object for a certain period of time (such as 7 days). texture.
  • the cloud may also permanently retain the basic 3D model of the target object and the texture of the surface of the target object, which is not limited here.
  • the mobile phone can realize 3D modeling only by relying on ordinary RGB cameras (cameras) to collect the data required for 3D modeling.
  • the process of collecting data required for modeling does not need to rely on special hardware such as LIDAR sensors or RGB-D cameras on mobile phones.
  • the process of 3D modeling is completed in the cloud, and there is no need to rely on the mobile phone to be equipped with a high-performance discrete graphics card.
  • the 3D modeling method can obviously lower the threshold of 3D modeling, and has higher universal applicability to terminal equipment.
  • the mobile phone can enhance user interaction through a dynamic UI guidance, allowing the user to intuitively perceive the data collection process.
  • the mobile phone when the mobile phone shoots the target object at a certain position, it performs blur detection on each frame of the captured picture to obtain a picture with a resolution that meets the requirements, which can realize the screening of key frames and obtain effective Useful keyframes for modeling.
  • the mobile phone extracts the matching information of the key frames of each frame, and sends the first file including the matching information of the key frames of each frame and the frame sequence file composed of the key frames to the cloud for the cloud to carry out modeling (no need to send the captured All pictures), can greatly reduce the complexity of 3D modeling on the cloud side, reduce the consumption of hardware resources in the cloud during the 3D modeling process, effectively reduce the computing load of cloud modeling, and improve the speed and effect of 3D modeling.
  • the cloud may send an indication message to the mobile phone to indicate that the cloud has completed the 3D modeling.
  • the mobile phone after receiving the above instruction message from the cloud, can automatically download the 3D model of the target object from the cloud for the user to preview the 3D model.
  • FIG. 12 is a schematic diagram of a 3D model preview interface provided by the embodiment of the present application.
  • the display interface can be switched from the 3D modeling data collection interface shown in FIG. 11 to the 3D model preview interface shown in FIG. 12 .
  • the mobile phone may display prompt information on the 3D model preview interface: "modeling" 1201, which is used to prompt the user that the target object is being 3D modeled.
  • "Modeling" 1201 may be referred to as the third prompt information.
  • the mobile phone may display the third prompt information after detecting that the user clicks on the above-mentioned function control "upload cloud modeling" 1101.
  • the cloud After the cloud completes the 3D modeling of the target object and obtains the 3D model of the target object, it can send an indication message to the mobile phone to indicate that the cloud has completed the 3D modeling. After receiving the indication message, the mobile phone can automatically download the 3D model of the target object from the cloud. For example: the mobile phone can send a download request message to the cloud, and the cloud can send the 3D model of the target object (ie, the basic 3D model of the target object and the texture of the surface of the target object) to the mobile phone according to the download request message.
  • the mobile phone can send a download request message to the cloud
  • the cloud can send the 3D model of the target object (ie, the basic 3D model of the target object and the texture of the surface of the target object) to the mobile phone according to the download request message.
  • FIG. 13 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • the mobile phone may also change the prompt information from "modeling in progress" 1201 to "modeling completed” 1301 to remind the user that the 3D model of the target object has been modeled.
  • a view button 1302 may also be included in the 3D model preview interface. The user can click the view button 1302, and the mobile phone can display the 3D model of the target object downloaded from the cloud in the 3D model preview interface in response to the user's operation of clicking the view button 1302.
  • "Modeling completed" 1301 may be referred to as the fourth prompt information.
  • FIG. 14 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • the mobile phone may display the 3D model 1401 of the toy car in the 3D model preview interface in response to the user's operation of clicking the view button 1302 .
  • the user can view the 3D model 1401 of the toy car in the 3D model preview interface shown in FIG. 14 .
  • the user's operation of clicking the view button 1302 is an operation of previewing the 3D model corresponding to the target object.
  • the 3D model of the toy car can be rotated counterclockwise in any direction (such as horizontal direction, vertical direction, etc.)
  • the mobile phone can respond to the aforementioned operations of the user and display the 3D models of the toy car at different angles (360 degrees) for the user on the 3D model preview interface.
  • Fig. 15 is a schematic diagram of the user performing counterclockwise rotation operation along the horizontal direction on the 3D model of the toy car provided by the embodiment of the present application. As shown in FIG. 15 , the user can use fingers to drag the 3D model of the toy car along the horizontal direction to rotate counterclockwise on the 3D model preview interface.
  • FIG. 16 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • the mobile phone can respond to the user's dragging of the 3D model of the toy car along the horizontal direction.
  • the counterclockwise rotation operation displays the rendering effects of the angles shown in (a) and (b) in FIG. 16 for the user on the 3D model preview interface.
  • the presentation effects of the angles shown in (a), (b) and so on in FIG. 16 are only illustrative.
  • the angle at which the 3D model of the toy car is presented is related to the direction, distance, and number of times the user drags the 3D model of the toy car, and will not be shown here one by one.
  • the user when the user views the 3D model of the toy car in the 3D model preview interface, the user can also perform zoom-in or zoom-out operations on the 3D model of the toy car, and the mobile phone can respond to the zoom-in performed by the user on the 3D model of the toy car or zoom out operation, the zoom-in or zoom-out effect of the 3D model of the toy car is displayed for the user on the 3D model preview interface.
  • FIG. 17 is a schematic diagram of a user performing a zoom-out operation on a 3D model of a toy car provided by an embodiment of the present application.
  • the user can use two fingers to slide inward (in the opposite direction) on the 3D model preview interface, and this sliding operation is a zoom-out operation.
  • FIG. 18 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
  • the mobile phone can respond to the zoom-out operation performed by the user on the 3D model of the toy car, and display the zoomed-in 3D model of the toy car to the user on the 3D model preview interface. After the rendering effect.
  • the user can use two fingers to slide outward (opposite direction) on the 3D model preview interface.
  • the mobile phone may respond to the zoom-in operation performed by the user on the 3D model of the toy car, and display the zoomed-in presentation effect of the 3D model of the toy car for the user on the 3D model preview interface. Let me go into more detail.
  • the zoom-in operation or zoom-out operation performed by the user on the 3D model of the toy car above is an exemplary description.
  • the zoom-in or zoom-out operation performed by the user on the 3D model of the toy car can also be a double-click operation, a long-press operation, or, the 3D model preview interface can also include a zoom-in or zoom-out operation Functional controls, etc., are not limited here.
  • the mobile phone after the mobile phone receives the above instruction message from the cloud, it may also only display the 3D model preview interface as shown in FIG. 13 above.
  • the mobile phone downloads the 3D model of the target object from the cloud in response to the user's operation of clicking the view button 1302, and displays the 3D model of the target object in the 3D model preview interface for the user to preview.
  • the present application does not limit the triggering conditions for the mobile phone to download the 3D model of the target object from the cloud.
  • the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final 3D model on the terminal device. For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
  • FIG. 19 is a schematic flowchart of a 3D modeling method provided by an embodiment of the present application.
  • the 3D modeling method may include S1901-S1913.
  • the mobile phone receives a first operation, where the first operation is an operation for starting a first application.
  • the first operation may be the above-mentioned operation of clicking or touching the application icon 302 of the first application in the main interface of the mobile phone shown in FIG. 3 .
  • the first operation may be an operation of clicking or touching a function control of the first application on a drop-down interface, or another display interface such as a negative one screen.
  • the first operation may also be the above-mentioned operation of controlling the mobile phone to start and run the first application through the voice assistant.
  • the mobile phone In response to the first operation, the mobile phone starts and runs the first application, and displays a main interface of the first application.
  • the main interface of the first application may be referred to as a second interface.
  • the mobile phone receives a second operation, where the second operation is an operation of starting the 3D modeling function of the first application.
  • the second operation may be the above-mentioned operation of clicking or touching the function control "start modeling" 402 in the main interface 401 of the first application shown in FIG. 4 .
  • the mobile phone displays a 3D modeling data collection interface, and starts a camera shooting function, and displays images captured by the camera on the 3D modeling data collection interface.
  • the 3D modeling data collection interface displayed by the mobile phone in response to the second operation may refer to the above-mentioned FIG. 5 .
  • the 3D modeling data acquisition interface may be referred to as a first interface.
  • the mobile phone receives a third operation.
  • the third operation is an operation of controlling the mobile phone to collect data required for 3D modeling of the target object.
  • the third operation may include the operation of the user clicking or touching the scan button 502 in the 3D modeling data collection interface shown in FIG. 5 above, and the operation of the user holding the mobile phone around the target object to take pictures.
  • the third operation may also be referred to as an acquisition operation.
  • the mobile phone acquires a frame sequence file composed of key frame pictures corresponding to the target object.
  • the mobile phone obtains the matching information of key frames of each frame in the frame sequence file, and obtains the first file.
  • the matching information of the key frame included in the first file may include: the identification information of the nearest key frame respectively corresponding to the four orientations of the key frame, up, down, left, and right, for example, the identification information Can be the number of the preceding picture.
  • the mobile phone sends the frame sequence file and the first file to the cloud.
  • the cloud receives the frame sequence file and the first file.
  • the cloud performs 3D modeling according to the frame sequence file and the first file to obtain a 3D model of the target object.
  • the 3D model of the target object may include the basic 3D model of the target object and the texture of the surface of the target object. Paste the texture (texture map) on the surface of the target object to the basic 3D model of the target object, which is the 3D model of the target object.
  • the cloud sends an indication message to the mobile phone, which is used to indicate that the cloud has completed the 3D modeling.
  • the mobile phone receives the indication message.
  • the mobile phone sends a download request message to the cloud, for requesting to download the 3D model of the target object.
  • the cloud receives the download request message.
  • the cloud sends the 3D model of the target object to the mobile phone.
  • the mobile phone receives a 3D model of the target object.
  • the mobile phone displays the 3D model of the target object.
  • the mobile phone displays the 3D model of the target object, which can be used for the user to preview the 3D model of the target object.
  • the effect of the mobile phone displaying the 3D model of the target object can refer to the above-mentioned figures 12, 13, 14, 16, and 18.
  • the target object displayed on the mobile phone can The 3D model can be rotated, zoomed in or out, etc.
  • FIG. 20 is a schematic diagram of a 3D modeling method implemented by the device-cloud collaboration system provided in the embodiment of the present application.
  • the mobile phone may at least include an RGB camera (such as a camera) and a first application.
  • the first application is the above-mentioned 3D modeling application.
  • the RGB camera can be used to realize the shooting function of the mobile phone, shoot the target object to be modeled, and obtain the corresponding picture of the target object.
  • the RGB camera can transmit the captured pictures to the first application.
  • the first application may include a data acquisition and dynamic guidance module, a data processing module, a 3D model preview module, and a 3D model export module.
  • the data acquisition and dynamic guidance module can realize functions such as blur detection, key frame selection, guidance information calculation, and guidance interface update.
  • the blur detection function can be used to realize the blur detection of each frame of the picture (which can be called the input frame), and obtain the picture with the definition that meets the requirements as the key frame; if the current frame picture is clear If the degree does not meet the requirements, continue to obtain the next frame of pictures.
  • the key frame selection function you can judge whether the picture has been stored in the frame sequence file. If the picture is not stored in the frame sequence file, add the frame picture to the frame sequence file.
  • the relationship between the picture and each patch in the patch model can be determined according to the camera pose information corresponding to the picture, and the corresponding patch of the picture can be obtained.
  • the corresponding relationship between pictures and patches is the guide information.
  • the update function of the guide interface the display effect of the patches in the patch model can be updated (that is, changed), such as lighting up the patches, according to the guidance information obtained through the aforementioned calculation.
  • the data processing module can realize matching relationship calculation, matching list calculation, data packaging and other functions.
  • the matching relationship between key frames in the frame sequence file can be calculated through the matching relationship calculation function, such as whether they are adjacent or not.
  • the data processing module can calculate the matching relationship between the key frames in the frame sequence file according to the association relationship between the patches in the patch model through the overmatching relationship calculation function.
  • the matching list calculation function can generate the matching list of each key frame according to the calculation result of the matching relationship calculation function, and the matching list of each key frame includes the matching information of each key frame.
  • the matching information of a key frame includes identification information of other key frames associated with the key frame of the frame.
  • the first file including the matching information of each key frame and the frame sequence file can be packaged through the data packaging function. After the first file and the frame sequence file are packaged, the mobile phone can send the packaged data package (including the first file and the frame sequence file) to the cloud.
  • the cloud may include a data analysis module, a 3D modeling module, and a data storage module.
  • the data analysis module can analyze the received data packet to obtain the frame sequence file and the first file.
  • the 3D modeling module can perform 3D modeling according to the frame sequence file and the first file to obtain a 3D model.
  • the 3D modeling module can realize functions such as key target extraction, feature detection and matching, global optimization and fusion, sparse point cloud computing, dense point cloud computing, surface reconstruction, and texture generation.
  • key target extraction function the target object of interest in the key frame picture can be separated from the background, and different image features can be extracted by identifying and interpreting meaningful object entities from the image.
  • feature detection and matching function the unique pixels in the key frame picture can be detected as the feature points of the key frame picture; the feature points with significant features in different key frame pictures are described, and the similarity between the two descriptions is compared to judge different keys Whether the feature points in the frame picture are the same feature.
  • the 3D point cloud data corresponding to the target object can be generated according to the results of feature detection and matching.
  • the surface reconstruction function the 3D surface shape of the object can be accurately restored by using the 3D point cloud data, and the basic 3D model of the target object can be obtained.
  • the texture generation function the texture (also called texture map) of the surface of the target object can be generated according to the key frame picture or the characteristics of the key frame picture. After the texture of the surface of the target object is obtained, the texture is mapped to the surface of the basic 3D model of the target object in a specific manner to obtain the 3D model of the target object.
  • the 3D modeling module can store the 3D model of the target object in the data storage module.
  • the first application of the mobile phone can download the 3D model of the target object from the data storage module in the cloud. After downloading the 3D model of the target object, the first application may provide the user with a 3D model preview function through the 3D model preview module, or provide the user with a 3D model export function through the 3D model export module.
  • the specific process of the first application providing the user with a 3D model preview function through the 3D model preview module please refer to the foregoing embodiments.
  • the scanning progress can also be displayed on the 3D modeling data collection interface.
  • the mobile phone can display the scanning progress through the circular black filling effect in the scan button in the 3D modeling data collection interface.
  • the UI presentation effect of the scan button is different, and the manner in which the mobile phone displays the scanning progress on the 3D modeling data collection interface may be different, which is not limited here.
  • the mobile phone does not need to display the scanning progress, and the user can know the scanning progress according to the lighting of the patches in the patch model.
  • the steps of the 3D modeling method provided in the embodiments of the present application may also all be implemented on the terminal device side.
  • the functions implemented on the cloud side described in the foregoing embodiments may also all be implemented in the terminal device. That is, after obtaining the above-mentioned frame sequence file and the first file, the terminal device can directly locally generate a 3D model of the target object according to the frame sequence file and the first file, and provide functions such as preview and export of the 3D model.
  • the specific principle that the terminal device generates the 3D model of the target object locally based on the frame sequence file and the first file is the same as the principle that the cloud generates the 3D model of the target object based on the frame sequence file and the first file described in the foregoing embodiment. I won't repeat them here.
  • an embodiment of the present application provides a modeling device, which can be applied to a terminal device, and is used to implement the terminal device in the 3D modeling method described in the foregoing embodiments. steps to achieve.
  • the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
  • Hardware or software includes one or more modules or units corresponding to the functions described above.
  • FIG. 21 is a schematic structural diagram of a modeling device provided by an embodiment of the present application.
  • the device may include: a display unit 2101 and a processing unit 2102 .
  • the display unit 2101 and the processing unit 2102 can be used to cooperate to implement the functions of the terminal device in the modeling method described in the foregoing method embodiments.
  • the display unit 2101 is configured to display a first interface, where the first interface includes a captured image of the terminal device.
  • the processing unit 2102 is configured to, in response to the collection operation, collect multiple frames of images corresponding to the target object to be modeled, and acquire a relationship among the multiple frames of images. Acquiring a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
  • the display unit 2101 is further configured to display a three-dimensional model corresponding to the target object.
  • the display unit 2101 is further configured to display a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
  • the processing unit 2102 is specifically configured to collect the first image when the terminal device is in the first pose, and change the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, collect the second image, and Changing the display effect of the patches corresponding to the second image; after changing the display effects of the plurality of patches of the first virtual bounding volume, acquiring the association relationship between the multiple frames of images according to the plurality of patches.
  • the display unit 2101 and the processing unit 2102 are also configured to implement other display functions and processing functions of the terminal device in the modeling method described in the foregoing method embodiments, which will not be repeated here.
  • FIG. 22 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
  • the terminal device sends the multi-frame images and the association relationship between the multi-frame images to the server, and the server sends the multi-frame images according to the multi-frame images
  • the modeling device may also include a sending unit 2103 and a receiving unit 2104, and the sending unit 2103 is configured to send the multi-frame images to the server
  • the receiving unit 2104 is configured to receive the 3D model corresponding to the target object sent from the server.
  • the sending unit 2103 is also configured to implement other sending functions that the terminal device can implement in the methods described in the foregoing method embodiments, such as: sending a download request message
  • the receiving unit 2104 is also configured to implement the methods described in the foregoing method embodiments
  • Other receiving functions that the terminal device can implement in the method, such as: receiving indication messages, will not be described here one by one.
  • apparatus may further include other modules or units configured to implement the functions of the terminal device in the methods described in the foregoing embodiments, which are not shown here one by one.
  • the embodiment of the present application further provides a modeling device, which can be applied to a server, and used to realize the function of the server in the 3D modeling method described in the foregoing embodiments.
  • the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
  • Hardware or software includes one or more modules or units corresponding to the functions described above.
  • FIG. 23 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
  • the apparatus may include: a receiving unit 2301 , a processing unit 2302 and a sending unit 2303 .
  • the receiving unit 2301, the processing unit 2302, and the sending unit 2303 may be configured to cooperate to realize the server function in the modeling method described in the foregoing method embodiments.
  • the receiving unit 2301 may be configured to receive multiple frames of images corresponding to the target object sent from the terminal device and the association relationship among the multiple frames of images.
  • the processing unit 2302 may be configured to generate a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
  • the sending unit 2303 may be configured to send the 3D model corresponding to the target object to the terminal device.
  • the receiving unit 2301, the processing unit 2302, and the sending unit 2303 may be configured to implement all functions that can be implemented by the server in the modeling method described in the foregoing method embodiments, which will not be repeated here.
  • the division of units (or called modules) in the above device is only a division of logical functions, and may be fully or partially integrated into a physical entity or physically separated during actual implementation.
  • the units in the device can all be implemented in the form of software called by the processing element; they can also be implemented in the form of hardware; some units can also be implemented in the form of software called by the processing element, and some units can be implemented in the form of hardware.
  • each unit can be a separate processing element, or it can be integrated in a certain chip of the device. In addition, it can also be stored in the memory in the form of a program, which is called and executed by a certain processing element of the device. Function. In addition, all or part of these units can be integrated together, or implemented independently.
  • the processing element described here may also be referred to as a processor, and may be an integrated circuit with a signal processing capability. In the process of implementation, each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in the processor element or implemented in the form of software called by the processing element.
  • the units in the above device may be one or more integrated circuits configured to implement the above method, for example: one or more application specific integrated circuits (ASIC), or, one or more A digital signal processor (DSP), or, one or more field programmable gate arrays (FPGA), or a combination of at least two of these integrated circuit forms.
  • ASIC application specific integrated circuits
  • DSP digital signal processor
  • FPGA field programmable gate arrays
  • the processing element can be a general-purpose processor, such as a central processing unit (central processing unit, CPU) or other processors that can call programs.
  • CPU central processing unit
  • these units can be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • the units of the above apparatus for implementing each corresponding step in the above method may be implemented in the form of a processing element scheduler.
  • the apparatus may include a processing element and a storage element, and the processing element invokes a program stored in the storage element to execute the methods described in the above method embodiments.
  • the storage element may be a storage element on the same chip as the processing element, that is, an on-chip storage element.
  • the program for executing the above method may be stored in a storage element on a different chip from the processing element, that is, an off-chip storage element.
  • the processing element invokes or loads a program from the off-chip storage element on the on-chip storage element, so as to invoke and execute the steps performed by the terminal device or the server in the methods described in the above method embodiments.
  • the embodiment of the present application may also provide an apparatus, such as an electronic device.
  • the electronic device may include: a processor; a memory; and a computer program; wherein the computer program is stored on the memory, and when the computer program is executed by the processor, the electronic device realizes the aforementioned implementation The steps performed by the terminal device or the server in the 3D modeling method described in the example.
  • the memory can be located inside the electronic device or outside the electronic device.
  • the processor includes one or more.
  • the electronic device may be a mobile phone, a large screen (such as a smart screen), a tablet computer, a wearable device (such as a smart watch, a smart bracelet, etc.), a TV, a car device, an augmented reality (augmented reality, AR) /Virtual reality (virtual reality, VR) equipment, notebook computer, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA) and other terminal equipment.
  • augmented reality augmented reality, AR
  • VR Virtual reality
  • notebook computer ultra-mobile personal computer
  • UMPC ultra-mobile personal computer
  • netbook personal digital assistant
  • PDA personal digital assistant
  • the unit of the device implementing each step in the above method may be configured as one or more processing elements, where the processing elements may be integrated circuits, for example: one or more ASICs, or, one or more Multiple DSPs, or, one or more FPGAs, or a combination of these types of integrated circuits. These integrated circuits can be integrated together to form a chip.
  • an embodiment of the present application further provides a chip, and the chip can be applied to the above-mentioned electronic device.
  • the chip includes one or more interface circuits and one or more processors; the interface circuits and processors are interconnected through lines; the processor receives and executes computer instructions from the memory of the electronic device through the interface circuits, so as to realize the Steps performed by a terminal device or a server in a 3D modeling method.
  • the embodiment of the present application also provides a computer program product, including computer readable code, when the computer readable code runs in the electronic device, the electronic device implements the terminal device or server in the 3D modeling method as described in the foregoing embodiments steps performed.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the software product is stored in a program product, such as a computer-readable storage medium, and includes several instructions to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all of the methods described in various embodiments of the present application. or partial steps.
  • the aforementioned storage medium includes: various media capable of storing program codes such as U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk.
  • an embodiment of the present application may also provide a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the above-mentioned embodiment.
  • the embodiment of the present application also provides a device-cloud collaboration system.
  • the composition of the device-cloud collaboration system can refer to the above-mentioned FIG. 1 or FIG. 20, including a terminal device and a server, and the terminal device is connected to the server
  • the terminal device displays a first interface, and the first interface includes a shooting picture of the terminal device; the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtains the An association relationship between multiple frames of images; wherein, during the process of acquiring multiple frames of images corresponding to the target object, the terminal device displays a first virtual enclosure; the first virtual enclosure includes a plurality of patches;
  • the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and acquiring the correlation between the multiple frames of images includes: when the terminal device is in the first pose, the The terminal device collects the first image, and changes the display effect of the patch corresponding to the first image; when the terminal device is in the second pose
  • the terminal device can realize all the functions that the terminal device can realize in the 3D modeling method described in the foregoing method embodiments, and the server can realize the 3D modeling method described in the foregoing method embodiments All the functions that can be realized by the middle server are not repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Processing Or Creating Images (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)

Abstract

本申请提供一种建模方法及相关电子设备及存储介质,涉及三维重建领域。该方法中,终端设备可以显示包括拍摄画面的第一界面;响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取多帧图像之间的关联关系;根据多帧图像以及多帧图像之间的关联关系,获取目标物体对应的三维模型并显示。在采集多帧图像时显示包括多个面片的第一虚拟包围体,当终端设备为第一位姿时,采集第一图像并改变第一图像对应的面片的显示效果;当终端设备为第二位姿时采集第二图像,并改变第二图像对应的面片的显示效果;当改变所述多个面片的显示效果后,根据所述多个面片获取多帧图像之间的关联关系。该方法简化了3D建模过程,且对设备硬件要求较低。

Description

建模方法及相关电子设备及存储介质
本申请要求于2021年6月26日提交国家知识产权局、申请号为202110715044.2、申请名称为“建模方法及相关电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及三维重建领域,尤其涉及一种建模方法及相关电子设备及存储介质。
背景技术
3D重建应用/软件可以用于对物体进行3D建模。目前,在实现3D建模时,需要用户先使用移动端工具(如手机、相机等)采集进行3D建模所需的数据(如图片、深度信息等),然后,3D重建应用可以根据采集到的3D建模所需的数据对物体进行3D重建,得到物体对应的3D模型。
但目前的3D建模方式中,采集3D建模所需的数据的过程、以及根据采集到的3D建模所需的数据对物体进行3D重建的过程较为复杂,且对设备硬件要求较高。例如,采集3D建模所需的数据的过程需要采集设备(如上述移动端工具)配置有激光雷达(light detection and ranging,LIDAR)传感器或RGB深度(RGB depth,RGB-D)相机等特殊的硬件,根据采集到的3D建模所需的数据对物体进行3D重建的过程需要运行3D重建应用的处理设备配置有高性能的独立显卡。
发明内容
本申请实施例提供一种建模方法及相关电子设备及存储介质,简化了采集3D建模所需的数据的过程、以及根据采集到的3D建模所需的数据对物体进行3D重建的过程,且对设备硬件要求较低。
第一方面,本申请实施例提供一种建模方法,所述方法应用于终端设备,所述方法包括:
终端设备显示第一界面,第一界面包括终端设备的拍摄画面。终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系。终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型。终端设备显示目标物体对应的三维模型。
其中,在采集目标物体对应的多帧图像的过程中,终端设备显示第一虚拟包围体;第一虚拟包围体包括多个面片。所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:
当终端设备为第一位姿时,终端设备采集第一图像,并改变第一图像对应的面片的显示效果;当终端设备为第二位姿时,终端设备采集第二图像,并改变第二图像对应的面片的显示效果;当改变第一虚拟包围体的所述多个面片的显示效果后,终端设备根据所述多个面片获取所述多帧图像之间的关联关系。
示例性地,对于每一帧关键帧,终端设备可以根据该关键帧对应的面片与其他面片之间的关联关系,确定出该帧关键帧的匹配信息。其中,该关键帧对应的面片与其 他面片之间的关联关系可以包括:面片模型中,该关键帧对应的面片的上下左右四个方位分别对应哪些面片。
例如,以面片模型包括两层,每层包括20个面片为例,假设某一帧关键帧对应的面片为第一层中的面片1,则面片1与其他面片之间的关联关系可以包括:面片1下方的面片为面片21、面片1左侧的面片为面片20,面片1右侧的面片为面片2。手机根据前述面片1与其他面片之间的关联关系,可以确定出与该帧关键帧关联的其他关键帧包括为面片21对应的关键帧、面片20对应的关键帧、面片2对应的关键帧。从而,手机可以得到该帧关键帧的匹配信息包括面片21对应的关键帧的标识信息、面片20对应的关键帧的标识信息、以及面片2对应的关键帧的标识信息。
可以理解的,对于手机而言,面片模型中不同面片之间的关系关系为已知量。
该建模方法(或称为3D建模方法)中,终端设备仅依赖普通的RGB相机采集3D建模所需的数据即可实现3D建模,对3D建模所需的数据进行采集的过程,不需要依赖于终端设备具有LIDAR传感器或RGB-D相机等特殊的硬件。终端设备根据多帧图像以及多帧图像之间的关联关系,获取目标物体对应的三维模型,可以有效降低3D建模过程的计算负载,提高3D建模的效率。
另外,该方法中,用户仅需要在终端设备侧执行采集3D建模所需的数据相关的操作,然后在终端设备上查看或预览最终的3D模型即可。对用户而言,所有的操作均在终端设备侧完成,操作更加简单,用户体验可以更好。
在一种可能的设计中,终端设备包括第一应用,在所述终端设备显示第一界面之前,所述方法还包括:终端设备响应于打开第一应用的操作,显示第二界面。
所述终端设备显示第一界面,包括:终端设备响应于在第二界面启动第一应用的三维建模功能的操作,显示第一界面。
例如,第二界面中可以包括一个用于启动3D建模功能的功能控件,用户可以在第二界面点击或触摸该功能控件,手机可以响应于用户在第二界面点击或触摸该功能控件的操作,启动第一应用的3D建模功能。也即,用户在第二界面点击或触摸该功能控件的操作即为在第二界面启动第一应用的三维建模功能的操作。
一些实施例中,第一虚拟包围体包括一层或多层,所述多个面片分布在所述一层或多层。
例如,一种实现方式中,面片模型的结构可以包括上下两层,每层可以包括多个面片。另一种实现方式中,面片模型的结构可以包括上中下三层,每层可以包括多个面片。又一种实现方式中,面片模型的结构可以为多个面片组成的一层结构。在此不作限制。
可选地,所述方法还包括:终端设备显示第一提示信息,第一提示信息用于提醒用户将目标物体在拍摄画面中的位置置于中央位置。
例如,第一提示信息可以是“请将目标对象置于屏幕中央”。
可选地,所述方法还包括:终端设备显示第二提示信息;第二提示信息用于提醒用户调整所述目标物体所处的拍摄环境、对所述目标物体进行拍摄的方式、以及所述目标物体的屏占比中的一种或多种。
例如,第二提示信息可以是“将物体静置于纯色平面上,光照柔和,环绕物体一 周进行拍摄,物体屏占比尽量大且完整”。
本实施例中,用户按照第二提示信息的提示内容对目标物体所处的拍摄环境、物体屏占比等进行调整后,后续数据采集过程可以更迅速,采集到的数据质量可以更好。
可选地,在所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型之前,所述方法还包括:终端设备检测生成三维模型的操作;终端设备响应于生成三维模型的操作,显示第三提示信息,第三提示信息用于提示用户正在对目标物体进行建模。
例如,第三提示信息可以是“正在建模”。
可选地,在所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型之后,所述方法还包括:终端设备显示第四提示信息,第四提示信息用于提示用户已完成对所述目标物体的建模。
例如,第四提示信息可以是“已完成建模”。
可选地,所述终端设备显示目标物体对应的三维模型还包括:终端设备响应于改变目标物体对应的三维模型的显示角度的操作,改变目标物体对应的三维模型的显示角度;所述改变目标物体对应的三维模型的显示角度的操作包括拖动目标物体对应的三维模型沿着第一方向进行顺时针转动或逆时针转动的操作。
其中,第一方向可以是任意方向,如水平方向、垂直方向等。终端设备响应于改变目标物体对应的三维模型的显示角度的操作,改变目标物体对应的三维模型的显示角度,可以实现在不同角度为用户呈现3D模型的效果。
可选地,所述终端设备显示目标物体对应的三维模型还包括:终端设备响应于改变目标物体对应的三维模型的显示大小的操作,改变目标物体对应的三维模型的显示大小;所述改变目标物体对应的三维模型的显示大小的操作包括对目标物体对应的三维模型进行放大或缩小的操作。
例如,缩小操作可以是用户使用两根手指在3D模型预览界面向内侧(相对方向)进行滑动的操作,放大操作可以是用户使用两根手指在3D模型预览界面向外侧(相反方向)进行滑动的操作。3D模型预览界面也即终端设备显示目标物体对应的3D模型的界面。
在其他一些实现方式中,用户对目标物体对应的三维模型执行的放大操作或缩小操作还可以是双击操作、长按操作,又或者,3D模型预览界面中还可以包括一个可以进行放大操作或缩小操作的功能控件等,在此不作限制。
一些实施例中,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
例如,对于图片编号为18的关键帧而言,与图片编号为18的关键帧关联的其他关键帧的标识信息即为与图片编号为18的关键帧关联的其他关键帧的图片编号,如26、45、59、78、89、100、449等。
可选地,所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系还包括:终端设备根据拍摄画面确定目标物体; 当终端设备采集多帧图像时,目标物体在拍摄画面中的位置为拍摄画面的中央位置。
可选地,所述终端设备采集待建模的目标物体对应的多帧图像,包括:终端设备在对目标物体进行拍摄的过程中,对拍摄到的每一帧图像进行模糊检测,采集清晰度大于第一阈值的图像作为目标物体对应的图像。
针对每个拍摄位置(一个拍摄位置可以对应一个面片),终端设备通过对该拍摄位置拍摄到的图片进行模糊检测,可以得到质量较好的一些关键帧图片,关键帧图片即目标物体对应的图像,每个面片对应的关键帧图片的数量可以为一个或多个。
可选地,所述终端设备显示所述目标物体对应的三维模型,包括:终端设备响应于对目标物体对应的三维模型进行预览的操作,显示目标物体对应的三维模型。
例如,终端设备可以显示一个查看按钮,用户可以点击该查看按钮,终端设备可以响应于用户点击该查看按钮的操作,显示目标物体对应的三维模型。用户点击该查看按钮的操作即为对目标物体对应的三维模型进行预览的操作。
可选地,目标物体对应的三维模型包括目标物体的基本三维模型、以及目标物体表面的纹理。
目标物体表面的纹理可以是目标物体表面的纹理贴图。根据目标物体的基本3D模型、以及目标物体表面的纹理即可生成目标物体的3D模型。将目标物体表面的纹理按照特定的方式映射到目标物体的基本3D模型的表面上,可以更加真实的还原出目标物体的表面,使目标物体看上去更加真实。
在一种可能的设计中,终端设备与服务器连接;所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型,包括:终端设备向服务器发送所述多帧图像以及所述多帧图像之间的关联关系;终端设备接收来自服务器发送的目标物体对应的三维模型。
本设计中,根据所述多帧图像以及所述多帧图像之间的关联关系生成目标物体对应的三维模型的过程可以在服务器侧完成。也即,本设计可以利用服务器的计算资源实现3D建模。该设计可以适用于一些终端设备的计算能力较弱的场景,提高该3D建模方法的普适性。
可选地,所述方法还包括:终端设备向服务器发送所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳。
可选地,所述方法还包括:终端设备接收来自服务器的指示消息,指示消息用于向终端设备指示服务器已完成对目标物体的建模。
例如,终端设备可以在接收到该指示消息后,显示上述第四提示信息。
可选地,所述终端设备接收来自服务器发送的目标物体对应的三维模型之前,所述方法还包括:终端设备向服务器发送下载请求消息,下载请求消息用于向服务器请求下载目标物体对应的三维模型。
服务器接收到该下载请求消息后,可以向终端设备发送目标物体对应的三维模型。
可选地,一些实施例中,终端设备在采集对目标物体进行3D建模所需的数据时,还可以第一界面显示扫描进度。例如,第一界面中可以包括扫描按钮,终端设备可以通过第一界面中的扫描按钮中的环形黑色填充效果显示扫描进度。
可以理解的,扫描按钮的UI呈现效果不同,手机在第一界面显示扫描进度的方式 可以不同,在此不作限制。
另外一些实施例中,终端设备也可以无需显示扫描进度,用户根据第一虚拟包围体中面片被点亮的情况,即可了解到扫描进度。
第二方面,本申请实施例提供一种建模装置,该装置可以应用于终端设备,用于实现上述第一方面所述的建模方法。该装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元,例如,该装置可以包括:显示单元和处理单元。显示单元和处理单元可以用于配合实现上述第一方面中所述的建模方法。
例如,显示单元,用于显示第一界面,第一界面包括终端设备的拍摄画面。
处理单元,用于响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系。根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型。
显示单元,还用于显示目标物体对应的三维模型。
其中,在采集目标物体对应的多帧图像的过程中,显示单元还用于显示第一虚拟包围体;第一虚拟包围体包括多个面片。处理单元具体用于当终端设备为第一位姿时,采集第一图像,并改变第一图像对应的面片的显示效果;当终端设备为第二位姿时,采集第二图像,并改变第二图像对应的面片的显示效果;当改变第一虚拟包围体的所述多个面片的显示效果后,根据所述多个面片获取所述多帧图像之间的关联关系。
可选地,显示单元和处理单元还用于实现上述第一方面所述的方法中的其他显示功能和处理功能,在此不再一一赘述。
可选地,对于上述第一方面中所述的终端设备将所述多帧图像以及所述多帧图像之间的关联关系发送到服务器,由服务器根据所述多帧图像以及所述多帧图像之间的关联关系生成目标物体对应的三维模型的实现方式,该建模装置还可以包括发送单元和接收单元,发送单元用于向服务器发送所述多帧图像以及所述多帧图像之间的关联关系,接收单元用于接收来自服务器发送的目标物体对应的三维模型。
第三方面,本申请实施例提供一种电子设备,包括:处理器;存储器;以及计算机程序;其中,所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备实现如第一方面及第一方面的任意一种可能的实现方式中所述的方法。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当所述计算机程序在电子设备上运行时,使得所述电子设备实现如第一方面及第一方面的任意一种可能的实现方式中所述的方法。
第五方面,本申请实施例还提供一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,使得电子设备实现如第一方面及第一方面的任意一种可能的实现方式中所述的方法。
上述第二方面至第五方面所具备的有益效果,可参考第一方面中所述,在此不再赘述。
第六方面,本申请实施例还提供一种建模方法,所述方法应用于服务器,所述服务器与终端设备连接;所述方法包括:服务器接收来自终端设备发送的目标物体对应 的多帧图像以及所述多帧图像之间的关联关系;服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成目标物体对应的三维模型;服务器向终端设备发送目标物体对应的三维模型。
该方法可以利用服务器的计算资源实现3D建模,根据所述多帧图像以及所述多帧图像之间的关联关系生成目标物体对应的三维模型的过程可以在服务器侧完成。服务器结合所述多帧图像之间的关联关系进行3D建模,可以有效降低服务器的计算负载,提高建模效率。
例如,服务器在进行3D建模时,结合所述多帧图像之间的关联关系,并将每一帧图像、以及与该图像关联的其他图像进行特征检测与匹配即可,并不需要将该图像与其他所有图像进行特征检测与匹配。通过这种方式可以迅速的比较邻近的两帧图像,能够有效降低服务器的计算负载,提升3D建模的效率。
又例如,服务器在确定出第一帧图像的纹理与目标物体的基本3D模型的表面的映射关系后,可以结合第一帧图像的匹配信息,迅速、精准地确定出与第一帧图像关联的其他图像的纹理与目标物体的基本3D模型的表面的映射关系。类似地,对之后的每一帧图像,服务器都可以结合该图像的匹配信息,迅速、精准地确定出与该图像关联的其他图像的纹理与目标物体的基本3D模型的表面的映射关系。
该方法可以适用于一些终端设备的计算能力较弱的场景,提高该3D建模方法的普适性。
另外,该方法同样具备上述第一方面中所述的其他有益效果,如:对3D建模所需的数据进行采集的过程,不需要依赖于终端设备具有LIDAR传感器或RGB-D相机等特殊的硬件。服务器根据多帧图像以及多帧图像之间的关联关系,获取目标物体对应的三维模型,可以有效降低3D建模过程的计算负载,提高3D建模的效率等,在此不再一一赘述。
可选地,所述方法还包括:服务器接收来自终端设备发送的所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳。
所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成目标物体对应的三维模型,包括:服务器根据所述多帧图像以及所述多帧图像之间的关联关系、所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳,生成目标物体对应的三维模型。
可选地,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
第七方面,本申请实施例提供一种建模装置,该装置可以应用于服务器,用于实现上述第六方面所述的建模方法。该装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元,例如,该装置可以包括:接收单元、处理单元和发送单元。接收单元、处理单元和发送单元可以用于配合实现上述第六方面中所述的建模方法。
例如,接收单元可以用于接收来自终端设备发送的目标物体对应的多帧图像以及 所述多帧图像之间的关联关系。处理单元可以用于根据所述多帧图像以及所述多帧图像之间的关联关系,生成目标物体对应的三维模型。发送单元可以用于向终端设备发送目标物体对应的三维模型。
可选地,接收单元、处理单元和发送单元可以用于实现上述第六方面所述的方法中的服务器可以实现的全部功能,在此不再一一赘述。
第八方面,本申请实施例提供一种电子设备,包括:处理器;存储器;以及计算机程序;其中,所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备实现如第六方面及第六方面的任意一种可能的实现方式中所述的方法。
第九方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当所述计算机程序在电子设备上运行时,使得所述电子设备实现如第六方面及第六方面的任意一种可能的实现方式中所述的方法。
第十方面,本申请实施例还提供一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,使得电子设备实现如第六方面及第六方面的任意一种可能的实现方式中所述的方法。
上述第七方面至第十方面所具备的有益效果,可参考第六方面中所述,在此不再赘述。
第十一方面,本申请实施例还提供一种端云协同系统,包括:终端设备和服务器,所述终端设备与所述服务器连接;所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中,所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;所述服务器向所述终端设备发送所述目标物体对应的三维模型;所述终端设备显示所述目标物体对应的三维模型。
上述第十一方面所具备的有益效果,可参考第一方面和第六方面中所述,在此不再赘述。
应当理解的是,本申请中对技术特征、技术方案、有益效果或类似语言的描述并不是暗示在任意的单个实施例中可以实现所有的特点和优点。相反,可以理解的是对于特征或有益效果的描述意味着在至少一个实施例中包括特定的技术特征、技术方案或有益效果。因此,本说明书中对于技术特征、技术方案或有益效果的描述并不一定是指相同的实施例。进而,还可以任何适当的方式组合本实施例中所描述的技术特征、 技术方案和有益效果。本领域技术人员将会理解,无需特定实施例的一个或多个特定的技术特征、技术方案或有益效果即可实现实施例。在其他实施例中,还可在没有体现所有实施例的特定实施例中识别出额外的技术特征和有益效果。
附图说明
图1为本申请实施例提供的端云协同系统的组成示意图;
图2为本申请实施例提供的终端设备的结构示意图;
图3为本申请实施例提供的手机的主界面的示意图;
图4为本申请实施例提供的第一应用的主界面的示意图;
图5为本申请实施例提供的3D建模数据采集界面的示意图;
图6为本申请实施例提供的3D建模数据采集界面的另一示意图;
图7A为本申请实施例提供的3D建模数据采集界面的又一示意图;
图7B为本申请实施例提供的3D建模数据采集界面的又一示意图;
图7C为本申请实施例提供的3D建模数据采集界面的又一示意图;
图7D为本申请实施例提供的3D建模数据采集界面的又一示意图;
图7E为本申请实施例提供的3D建模数据采集界面的又一示意图;
图7F为本申请实施例提供的3D建模数据采集界面的又一示意图;
图8为本申请实施例提供的面片模型的结构示意图;
图9为本申请实施例提供的3D建模数据采集界面的又一示意图;
图10为本申请实施例提供的3D建模数据采集界面的又一示意图;
图11为本申请实施例提供的3D建模数据采集界面的又一示意图;
图12为本申请实施例提供的3D模型预览界面的示意图;
图13为本申请实施例提供的3D模型预览界面的另一示意图;
图14为本申请实施例提供的3D模型预览界面的又一示意图;
图15为本申请实施例提供的用户对玩具汽车的3D模型执行沿着水平方向的逆时针转动操作的示意图;
图16为本申请实施例提供的3D模型预览界面的又一示意图;
图17为本申请实施例提供的用户对玩具汽车的3D模型执行缩小操作的示意;
图18为本申请实施例提供的3D模型预览界面的又一示意图;
图19为本申请实施例提供的3D建模方法的流程示意图;
图20为本申请实施例提供的端云协同系统实现3D建模方法的逻辑示意图;
图21为本申请实施例提供的建模装置的结构示意图;
图22为本申请实施例提供的建模装置的另一结构示意图;
图23为本申请实施例提供的建模装置的又一结构示意图。
具体实施方式
以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个或两个以上(包含两个)。 字符“/”一般表示前后关联对象是一种“或”的关系。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。术语“连接”包括直接连接和间接连接,除非另外说明。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
在本申请实施例中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。
三维(3-dimension,3D)重建技术被广泛应用于虚拟现实(virtual reality)、增强现实(augmented reality)、扩展现实(extended reality,XR)、混合现实(mixed reality,MR)、游戏、影视、教育、医疗等领域中。例如,3D重建技术可以用于对游戏中的角色、道具、植被等进行建模,或者对影视中的人物模型进行建模,又或者可以在教育领域实现化学分析结构相关的建模,在医疗领域实现人体结构相关的建模等。
目前,可以用于实现3D建模的3D重建应用/软件大部分需要在个人计算机(personal computer,PC)端(如电脑)实现,有少部分3D重建应用可以在移动端(如手机)实现3D建模。PC端的3D重建应用实现3D建模时,需要用户先使用移动端工具(如手机、相机等)采集进行3D建模所需的数据(如图片、深度信息等),并将采集到的数据上传至PC端,然后PC端的3D重建应用可以根据上传的数据进行3D建模处理。移动端的3D重建应用实现3D建模时,用户可以直使用移动端采集进行3D建模所需的数据,移动端的3D重建应用可以直接根据移动端采集的数据进行3D建模处理。
但是,上述两种3D建模方式中,用户使用移动端采集进行3D建模所需的数据时,都要依赖于移动端的激光雷达(light detection and ranging,LIDAR)传感器或RGB深度(RGB depth,RGB-D)相机等特殊的硬件,3D建模所需的数据的采集过程需要较高的的硬件要求。PC端/移动端的3D重建应用实现3D建模,对PC端/移动端的硬件要求也较高,如:可能需要PC端/移动端配置有高性能的独立显卡。
另外,上述在PC端实现3D建模的方式还较为繁琐,例如,用户在移动端进行相关的数据采集操作后,不仅需要用户将采集到的数据拷贝或通过网络传输至PC端,还需要在PC端对3D重建应用执行相关的建模操作。
在此背景技术下,本申请实施例提供了一种3D建模方法,可以应用于终端设备与云端组成的端云协同系统。其中,端云协同的“端”指终端设备,“云”指云端,云端也可称为云服务器或云平台。该方法中,终端设备可以采集3D建模所需的数据, 并对3D建模所需的数据进行预处理后,将预处理后的3D建模所需的数据上传给云端;云端可以根据接收到的预处理后的3D建模所需的数据,进行3D建模;终端设备可以从云端下载云端进行3D建模得到的3D模型,并提供对3D模型的预览功能。
该3D建模方法中,终端设备仅依赖普通的RGB相机采集3D建模所需的数据即可实现3D建模,对3D建模所需的数据进行采集的过程,不需要依赖于终端设备具有LIDAR传感器或RGB-D相机等特殊的硬件;进行3D建模的过程在云端完成,也不需要依赖终端设备配置有高性能的独立显卡。也即,该3D建模方法对终端设备的硬件要求较低。
另外,相对于上述在PC端实现3D建模的方式而言,该方法中,用户仅需要在终端设备侧执行采集3D建模所需的数据相关的操作,然后在终端设备上查看或预览最终的3D模型即可。对用户而言,所有的操作均在终端设备侧完成,操作更加简单,用户体验可以更好。
示例性地,图1为本申请实施例提供的端云协同系统的组成示意图。如图1所示,本申请实施例提供的端云协同系统可以包括:云端100和终端设备200,终端设备200可以通过无线网络与云端100连接。
其中,云端100也即服务器。例如,一些实施例中,云端100可以是单个服务器、或者多个服务器组成的服务器集群,本申请对云端100的实现架构不作限制。
可选地,本申请实施例中,终端设备200可以是具有拍摄功能的交互式电子白板、手机、可穿戴设备(例如智能手表、智能手环等)、平板电脑、笔记本电脑、台式计算机、便携式电子设备(如膝上型计算机,Laptop)、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、智能电视(如智慧屏)、车载电脑、智能音箱、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、以及其他带有显示屏的智能设备,或者,也可以是数码相机、单反相机/微单相机、运动摄像机、云台相机、无人机等专业的拍摄设备,本申请实施例对终端设备的具体类型不作限制。
应当理解,当终端设备为云台相机、无人机等拍摄设备时,还会包括一个可以提供拍摄界面的显示设备,用于显示采集3D建模所需的数据的采集界面,3D模型的预览界面等。例如,云台相机的显示设备可以是手机,航拍无人机的显示设备可以是遥控设备等。
需要说明的是,图1中示例性给出了一个终端设备200。但应当理解,该端云协同系统中的终端设备200可以包括一个或多个,多个终端设备200可以相同,也可以不相同或部分相同,在此均不作限制。本申请实施例提供的3D建模方法是针对每个终端设备200与云端100之间进行交互实现3D建模的过程。
示例性地,以终端设备200为手机为例,图2为本申请实施例提供的终端设备的结构示意图。如图2所示,手机可以包括处理器210,外部存储器接口220,内部存储器221,通用串行总线(universal serial bus,USB)接口230,充电管理模块240,电源管理模块241,电池242,天线1,天线2,移动通信模块250,无线通信模块260,音频模块270,扬声器270A,受话器270B,麦克风270C,耳机接口270D,传感器模块280,按键290,马达291,指示器292,摄像头293,显示屏294,以及用户标识模 块(subscriber identification module,SIM)卡接口295等。
处理器210可以包括一个或多个处理单元,例如:处理器210可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是手机的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器210中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器210中的存储器为高速缓冲存储器。该存储器可以保存处理器210刚用过或循环使用的指令或数据。如果处理器210需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器210的等待时间,因而提高了系统的效率。
在一些实施例中,处理器210可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,SIM接口,和/或USB接口等。
外部存储器接口220可以用于连接外部存储卡,例如Micro SD卡,实现扩展手机的存储能力。外部存储卡通过外部存储器接口220与处理器210通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器221可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器210通过运行存储在内部存储器221的指令,从而执行手机的各种功能应用以及数据处理。
内部存储器221还可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如本申请实施例中所述的第一应用)等。存储数据区可存储手机使用过程中所创建的数据(比如图像数据,电话本)等。此外,内部存储器221可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
充电管理模块240用于从充电器接收充电输入。充电管理模块240为电池242充电的同时,还可以通过电源管理模块241为手机供电。电源管理模块241用于连接电池242,充电管理模块240,以及处理器210。电源管理模块241也可接收电池242的输入为手机供电。
手机的无线通信功能可以通过天线1,天线2,移动通信模块250,无线通信模块260,调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。手机中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
手机可以通过音频模块270,扬声器270A,受话器270B,麦克风270C,耳机接口270D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
传感器模块280可以包括压力传感器280A,陀螺仪传感器280B,气压传感器280C,磁传感器280D,加速度传感器280E,距离传感器280F,接近光传感器280G,指纹传感器280H,温度传感器280J,触摸传感器280K,环境光传感器280L,骨传导传感器280M等。
摄像头293可以包括多种类型。例如,摄像头293可以包括具有不同焦段的长焦摄像头,广角摄像头或超广角摄像头等。其中,长焦摄像头的视场角小,适用于拍摄远处小范围内的景物;广角摄像头的视场角较大;超广角摄像头的视场角大于广角摄像头,可以用于拍摄全景等大范围的画面。在一些实施例中,视场角较小的长焦摄像头可转动,从而可以拍摄不同范围内的景物。
手机可以通过摄像头293捕获原始图像(也称为RAW图或数字底片)。例如,摄像头293至少包括镜头(lens)和传感器(sensor)。在拍摄照片或者拍摄视频时,打开快门,光线可以通过摄像头293的镜头被传递到sensor上。sensor可以将通过镜头的光信号转换为电信号,再对电信号进行模数(analogue-to-digital,A/D)转换,输出对应的数字信号。该数字信号即为RAW图。之后,手机可以通过处理器(如:ISP、DSP等)对RAW图进行后续的ISP处理、以及YUV域处理等,将RAW图转化为可用于显示的图像,如:JPEG图像或高效率图像文件格式(high efficiency image file format,HEIF)图像。JPEG图像或HEIF图像可以被传输给手机的显示屏进行显示,和/或,传输给手机的存储器进行存储。从而,手机可以实现拍摄的功能。
在一种可能的设计中,sensor的感光元件可以是电荷耦合器件(charge coupled device,CCD),sensor还包括A/D转换器。在另外一种可能的设计中,sensor的感光元件可以是互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)。
示例性地,ISP处理可以包括:坏点矫正(bad pixel correction,DPC)、RAW域降噪、黑电平矫正(black level correction,BLC)、镜头亮度矫正(lens shading correction,LSC)、自动白平衡(auto white balance,AWB)、去马赛克(demosica)颜色插值、色彩校正(color correction matrix,CCM)、动态范围压缩(dynamic range compression,DRC)、伽玛(gamma)、3D查找表(look up table,LUT)、YUV域降噪、锐化(sharpen)、增强细节(detail enhance)等。YUV域处理可以包括:高动态范围图像(high-dynamic range,HDR)的多帧配准、融合、降噪,以及提升清晰度的超分辨率(super resolution,SR)算法、美肤算法、畸变校正算法、虚化算法等。
显示屏294用于显示图像,视频等。显示屏294包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,手机可以包括1个或N个显示屏294,N为大于1的正整数。例如,显示屏294可以用于显示应用程序界面。
手机通过GPU,显示屏294,以及应用处理器等实现显示功能。GPU为图像处理 的微处理器,连接显示屏294和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器210可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
可以理解的是,图2所示的结构并不构成对手机的具体限定。在一些实施例中,手机也可以包括比图2所示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置等。又或者,图2所示的一些部件可以以硬件,软件或软件和硬件的组合实现。
另外,当终端设备200是交互式电子白板、可穿戴设备、平板电脑、笔记本电脑、台式计算机、便携式电子设备、UMPC、上网本、PDA、智能电视、车载电脑、智能音箱、AR设备、VR设备、以及其他带有显示屏的智能设备,或者,数码相机、单反相机/微单相机、运动摄像机、云台相机、无人机等其他形态的终端设备时,这些其他形态的终端设备的具体结构也可以参考图2所示。示例性地,其他形态的终端设备可以是在图2给出的结构的基础上增加或减少了组件,在此不再一一赘述。
还应当理解的是,终端设备200(如手机)中可以运行有一个或多个可以实现采集3D建模所需的数据,并对3D建模所需的数据进行预处理,以及支持对3D模型进行预览等功能的应用程序,如:可以称为3D建模应用或3D重建应用。当终端设备200运行前述3D建模应用或3D重建应用时,该应用程序可以根据用户的操作调用终端设备200的摄像头进行拍摄,采集3D建模所需的数据,并对3D建模所需的数据进行预处理。另外,该应用程序还可以通过终端设备200的显示屏显示3D模型的预览界面,供用户对3D模型进行查看和预览。
下面以上述图1所示的端云协同系统中的终端设备200为手机为例,结合用户使用手机进行3D建模的场景,对本申请实施例提供的3D建模方法进行示例性说明。
需要说明的是,本申请实施例虽然是以终端设备200为手机为例进行说明,但应当理解,本申请实施例提供的3D建模方法同样适用于上述其他具有拍摄功能的终端设备,本申请对该终端设备的具体类型并不作限制。
以对某个目标物体进行3D建模为例,本申请实施例提供的3D建模方法可以包括以下三个部分:
第一部分,用户使用手机采集对目标物体进行3D建模所需的数据。
第二部分,手机对采集到的对目标物体进行3D建模所需的数据进行预处理,并将预处理后的数据上传给云端。
第三部分,云端根据手机上传的数据进行3D建模,得到目标物体的3D模型。
通过上述第一部分至第三部分,即可实现对目标物体进行3D建模。在得到目标物体的3D模型后,手机可以从云端下载该3D模型,供用户进行预览。
以下分别对上述第一部分至第三部分进行具体说明。
对于第一部分:
本申请实施例中,手机中可以安装有第一应用,该第一应用即为前述实施例中所述的3D建模应用或3D重建应用。例如,第一应用的名称可以是“3D魔方”,在此对第一应用的名称不作限制。当用户想要目标物体进行3D建模时,可以在手机上启动运行第一应用。手机在启动运行第一应用后,第一应用的主界面中可以包括用于启动3D建模功能的功能控件,用户可以在第一应用的主界面点击或触摸该功能控件。 手机可以响应于用户点击或触摸该功能控件的操作,启动第一应用的3D建模功能。第一应用的3D建模功能启动后,手机可以将显示界面由第一应用的主界面切换至3D建模数据采集界面,并启动摄像头的拍摄功能,将摄像头拍摄到的画面在3D建模数据采集界面进行显示。在手机显示3D建模数据采集界面的过程中,用户可以手持手机采集对目标物体进行3D建模所需的数据。
一些实施例,手机可以在主界面(或称为桌面)显示启动第一应用的功能控件,如:第一应用的应用图标(或称为按键)。当用户想要使用第一应用对某个目标物体进行3D建模时,可以点击或触摸第一应用的应用图标。手机接收到用户点击或触摸第一应用的应用图标的操作后,可以响应于用户点击或触摸第一应用的应用图标的操作,启动运行第一应用,显示第一应用的主界面。
例如,图3为本申请实施例提供的手机的主界面的示意图。如图3所示,手机的主界面301中可以包括第一应用的应用图标302。手机的主界面301中还可能包括应用A、应用B、应用C等其他应用的应用图标。用户可以在手机的主界面301点击或触摸应用图标302,以触发手机启动运行第一应用,显示第一应用的主界面。
另外一些实施例中,手机也可以在下拉界面、或者负一屏等其他显示界面显示启动第一应用的功能控件。在下拉界面或者负一屏中,第一应用的功能控件可以以应用图标的形式呈现,也可以以其他功能按钮的形式呈现,在此不作限制。
其中,下拉界面是指在将手机的主界面的顶部向下滑动后出现的显示界面,下拉界面中可以显示用户常用的功能的按键,如:WLAN、蓝牙等,方便用户快速使用相关功能。例如,手机当前显示界面为桌面时,用户可在手机屏幕顶部上执行向下的滑动操作,以触发手机将显示界面由桌面切换至下拉界面(或者在桌面上叠加显示下拉界面)。负一屏是指将手机的主界面(或称为桌面)向右滑动后出现的显示界面,负一屏中可以显示用户常用的应用、功能以及订阅的服务和资讯等,方便用户快速浏览和使用。例如,手机当前显示界面为桌面时,用户可在手机屏幕上执行向右的滑动操作,以触发手机将显示界面由桌面切换至负一屏。
可以理解的是,“负一屏”只是本申请实施例中所使用的一个词语,其代表的含义在本申请实施例中已经记载,但其名称并不能对本申请实施例构成任何限制;另外,在其他一些实施例中,“负一屏”也可以被称为例如“桌面助理”、“快捷菜单”、“Widget集合界面”等其他名称,在此不作限制。
还有一些实施例中,当用户想要使用第一应用对某个目标物体进行3D建模时,也可以通过语音助手控制手机启动运行第一应用。本申请在此对第一应用的启动方式不作限制。
示例性地,图4为本申请实施例提供的第一应用的主界面的示意图。如图4所示,第一应用的主界面401中可以包括功能控件:“开始建模”402,“开始建模”402即为上述用于启动3D建模功能的功能控件。用户可以在第一应用的主界面401点击或触摸“开始建模”402。手机可以响应于用户点击或触摸“开始建模”402的操作,启动第一应用的3D建模功能,将显示界面由第一应用的主界面401切换至3D建模数据采集界面,并启动摄像头的拍摄功能,将摄像头拍摄到的画面在3D建模数据采集界面进行显示。
示例性地,图5为本申请实施例提供的3D建模数据采集界面的示意图。如图5所示,手机显示的3D建模数据采集界面501中可以包括功能控件:扫描按钮502、以及手机的摄像头拍摄到的画面。例如,请继续参考图5所示,假设用户想要对放置于桌子上的一个玩具汽车进行3D建模,则用户可以将手机摄像头对着该玩具汽车,此时,3D建模数据采集界面501中显示的摄像头拍摄到的画面中可以包括玩具汽车503、以及桌子504。用户可以移动手机的拍摄角度,调整玩具汽车503在画面中的位置处于手机屏幕(即3D建模数据采集界面501)的中央位置,并点击或触摸3D建模数据采集界面501中的扫描按钮502,手机可以响应于用户点击或触摸扫描按钮502的操作,开始采集对处于手机屏幕的中央位置的目标物体(即玩具汽车503)进行3D建模所需的数据。
也即,本申请实施例中,手机响应于用户点击或触摸扫描按钮502的操作,开始采集对目标物体进行3D建模所需的数据时,可以将在画面中的位置处于手机屏幕的中央位置物体作为目标物体。
本申请实施例中,3D建模数据采集界面可以称为第一界面,第一应用的主界面可以称为第二界面。
可选地,图6为本申请实施例提供的3D建模数据采集界面的另一示意图。如图6所示,手机接收到用户点击或触摸扫描按钮502的操作后,还可以在建模数据采集界面501中显示提示信息:“请将目标对象置于屏幕中央”505。其中,目标对象也即目标物体,该提示信息可以用于提醒用户将目标物体在画面中的位置置于手机屏幕的中央位置。
示例性地,“请将目标对象置于屏幕中央”505在建模数据采集界面501中的显示位置可以是扫描按钮502的上方、或者屏幕中央偏下的位置等,在此对“请将目标对象置于屏幕中央”505在建模数据采集界面501中的显示位置不作限制。
另外,需要说明的是,提示信息:“请将目标对象置于屏幕中央”505仅为示例性说明,在其他一些实施例中,也可以用其他文字标识提醒用户将目标物体在画面中的位置置于手机屏幕的中央位置,本申请对该提示信息的内容也不作限制。本申请实施例中,用于提醒用户将目标物体在画面中的位置置于手机屏幕的中央位置的提示信息可以称为第一提示信息。
本申请实施例中,手机响应于用户点击或触摸扫描按钮502的操作,开始采集对目标物体进行3D建模所需的数据时,用户可以手持手机绕着目标物体进行环绕拍摄,在环绕的过程中,手机可以采集到目标物体360度的全景数据。其中,手机采集到对目标物体进行3D建模所需的数据可以包括:手机在绕着目标物体进行环绕拍摄的过程中,拍摄到的目标物体的图片/图像,图片可以是JPG/JPEG格式。例如,手机可以通过摄像头采集目标物体对应的RAW图,然后,手机的处理器可以对RAW图进行ISP处理和JPEG编码,得到目标物体的对应的JPG/JPEG格式的图片。
图7A为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7A所示,在一种可能的设计中,手机响应于用户点击或触摸扫描按钮502的操作,开始采集对目标物体(以玩具汽车为例)进行3D建模所需的数据时,还可以在3D建模数据采集界面501的画面中目标物体的周围显示一个面片模型701(或称为包围体或者虚拟包 围体)。面片模型701可以以目标物体的中心为中心轴,罩设于目标物体周围。面片模型701可以包括上下两层,每层可以包括多个面片,位于上面的一层可以称为第一层,位于下面的一层可以称为第二层。每一层中的每个面片可以对应于目标物体周围的360度范围内的一个角度范围。例如,假设每一层中的面片的数量为20个,则每一层中的每个面片分别对应18度大小的角度范围。
用户手持手机对目标物体进行拍摄时,需要围绕目标物体拍摄两圈,第一圈为将手机摄像头俯视(如俯视30度角,不作限制)目标物体围绕目标物体进行拍摄,第二圈为将手机摄像头正视目标物体围绕目标物体进行拍摄。当用户将手机摄像头俯视目标物体围绕目标物体进行第一圈的拍摄时,手机可以随着围绕目标物体的移动依次点亮第一层的面片。当用户将手机摄像头正视目标物体围绕目标物体进行第二圈的拍摄时,手机可以随着围绕目标物体的移动依次点亮第二层的面片。
举例说明,假设用户将手机摄像头俯视目标物体,沿着顺时针或逆时针方向围绕目标物体进行第一圈的拍摄,以初始拍摄时的角度为0度,每一层中的面片的数量为20个为例,手机在0度至18度的角度范围内拍摄到了目标物体的图片时,手机可以点亮第一层中的第1个面片。手机在18度至36度的角度范围内拍摄到了目标物体的图片时,手机可以点亮第一层中的第2个面片。以此类推,手机在342度至360度的角度范围内拍摄到了目标物体的图片时,手机可以点亮第一层中的第20个面片。也即,用户将手机摄像头俯视目标物体,沿着顺时针或逆时针方向围绕目标物体完成第一圈的拍摄后,第一层中的20面片可以被全部点亮。类似地,用户将手机摄像头正视目标物体,沿着顺时针或逆时针方向围绕目标物体完成第二圈的拍摄后,第二层中的20面片可以被全部点亮。
本申请实施例中,第一个面片对应的图片可以称为第一图像,第二个面片对应的图片可以称为第二图像。手机拍摄第一图像时的姿态可以称为第一位姿,手机拍摄第二图像时的姿态可以称为第二位姿。
示例性地,请继续参考图7A所示,用户将手机摄像头俯视目标物体,沿着逆时针方向围绕目标物体进行第一圈的拍摄时,手机可以点亮第一层中的第1个面片的效果可以如图7A中的702所示。被点亮的面片相较于其他未被点亮的面片而言,可以呈现出与其他未被点亮的面片不同的图案或颜色(即改变了显示效果)。
图7B为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7B所示,在上述图7A所示的基础上,用户将手机摄像头俯视目标物体,沿着逆时针方向围绕目标物体进行第一圈的拍摄的过程中,当用户手持手机围绕目标物体逆时针转动了一定角度(移动了一定距离)时,手机可以继续点亮第一层中的第2个面片、第3个面片等。
图7C为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7C所示,在上述图7B所示的基础上,用户将手机摄像头俯视目标物体,沿着逆时针方向围绕目标物体进行第一圈的拍摄的过程中,当用户手持手机围绕目标物体逆时针移动至目标物体的另一侧时,手机可以点亮的第一层中的面片的数量可以达到一半或一半以上。
图7D为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7D所示,在上述图7C所示的基础上,用户将手机摄像头俯视目标物体,沿着逆时针方向围绕 目标物体进行第一圈的拍摄的过程中,当用户手持手机围绕目标物体逆时针移动至图7A中所述的初始拍摄位置时(或者说当用户手持手机围绕目标物体逆时针转动一圈时),手机可以点亮第一层中的所有面片。
在手机点亮第一层中的所有面片后,用户可以调整手机相对于目标物体的拍摄位置,将手机降低一定的距离,使手机摄像头正视目标物体,沿着逆时针方向围绕目标物体进行第二圈的拍摄。
例如,图7E为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7E所示,用户将手机摄像头正视目标物体,沿着逆时针方向围绕目标物体进行第二圈的拍摄时,手机可以点亮第二层中的第1个面片。
类似地,用户将手机摄像头正视目标物体,沿着逆时针方向围绕目标物体进行第二圈的拍摄的过程中,随着用户手持手机围绕目标物体逆时针进行移动,手机可以逐渐点亮第二层中的面片。例如,图7F为本申请实施例提供的3D建模数据采集界面的又一示意图。如图7F所示,用户将手机摄像头正视目标物体,沿着逆时针方向围绕目标物体进行第二圈的拍摄时,当用户手持手机围绕目标物体逆时针移动至图7E中所述的初始拍摄位置时(或者说当用户手持手机围绕目标物体逆时针转动一圈时),手机可以点亮第二层中的所有面片。第二层的第一个面片点亮至所有面片全部点亮的过程与第一层面片的点亮过程类似,可以参考上述图7A至图7D所示的过程,未再详细赘述。
可选地,本申请实施例中,手机点亮每一个面片的规则可以如下:
1)手机在某个位置对目标物体进行拍摄时,对拍摄到的每一帧图片(可称为输入帧)进行模糊检测和关键帧选择,获取清晰度和图片特征符合要求的图片。
例如,本申请实施例中,手机在对目标物体进行拍摄时,可以采集到目标物体对应的预览流,预览流中包括多帧图片。如:手机可以以24帧/秒、30帧/秒等帧率对目标物体进行拍摄,在此不作限制。手机可以对拍摄到的每一帧图片进行模糊检测,获取清晰度大于第一阈值的图片,第一阈值可根据需求和模糊检测算法进行确定,大小不作限制,如果当前帧图片的清晰度不符合要求(如小于或等于第一阈值),则继续获取下一帧图片。然后,手机可以对清晰度大于第一阈值的图片进行关键帧选择(或称为关键帧筛选),获取图片特征符合要求的图片。图片特征符合要求可以包括:图片包含的特征较为明确和丰富,图片包含的特征容易提取、图片包含的特征的冗余信息较少等,在此对关键帧选择的算法和具体要求不作限制。
针对每个拍摄位置(一个拍摄位置可以对应一个面片),手机通过对该拍摄位置拍摄到的图片进行模糊检测和关键帧选择,可以得到质量较好的一些关键帧图片,每个面片对应的关键帧图片的数量可以为一个或多个。
2)手机计算1)中获取到的图片对应的相机位姿信息(即手机摄像头的位姿信息)。
例如,当手机支持AR引擎(engine)能力、或者AR核心(core)能力、或者AR KIT能力等能力时,手机可以调用前述能力直接获取图片对应的相机位姿信息。
示例性地,相机位姿信息可以包括qw、qx、qy、qz、tx、ty、tz。其中,qw、qx、qy、qz表示以单位四元数的组成的旋转矩阵,tx、ty、tz可以组成平移矩阵。该旋转矩阵和平移矩阵能够表示相机(手机相机)与目标物体之间的相对位置关系和角度。 手机通过前述旋转矩阵和平移矩阵可以将目标物体的坐标由世界坐标系转换到相机坐标系下,得到目标物体在相机坐标系中的坐标。其中,世界坐标系可以是指以目标物体的中心为原点的坐标系,相机坐标系可以是指以相机中心为原点的坐标系。
3)手机根据2)中得到的图片对应的相机位姿信息,确定1)中获取的图片与面片模型中的每个面片的关系,得到该图片对应的面片。
例如,如上述2)中所示,手机可以根据图片对应的相机位姿信息(旋转矩阵和平移矩阵),将目标物体的坐标由世界坐标系转换到相机坐标系下,得到目标物体在相机坐标系中的坐标。然后,手机可以根据目标物体在相机坐标系中的坐标和相机坐标确定相机坐标与目标物体的坐标之间的连线,相机坐标与目标物体的坐标之间的连线与面片模型中相交的面片即为该帧图片对应的面片。其中,相机坐标对手机而言为已知参数。
4)将该图片存储至帧序列文件,并点亮该图片对应的面片。其中,帧序列文件包括了每个被点亮的面片对应的图片,这些图片可以作为对目标物体进行3D建模所需的数据。
帧序列文件包括的图片的格式可以为JPG格式。例如,帧序列文件中保存的图片可以依次编号为001.jpg、002.jpg、003.jpg...等。
可以理解的,当某一个面片被点亮后,用户可以继续移动手机的位置进行下一个角度的拍摄,手机可以继续按照上述规则点亮下一个面片。
上述帧序列文件中包括的每一帧图片可以称为关键帧(keyframes),这些关键帧即可以作为第一部分中手机采集到的对目标物体进行3D建模所需的数据。帧序列文件也可以被称为关键帧序列文件。
可选地,本申请实施例中,用户可以手持手机在距离目标物体1.5米以内的范围内对目标物体进行拍摄,当拍摄距离过近时(如3D建模数据采集界面无法呈现目标物体的全貌时),手机可以开启广角摄像头进行拍摄。
上述第一部分中,用户手持手机绕着目标物体进行环绕拍摄的过程中,手机随着围绕目标物体的移动依次点亮面片模型中的面片,可以实现引导用户采集对目标物体进行3D建模所需的数据,以3D引导界面(即上述显示面片模型的3D建模数据采集界面)进行动态UI引导的方式增强了用户交互性,能够让用户直观感知数据采集过程。
需要说明的是,上述第一部分中关于面片模型(或称为包围体)的描述仅为示例性说明。在其他一些实施例中,面片模型的层数也可以包括更多层或更少层,每一层中面片的数量可以大于20或小于20,本申请对面片模型的层数、以及每一层中面片的数量均不作限制。
例如,图8为本申请实施例提供的面片模型的结构示意图。请参考图8所示,一种实现方式中,面片模型的结构可以如图8中的(a)所示,包括上下两层,每层可以包括多个面片(即前述实施例中所述的结构)。另一种实现方式中,面片模型的结构可以如图8中的(b)所示,包括上中下三层,每层可以包括多个面片。又一种实现方式中,面片模型的结构可以如图8中的(c)所示,为多个面片组成的一层结构。又一种实现方式中,面片模型的结构还可以如图8中的(d)所示,包括上下两层,每层可以包括多个面片。
图8中所示的面片模型的结构均为示例性说明。本申请对面片模型的结构,以及面片模型中每一层的倾斜角度(相对于中轴线的倾斜角度)均不作限制。
应当理解,本申请实施例中所述的面片模型为虚拟模型,该面片模型可以预置在手机中,如:可以以配置文件的形式配置在第一应用的文件目录内。
一些实施例中,手机中可以预置有多个面片模型,手机采集对目标物体进行3D建模所需的数据时,可以根据目标物体的形状推荐与目标物体匹配的目标面片模型,或者根据用户的选择操作,从多个面片模型中选择目标面片模型,并使用目标面片模型实现前述实施例所述的引导功能。目标面片模型也可以称为第一虚拟包围体。
可选地,图9为本申请实施例提供的3D建模数据采集界面的又一示意图。如图9所示,手机将显示界面由第一应用的主界面401切换至3D建模数据采集界面501后,当检测到画面中存在目标对象(目标物体)时,还可以在建模数据采集界面501中显示提示信息:“检测到目标对象,点击按钮开始扫描”506。其中,按钮也即扫描按钮502,该提示信息可以用于提醒用户点击扫描按钮502以使手机开始采集对目标物体进行3D建模所需的数据。
可选地,本申请实施例中,手机可以响应于用户点击或触摸“开始建模”402的操作,启动第一应用的3D建模功能,将显示界面由第一应用的主界面401切换至3D建模数据采集界面后,还可以先在3D建模数据采集界面显示用于提醒用户调整目标物体所处的拍摄环境、对目标物体进行拍摄的方式、以及物体屏占比等相关的提示信息。
例如,图10为本申请实施例提供的3D建模数据采集界面的又一示意图。如图10所示,手机可以响应于用户点击或触摸“开始建模”402的操作,启动第一应用的3D建模功能,将显示界面由第一应用的主界面401切换至3D建模数据采集界面后,还可以先在3D建模数据采集界面显示提示信息1001,提示信息1001的内容可以为“将物体静置于纯色平面上,光照柔和,环绕物体一周进行拍摄,物体屏占比尽量大且完整”,能够用于提醒用户调整目标物体所处的拍摄环境为物体静置于纯色平面、且光照柔和;对目标物体进行拍摄的方式为环绕物体一周进行拍摄;同时,物体屏占比尽量大且完整。继续参考图10所示,手机还可以先在3D建模数据采集界面显示功能控件1002,如功能空间可以为“知道了”、“确认”等。用户点击功能控件1002后,手机不再显示提示信息1001和功能控件1002,并呈现出如上述图5所示的3D建模数据采集界面。
本实施例中,用户按照提示信息1001的提示内容对目标物体所处的拍摄环境、物体屏占比等进行调整后,后续数据采集过程可以更迅速,采集到的数据质量可以更好。提示信息1001也可以称为第二提示信息。
一些实施例中,手机将显示界面由第一应用的主界面401切换至3D建模数据采集界面后,也可以在3D建模数据采集界面显将提示信息1001显示预设时长,当达到预设时长后,手机可以自动不再显示提示信息1001,并呈现出如上述图5所示的3D建模数据采集界面。其中,预设时长的大小可以为20秒、30秒等,在此不作限制。
一些实施例中,手机按照上述第一部分所述的方式采集到对目标物体进行3D建模所需的数据(帧序列文件中包括的每一帧图片)后,可以自动执行第二部分。
另外一些实施例中,手机按照上述第一部分所述的方式采集到对目标物体进行3D建模所需的数据(帧序列文件中包括的每一帧图片)后,可以在3D建模数据采集界面显示用于上传云端进行3D建模的功能控件。用户可以点击该用于上传云端进行3D建模的功能控件。手机可以响应于用户点击该用于上传云端进行3D建模的功能控件的操作,执行第二部分。
例如,图11为本申请实施例提供的3D建模数据采集界面的又一示意图。如图11所示,手机按照上述第一部分所述的方式采集到对目标物体进行3D建模所需的数据(帧序列文件中包括的每一帧图片)后,可以在3D建模数据采集界面显示功能控件:“上传云端建模”1101,“上传云端建模”1101即为用于上传云端进行3D建模的功能控件。用户可以点击“上传云端建模”1101,手机可以响应于用户点击“上传云端建模”1101的操作,执行第二部分。本申请实施例中,用户点击“上传云端建模”1101的操作可以称为生成三维模型的操作。
可选地,请继续参考图11所示,手机按照上述第一部分所述的方式采集到对目标物体进行3D建模所需的数据(帧序列文件中包括的每一帧图片)后,还可以在3D建模数据采集界面显示用于提示用户手机已采集到对目标物体进行3D建模所需的数据的提示信息,如:“扫描完成”1102。
可选地,请参考前述图5至图7E,以及9至图11所示,手机还可以3D建模数据采集界面中显示一个退出按钮1103(仅在图11中标出),手机在执行第一部分的过程中,用户可以随时点击退出按钮1103,手机可以响应于用户点击退出按钮1103的操作,退出第一部分的执行过程。退出第一部分的执行过程后,手机可以将显示界面由3D建模数据采集界面切换至如图4所示的第一应用的主界面。
对于第二部分:
由上述第一部分可知,手机在第一部分中采集到的对目标物体进行3D建模所需的数据为第一部分中提到的帧序列文件中包括的每一帧图片(即关键帧)。第二部分中,手机对采集到的对目标物体进行3D建模所需的数据进行预处理是指:手机对第一部分中采集到的帧序列文件中包括的关键帧进行预处理,具体如下:
1)手机计算各帧关键帧的匹配信息,并将各帧关键帧的匹配信息保存在第一文件中。例如,第一文件可以为JS对象简谱(javascript object notation,JSON)格式的文件。
其中,对每一帧关键帧而言,该帧关键帧的匹配信息可以包括:与该帧关键帧关联的其他关键帧的标识信息。例如,某一帧关键帧的匹配信息可以包括该帧关键帧的上下左右四个方位分别对应的关键帧(如最近的关键帧)的标识信息,该帧关键帧的上下左右四个方位分别对应的关键帧即为与该帧关键帧关联的其他关键帧。其中,关键帧的标识信息可以是关键帧的图片编号。该帧关键帧的匹配信息可以用于指示:与该帧关键帧关联的其他为帧序列文件中的哪些图片。
示例性地,每一帧关键帧的匹配信息根据每一帧关键帧与每一帧关键帧对应的面片的关联关系,以及所述多个面片之间的关联关系获得。也即,对于每一帧关键帧,手机可以根据该关键帧对应的面片与其他面片之间的关联关系,确定出该帧关键帧的匹配信息。其中,该关键帧对应的面片与其他面片之间的关联关系可以包括:面片模 型中,该关键帧对应的面片的上下左右四个方位分别对应哪些面片。
例如,以上述图7A中所示的面片模型为例,假设某一帧关键帧对应的面片为第一层中的面片1,则面片1与其他面片之间的关联关系可以包括:面片1下方的面片为面片21、面片1左侧的面片为面片20,面片1右侧的面片为面片2。手机根据前述面片1与其他面片之间的关联关系,可以确定出与该帧关键帧关联的其他关键帧包括为面片21对应的关键帧、面片20对应的关键帧、面片2对应的关键帧。从而,手机可以得到该帧关键帧的匹配信息包括面片21对应的关键帧的标识信息、面片20对应的关键帧的标识信息、以及面片2对应的关键帧的标识信息。
可以理解的,对于手机而言,面片模型中不同面片之间的关系关系为已知量。
可选地,第一文件中还包括:各帧关键帧对应的相机内参(intrinsics)、重力方向信息(gravity)、图片(image)名称、图片编号(index)、相机位姿信息(slampose)、时间戳(timestamp)等信息。
示例性地,第一文件包括“intrinsics”、“keyframes”、以及“matching_list”三部分。其中,“intrinsics”部分为相机内参;“keyframes”为各帧关键帧对应的重力方向信息、图片名称、图片编号、相机位姿信息、时间戳(timestamp)等信息;“matching_list”为各帧关键帧的匹配信息。
例如,第一文件的内容可以如下所示:
Figure PCTCN2022093934-appb-000001
Figure PCTCN2022093934-appb-000002
Figure PCTCN2022093934-appb-000003
上述示例性给出的第一文件的内容中,cx、cy、fx、fy、height、k1、k2、k3、p1、p2、width均为相机内参。其中,cx和cy表示光轴对于投影平面坐标中心的偏移量;fx和fy分别表示相机拍摄时x方向和y方向的焦距;k1、k2、k3表示径向畸变系数;p1、p2表示切向畸变系数;高度(height)和宽度(width)表示相机拍摄时的分辨率。
x、y、z表示重力方向信息。重力方向信息可以由手机根据内置的陀螺仪得到,能够表示手机拍摄图片时的偏移角度。
18.jpg表示图片(image)名称,18为图片编号(此处仅以18.jpg作为示例)。即,上述示例为18.jpg对应的相机内参(intrinsics)、重力方向信息(gravity)、图片(image)名称、图片编号(index)、相机位姿信息(slampose)、时间戳(timestamp)、以及匹配信息。
qw、qx、qy、qz、tx、ty、tz均为相机位姿信息。qw、qx、qy、qz表示以单位四元数的组成的旋转矩阵,tx、ty、tz可以组成平移矩阵。该旋转矩阵和平移矩阵能够表示相机(手机相机)与目标物体之间的相对位置关系和角度。手机通过前述旋转矩阵和平移矩阵可以将目标物体的坐标由世界坐标系转换到相机坐标系下,得到目标物体在相机坐标系中的坐标。其中,世界坐标系可以是指以目标物体的中心为原点的坐标系,相机坐标系可以是指以相机中心为原点的坐标系。
timestamp表示时间戳,其含义为相机拍摄该帧关键帧时的时间。
src_id表示每一帧关键帧的图片编号,如:上述示例性给出的第一文件的内容中,图片编号为18,“matching_list”部分为图片编号为18的关键帧的匹配信息。tgt_id表示与图片编号为18的关键帧关联的其他关键帧的图片编号(即与图片编号为18的关键帧关联的其他关键帧的标识信息)。如:上述示例性给出的第一文件的内容中,与图片编号为18的关键帧关联的其他关键帧的图片编号包括:26、45、59、78、89、100、449等。也即,与图片编号为18的关键帧关联的其他关键帧包括:图片编号为26的关键帧、图片编号为45的关键帧、图片编号为59的关键帧、图片编号为78的关键帧、图片编号为89的关键帧、图片编号为100的关键帧、图片编号为449的关键帧等。
需要说明的是,上述仅以图片编号为18的关键帧为例给出了第一文件的部分内容,并不用于对第一文件的内容进行限制。
2)手机将第一文件与帧序列文件中的所有关键帧(即上述帧序列文件中包括的所有帧图片)进行打包。
第一文件与帧序列文件中的所有关键帧进行打包后得到的结果(如称为打包文件或数据包),即为第二部分中手机对第一部分中采集到的对目标物体进行3D建模所需的数据进行预处理得到的预处理后的数据。
也即,本申请实施例中,手机对第一部分中采集到的对目标物体进行3D建模所需的数据进行预处理后的数据可以包括:手机在绕着目标物体进行环绕拍摄的过程中, 保存至帧序列文件中的每一帧关键帧图片,以及包括各帧关键帧的匹配信息的第一文件。
手机在得到上述预处理后的数据后,可以将预处理后的数据发送(即上传)至云端,云端可以执行第三部分,根据手机上传的数据进行3D建模,得到目标物体的3D模型。
对于第三部分:
云端根据手机上传的数据进行3D建模的过程可以如下:
1)云端对接收到来自手机的数据包(数据包包括帧序列文件和第一文件)进行解压,提取帧序列文件以及上述第一文件。
2)云端根据帧序列文件包括的关键帧图片以及上述第一文件进行3D建模处理,得到目标物体的3D模型。
示例性地,云端根据帧序列文件包括的关键帧图片以及上述第一文件进行3D建模处理的步骤至少可以包括:关键目标提取、特征检测与匹配、全局优化与融合、稀疏点云计算、稠密点云计算、表面重建、纹理生成。
其中,关键目标提取是指将关键帧图片中感兴趣的目标物体与背景分割开来,从图像中识别和解译有意义的物体实体而提取不同的图像特征的操作。
特征检测与匹配是指:检测关键帧图片中存在的独特的像素点作为关键帧图片的特征点;对不同关键帧图片中特征显著的特征点进行描述,比较两个描述的相似程度判断不同关键帧图片中的特征点是否为同一个特征。
本申请实施例中,云端进行特征检测与匹配时,对于每一帧关键帧,云端可以根据第一文件中包括的该关键帧的匹配信息(即与该帧关键帧关联的其他关键帧的标识信息)确定出与该关键帧与该帧关键帧关联的其他关键帧,并将该关键帧、以及与该帧关键帧关联的其他关键帧进行特征检测与匹配。
例如,以上述以图片编号为18的关键帧为例示例性给出的第一文件的内容为例,云端可以根据第一文件确定出与图片编号为18的关键帧关联的其他关键帧包括:图片编号为26的关键帧、图片编号为45的关键帧、图片编号为59的关键帧、图片编号为78的关键帧、图片编号为89的关键帧、图片编号为100的关键帧、图片编号为449的关键帧等。然后,云端可以将图片编号为18的关键帧与图片编号为26的关键帧、图片编号为45的关键帧、图片编号为59的关键帧、图片编号为78的关键帧、图片编号为89的关键帧、图片编号为100的关键帧、图片编号为449的关键帧等进行特征检测与匹配,无需将图片编号为18的关键帧与帧序列文件中的其他所有关键帧进行特征检测与匹配。
可以看到,本申请实施例中,对于每一帧关键帧,云端可以结合第一文件中包括的该关键帧的匹配信息,并将该关键帧、以及与该帧关键帧关联的其他关键帧进行特征检测与匹配即可,并不需要将该关键帧与帧序列文件中的其他所有关键帧进行特征检测与匹配。通过这种方式可以有效降低云端的计算负载,提升3D建模的效率。
全局优化与融合是指采用全局优化与融合算法对特征检测与匹配的匹配结果进行优化和融合,全局优化与融合的结果可以用于生成基本3D模型。
稀疏点云计算和稠密点云计算是指:根据全局优化与融合的结果,生成目标物体 对应的三维点云数据。相对于图像,点云有其不可替代的优势—深度。三维点云数据直接提供了三维空间的数据,而图像则需要通过透视几何来反推三维数据。
表面重建是指:利用三维点云数据精确恢复物体三维表面形状,得到目标物体的基本3D模型。
纹理生成是指:根据关键帧图片或者关键帧图片的特征,生成目标物体表面的纹理(也称纹理贴图)。在得到目标物体表面的纹理后,将纹理按照特定的方式映射到上述目标物体的基本3D模型的表面上,可以更加真实的还原出目标物体的表面,使目标物体看上去更加真实。
本申请实施例中,云端也可以根据结合第一文件中包括的各帧关键帧的匹配信息,迅速、精准地确定出纹理与目标物体的基本3D模型的表面的映射关系,能够进一步提升建模效率和效果。
例如,云端在确定出第一帧关键帧的纹理与目标物体的基本3D模型的表面的映射关系后,可以结合第一帧关键帧的匹配信息,迅速、精准地确定出与第一帧关键帧关联的其他关键帧的纹理与目标物体的基本3D模型的表面的映射关系。类似地,对之后的每一帧关键帧,云端都可以结合该关键帧的匹配信息,迅速、精准地确定出与该关键帧关联的其他关键帧的纹理与目标物体的基本3D模型的表面的映射关系。
云端在得到目标物体的基本3D模型、以及目标物体表面的纹理后,根据目标物体的基本3D模型、以及目标物体表面的纹理即可生成目标物体的3D模型。云端可以保存目标物体的基本3D模型、以及目标物体表面的纹理,以供手机进行下载。
可以看到,上述云端根据手机上传的数据进行3D建模的过程中,第一文件中包括的各帧关键帧的匹配信息可以有效提高3D建模的处理速度,降低云端的计算负载,提升云端进行3D建模的效率。
示例性地,目标物体的基本3D模型可以以OBJ格式进行存储,目标物体表面的纹理可以以JPG格式(如纹理贴图)进行存储。如:目标物体的基本3D模型可以为OBJ文件,目标物体表面的纹理可以为JPG文件。
可选地,云端可以将目标物体的基本3D模型、以及目标物体表面的纹理保存一定时长(如7天),待达到该时长后,云端可以自动删除目标物体的基本3D模型、以及目标物体表面的纹理。或者,云端也可以永久保留目标物体的基本3D模型、以及目标物体表面的纹理,在此不作限制。
通过上述第一部分至第三部分,即可实现对目标物体进行3D建模,得到目标物体的3D模型。由上述第一部分至第三部分可知,本申请实施例提供的该3D建模方法中,手机仅依赖普通的RGB相机(摄像头)采集3D建模所需的数据即可实现3D建模,对3D建模所需的数据进行采集的过程,不需要依赖于手机具有LIDAR传感器或RGB-D相机等特殊的硬件。进行3D建模的过程在云端完成,也不需要依赖手机配置有高性能的独立显卡。该3D建模方法可以明显降低3D建模的门槛,对终端设备的普适性更高。而且,用户使用手机采集3D建模所需的数据的过程中,手机可以通过动态UI引导的方式增强用户交互性,能够让用户直观感知数据采集过程。
另外,该3D建模方法中,手机在某个位置对目标物体进行拍摄时,对拍摄到的每一帧图片进行模糊检测,获取清晰度符合要求的图片,可以实现关键帧的筛选,得 到有利于建模的关键帧。手机通过提取各帧关键帧的匹配信息,并将包括各帧关键帧的匹配信息的第一文件、以及关键帧组成的帧序列文件发送到云端,供云端进行过建模(无需发送拍摄到的全部图片),可以大降低云侧进行3D建模的复杂度,减小3D建模过程对云端的硬件资源的消耗,有效降低云端进行建模的计算负载,提升3D建模的速度和效果。
下面对手机从云端下载3D模型,供用户对3D模型进行预览的过程进行示例性说明。
示例性地,云端在完成对目标物体的3D建模,得到目标物体的3D模型后,可以向手机发送一个指示消息,用于指示云端已完成3D建模。
一些实施例中,手机在接收到来自云端的上述指示消息后,可以自动从云端下载目标物体的3D模型,供用户对3D模型进行预览。
例如,图12为本申请实施例提供的3D模型预览界面的示意图。上述第二部分中,手机将预处理后的数据发送至云端后,可以将显示界面由图11所示的3D建模数据采集界面切换至如图12所示的3D模型预览界面。如图12所示,手机可以在3D模型预览界面中显示提示信息:“正在建模”1201,用于提示用户正在对目标物体进行3D建模。“正在建模”1201可以称为第三提示信息。
一些实施例中,手机可以在检测到用户点击到上述功能控件“上传云端建模”1101的操作后,显示第三提示信息。
云端在完成对目标物体的3D建模,得到目标物体的3D模型后,可以向手机发送一个指示消息,用于指示云端已完成3D建模。手机在接收到该指示消息后,可以自动从云端下载目标物体的3D模型。如:手机可以向云端发送一个下载请求消息,云端可以根据该下载请求消息向手机发送目标物体的3D模型(即目标物体的基本3D模型、以及目标物体表面的纹理)。
图13为本申请实施例提供的3D模型预览界面的另一示意图。如图13所示,手机在接收到该指示消息后,还可以将提示信息由“正在建模”1201变更为“已完成建模”1301,以提示用户目标物体的3D模型已完成建模。3D模型预览界面中还可以包括一个查看按钮1302。用户可以点击查看按钮1302,手机可以响应于用户点击查看按钮1302的操作,在3D模型预览界面中显示从云端下载的目标物体的3D模型。“已完成建模”1301可以称为第四提示信息。
以目标物体为前述实施例中所述的玩具汽车为例,图14为本申请实施例提供的3D模型预览界面的又一示意图。如图14所示,手机可以响应于用户点击查看按钮1302的操作,在3D模型预览界面中显示玩具汽车的3D模型1401。用户可以在图14所示的3D模型预览界面中对玩具汽车的3D模型1401进行查看。用户点击该查看按钮1302的操作即为对目标物体对应的三维模型进行预览的操作。
可选地,用户在3D模型预览界面中对玩具汽车的3D模型进行查看时,可以对玩具汽车的3D模型执行沿着任意方向(如水平方向、竖直方向等)的逆时针转动操作或顺时针转动操作,手机可以响应于用户的前述操作,在3D模型预览界面为用户显示玩具汽车的3D模型在不同角度(360度)的呈现效果。
示例性地,图15为本申请实施例提供的用户对玩具汽车的3D模型执行沿着水平 方向的逆时针转动操作的示意图。如图15所示,用户可以使用手指在3D模型预览界面拖动玩具汽车的3D模型沿着水平方向进行逆时针转动。
图16为本申请实施例提供的3D模型预览界面的又一示意图。如图16所示,当用户可以使用手指在3D模型预览界面拖动玩具汽车的3D模型沿着水平方向进行逆时针转动时,手机可以响应于用户拖动玩具汽车的3D模型沿着水平方向进行逆时针转动的操作,在3D模型预览界面为用户显示如图16中的(a)、(b)等所示的角度的呈现效果。
可以理解的,图16中的(a)、(b)等所示的角度的呈现效果仅为示例性说明。玩具汽车的3D模型呈现的角度与用户拖动玩具汽车的3D模型的方向、距离、次数等有关,在此不再一一进行表示。
可选地,用户在3D模型预览界面中对玩具汽车的3D模型进行查看时,还可以对玩具汽车的3D模型执行放大或缩小的操作,手机可以响应于用户对玩具汽车的3D模型执行的放大或缩小的操作,在3D模型预览界面为用户显示玩具汽车的3D模型的放大效果或缩小效果。
示例性地,图17为本申请实施例提供的用户对玩具汽车的3D模型执行缩小操作的示意图。如图17所示,用户可以使用两根手指在3D模型预览界面向内侧(相对方向)进行滑动,该滑动操作即为缩小操作。
图18为本申请实施例提供的3D模型预览界面的又一示意图。如图18所示,当用户对玩具汽车的3D模型执行缩小操作时,手机可以响应于当用户对玩具汽车的3D模型执行的缩小操作,在3D模型预览界面为用户显示玩具汽车的3D模型缩小后的呈现效果。
类似的,用户对玩具汽车的3D模型执行的放大操作可以使用两根手指在3D模型预览界面向外侧(相反方向)进行滑动。当用户对玩具汽车的3D模型执行放大操作时,手机可以响应于当用户对玩具汽车的3D模型执行的放大操作,在3D模型预览界面为用户显示玩具汽车的3D模型放大后的呈现效果,不再详细赘述。
需要说明的是,上述用户对玩具汽车的3D模型执行的放大操作或缩小操作均为示例性说明。在其他一些实现方式中,用户对玩具汽车的3D模型执行的放大操作或缩小操作还可以是双击操作、长按操作,又或者,3D模型预览界面中还可以包括一个可以进行放大操作或缩小操作的功能控件等,在此不作限制。
另外一些实施例中,手机在接收到来自云端的上述指示消息后,也可以仅显示如上述图13所示的3D模型预览界面。当用户点击查看按钮1302后,手机再响应于用户点击查看按钮1302的操作,从云端下载目标物体的3D模型,并在3D模型预览界面中显示目标物体的3D模型,供用户进行预览。本申请对手机从云端下载目标物体的3D模型的触发条件也不作限制。
由上所述,本申请实施例中,用户仅需要在终端设备侧执行采集3D建模所需的数据相关的操作,然后在终端设备上查看或预览最终的3D模型即可。对用户而言,所有的操作均在终端设备侧完成,操作更加简单,用户体验可以更好。
为使本申请实施例提供的技术方案更加简洁明了,下面分别结合图19和图20对本申请实施例提供的3D建模方法的实现逻辑进行示例性说明。
示例性地,图19为本申请实施例提供的3D建模方法的流程示意图。如图19所示,该3D建模方法可以包括S1901-S1913。
S1901、手机接收第一操作,第一操作为启动第一应用的操作。
例如,第一操作可以是上述点击或触摸图3所示的手机的主界面中的第一应用的应用图标302的操作。或者,第一操作可以是在下拉界面、或者负一屏等其他显示界面点击或触摸启动第一应用的功能控件的操作。又或者,第一操作还可以是上述通过语音助手控制手机启动运行第一应用的操作。
S1902、手机响应于第一操作,启动运行第一应用,显示第一应用的主界面。
第一应用的主界面可以参考上述图4所示。第一应用的主界面可以称为第二界面。
S1903、手机接收第二操作,第二操作为启动第一应用的3D建模功能的操作。
例如,第二操作可以是上述点击或触摸图4所示的第一应用的主界面401中的功能控件“开始建模”402的操作。
S1904、手机响应于第二操作,显示3D建模数据采集界面,并启动摄像头的拍摄功能,将摄像头拍摄到的画面在3D建模数据采集界面进行显示。
手机响应于第二操作,显示的3D建模数据采集界面可以参考上述图5所示。3D建模数据采集界面可以称为第一界面。
S1905、手机接收第三操作,第三操作为控制手机采集对目标物体进行3D建模所需的数据的操作。
例如,第三操作可以包括用户点击或触摸上述图5所示的3D建模数据采集界面中的扫描按钮502的操作、以及用户手持手机绕着目标物体进行环绕拍摄的操作。第三操作也可以称为采集操作。
S1906、手机响应于第三操作,获取目标物体对应的关键帧图片组成的帧序列文件。
S1907、手机获取帧序列文件中各帧关键帧的匹配信息,得到第一文件。
第一文件可以参考前述实施例中所述。对每一帧关键帧而言,第一文件中包括的该帧关键帧的匹配信息可以包括:该帧关键帧的上下左右四个方位分别对应的最近的关键帧的标识信息,例如,标识信息可以是前述图片的编号。
S1908、手机向云端发送帧序列文件、以及第一文件。
相应地,云端接收帧序列文件、以及第一文件。
S1909、云端根据帧序列文件、以及第一文件进行3D建模,得到目标物体的3D模型。
云端根据帧序列文件、以及第一文件进行3D建模的具体过程可以参考前述实施例中的第三部分中所述,不再赘述。目标物体的3D模型可以包括目标物体的基本3D模型、以及目标物体表面的纹理。将目标物体表面的纹理(纹理贴图)贴到目标物体的基本3D模型上,即为目标物体的3D模型。
S1910、云端向手机发送指示消息,用于指示云端已完成3D建模。
相应地,手机接收指示消息。
S1911、手机向云端发送下载请求消息,用于请求下载目标物体的3D模型。
相应地,云端接收下载请求消息。
S1912、云端向手机发送目标物体的3D模型。
相应地,手机接收目标物体的3D模型。
S1913、手机显示目标物体的3D模型。
手机显示目标物体的3D模型,可以供用户对目标物体的3D模型进行预览。
例如,手机显示目标物体的3D模型的效果可以参考上述图12、图13、图14、图16、图18等所示,用户对目标物体的3D模型进行预览时,可以对手机显示的目标物体的3D模型进行转动角度、放大或缩小等。
上述图19所示的流程的具体实现、以及具备的有益效果,请参见前述实施例中所述,不再赘述。
示例性地,图20为本申请实施例提供的端云协同系统实现3D建模方法的逻辑示意图。
如图20所示,本申请实施例中,手机至少可以包括RGB相机(如摄像头)、第一应用。第一应用即上述3D建模应用。
RGB相机可以用于实现手机的拍摄功能,对待建模的目标物体进行拍摄,得到目标物体对应的图片。RGB相机可以将拍摄到的图片传输给第一应用。
第一应用可以包括数据采集和动态引导模块、数据处理模块、3D模型预览模块、以及3D模型导出模块。
数据采集和动态引导模块可以实现模糊检测、关键帧选择、引导信息计算、引导界面更新等功能。对于RGB相机拍摄到的图片,通过模糊检测功能可以实现对拍摄到的每一帧图片(可称为输入帧)进行模糊检测,获取清晰度符合要求的图片作为关键帧;如果当前帧图片的清晰度不符合要求,则继续获取下一帧图片。通过关键帧选择功能可以判断图片是否已存储在帧序列文件,如果图片未在存储在帧序列文件,则添加该帧图片至帧序列文件。通过引导信息计算功能可以根据图片对应的相机位姿信息,确定图片与面片模型中的每个面片的关系,得到该图片对应的面片。图片与面片的对应关系即为引导信息。通过引导界面更新功能可以根据前述计算得到的引导信息,对面片模型中面片的显示效果进行更新(即改变),如点亮面片。
数据处理模块可以实现匹配关系计算、匹配列表计算、数据打包等功能。通过匹配关系计算功能可以计算得到帧序列文件中的关键帧之间的匹配关系,如是否相邻。具体地,数据处理模块可以通过过匹配关系计算功能,根据面片模型中的面片之间的关联关系,计算得到帧序列文件中的关键帧之间的匹配关系。通过匹配列表计算功能可以根据匹配关系计算功能的计算结果生成各关键帧的匹配列表,各关键帧的匹配列表中包括了各关键帧的匹配信息。如:某个关键帧的匹配信息包括与该帧关键帧关联的其他关键帧的标识信息。通过数据打包功能可以将包括各关键帧的匹配信息第一文件和帧序列文件进行打包。将第一文件和帧序列文件进行打包后,手机可以将打包后的数据包(包括第一文件和帧序列文件)发送给云端。
云端可以包括数据解析模块、3D建模模块、以及数据存储模块。数据解析模块可以对接收到的数据包进行解析,得到帧序列文件和第一文件。3D建模模块可以根据根据帧序列文件、以及第一文件进行3D建模,得到3D模型。
例如,3D建模模块可以实现关键目标提取、特征检测与匹配、全局优化与融合、稀疏点云计算、稠密点云计算、表面重建、纹理生成等功能。通过关键目标提取功能 可以将关键帧图片中感兴趣的目标物体与背景分割开来,从图像中识别和解译有意义的物体实体而提取不同的图像特征。通过特征检测与匹配功能可以检测关键帧图片中存在的独特的像素点作为关键帧图片的特征点;对不同关键帧图片中特征显著的特征点进行描述,比较两个描述的相似程度判断不同关键帧图片中的特征点是否为同一个特征。通过全局优化与融合功能、稀疏点云计算功能、稠密点云计算功能等可以根据特征检测与匹配的结果,生成目标物体对应的三维点云数据。通过表面重建功能可以利用三维点云数据精确恢复物体三维表面形状,得到目标物体的基本3D模型。通过纹理生成功能可以根据关键帧图片或者关键帧图片的特征,生成目标物体表面的纹理(也称纹理贴图)。在得到目标物体表面的纹理后,将纹理按照特定的方式映射到上述目标物体的基本3D模型的表面上,即可得到目标物体的3D模型。
3D建模模块在得到目标物体的3D模型后,可以将目标物体的3D模型存储至数据存储模块。
手机的第一应用可以从云端的数据存储模块下载目标物体的3D模型。下载目标物体的3D模型后,第一应用可以通过3D模型预览模块为用户提供3D模型预览功能,或者,通过3D模型导出模块为用户提供导出3D模型的功能。第一应用通过3D模型预览模块为用户提供3D模型预览功能的具体过程请参考前述实施例中所述。
可选地,本申请实施例中,手机在第一部分中采集对目标物体进行3D建模所需的数据时,还可以3D建模数据采集界面显示扫描进度。例如,可以参考上述图7A至图7E所示,手机可以通过3D建模数据采集界面中的扫描按钮中的环形黑色填充效果显示扫描进度。可以理解的,扫描按钮的UI呈现效果不同,手机在3D建模数据采集界面显示扫描进度的方式可以不同,在此不作限制。
另外一些实施例中,手机也可以无需显示扫描进度,用户根据面片模型中面片被点亮的情况,即可了解到扫描进度。
以上实施例以本申请实施例提供的3D建模方法在终端设备与云端组成的端云协同系统中实现为例进行了说明。可选地,还有一些实施例中,本申请实施例提供的3D建模方法的步骤也可以全部在终端设备侧实现。例如,对于一些处理能力较强、计算资源充裕的终端设备而言,前述实施例中所述的云端侧实现的功能也可以全部在终端设备中实现。即,终端设备在获取到上述帧序列文件和第一文件后,可以直接在本地根据帧序列文件和第一文件生成目标物体的3D模型,并提供3D模型的预览、导出等功能。终端设备在本地根据帧序列文件和第一文件生成目标物体的3D模型的具体原理,与前述实施例中所述的云端根据帧序列文件和第一文件生成目标物体的3D模型的原理相同,此处不再赘述。
应当理解,以上各实施例中所述仅为对本申请实施例提供的3D建模方法的示例性说明。在其他一些可能的实现方式中,以上所述的各实施例也可以删减或增加某些执行步骤,或者以上实施例中所述的部分步骤的顺序也可以进行调整,本申请对此均不作限制。
对应于前述实施例中所述的3D建模方法,本申请实施例提供一种建模装置,该装置可以应用于终端设备,用于实现前述实施例所述的3D建模方法中终端设备可以实现的步骤。该装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。 硬件或软件包括一个或多个与上述功能相对应的模块或单元。
例如,图21为本申请实施例提供的建模装置的结构示意图。如图21所示,该装置可以包括:显示单元2101和处理单元2102。显示单元2101和处理单元2102可以用于配合实现前述方法实施例所述的建模方法中终端设备的功能。
例如,显示单元2101,用于显示第一界面,第一界面包括终端设备的拍摄画面。
处理单元2102,用于响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系。根据所述多帧图像以及所述多帧图像之间的关联关系,获取目标物体对应的三维模型。
显示单元2101,还用于显示目标物体对应的三维模型。
其中,在采集目标物体对应的多帧图像的过程中,显示单元2101还用于显示第一虚拟包围体;第一虚拟包围体包括多个面片。处理单元2102具体用于当终端设备为第一位姿时,采集第一图像,并改变第一图像对应的面片的显示效果;当终端设备为第二位姿时,采集第二图像,并改变第二图像对应的面片的显示效果;当改变第一虚拟包围体的所述多个面片的显示效果后,根据所述多个面片获取所述多帧图像之间的关联关系。
可选地,显示单元2101和处理单元2102还用于实现前述方法实施例所述的建模方法中终端设备的其他显示功能和处理功能,在此不再一一赘述。
可选地,图22为本申请实施例提供的建模装置的另一结构示意图。如图22所示,对于前述方法实施例所述的建模方法中,终端设备将所述多帧图像以及所述多帧图像之间的关联关系发送到服务器,由服务器根据所述多帧图像以及所述多帧图像之间的关联关系生成目标物体对应的三维模型的实现方式,该建模装置还可以包括发送单元2103和接收单元2104,发送单元2103用于向服务器发送所述多帧图像以及所述多帧图像之间的关联关系,接收单元2104用于接收来自服务器发送的目标物体对应的三维模型。
可选地,发送单元2103还用于实现前述方法实施例所述的方法中终端设备可以实现的其他发送功能,如:发送下载请求消息,接收单元2104还用于实现前述方法实施例所述的方法中终端设备可以实现的其他接收功能,如:接收指示消息,在此不再一一赘述。
应当理解,该装置还可能包括用于实现前述实施例所述的方法中可以实现终端设备的功能的其他模块或单元,在此并未一一示出。
可选地,本申请实施例还提供一种建模装置,该装置可以应用于服务器,用于实现前述实施例所述的3D建模方法中服务器的功能。该装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元。
例如,图23为本申请实施例提供的建模装置的又一结构示意图。如图23所示,该装置可以包括:接收单元2301、处理单元2302和发送单元2303。接收单元2301、处理单元2302和发送单元2303可以用于配合实现前述方法实施例所述的建模方法中服务器的功能。
例如,接收单元2301可以用于接收来自终端设备发送的目标物体对应的多帧图像 以及所述多帧图像之间的关联关系。处理单元2302可以用于根据所述多帧图像以及所述多帧图像之间的关联关系,生成目标物体对应的三维模型。发送单元2303可以用于向终端设备发送目标物体对应的三维模型。
可选地,接收单元2301、处理单元2302和发送单元2303可以用于实现前述方法实施例所述的建模方法中的服务器可以实现的全部功能,在此不再一一赘述。
应理解以上装置中单元(或称为模块)的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。且装置中的单元可以全部以软件通过处理元件调用的形式实现;也可以全部以硬件的形式实现;还可以部分单元以软件通过处理元件调用的形式实现,部分单元以硬件的形式实现。
例如,各个单元可以为单独设立的处理元件,也可以集成在装置的某一个芯片中实现,此外,也可以以程序的形式存储于存储器中,由装置的某一个处理元件调用并执行该单元的功能。此外这些单元全部或部分可以集成在一起,也可以独立实现。这里所述的处理元件又可以称为处理器,可以是一种具有信号的处理能力的集成电路。在实现过程中,上述方法的各步骤或以上各个单元可以通过处理器元件中的硬件的集成逻辑电路实现或者以软件通过处理元件调用的形式实现。
在一个例子中,以上装置中的单元可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个专用集成电路(application specific integrated circuit,ASIC),或,一个或多个数字信号处理器(digital signal process,DSP),或,一个或者多个现场可编辑逻辑门阵列(field programmable gate array,FPGA),或这些集成电路形式中至少两种的组合。
再如,当装置中的单元可以通过处理元件调度程序的形式实现时,该处理元件可以是通用处理器,例如中央处理器(central processing unit,CPU)或其它可以调用程序的处理器。再如,这些单元可以集成在一起,以片上系统(system-on-a-chip,SOC)的形式实现。
在一种实现中,以上装置实现以上方法中各个对应步骤的单元可以通过处理元件调度程序的形式实现。例如,该装置可以包括处理元件和存储元件,处理元件调用存储元件存储的程序,以执行以上方法实施例所述的方法。存储元件可以为与处理元件处于同一芯片上的存储元件,即片内存储元件。
在另一种实现中,用于执行以上方法的程序可以在与处理元件处于不同芯片上的存储元件,即片外存储元件。此时,处理元件从片外存储元件调用或加载程序于片内存储元件上,以调用并执行以上方法实施例所述的方法中终端设备或服务器所执行的步骤。
例如,本申请实施例还可以提供一种装置,如:电子设备。该电子设备可以包括:处理器;存储器;以及计算机程序;其中,所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备实现如前述实施例所述的3D建模方法中终端设备或服务器所执行的步骤。该存储器可以位于该电子设备之内,也可以位于该电子设备之外。且该处理器包括一个或多个。
示例性地,该电子设备可以是手机、大屏(如智慧屏)、平板电脑、可穿戴设备(例如智能手表、智能手环器等)、电视、车载设备、增强现实(augmented reality, AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等终端设备。
在又一种实现中,该装置实现以上方法中各个步骤的单元可以是被配置成一个或多个处理元件,这里的处理元件可以为集成电路,例如:一个或多个ASIC,或,一个或多个DSP,或,一个或者多个FPGA,或者这些类集成电路的组合。这些集成电路可以集成在一起,构成芯片。
例如,本申请实施例还提供一种芯片,该芯片可以应用于上述电子设备。芯片包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;处理器通过接口电路从电子设备的存储器接收并执行计算机指令,以实现如前述实施例所述的3D建模方法中终端设备或服务器所执行的步骤。
本申请实施例还提供一种计算机程序产品,包括计算机可读代码,当计算机可读代码在电子设备中运行时,使得电子设备实现如前述实施例所述的3D建模方法中终端设备或服务器所执行的步骤。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。
基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,如:程序。该软件产品存储在一个程序产品,如计算机可读存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
例如,本申请实施例还可以提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当所述计算机程序在电子设备上运行时,使得所述电子设备实现如前述实施例所述的3D建模方法中终端设备或服务器所执行的步骤。
可选地,本申请实施例还提供一种端云协同系统,该端云协同系统的组成可以参考上述图1或图20所示,包括终端设备和服务器,所述终端设备与所述服务器连接;所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中,所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关 系包括:当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;所述服务器向所述终端设备发送所述目标物体对应的三维模型;所述终端设备显示所述目标物体对应的三维模型。
类似地,该端云协同系统中,终端设备可以实现前述方法实施例中所述的3D建模方法中终端设备可以实现的全部功能,服务器可以实现前述方法实施例中所述的3D建模方法中服务器可以实现的全部功能,在此不再一一赘述。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (24)

  1. 一种建模方法,其特征在于,所述方法应用于终端设备,所述方法包括:
    所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;
    所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中,所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;
    所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:
    当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;
    当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;
    当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;
    所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;
    所述终端设备显示所述目标物体对应的三维模型。
  2. 根据权利要求1所述的方法,其特征在于,所述终端设备包括第一应用,在所述终端设备显示第一界面之前,所述方法还包括:
    所述终端设备响应于打开所述第一应用的操作,显示第二界面;
    所述终端设备显示第一界面,包括:
    所述终端设备响应于在所述第二界面启动所述第一应用的三维建模功能的操作,显示所述第一界面。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一虚拟包围体包括一层或多层,所述多个面片分布在所述一层或多层。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:
    所述终端设备显示第一提示信息,所述第一提示信息用于提醒用户将目标物体在拍摄画面中的位置置于中央位置。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述方法还包括:
    所述终端设备显示第二提示信息;所述第二提示信息用于提醒用户调整所述目标物体所处的拍摄环境、对所述目标物体进行拍摄的方式、以及所述目标物体的屏占比中的一种或多种。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,在所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型之前,所述方法还包括:
    所述终端设备检测生成三维模型的操作;
    所述终端设备响应于所述生成三维模型的操作,显示第三提示信息,所述第三提示信息用于提示用户正在对所述目标物体进行建模。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,在所述终端设备根据所述 多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型之后,所述方法还包括:
    所述终端设备显示第四提示信息,所述第四提示信息用于提示用户已完成对所述目标物体的建模。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型还包括:
    所述终端设备响应于改变所述目标物体对应的三维模型的显示角度的操作,改变所述目标物体对应的三维模型的显示角度;所述改变所述目标物体对应的三维模型的显示角度的操作包括拖动所述目标物体对应的三维模型沿着第一方向进行顺时针转动或逆时针转动的操作。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型还包括:
    所述终端设备响应于改变所述目标物体对应的三维模型的显示大小的操作,改变所述目标物体对应的三维模型的显示大小;所述改变所述目标物体对应的三维模型的显示大小的操作包括对所述目标物体对应的三维模型进行放大或缩小的操作。
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;
    每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;
    每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系还包括:
    所述终端设备根据所述拍摄画面确定所述目标物体;
    当所述终端设备采集所述多帧图像时,所述目标物体在所述拍摄画面中的位置为所述拍摄画面的中央位置。
  12. 根据权利要求1-11任一项所述的方法,其特征在于,所述终端设备采集待建模的目标物体对应的多帧图像,包括:
    所述终端设备在对所述目标物体进行拍摄的过程中,对拍摄到的每一帧图像进行模糊检测,采集清晰度大于第一阈值的图像作为所述目标物体对应的图像。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型,包括:
    所述终端设备响应于对所述目标物体对应的三维模型进行预览的操作,显示所述目标物体对应的三维模型。
  14. 根据权利要求1-13任一项所述的方法,其特征在于,所述目标物体对应的三维模型包括所述目标物体的基本三维模型、以及所述目标物体表面的纹理。
  15. 根据权利要求1-14任一项所述的方法,其特征在于,所述终端设备与服务器连接;所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述 目标物体对应的三维模型,包括:
    所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;
    所述终端设备接收来自所述服务器发送的所述目标物体对应的三维模型。
  16. 根据权利要求15所述的方法,其特征在于,所述方法还包括:
    所述终端设备向所述服务器发送所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳。
  17. 根据权利要求15或16所述的方法,其特征在于,所述方法还包括:
    所述终端设备接收来自所述服务器的指示消息,所述指示消息用于向所述终端设备指示所述服务器已完成对所述目标物体的建模。
  18. 根据权利要求15-17任一项所述的方法,其特征在于,所述终端设备接收来自所述服务器发送的所述目标物体对应的三维模型之前,所述方法还包括:
    所述终端设备向所述服务器发送下载请求消息,所述下载请求消息用于向所述服务器请求下载所述目标物体对应的三维模型。
  19. 一种建模方法,其特征在于,所述方法应用于服务器,所述服务器与终端设备连接;所述方法包括:
    所述服务器接收来自所述终端设备发送的目标物体对应的多帧图像以及所述多帧图像之间的关联关系;
    所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成所述目标物体对应的三维模型;
    所述服务器向所述终端设备发送所述目标物体对应的三维模型。
  20. 根据权利要求19所述的方法,其特征在于,所述方法还包括:
    所述服务器接收来自所述终端设备发送的所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳;
    所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成所述目标物体对应的三维模型,包括:
    所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系、所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳,生成所述目标物体对应的三维模型。
  21. 根据权利要求19或20所述的方法,其特征在于,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;
    每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;
    每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
  22. 一种端云协同系统,其特征在于,包括:终端设备和服务器,所述终端设备与所述服务器连接;
    所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;
    所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中, 所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;
    所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:
    当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;
    当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;
    当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;
    所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;
    所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;
    所述服务器向所述终端设备发送所述目标物体对应的三维模型;
    所述终端设备显示所述目标物体对应的三维模型。
  23. 一种电子设备,其特征在于,包括:处理器;存储器;以及计算机程序;其中,所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备实现如权利要求1-18任一项所述的方法,或者如权利要求19-21任一项所述的方法。
  24. 一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,其特征在于,当所述计算机程序在电子设备上运行时,使得所述电子设备实现如权利要求1-18任一项所述的方法,或者如权利要求19-21任一项所述的方法。
PCT/CN2022/093934 2021-06-26 2022-05-19 建模方法及相关电子设备及存储介质 WO2022267781A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22827279.5A EP4343698A1 (en) 2021-06-26 2022-05-19 Modeling method and related electronic device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110715044.2A CN115526925A (zh) 2021-06-26 2021-06-26 建模方法及相关电子设备及存储介质
CN202110715044.2 2021-06-26

Publications (1)

Publication Number Publication Date
WO2022267781A1 true WO2022267781A1 (zh) 2022-12-29

Family

ID=84544082

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/093934 WO2022267781A1 (zh) 2021-06-26 2022-05-19 建模方法及相关电子设备及存储介质

Country Status (3)

Country Link
EP (1) EP4343698A1 (zh)
CN (1) CN115526925A (zh)
WO (1) WO2022267781A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160335985A1 (en) * 2015-05-14 2016-11-17 Box, Inc. Rendering high bit depth grayscale images using gpu color spaces and acceleration
CN108108748A (zh) * 2017-12-08 2018-06-01 联想(北京)有限公司 一种信息处理方法及电子设备
CN109658507A (zh) * 2018-11-27 2019-04-19 联想(北京)有限公司 信息处理方法及装置、电子设备
CN110473292A (zh) * 2019-07-16 2019-11-19 江苏艾佳家居用品有限公司 一种三维场景中模型自动化加载布局方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160335985A1 (en) * 2015-05-14 2016-11-17 Box, Inc. Rendering high bit depth grayscale images using gpu color spaces and acceleration
CN108108748A (zh) * 2017-12-08 2018-06-01 联想(北京)有限公司 一种信息处理方法及电子设备
CN109658507A (zh) * 2018-11-27 2019-04-19 联想(北京)有限公司 信息处理方法及装置、电子设备
CN110473292A (zh) * 2019-07-16 2019-11-19 江苏艾佳家居用品有限公司 一种三维场景中模型自动化加载布局方法

Also Published As

Publication number Publication date
EP4343698A1 (en) 2024-03-27
CN115526925A (zh) 2022-12-27

Similar Documents

Publication Publication Date Title
WO2021093793A1 (zh) 一种拍摄方法及电子设备
WO2021238325A1 (zh) 一种图像处理方法及装置
CN114205522B (zh) 一种长焦拍摄的方法及电子设备
WO2019134516A1 (zh) 全景图像生成方法、装置、存储介质及电子设备
CN110288518B (zh) 图像处理方法、装置、终端及存储介质
WO2022042776A1 (zh) 一种拍摄方法及终端
JP2022537614A (ja) マルチ仮想キャラクターの制御方法、装置、およびコンピュータプログラム
JP2021524957A (ja) 画像処理方法およびその、装置、端末並びにコンピュータプログラム
WO2022007627A1 (zh) 一种图像特效的实现方法、装置、电子设备及存储介质
CN113709355B (zh) 滑动变焦的拍摄方法及电子设备
CN110290426B (zh) 展示资源的方法、装置、设备及存储介质
CN108776822B (zh) 目标区域检测方法、装置、终端及存储介质
CN110796248A (zh) 数据增强的方法、装置、设备及存储介质
CN105427369A (zh) 移动终端及其三维形象的生成方法
CN114640783B (zh) 一种拍照方法及相关设备
EP4254938A1 (en) Electronic device and operation method therefor
CN114579016A (zh) 一种共享输入设备的方法、电子设备及系统
CN115115679A (zh) 一种图像配准方法及相关设备
CN110956571A (zh) 基于slam进行虚实融合的方法及电子设备
WO2021185374A1 (zh) 一种拍摄图像的方法及电子设备
WO2021103919A1 (zh) 构图推荐方法和电子设备
CN114513689B (zh) 一种遥控方法、电子设备及系统
CN113538227A (zh) 一种基于语义分割的图像处理方法及相关设备
WO2022267781A1 (zh) 建模方法及相关电子设备及存储介质
WO2022257889A1 (zh) 显示方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22827279

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022827279

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022827279

Country of ref document: EP

Effective date: 20231213

NENP Non-entry into the national phase

Ref country code: DE