WO2020215898A1 - 三维重建方法及装置、系统、模型训练方法、存储介质 - Google Patents

三维重建方法及装置、系统、模型训练方法、存储介质 Download PDF

Info

Publication number
WO2020215898A1
WO2020215898A1 PCT/CN2020/077850 CN2020077850W WO2020215898A1 WO 2020215898 A1 WO2020215898 A1 WO 2020215898A1 CN 2020077850 W CN2020077850 W CN 2020077850W WO 2020215898 A1 WO2020215898 A1 WO 2020215898A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
angle
dimensional
target object
target
Prior art date
Application number
PCT/CN2020/077850
Other languages
English (en)
French (fr)
Inventor
赵骥伯
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US17/043,061 priority Critical patent/US11403818B2/en
Publication of WO2020215898A1 publication Critical patent/WO2020215898A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/10Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • G06Q30/0643Graphical representation of items or shoppers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47GHOUSEHOLD OR TABLE EQUIPMENT
    • A47G1/00Mirrors; Picture frames or the like, e.g. provided with heating, lighting or ventilating means
    • A47G1/02Mirrors used as equipment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the embodiments of the present disclosure relate to a three-dimensional reconstruction method and device, system, model training method, and storage medium.
  • Some fitting mirrors with multiple functions can provide a fitting effect when the user does not actually try on clothes, which improves the convenience of fitting and saves the time of fitting.
  • the three-dimensional image is usually obtained by performing three-dimensional reconstruction processing on the target image containing the target object obtained by the camera at various shooting angles. owned. Each shooting angle is located in an angle interval obtained by dividing the angle range [0, 360°).
  • the target image containing the target object it is necessary to obtain the target image containing the target object at multiple shooting angles. For example, the camera is used to shoot the target object at 15° intervals in clockwise or counter-clockwise sequence, and then The acquired 24 frames of target images are processed for 3D reconstruction.
  • the inventor found that in the current three-dimensional reconstruction process, it is necessary to sort the target images obtained by the camera through an algorithm to ensure that the two angle intervals corresponding to the two adjacent target images are adjacent, resulting in the three-dimensional reconstruction
  • the amount of calculation during processing is relatively large, and the efficiency of obtaining three-dimensional images is low.
  • Various embodiments of the present disclosure provide a three-dimensional reconstruction method, which includes:
  • Determining a shooting angle of the first image where the shooting angle is used to characterize a shooting direction relative to the target object when the first camera shoots the first object;
  • the determining the shooting angle of the target image includes:
  • the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image.
  • the method further includes:
  • Labeling the first image with a label the label being used to label a target object in the target image
  • the method Before performing three-dimensional reconstruction on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object, the method further includes:
  • the first image corresponding to each of the angle intervals is acquired.
  • the method further includes :
  • image quality scoring processing is performed on the multiple frames of first images to obtain the value of each frame of the first image in the multiple frames of first image Image quality score;
  • the method further includes:
  • the first image is modified to an image of a specified resolution, and the specified resolution is greater than or equal to the resolution threshold.
  • the three-dimensional reconstruction of the target object based on the target image of each angle interval to obtain the three-dimensional image of the target object includes:
  • each of the multiple angle intervals has a corresponding target image
  • perform three-dimensional reconstruction on the target object to obtain a three-dimensional image of the target object.
  • the three-dimensional reconstruction of the target object based on the target image of each angle interval to obtain the three-dimensional image of the target object includes:
  • the three-dimensional image is an incomplete three-dimensional image, repair the incomplete three-dimensional image to obtain a repaired three-dimensional image.
  • Various embodiments of the present disclosure provide a three-dimensional reconstruction device, including:
  • a first acquiring module configured to acquire a first image acquired by a first camera, the first image including a target object
  • a first determining module configured to determine a shooting angle of the first image using an angle recognition model, where the shooting angle is used to characterize the shooting direction of the first object relative to the target object when the first camera shoots the first object;
  • the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image;
  • the second determining module is configured to determine the angle interval corresponding to the first image among a plurality of angle intervals included in the angle range [0, 360°) based on the shooting angle of the first image, and to combine the first image Set as a target image in the angle interval;
  • the three-dimensional reconstruction module is configured to perform three-dimensional reconstruction on the target object based on the target image in each of the angle intervals to obtain a three-dimensional image of the target object.
  • Various embodiments of the present disclosure provide a three-dimensional reconstruction system, including: a reconstruction server and a first camera, and the reconstruction server includes the above-mentioned three-dimensional reconstruction device.
  • the system further includes: a dressing mirror;
  • the dressing mirror is configured to send an acquisition request to the reconstruction server, and the acquisition request carries information of the target object;
  • the reconstruction server is configured to send an acquisition response to the dressing mirror based on the information of the target object, and the acquisition correspondingly carries the three-dimensional image of the target object.
  • a model training method configured to train an angle recognition model, and the method includes:
  • the shooting angle of the sample image is determined, and the shooting angle is used to characterize that the second camera captures the sample image relative to The direction of the sample object;
  • the sample image is input to the deep learning model to obtain the predicted shooting angle of the sample image, and the classification accuracy of the shooting angle is determined according to the shooting angle and the predicted shooting angle of the sample image.
  • determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point includes:
  • the angle calculation formula is used to calculate the shooting angle of the sample image, and the angle calculation formula is:
  • the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
  • V 1 represents the first key point A vector in the XZ plane of the line connecting a key point and the second key point in the world coordinate system;
  • V 2 represents a unit vector perpendicular to V 1 ;
  • V Z represents a unit parallel to the Z axis in the world coordinate system Vector;
  • represents the shooting angle.
  • the training process further includes:
  • the shooting angle is corrected by using a correction calculation formula to obtain a corrected shooting angle, and the correction calculation formula is :
  • ⁇ 1 is the shooting angle after correction
  • ⁇ 2 is the shooting angle before correction
  • the training process before determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point, the training process further includes:
  • the shooting angle of the sample image is a specified angle, and the specified angle is any angle within a fixed range. One angle.
  • Various embodiments of the present disclosure provide a non-volatile computer-readable storage medium in which code instructions are stored, and the code instructions are executed by a processor to execute the above-mentioned three-dimensional reconstruction method.
  • Fig. 1 is a block diagram of a three-dimensional reconstruction system involved in a three-dimensional reconstruction method according to an embodiment of the present disclosure
  • FIG. 2 is a block diagram of a model training system involved in a model training method according to an embodiment of the present disclosure
  • Fig. 3 is a flowchart of a three-dimensional reconstruction method according to an embodiment of the present disclosure
  • Figure 4 is an effect diagram when the first camera shoots the first object
  • Fig. 5 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
  • Fig. 6 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
  • FIG. 7 is a flowchart of a training process according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of joint points of a sample object according to an embodiment of the present disclosure.
  • Fig. 9 is a block diagram of a three-dimensional reconstruction device according to an embodiment of the present disclosure.
  • Fig. 10 is a block diagram of a three-dimensional reconstruction device according to another embodiment of the present disclosure.
  • FIG. 11 is a block diagram of a three-dimensional reconstruction apparatus according to still another embodiment of the present disclosure.
  • FIG. 1 is a block diagram of a three-dimensional reconstruction system involved in a three-dimensional reconstruction method according to an embodiment of the present disclosure.
  • the three-dimensional reconstruction system 100 may include: at least one first camera 101 and a reconstruction server 102.
  • the first camera 101 may generally include a surveillance camera such as an RGB camera or an infrared camera, and the first camera 101 is usually set to be multiple.
  • the multiple first cameras 101 may be deployed at different locations in a mall or shop.
  • the reconstruction server 102 may be a server, or a server cluster composed of several servers, or a cloud computing service center, or a computer device.
  • the first camera 101 can establish a communication connection with the reconstruction server 102.
  • the three-dimensional reconstruction system 100 may further include: a dressing mirror 103.
  • the fitting mirror 103 can usually be deployed in a store such as a clothing store, and the fitting mirror 103 can provide users with virtual fitting services.
  • the fitting mirror 103 can establish a communication connection with the reconstruction server 102.
  • FIG. 2 is a block diagram of a model training system involved in a model training method provided by an embodiment of the present disclosure.
  • the model training system 200 may include: a second camera 201 and a training server 202.
  • the second camera 201 may be a camera including a depth camera, or a binocular camera.
  • the second camera can acquire both a color map (also called an RGB map) and a depth map.
  • the pixel value of each pixel in the depth map is a depth value, and the depth value is used to indicate the distance of the corresponding pixel from the second camera.
  • the training server 202 may be a server, or a server cluster composed of several servers, or a cloud computing service center, or a computer device.
  • the second camera 201 may establish a communication connection with the training server 202 in a wired or wireless manner.
  • FIG. 3 is a flowchart of a three-dimensional reconstruction method according to an embodiment of the present disclosure.
  • the three-dimensional reconstruction method is applicable to the reconstruction server 102 in the three-dimensional reconstruction system 100 shown in FIG. 1.
  • the three-dimensional reconstruction method may include:
  • Step S301 Obtain a first image collected by the first camera.
  • the first image contains the target object.
  • Step S302 Determine a shooting angle of the first image, where the shooting angle is used to characterize the shooting direction of the first camera relative to the target object when the first image is captured.
  • an angle recognition model may be used to determine the shooting angle of the first image, and the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image.
  • Step S303 Based on the shooting angle, determine the corresponding angle interval of the first image in the multiple angle intervals included in the angle range [0, 360°), and set the first image as the angle interval of the angle interval. Target image.
  • the multiple angle intervals are obtained by dividing the angle range [0, 360°), and the angle values contained in each angle interval are different.
  • Figure 4 is an effect diagram when the first camera shoots the target object.
  • the first camera 02 shoots the target object 01 from different directions, and the first camera 02 shoots the target
  • the shooting direction of object 01 can be characterized by shooting angle.
  • counterclockwise rotation with the target object 01 as the center is an angular interval every 15°.
  • the angle information included in each angle interval is the shooting angle obtained in step S302, and therefore the shooting angle obtained in step S302 belongs to one of the multiple angle intervals.
  • Step S304 Perform three-dimensional reconstruction processing on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object.
  • the shooting angle of the first image can be determined by the angle recognition model, and based on the shooting angle, the angle interval corresponding to the first image in the multiple angle intervals can be determined. Therefore, when the first image is acquired, the shooting angle of the first image is also acquired.
  • the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort multiple frames of the first image, which can be directly based on the shooting angle
  • the sequence of acquiring the first image of multiple frames effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
  • the shooting angle of the first image can be determined through the angle recognition model, and the corresponding angle interval of the first image in the multiple angle intervals can be determined based on the shooting angle , And set the first image as the target image in the angle interval, and subsequently, based on the target image in each angle interval, three-dimensional reconstruction of the target object may be performed to obtain a three-dimensional image of the target object.
  • the shooting angle of the first image is also acquired.
  • the target object When the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort multiple frames of the first image, and it can be obtained directly based on the shooting angle
  • the sequence of the first image of multiple frames effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
  • FIG. 5 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
  • the three-dimensional reconstruction method is applied to the reconstruction server 102 in the three-dimensional reconstruction system 100 shown in FIG. 1.
  • the three-dimensional reconstruction method may include:
  • Step S401 Acquire a first image collected by a first camera.
  • the first image is an image containing the target object collected by the first camera.
  • the target object can be a person, an animal, or an object. If the first camera is a surveillance camera and the first camera is deployed in a mall or store, the target object is a person in the mall or store.
  • the reconstruction server may intercept multiple frames of first images in the frame of image, and each frame of the first image The target audience is different.
  • Step S402 Determine whether the resolution of the first image is less than the resolution threshold.
  • step S403 if the resolution of the first image is less than the resolution threshold, step S403 is executed; if the resolution of the first image is not less than the resolution threshold, step S404 is executed.
  • the resolution threshold is: 224 ⁇ 112.
  • Step S403 If the resolution of the first image is less than the resolution threshold, delete the first image.
  • the reconstruction server may determine whether the resolution of the first image is less than the resolution threshold, and if the reconstruction server determines that the resolution of the first image is less than the resolution threshold, delete the first image.
  • Step S404 If the resolution of the first image is not less than the resolution threshold, modify the first image to an image with a specified resolution.
  • the specified resolution is greater than or equal to the resolution threshold.
  • the first image may be modified to an image with a specified resolution. For example, if the resolution of the first image is greater than the specified resolution, the reconstruction server needs to compress the resolution of the first image to the specified resolution; if the resolution of the first image is less than the specified resolution, the reconstruction server needs to The resolution of an image is expanded to the specified resolution.
  • Step S405 Mark the first image with a label, where the label is used to mark a target object in the first image.
  • a target recognition algorithm can be used to mark the first image.
  • the target recognition algorithm may be a pedestrian movement detection algorithm.
  • the reconstruction server can mark the first image with a label through the pedestrian movement detection algorithm, where the label is used to mark the target object in the first image.
  • the pedestrian movement detection algorithm may analyze at least one of the clothing feature, the face feature, and the morphological feature of the target object, so as to mark the first image with a label.
  • Step S406 Based on the tag, classify the first image into an image set.
  • the first images have the same label, and the first images contain the same target object. Therefore, the reconstruction server can classify the target images containing the same target object into an image set based on the tags.
  • step S406 the target objects in the first images in the same image set are the same, but the target objects in the first images in different image sets are different.
  • Step S407 Use the angle recognition model to determine the shooting angle of the first image.
  • the reconstruction server may use an angle recognition model to determine the shooting angle of each frame of the first image.
  • the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image. The method for obtaining the angle recognition model will be introduced in the subsequent embodiments, and will not be repeated here.
  • the reconstruction server uses the angle recognition model to determine the shooting angle of the first image, which may include the following steps:
  • Step A1 Input the first image to the angle recognition model.
  • Step B1 Receive the angle information output by the angle recognition model.
  • Step C1 Determine the angle information as the shooting angle of the first image.
  • Step S408 Determine an angle interval corresponding to the first image based on the shooting angle of the first image, where the angle interval is an angle interval included in a plurality of angle intervals in the angle range [0, 360°), and The first image is set as a target image in the angle interval.
  • the reconstruction server can obtain the first camera in real time from different shooting angles while the target object is walking. Contains the first image of the target object. For example, if the target object is rotated clockwise or counterclockwise every 15° as an angular interval, then the number of the multiple angular intervals is 24.
  • the reconstruction server may determine the angle interval corresponding to the first image among multiple angle intervals based on the shooting angle of each frame of the first image, and set the first image as a target image of the angle interval.
  • the shooting angle of the first image is 10°
  • the angle interval corresponding to the first image is [0, 15°)
  • the first image is set as the target image with the angle interval of [0, 15°)
  • the target image is used for three-dimensional reconstruction.
  • Step S409 Based on the image set, for each angle interval, determine whether there are more than two frames of first images in the same image set corresponding to the angle interval.
  • the reconstruction server since the reconstruction server subsequently needs to refer to the target image corresponding to each angle interval to perform three-dimensional reconstruction of the target object, there may be more than two frames of target images containing the same target object corresponding to one angle interval If the reconstruction server directly refers to the multiple first images corresponding to the angle interval and contains the same target object to perform three-dimensional reconstruction on the target object, the efficiency of performing three-dimensional reconstruction on the target object may be affected. Therefore, the reconstruction server can determine, based on the image collection, whether there are more than two frames of target images in the same image collection corresponding to an angular interval, that is, whether there are more than two target images containing the same target object corresponding to An angular interval.
  • the reconstruction server may determine whether there are more than two frames of target images corresponding to the angle interval in the same image set. If there are more than two frames of target images in the same image set corresponding to the angle interval, that is, there are more than two frames of target images containing the same target object corresponding to the angle interval, go to step S410; if none of them are in the same image More than two frames of target images in the set correspond to the angle interval, and step S409 is repeatedly executed.
  • Step S410 If there are more than two frames of first images located in the same image set corresponding to the angle interval, perform image quality scoring processing on more than two frames of first images located in the same image set to obtain the first image of each frame The image quality rating of an image.
  • the reconstruction server may use an image quality scoring algorithm to perform image quality on more than two frames of first images corresponding to the angle interval.
  • the scoring process obtains the quality score of the first image of each frame.
  • Step S411 Keep the first image with the highest image quality score, set the first image with the highest image quality score as the target image in the angle interval, and delete other first images.
  • the higher the quality score the higher the definition of the first image.
  • the first image is set as the target image in the corresponding angle interval, based on the three-dimensional image obtained during the subsequent three-dimensional reconstruction of the target image.
  • the better the quality. Therefore, the reconstruction server can retain the first image with the highest image quality score, delete other first images, and ensure that each angle interval corresponds to only one frame of the first image with higher definition, which effectively improves the subsequent 3D reconstruction of the target object
  • the imaging quality of the obtained three-dimensional image is reduced, and the number of frames of the first image that needs to be processed during three-dimensional reconstruction is reduced, thereby improving the efficiency of three-dimensional reconstruction of the target object.
  • Step S412 Perform three-dimensional reconstruction on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object.
  • the three-dimensional reconstruction method may further include: obtaining, based on the image set, corresponding to each angle interval containing the same target object The first image. It should also be noted that, after obtaining the three-dimensional image of the target object, the reconstruction server may store the three-dimensional image in the memory of the reconstruction server.
  • the reconstruction server can perform three-dimensional reconstruction on the target object that meets the three-dimensional reconstruction conditions.
  • the embodiments of the present disclosure are schematically illustrated in the following two optional implementation manners.
  • the reconstruction server determines the target image corresponding to each of the multiple angle intervals, the target object satisfies the three-dimensional reconstruction condition.
  • step S412 may include: if each of the multiple angle intervals has a target object, based on the target image of each angle interval, three-dimensional reconstruction of the target object is performed to obtain a three-dimensional image of the target object.
  • the reconstruction server may use a motion reconstruction (English: Structure from motion; SFM for short) algorithm to perform three-dimensional reconstruction of the target object to obtain a three-dimensional image of the target object.
  • the process of the reconstruction server determining that the target image containing the same target object corresponds to each of the multiple angle intervals may include the following steps:
  • Step A2 For each image set, obtain the angle interval corresponding to each frame of the first image in the image set.
  • the reconstruction server can determine the angle interval corresponding to each frame of the first image in each image set, and in the same image set, each angle interval only corresponds to One frame of the first image, and the first image is set as the target image of the angle interval. Therefore, for each image set, the reconstruction server can obtain the angle interval corresponding to each frame of the first image in the image set in real time.
  • Step B2 Determine whether the number of angle intervals corresponding to all target images is the same as the number of multiple angle intervals.
  • the reconstruction server can determine that each of the multiple angle intervals has a target image, that is, execute Step C2: If the number of angle intervals corresponding to all target images is different from the number of multiple angle intervals, the reconstruction server can determine that at least one of the multiple angle intervals does not have a target image, and repeat step A2.
  • Step C2 If the number of all target images is the same as the number of multiple angle intervals, it is determined that each of the multiple angle intervals has a target image.
  • the reconstruction server may determine that the target object has a target image in each of the multiple angle intervals. , The target object meets the 3D reconstruction conditions.
  • the reconstruction server when the reconstruction server receives the three-dimensional reconstruction instruction carrying the information of the target object, the target object meets the three-dimensional reconstruction condition.
  • step S412 may include the following steps:
  • Step A3 When the three-dimensional reconstruction instruction is received, based on the information of the target object carried by the three-dimensional reconstruction instruction, obtain multiple frames of first images containing the target object.
  • the three-dimensional reconstruction system may further include: a dressing mirror, and the three-dimensional reconstruction instruction may be an instruction sent by the dressing mirror.
  • the reconstruction server when the reconstruction server receives the three-dimensional reconstruction instruction carrying the information of the target object, the reconstruction server may obtain multiple frames of first images containing the target object based on the information of the target object.
  • the information about the target object may include at least one of clothing features, facial features, and morphological features. Since the 3D reconstruction server obtains the first image, it will also analyze at least one of the dress feature, face feature, and morphological feature of the target image. Therefore, the reconstruction server can obtain multiple frames containing the target based on the information of the target object. The first image of the object.
  • Step B3 Based on the first image of each frame, determine the three-dimensional reconstruction of the target object corresponding to the first image of the frame to obtain a three-dimensional image of the target object. Based on the multiple frames of first images, determining corresponding angle intervals of the multiple frames of first images, and determining the multiple frames of first images as target images in the corresponding angle intervals;
  • the reconstruction server may use the SFM algorithm to perform three-dimensional reconstruction of the target object based on the first image containing the target object in each frame to obtain the three-dimensional image of the target object.
  • Step C3 Determine whether the three-dimensional image is an incomplete three-dimensional image.
  • the number of frames of the first image on which the reconstruction server is based is small, that is, there may be at least one angle interval that does not contain the target object among the multiple angle intervals.
  • the first image therefore, the three-dimensional image obtained after three-dimensional reconstruction may be an incomplete three-dimensional image, for example, the angle of the hole contained in the three-dimensional image.
  • the reconstruction server can determine whether the three-dimensional image is an incomplete three-dimensional image. If the three-dimensional image is an incomplete three-dimensional image, perform step D3; if the three-dimensional image is a complete three-dimensional image, the action ends.
  • Step D3 If the three-dimensional image is an incomplete three-dimensional image, repair the incomplete three-dimensional image to obtain a repaired three-dimensional image.
  • the reconstruction server in order for the reconstruction server to obtain a three-dimensional image with higher image quality, after determining that the three-dimensional image is an incomplete three-dimensional image, the reconstruction server needs to repair the incomplete three-dimensional image.
  • the reconstruction server can repair the three-dimensional image according to the law of the three-dimensional image of the human body.
  • step 407 can be performed first, and then step 405 to step 406 can be performed. Steps can also be increased or decreased according to the situation.
  • the shooting angle of the target image can be determined through the angle recognition model. Based on the shooting angle, the angle interval of the first image can be determined in multiple angle intervals, and The first image is set as a target image in an angle interval, and then based on the target image corresponding to each angle interval, three-dimensional reconstruction processing may be performed on the target object to obtain a three-dimensional image of the target object.
  • the shooting angle of the first image is also acquired.
  • the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort the multi-frame target images, and the multiple frames can be obtained directly based on the shooting angle.
  • the sequence of the first image of the frame effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
  • FIG. 6 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
  • the three-dimensional reconstruction method is applicable to the three-dimensional reconstruction system 100 shown in FIG. 1, and the three-dimensional reconstruction method may include:
  • Step S501 The first camera collects an image.
  • the first camera may be a surveillance camera, and the number of the first camera is multiple, which may be deployed at different locations in a mall or a store.
  • Step S502 The first camera sends the collected image to the reconstruction server.
  • the first camera may send the real-time collected images to the reconstruction server, so that the reconstruction server can perform three-dimensional reconstruction of the target object.
  • Step S503 The reconstruction server performs three-dimensional reconstruction on the target object based on the image collected by the first camera to obtain a three-dimensional image of the target object.
  • the reconstruction server performs three-dimensional reconstruction of the target object based on the image collected by the first camera to obtain the three-dimensional image of the target object.
  • the process of obtaining the three-dimensional image of the target object please refer to the related content in the aforementioned step S401 to step S412, which will not be repeated here.
  • Step S504 The dressing mirror sends an acquisition request to the reconstruction server, and the acquisition request carries information of the target object.
  • the target object is a person standing in front of the dressing mirror.
  • the dressing mirror needs to obtain the target object’s information from the reconstruction server. Three-dimensional image.
  • the dressing mirror may be provided with a dressing mirror camera.
  • the dressing mirror camera can collect information of the target object located in front of the dressing mirror, and send an acquisition request to the reconstruction server, and the acquisition request carries the information of the target object.
  • Step S505 The reconstruction server sends an acquisition response to the dressing mirror based on the information of the target object, and the acquisition response carries a three-dimensional image of the target object.
  • the reconstruction server may first determine whether the three-dimensional image of the target object is stored.
  • the target object information may be face information
  • the reconstruction server when the reconstruction server obtains the three-dimensional image of the target object, it will also obtain the face information of the target object. Therefore, the reconstruction server can determine whether it stores a three-dimensional image of the target object based on the face information.
  • the reconstruction server can determine that it stores A three-dimensional image of the target object. At this time, the reconstruction server may send an acquisition response carrying the three-dimensional image of the target object to the dressing mirror.
  • the reconstruction server may send a response to the fitting mirror, the response indicating that the three-dimensional image of the target object is not stored in the reconstruction server; the fitting mirror may send the reconstruction server based on the acquisition response to the reconstruction server with information about the target object Three-dimensional reconstruction instruction; the reconstruction server may, based on the three-dimensional reconstruction instruction, perform three-dimensional reconstruction on the target object, and send to the dressing mirror an acquisition response that carries the three-dimensional image of the target object.
  • the reconstruction server based on the three-dimensional reconstruction instruction the process of performing the three-dimensional reconstruction of the target object can refer to the corresponding process in the above step S412, which will not be repeated here.
  • Step S506 The fitting mirror provides a virtual fitting service to the target object based on the acquisition response.
  • the dressing mirror may provide a virtual fitting service to the target object based on the acquisition response.
  • the image quality of the three-dimensional image of the target object carried in the response may be poor.
  • the reconstruction server acquires the three-dimensional image based on the image containing the target object with a small number of frames
  • the three-dimensional image obtained by the reconstruction server The image quality of the image is poor. Therefore, the fitting mirror can analyze the image quality of the three-dimensional image of the target object carried in the response to determine whether it will affect the provision of virtual fitting services to the target object. If there is an impact, the dressing mirror will send out a voice message prompting the target object to rotate in a circle.
  • the image of the target object can be re-acquired at different shooting angles through the dressing mirror camera, and the reconstruction service can rebuild the target object in three dimensions based on each frame of image.
  • the target object is obtained The imaging quality of the three-dimensional images is higher.
  • the first camera is arranged in a store or a shopping mall
  • the reconstruction server can obtain images collected by the first camera in real time and containing different shooting angles of the user, and then can directly perform three-dimensional reconstruction after satisfying the conditions of three-dimensional reconstruction. Reconstruction, and then send the obtained three-dimensional image to the dressing mirror.
  • the user uses the dressing mirror, there is no need to rotate a circle in front of the dressing mirror, and there is no need to wait for three-dimensional reconstruction to obtain a three-dimensional image, and a three-dimensional image of the user can be directly obtained, which improves the user experience.
  • the shooting angle of an image can be determined through the angle recognition model. Based on the shooting angle, the angle interval corresponding to the image can be determined in a plurality of angle intervals, and the corresponding angle interval can be determined subsequently based on the corresponding Three-dimensional reconstruction is performed on the image of the same target object in each angle interval to obtain the three-dimensional image of the target object.
  • the shooting angle of the image is acquired at the same time as the image is acquired.
  • the order of the multi-frame images can be obtained directly based on the shooting angle, which is effective This reduces the amount of calculations in 3D reconstruction and improves the efficiency of acquiring 3D images.
  • the user uses the dressing mirror, he does not need to rotate a circle in front of the dressing mirror, and does not need to wait for three-dimensional reconstruction to obtain a three-dimensional image, and can directly obtain a three-dimensional image of the user, which improves the user experience.
  • the embodiment of the present disclosure also provides a model training method, which is used to train the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5, or FIG.
  • This model training method is applied to the training server 202 in the model training system 200 shown in FIG. 2.
  • the model training method may include:
  • FIG. 7 is a flowchart of a training process according to at least one embodiment of the present disclosure.
  • the training process can include:
  • Step S601 Obtain a sample image containing a sample object and a depth map corresponding to the sample image collected by the second camera.
  • the sample object may be a person, an animal, or an object
  • the training server may use a second camera to collect a sample image containing the sample object and a depth map corresponding to the sample image.
  • the second camera may be a camera including a depth camera, or a binocular camera.
  • the second camera may be a device with a depth camera such as a Kinect device. It should be noted that the second camera can simultaneously collect the depth map and the color map. Therefore, after the sample object is captured by the second camera, the training server can simultaneously obtain the sample image containing the sample object collected by the second camera and the sample image. The corresponding depth map.
  • the color map and depth map obtained by the second camera after shooting the sample object not only include the sample object, but also other background images before the sample object, in order to facilitate subsequent images Processing: After the second camera shoots the sample object, the training server also needs to intercept the acquired depth map and color map, so that the intercepted sample image and its corresponding depth map only contain the sample object.
  • Step S602 Obtain the first key point and the second key point of the sample object from the depth map.
  • the first key point and the second key point of the sample object may be two shoulder joint points of the person, respectively.
  • the Kinect device can collect all the joint points of the sample object. For example, as shown in FIG. 8, the Kinect device can collect 14 joint points of the sample object.
  • the training server can obtain two shoulder joint points a and b of the sample object in the depth map.
  • Step S603 Determine the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point.
  • the shooting angle is used to characterize the direction when the second camera shoots the sample object.
  • the angle between the vertical direction of the line connecting the first key point and the second key point and the Z axis direction in the world coordinate system can be determined as the shooting angle of the sample image.
  • the Z-axis direction in the world coordinate system is usually parallel to the optical axis direction of the second camera.
  • the training server may determine the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point.
  • the X-axis and Y-axis components of the key point's three-dimensional coordinates can determine the position of the key point in the depth map, and the key point's three-dimensional coordinate component on the Z axis can determine the depth value of the key point. It should be noted that after obtaining the sample image and its corresponding depth map, the training server can determine the three-dimensional coordinates of any point in the depth map.
  • determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point may include: calculating the shooting angle of the sample image by using an angle calculation formula, and The angle calculation formula is:
  • the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
  • V 1 represents the first key point and the second key point.
  • the line of key points is a vector in the XZ plane in the world coordinate system;
  • V 2 represents a unit vector perpendicular to V 1 ;
  • V Z represents a unit vector parallel to the Z axis in the world coordinate system;
  • represents a shooting angle.
  • the training server only determines the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point, there may be two frames with the same shooting angle but the second camera shooting Sample images with different orientations.
  • the shooting angle when the second camera shoots the sample object in the current shooting direction is the same as the shooting angle when the second camera shoots the sample object after the shooting direction is rotated by 180°. Therefore, in order to distinguish two sample images with the same shooting angle but different shooting directions of the second camera, after step S603, the training process may further include:
  • Step A4 Based on the sample image, determine whether the orientation posture of the sample object relative to the second camera is the back-facing posture.
  • the training server may determine whether the orientation posture of the sample object relative to the second camera is the back orientation posture or the forward orientation posture based on the sample image.
  • the training server determines that the orientation posture of the sample object relative to the second camera is the back-facing posture, the shooting angle of the sample object needs to be corrected, and step B4 is executed; when the training server determines the orientation of the sample object relative to the second camera When the posture is forward facing posture, there is no need to correct the shooting angle of the sample object.
  • Step B4 When the orientation posture of the sample object relative to the second camera is the back-facing posture, a correction calculation formula is used to correct the shooting angle to obtain the corrected shooting angle.
  • the correction calculation formula is:
  • ⁇ 1 ⁇ 2+180°; where ⁇ 1 is the shooting angle after correction; ⁇ 2 is the shooting angle before correction.
  • the training server may determine that the orientation attitude of the sample object relative to the second camera is the back-to-facing attitude, and the sample image The shooting angle of the camera is corrected so that the shooting angles of any two sample images captured by the second camera in different shooting directions are also different.
  • the training server is based on the three-dimensional coordinates of the first key point and The second key point is that the accuracy of determining the shooting angle of the sample image is low. Therefore, in order to improve the accuracy of the shooting angle of the sample image, before step 603, the training process may also include:
  • Step A5 Determine whether the distance between the first key point and the second key point is less than a distance threshold.
  • the first key point and the second key point in the sample object may be the two shoulder joint points a and b of the person, and the distance threshold It can be the distance between the head joint point c and the neck joint point d.
  • the training server can calculate the distance between the two shoulder joint points a and b in the sample object in the depth map, and compare it with the distance threshold (that is, the distance between the head joint point c and the neck joint point d), to Determine whether the distance between the first key point and the second key point is less than the distance threshold. If the distance between the first key point and the second key point is less than the distance threshold, step B5 is executed; if the distance between the first key point and the second key point is not less than the distance threshold, the above step S603 is executed.
  • Step B5 If the distance between the first key point and the second key point is less than the distance threshold, it is determined that the shooting angle of the sample image is the specified angle.
  • the specified angle is any angle within the angle interval of the fixed range.
  • the training server may determine the shooting angle of the sample image as the specified angle.
  • the specified angle may be 90° or 270°.
  • the training server needs to determine the orientation and posture of the sample object relative to the second camera based on the sample image, and determine the shooting angle of the sample image based on the orientation and posture of the sample object relative to the second camera. 90°, still 270°.
  • the orientation posture of the sample object relative to the second camera may further include: a rightward orientation posture and a leftward orientation posture. When the orientation posture of the sample object relative to the second camera is rightward orientation posture, the shooting angle of the sample image is 90°; when the orientation posture of the sample object relative to the second camera is leftward orientation posture, the sample The shooting angle of the image is 270°.
  • Step S604 Input the sample image to the deep learning model to obtain the predicted shooting angle of the sample image, and determine the classification accuracy of the shooting angle according to the shooting angle and the predicted shooting angle of the sample image.
  • the deep learning model can learn the correspondence between the sample image and the shooting angle. After the deep learning model has completed the learning of the correspondence between the sample image and the shooting angle, the predicted shooting angle of the sample image can be obtained. According to the shooting angle and predicted shooting angle of the sample image, the classification accuracy of the shooting angle is determined, and then training The server can determine whether the classification accuracy is greater than the set threshold; when the classification accuracy is greater than or equal to the set threshold, the training of the sample image is ended, and then new sample images can be input to the deep learning model; when the classification accuracy is When it is less than the set threshold, step S604 is repeated to input the sample image to the deep learning model again.
  • the angle recognition model can be obtained through the training process from step S601 to step S604 for multiple times, and the angle recognition model's accuracy of classifying the shooting angle of the sample images in the training image set reaches the set threshold.
  • the loss value LOSS of the loss function can be determined according to the shooting angle and the predicted shooting angle of the sample image, and the loss value of the loss function can be determined by the following calculation formula:
  • a represents the predicted shooting angle of the sample image
  • CE represents cross entropy
  • MSE represents mean square error
  • the other parameter configuration of the deep learning model is as follows: the resolution of the input sample image is 224 ⁇ 112, the optimizer used is the Adam optimizer, and the number of iterations is 50 times. Among them, one iteration means that the deep learning model learns the correspondence between the sample image and the shooting angle of the sample image once.
  • the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5 or FIG. 6 can be obtained through the above steps.
  • the angle model can output the shooting angle of the target image.
  • At least one embodiment of the present disclosure provides a three-dimensional reconstruction device, including: a processor; and
  • a memory stores program code executable by the processor, and when the program code is executed by the processor, the processor is configured to:
  • the angle recognition model is used to determine the shooting angle of the first image, the shooting angle is used to characterize the shooting direction of the first camera relative to the target object when the first camera shoots the first image, and the angle recognition model is right A model obtained by learning and training the sample image and the shooting angle of the sample image;
  • FIG. 9 shows a block diagram of a three-dimensional reconstruction device according to an embodiment of the present disclosure.
  • the three-dimensional reconstruction apparatus 700 can be integrated in the reconstruction server 102 in the three-dimensional reconstruction system 100 as shown in FIG. 1.
  • the three-dimensional reconstruction device 700 may include:
  • the first acquisition module 701 is configured to acquire a target image collected by a first camera, where the target image is an image containing the target object.
  • the first determining module 702 is configured to determine the shooting angle of the target image using an angle recognition model, the shooting angle is used to characterize the shooting direction when the first camera shoots the target image, and the angle recognition model is to learn the shooting angles of the sample image and the sample image The trained model.
  • the second determining module 703 is configured to determine an angle interval corresponding to the target image in a plurality of angle intervals included in the angle range [0, 360°) based on the shooting angle.
  • the three-dimensional reconstruction module 704 is configured to perform three-dimensional reconstruction processing on the target object based on the target image containing the target object corresponding to each angle interval to obtain a three-dimensional image of the target object.
  • the first determining module 702 is configured to: input a target image to the angle recognition model; receive angle information output by the angle recognition model; and determine the angle information as a shooting angle.
  • Fig. 10 is a block diagram of a three-dimensional reconstruction apparatus according to another embodiment of the present disclosure.
  • the three-dimensional reconstruction device 700 may further include:
  • the marking module 705 is configured to mark the target image to obtain a label of the target image, and the label is used to mark the target object in the target image.
  • the marking module 705 uses a target recognition algorithm to mark the target image.
  • the classification module 706 is configured to classify the target image containing the target object into an image set based on the tag of the target image.
  • the second acquisition module 707 is configured to acquire a target image corresponding to each angle interval and containing the target object based on the image collection.
  • the three-dimensional reconstruction apparatus 700 may further include:
  • the first determining module 708 is configured to determine whether each angle interval corresponds to more than two target images located in the image set based on the image set.
  • the scoring module 709 is configured to, if there are more than two frames of target images in the same image set corresponding to the angle interval, perform image quality scoring processing on the target images of more than two frames in the image set to obtain each frame of target image The image quality rating of.
  • the first deletion module 710 is configured to retain the target image with the highest image quality score and delete other target images.
  • Fig. 11 is a block diagram of a three-dimensional reconstruction device according to another embodiment of the present disclosure.
  • the three-dimensional reconstruction device 700 may further include:
  • the second determining module 711 is configured to determine whether the resolution of the target image is less than the resolution threshold.
  • the second deleting module 712 is configured to delete the target image if the resolution of the target image is less than the resolution threshold.
  • the modification module 713 is configured to modify the target image to an image of a specified resolution if the resolution of the target image is not less than the resolution threshold, and the specified resolution is greater than or equal to the resolution threshold.
  • the three-dimensional reconstruction module 704 is configured to: if each of the multiple angle intervals corresponds to a target image containing the target object, based on the target image corresponding to each angle interval , Performing three-dimensional reconstruction on the target object to obtain a three-dimensional image of the target object.
  • the 3D reconstruction module 704 is configured to: when a 3D reconstruction instruction is received, obtain multiple frames containing the target object to be reconstructed based on the information of the target object to be reconstructed carried in the 3D reconstruction instruction Target image; based on each frame of the target image containing the target object to be reconstructed, perform three-dimensional reconstruction of the target object to be reconstructed to obtain the three-dimensional image of the target object to be reconstructed; determine whether the three-dimensional image is an incomplete three-dimensional image; if the three-dimensional image is Incomplete three-dimensional image, repair the incomplete three-dimensional image, and obtain the repaired three-dimensional image.
  • the shooting angle of the target image can be determined through the angle recognition model, and based on the shooting angle, the angle interval corresponding to the target image can be determined in multiple angle intervals, Subsequently, a three-dimensional reconstruction process may be performed on the target object based on the target image corresponding to each angle interval and containing the target object to obtain a three-dimensional image of the target object.
  • the shooting angle of the target image is also obtained.
  • the target object is subsequently reconstructed in 3D
  • no additional algorithm is needed to sort the multi-frame target images, and the multi-frame target image can be obtained directly based on the shooting angle
  • the sequence effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
  • the embodiment of the present disclosure also provides a model training device.
  • the model training device can be integrated in the training server 202 in the model training system 200 shown in FIG. 2.
  • the model training device is configured to train the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5 or FIG.
  • the model training device may include:
  • the training module is configured to perform multiple training processes until the angle recognition model classifies the shooting angle of the sample images in the training image set to a set threshold.
  • the training process can include:
  • determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the second key point includes:
  • the angle calculation formula is used to calculate the shooting angle of the sample image.
  • the angle calculation formula is:
  • the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
  • V 1 represents the first key point and the second key point.
  • the line of key points is a vector in the XZ plane in the world coordinate system;
  • V 2 represents a unit vector perpendicular to V 1 ;
  • V Z represents a unit vector parallel to the Z axis in the world coordinate system;
  • represents a shooting angle.
  • the training process further includes:
  • the training process before determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the second key point, the training process further includes:
  • the angle is the specified angle, and the specified angle is any angle within the angle interval of the fixed range.
  • At least one embodiment of the present disclosure also provides a three-dimensional reconstruction system, which may include: a reconstruction server and a first camera.
  • the structure of the three-dimensional reconstruction system can refer to the structure shown in the three-dimensional reconstruction system shown in FIG. 1.
  • the reconstruction server may include: the three-dimensional reconstruction apparatus 700 shown in FIG. 9, FIG. 10 or FIG. 11.
  • the three-dimensional reconstruction system includes: a dressing mirror.
  • the dressing mirror is configured to send an acquisition request to the reconstruction server when the target object is detected, the acquisition request is used to request to acquire a three-dimensional image of the target object from the reconstruction server, and the acquisition request carries the Information of the target object;
  • the reconstruction server is configured to send an acquisition response to the dressing mirror based on the information of the target object, and the acquisition response carries a three-dimensional image of the target object.
  • At least one embodiment of the present disclosure also provides a model training system.
  • the model training system may include a training server and a second camera.
  • the structure of the model training system can refer to the structure shown in the model training system shown in FIG. 2.
  • the training server may include: the training module shown in the foregoing embodiment.
  • At least one embodiment of the present disclosure also provides a non-volatile computer-readable storage medium, in which code instructions are stored, and the code instructions are executed by a processor to execute the three-dimensional reconstruction method shown in the above embodiments, For example, the three-dimensional reconstruction method shown in FIG. 3, FIG. 5, or FIG.
  • At least one embodiment of the present disclosure also provides a computer-readable storage medium.
  • the storage medium is a non-volatile storage medium.
  • the storage medium stores code instructions, and the code instructions are executed by a processor to perform the foregoing implementation.
  • the illustrated model training method is, for example, the training process shown in FIG. 7.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.
  • plurality refers to two or more, unless specifically defined otherwise.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, for example, a magnetic disk or an optical disk.

Abstract

一种三维重建方法,包括:获取第一摄像机采集的第一图像,所述第一图像为包含目标对象的图像;确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向;基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像;以及基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。还公开了一种三维重建装置、三维重建系统、模型训练方法以及存储介质。

Description

三维重建方法及装置、系统、模型训练方法、存储介质
本公开要求于2019年4月24日提交的申请号为201910333474.0、名称为“三维重建方法及装置、系统、模型训练方法、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开的实施例涉及一种三维重建方法及装置、系统、模型训练方法、以及存储介质。
背景技术
一些具有多种功能的试衣镜可以在用户的不实际试穿衣服的情况下提供试穿效果,提高了试衣的便利性,节省了试衣的时间。
在使用试衣镜之前,需要获取目标对象(通常为使用试衣镜的用户)的三维图像,该三维图像通常是通过对摄像头在各个拍摄角度获取的包含目标对象的目标图像进行三维重建处理而得到的。每个拍摄角度位于对角度范围[0,360°)进行分割得到的一个角度区间内。为了得到质量较高的三维图像,需要在多个拍摄角度分别获取包含目标对象的目标图像,例如,采用摄像头按照顺时针顺序或逆时帧顺序每间隔15°对目标对象进行拍摄,然后再将获取的24帧目标图像进行三维重建处理。
在实现本公开的过程中,发明人发现目前在三维重建过程中,需要通过算法对摄像头获取的目标图像进行排序,保证相邻的两帧目标图像对应的两个角度区间相邻,导致三维重建处理时的运算量较大,获取三维图像的效率较低。
发明内容
本公开的多种实施例提供了一种三维重建方法,所述方法包括:
获取第一摄像机采集的第一图像,所述第一图像为包含目标对象的图像;
确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一对象时相对于所述目标对象的拍摄方向;
基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像;以及
基于各个所述角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
在本公开的一些实施例中,所述确定所述目标图像的拍摄角度,包括:
向角度识别模型输入所述第一图像;
接收所述角度识别模型输出的角度信息;以及
将所述角度信息确定为所述拍摄角度;
其中,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型。
在本公开的一些实施例中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:
用标签对所述第一图像进行标记,所述标签用于对所述目标图像中的目标对象进行标记;
基于所述标签,将包含所述目标对象的第一图像归类为一个图像集合;
在基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像之前,所述方法还包括:
基于所述图像集合,获取各个所述角度区间对应的所述第一图像。
在本公开的一些实施例中,在基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间之后,所述方法还包括:
对于每个角度区间,判断所述图像集合是否包含多帧第一图像对应于所述校对区间;
若在所述图像集合中包含多帧第一图像对应于所述角度区间,对所述多帧第一图像进行图像质量评分处理,得到所述多帧第一图像中的每帧第一图像的图像质量评分;以及
保留图像质量评分最高的第一图像,删除其他第一图像。
在本公开的一些实施例中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:
判断所述第一图像的分辨率是否小于分辨率阈值;
若所述第一图像的分辨率小于所述分辨率阈值,删除所述第一图像;
若所述第一图像的分辨率不小于所述分辨率阈值,将所述第一图像修改为指定分辨率的图像,所述指定分辨率大于或等于所述分辨率阈值。
在本公开的一些实施例中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:
当所述多个角度区间中的每个角度区间均具有对应的目标图像时,基于每个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
在本公开的一些实施例中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:
当接收到三维重建指令时,基于所述三维重建指令携带的目标对象的信息,获取多帧包含所述目标对象的第一图像;
基于每帧第一图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像;以及
判断所述三维图像是否为非完整的三维图像;
若所述三维图像是否为非完整的三维图像,对所述非完整的三维图像进行修复,得到修复后的三维图像。
本公开的多种实施例提供了一种三维重建装置,包括:
第一获取模块,配置为获取第一摄像机采集第一图像,所述第一图像包括目标对象;
第一确定模块,配置为采用角度识别模型确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一对象时相对于所述目标对象的拍摄方向,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型;
第二确定模块,配置为基于所第一图像的拍摄角度,在角度范围[0,360°)包括的多个角度区间中确定所述第一图像对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像;
三维重建模块,配置为基于各个所述角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
本公开的多种实施例提供了一种三维重建系统,包括:重建服务器和第一摄像机,所述重建服务器包括上述三维重建装置。
在本公开的一些实施例中,所述系统还包括:试衣镜;
在检测到目标对象时,所述试衣镜配置为向所述重建服务器发送获取请求,所述获取请求携带有所述目标对象的信息;
所述重建服务器用于基于所述目标对象的信息,向所述试衣镜发送获取响应,所述获取相应携带有所述目标对象的三维图像。
本公开的多种实施例提供了一种模型训练方法,配置为训练角度识别模型,所述方法包括:
执行多次训练过程,直至所述角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值,其中,所述训练过程包括:
获取第二摄像机采集的包含样本对象的样本图像以及与所述样本图像对应的深度图;
获取所述深度图中的样本对象中的第一关键点和第二关键点;
基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度,所述拍摄角度用于表征所述第二摄像机拍摄所述样本图像时相对于所述样本对象的方向;
向深度学习模型输入样本图像,得到所述样本图像的预测拍摄角度,根据所述样本图像的拍摄角度和预测拍摄角度,确定所述拍摄角度的分类准确度。
在本公开的一些实施例中,基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度,包括:
采用角度计算公式计算所述样本图像的拍摄角度,所述角度计算公式为:
Figure PCTCN2020077850-appb-000001
其中,所述第一关键点的三维坐标为(x 1,y 1,z 1),所述第二关键点的三维坐标为(x 2,y 2,z 2);V 1表示所述第一关键点和所述第二关键点的连线在世界坐标系中的XZ平面内的向量;V 2表示与V 1垂直的单位向量;V Z表示与世界坐标系中与Z轴平行的单位向量;α表示所述拍摄角度。
在本公开的一些实施例中,在基于所述第一关键点的三维坐标和所述第二关键点的的三维坐标,确定所述样本图像的拍摄角度之后,所述训练过程还包 括:
基于所述样本图像,判断所述样本对象相对于所述第二摄像机的朝向姿态是否为背对朝向姿态;
当所述样本对象相对于所述第二摄像机的朝向姿态为所述背对朝向姿态时,采用修正计算公式对所述拍摄角度进行修正处理,得到修正后的拍摄角度,所述修正计算公式为:
α1=α2+180°;
其中,α1为修正后的拍摄角度;α2为修正前的拍摄角度。
在本公开的一些实施例中,基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度之前,所述训练过程还包括:
判断所述第一关键点和所述第二关键点之间的距离是否小于距离阈值;
若所述第一关键点和所述第二关键点之间的距离小于所述距离阈值,确定所述样本图像的拍摄角度为指定角度,所述指定角度为位于固定范围的角度区间内的任一角度。
本公开的多个实施例提供了一种非易失性的计算机可读存储介质,所述存储介质中存储有代码指令,所述代码指令由处理器执行,以执行上述三维重建方法。
附图说明
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是根据本公开一个实施例的三维重建方法所涉及的三维重建系统的框图;
图2是根据本公开的一个实施例的模型训练方法所涉及的模型训练系统的框图;
图3是根据本公开的一个实施例的三维重建方法的流程图;
图4是第一摄像机拍摄第一对象时的效果图;
图5是根据本公开的另一实施例的三维重建方法的流程图;
图6是根据本公开的又实施例的三维重建方法的流程图;
图7是根据本公开的一个实施例的训练过程的流程图;
图8是根据本公开的一个实施例的样本对象的关节点的示意图;
图9是根据本公开的一个实施例三维重建装置的框图;
图10是根据本公开的另一个实施例的三维重建装置的框图;
图11是根据本公开的又一个实施例的三维重建装置的框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
请参考图1,图1是根据本公开的一个实施例的三维重建方法所涉及的三维重建系统的框图。该三维重建系统100可以包括:至少一个第一摄像机101和重建服务器102。
该第一摄像机101通常可以为包含诸如RGB摄像头或红外摄像头等监控摄摄像机,该第一摄像机101通常设置为多个。多个第一摄像机101可以部署在商场或商店中的不同位置处。该重建服务器102可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心,或者为一个计算机设备。该第一摄像机101可以与重建服务器102建立通信连接。
可选的,该三维重建系统100还可以包括:试衣镜103。该试衣镜103通常可以部署在诸如服装店的商店中,该试衣镜103能够为用户提供虚拟试衣服务。该试衣镜103可以与重建服务器102建立通信连接。
请参考图2,图2是本公开实施例提供的模型训练方法所涉及的模型训练系统的框图。该模型训练系统200可以包括:第二摄像机201和训练服务器202。
该第二摄像机201可以为包含深度摄像头的摄像机,或者为双目摄像机。该第二摄像机既可以获取彩色图(也称RGB图),也可以获取深度图。该深度图中每个像素点的像素值为深度值,该深度值用于表示对应像素点距离第二摄像机的距离。该训练服务器202可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心,或者为一个计算机设备。该第二摄像机201可以通过有线方式或者无线方式与训练服务器202建立通信连接。
请参考图3,图3是根据本公开的一个实施例的三维重建方法的流程图。该三维重建方法适用于图1所示的三维重建系统100中的重建服务器102。该三维重建方法可以包括:
步骤S301、获取第一摄像机采集的第一图像。该第一图像包含目标对象。
步骤S302、确定第一图像的拍摄角度,该拍摄角度用于表征第一摄像机拍摄第一图像时相对于目标对象的拍摄方向。在本公开的一些实施例中,可以采用角度识别模型确定所述第一图像的拍摄角度,该角度识别模型为对样本图像与样本图像的拍摄角度进行学习训练得到的模型。
步骤S303、基于所述拍摄角度,确定所述第一图像在角度范围[0,360°)包括的多个角度区间中对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像。
在本公开实施例中,该多个角度区间是对角度范围[0,360°)进行划分得到的,每个角度区间所包含的角度值均不同。
示例的,请参考图4,图4是第一摄像机拍摄目标对象时的效果图,以目标对象01为中心,第一摄像机02从不同的方向拍摄该目标对象01,该第一摄像机02拍摄目标对象01时的拍摄方向可以用拍摄角度表征。例如,以目标对象01为中心逆时针旋转每隔15°为一个角度区间。每个角度区间中所包含的角度信息为步骤S302获取的所述拍摄角度,因此在步骤S302获取的所述拍摄角度属于该多个角度区间中的一个角度区间。
步骤S304、基于各个角度区间的目标图像,对目标对象进行三维重建处理,得到目标对象的三维图像。
在相关技术中,为了能够获取目标对象的三维图像,需要通过摄像头获取目标图像在各个拍摄角度的包含目标对象的目标图像。由于通过摄像头获取的目标图像缺乏关于该目标图像的拍摄角度的信息,因此需要根据各个目标图像的拍摄顺序,对各个目标图像进行排序,保证相邻的两帧图像对应的两个角度区间相邻。该方法在进行三维重建时的运算量较大,导致获取三维图像的效率较低。
而在本公开实施例中,可以通过角度识别模型确定出第一图像的拍摄角度,基于该拍摄角度可以确定第一图像在多个角度区间中对应的角度区间。因此,在获取第一图像时,还获取了该第一图像的拍摄角度,在后续对目标对象进行三维重建时,无需再采用额外的算法对多帧第一图像进行排序,可以直接基于拍摄角度获取到多帧第一图像的顺序,有效的减少了三维重建时的运算量,提高了获取三维图像时的效率。
综上所述,在根据本公开实施例的三维重建方法中,通过角度识别模型可 以确定出第一图像的拍摄角度,基于该拍摄角度可以确定第一图像在多个角度区间中对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像,后续可以基于各个角度区间的目标图像,对该目标对象进行三维重建,得到该目标对象的三维图像。在获取第一图像时,还获取了该第一图像的拍摄角度,在后续对目标对象进行三维重建时,无需再采用额外的算法对多帧第一图像进行排序,可以直接基于拍摄角度获取到多帧第一图像的顺序,有效的减少了三维重建时的运算量,提高了获取三维图像时的效率。
请参考图5,图5是根据本公开另一实施例的三维重建方法的流程图。该三维重建方法应用于图1示出的三维重建系统100中的重建服务器102。该三维重建方法可以包括:
步骤S401、获取第一摄像机采集的第一图像。
该第一图像为第一摄像机采集的包含目标对象的图像。该目标对象可以为人、动物或者物体。若该第一摄像机为监控摄像摄像机,且该第一摄像机部署在商场或商店内,则该目标对象为在商场或商店内的人。
在本公开实施例中,假设第一摄像机采集到的一帧图像中包含了多个不同的目标对象,则重建服务器可以在该帧图像中的截取多帧第一图像,每帧第一图像中的目标对象均不同。
步骤S402、判断第一图像的分辨率是否小于分辨率阈值。
示例的,若第一图像的分辨率小于分辨率阈值,执行步骤S403;若第一图像的分辨率不小于分辨率阈值,执行步骤S404。例如,该分辨率阈值为:224×112。
步骤S403、若第一图像的分辨率小于分辨率阈值,删除该第一图像。
在本公开的一些实施例中,若获取到的第一图像的分辨率较低,则后续进行三维重建后得到的三维图像的显示效果较差。因此,可以在三维重建之前,删除分辨率较低的第一图像。示例的,重建服务器可以判断第一图像的分辨率是否小于分辨率阈值,若重建服务器判断出第一图像分辨率小于分辨率阈值,删除该第一图像。
步骤S404、若第一图像的分辨率不小于分辨率阈值,将该第一图像修改为分辨率为指定分辨率的图像。该指定分辨率大于或等于分辨率阈值。
在本公开的一些实施例中,若重建服务器判断出第一图像分辨率不小于分辨率阈值,为了便于后续进行三维重建,可以将该第一图像修改为分辨率为指 定分辨率的图像。示例的,若第一图像的分辨率大于该指定分辨率,重建服务器需要将第一图像的分辨率压缩为指定分辨率;若第一图像的分辨率小于该指定分辨率,重建服务器需要将第一图像的分辨率扩大至指定分辨率。
步骤S405、用标签对所述第一图像进行标记,所述标签用于标记所述第一图像中的目标对象。
例如,可以采用目标识别算法对第一图像进行标记。在本公开的一些实施例中,假设第一摄像机部署在商店或商场中,目标对象为在商店或商场中的人,则该目标识别算法可以为行人动线检测算法。重建服务器通过该行人动线检测算法可以用标签对第一图像进行标记,其中,该标签用于标记第一图像中的目标对象。示例的,行人动线检测算法可以对目标对象的着装特征、人脸特征和形态特征中的至少一个进行分析,从而用标签对第一图像进行标记。
步骤S406、基于所述标签,将第一图像归类为一个图像集合。
在本公开实施例中,第一图像具有相同的标签,第一图像中包含相同的目标对象。因此,重建服务器可以基于标签,将包含相同目标对象的目标图像归类为一个图像集合。
需要说明的是,通过步骤S406将第一图像进行分类后,位于相同图像集合的第一图像中的目标对象相同,而位于不同的图像集合的第一图像中的目标对象不同。
步骤S407、采用角度识别模型确定第一图像的拍摄角度。
在本公开实施例中,重建服务器可以采用角度识别模型确定每一帧第一图像的拍摄角度。该角度识别模型为对样本图像与样本图像的拍摄角度进行学习训练得到的模型。该角度识别模型的获取方法在后续的实施例进行介绍,在此不做赘述。
示例的,重建服务器采用角度识别模型确定第一图像的拍摄角度,可以包括以下步骤:
步骤A1、向角度识别模型输入第一图像。
步骤B1、接收该角度识别模型输出的角度信息。
步骤C1、将角度信息确定为第一图像的拍摄角度。
步骤S408、基于第一图像的拍摄角度,确定第一图像所对应的角度区间,其中,所述角度区间为包含在角度范围[0,360°)的多个角度区间中的一个角度区间,并将所述第一图像设置为所述角度区间的目标图像。
在本公开实施例中,假设第一摄像机布置在商店或商场中,目标对象为在商店或商场中的人,则在目标对象行走过程中,重建服务器可以实时获取第一摄像机在不同拍摄角度采集的包含该目标对象的第一图像。示例的,以该目标对象为中心顺时针或逆时针旋转每隔15°为一个角度区间,则该多个角度区间的个数为24个。重建服务器可以基于每帧第一图像的拍摄角度,在多个角度区间中确定该第一图像对应的角度区间,并将第一图像设置为所述角度区间的目标图像。例如,假设第一图像的拍摄角度为10°,则该第一图像对应的角度区间为[0,15°),并将第一图像设置为角度区间为[0,15°)的目标图像,所述目标图像用于进行三维重建。
步骤S409、基于图像集合,对于每个角度区间,判断是否有位于同一图像集合内的多于两帧第一图像对应于该角度区间。
在本公开实施例中,由于重建服务器后续需要参考每个角度区间对应的目标图像,对该目标对象进行三维重建,而可能会有多于两帧包含同一目标对象的目标图像对应于一个角度区间,若重建服务器直接参考对应于该角度区间的包含同一目标对象的多个第一图像对该目标对象进行三维重建,可能会影响对该目标对象进行三维重建的效率。因此,该重建服务器可以基于图像集合,判断是否有多于两帧位于同一的图像集合内的目标图像对应于一个角度区间,即,判断是否有多于两帧包含同一目标对象的目标图像对应于一个角度区间。
在本公开的一些实施例中,对于每个角度区间,重建服务器可以在同一图像集合中判断是否有多于两帧的目标图像对应于该角度区间。若有位于同一图像集合内的多于两帧目标图像对应于该角度区间,即,有包含同一目标对象的多于两帧目标图像对应于该角度区间,执行步骤S410;若未有位于同一图像集合内的多于两帧目标图像对应于该角度区间,重复执行步骤S409。
步骤S410、若有位于相同的图像集合内的多于两帧第一图像对应于该角度区间,对位于相同的图像集合内的多于两帧第一图像进行图像质量评分处理,得到每帧第一图像的图像质量评分。
示例的,若有位于相同的图像集合内的多于两帧第一图像对应于该角度区间,重建服务器可以采用图像质量评分算法对对应于该角度区间的多于两帧第一图像进行图像质量评分处理,得到每帧第一图像的质量评分。
步骤S411、保留图像质量评分最高的第一图像,将图像质量评分最高的第一图像设置为所述角度区间的目标图像,并删除其他第一图像。
在本公开实施例中,质量评分越高代表第一图像的清晰度越高,将该第一图像设置为相应的角度区间的目标图像,基于所述目标图像后续进行三维重建时得到的三维图像的质量越好。因此,重建服务器可以保留图像质量评分最高的第一图像,删除其他第一图像,保证每个角度区间只对应一帧清晰度较高的第一图像,有效地提高了后续对目标对象进行三维重建所得到的三维图像的成像质量,并且减少了三维重建时需要处理的第一图像的帧数,从而提高了对目标对象进行三维重建的效率。
步骤S412、基于各个角度区间的的目标图像,对该目标对象进行三维重建,得到该目标对象的三维图像。
需要说明的是,包含同一目标对象的第一图像属于同一图像集合,因此,在该步骤S412之前,该三维重建方法还可以包括:基于图像集合,获取对应于各个角度区间的包含同一目标对象的第一图像。还需要说明的是,重建服务器在得到目标对象的三维图像后,可以将该三维图像存储在重建服务器中存储器中。
在本公开实施例中,重建服务器可以对满足三维重建条件的目标对象进行三维重建。该三维重建条件有多种,本公开实施例以以下两种可选的实现方式进行示意性说明。
在第一种可选的实现方式中,在重建服务器确定出对应于多个角度区间中的每个角度区间的目标图像时,该目标对象满足三维重建条件。
此时,该步骤S412可以包括:若多个角度区间中的每个角度区间均具有目标对象时,基于每个角度区间的目标图像,对目标对象进行三维重建,得到目标对象的三维图像。示例的,重建服务器可以采用运动重构(英文:Structure from motion;简称:SFM)算法对目标对象进行三维重建,得到该目标对象的三维图像。
在本公开实施例中,重建服务器确定包含同一目标对象的目标图像对应于多个角度区间中的每个角度区间的过程,可以包括以下步骤:
步骤A2、对于每个图像集合,获取该图像集合中的每帧第一图像对应的角度区间。
在本公开实施例中,通过上述步骤S401至步骤S411后,重建服务器可以确定出每个图像集合中的每帧第一图像对应的角度区间,且在同一图像集合中,每个角度区间仅对应一帧第一图像,并将该第一图像设置为该角度区间的目标 图像,因此对于每个图像集合,重建服务器可以实时获取该图像集合中每帧第一图像对应的角度区间。
步骤B2、判断所有目标图像所对应的角度区间的个数与多个角度区间的个数是否相同。
示例的,若所有目标图像所对应的角度区间的个数与多个角度区间的个数相同,重建服务器可以确定出多个角度区间中的每个角度区间均具有目标图像,也即是,执行步骤C2;若所有目标图像所对应的角度区间的个数与多个角度区间的个数不同,重建服务器可以确定出多个角度区间中的至少一个角度区间不具有目标图像,重复执行步骤A2。
步骤C2、若所有目标图像的个数与多个角度区间的个数相同,确定多个角度区间中的每个角度区间均具有目标图像。
在本公开实施例中,在重建服务器确定出多个角度区间中的每个角度区间均具有目标图像后,重建服务器可以确定出目标对象在多个角度区间中的每个角度区间均具有目标图像,该目标对象满足三维重建条件。
在第二种可选的实现方式中,在重建服务器接收到携带目标对象的信息的三维重建指令时,该目标对象满足三维重建条件。
此时,该步骤S412可以包括以下几个步骤:
步骤A3、当接收到三维重建指令时,基于三维重建指令携带的目标对象的信息,获取多帧包含目标对象的第一图像。
示例的,三维重建系统还可以包括:试衣镜,该三维重建指令可以为试衣镜发送的指令。在本公开实施例中,当重建服务器接收到携带目标对象的信息的三维重建指令时,该重建服务器可以基于目标对象的信息,获取多帧包含目标对象的第一图像。
例如,关于该目标对象的信息可以包含着装特征、人脸特征和形态特征中的至少一个。由于三维重建服务器获取到第一图像后,也会对该目标图像的着装特征、人脸特征和形态特征中的至少一个进行分析,因此重建服务器能够基于目标对象的信息,获取多帧包含该目标对象的第一图像。
步骤B3、基于每帧第一图像,确定该帧第一图像所对应的对目标对象进行三维重建,得到目标对象的三维图像。基于所述多帧第一图像,确定所述多帧第一图像的所对应的角度区间,并将所述多帧第一图像分别确定为所对应的角度区间的目标图像;
基于所对应的角度区间的目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
示例的,重建服务器可以基于每帧包含目标对象的第一图像,采用SFM算法,对该目标对象进行三维重建,得到该目标对象的三维图像。
步骤C3、判断三维图像是否为非完整的三维图像。
在本公开实施例中,由于对目标对象进行三维重建时,重建服务器基于的第一图像的帧数较少,也即是,多个角度区间中可能存在至少一个角度区间不具有包含目标对象的第一图像,因此三维重建后得到的三维图像可能是非完整的三维图像,例如,该三维图像包含的孔洞角度。重建服务器可以判断该三维图像是否为非完整的三维图像,若三维图像为非完整的三维图像,执行步骤D3;若三维图像为完整的三维图像,结束动作。
步骤D3、若三维图像是否为非完整的三维图像,对非完整的三维图像进行修复,得到修复后的三维图像。
在本公开实施例中,重建服务器为了能够获取到图像质量较高的三维图像,在判断出三维图像为非完整的三维图像后,重建服务器需要对该非完整的三维图像进行修复,得到修复后的三维图像。例如,假设待重建的目标对象为人型目标对象,重建服务器可以通过人体的三维图像的规律对该三维图像进行修复。
需要说明的是,本公开实施例提供的三维重建方法步骤的先后顺序可以进行适当调整,例如,可以先执行步骤407,再执行步骤405至步骤406。步骤也可以根据情况进行相应增减,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本公开的保护范围之内,因此不再赘述。
综上所述,本公开实施例提供的三维重建方法,通过角度识别模型可以确定出目标图像的拍摄角度,基于该拍摄角度可以在多个角度区间中,确定第一图像的角度区间,并将第一图像设置为角度区间的目标图像,后续可以基于各个角度区间对应的的目标图像,对该目标对象进行三维重建处理,得到该目标对象的三维图像。在获取第一图像时,还获取了该第一图像的拍摄角度,在后续对目标对象进行三维重建时,无需再采用额外的算法对多帧目标图像进行排序,可以直接基于拍摄角度获取到多帧第一图像的顺序,有效的减少了三维重建时的运算量,提高了获取三维图像时的效率。
请参考图6,图6是根据本公开又一个实施例的三维重建方法的流程图。该 三维重建方法适用于图1示出的三维重建系统100,该三维重建方法可以包括:
步骤S501、第一摄像机采集图像。
在本公开实施例中,该第一摄像机可以为监控摄像机,该第一摄像机的个数为多个,可以部署在商场或商店中的不同位置处。
步骤S502、第一摄像机向重建服务器发送所采集的图像。
在本公开实施例中,第一摄像机可以将其实时采集的图像发送给重建服务器,便于重建服务器对目标对象进行三维重建。
步骤S503、重建服务器基于第一摄像机采集的图像对目标对象进行三维重建,得到目标对象的三维图像。
需要说明的是,重建服务器基于第一摄像机采集的图像对目标对象进行三维重建,得到目标对象的三维图像的过程,可以参考前述步骤S401至步骤S412中的相关内容,在此不再赘述。
步骤S504、试衣镜向重建服务器发送获取请求,所述获取请求携带有目标对象的信息。
在本公开实施例中,假设目标对象为站在试衣镜前的人,该试衣镜为了能够为该目标对象提供虚拟试衣服务,该试衣镜需要在重建服务器中获取该目标对象的三维图像。
示例的,该试衣镜中可以设置有试衣镜摄像头。该试衣镜摄像头可以采集位于该试衣镜之前的目标对象的信息,并向重建服务器发送获取请求,该获取请求携带有该目标对象的信息。
步骤S505、重建服务器基于所述目标对象的信息,向试衣镜发送获取响应,所述获取响应携带有该目标对象的三维图像。
在本公开的一些实施例中,重建服务器在接收到携带有所述目标对象的信息的该获取请求后,可以先所述目标对象的信息,判断是否存储有该目标对象的三维图像。
在本公开的一些实施例中,该目标对象信息可以为人脸信息,重建服务器在获取目标对象的三维图像时,也会获取该目标对象的人脸信息。因此,重建服务器可以基于人脸信息判断其是否存储有该目标对象的三维图像。
若存储有该目标对象的三维图像,示例的,若该获取请求中所携带的目标对象的人脸信息,与重建服务器存储的三维图像对应的人脸信息相同,则重建服务器可以确定其存储有该目标对象的三维图像。此时,该重建服务器可以向 该试衣镜发送携带有该目标对象的三维图像的获取响应。
若未存储有该目标对象的三维图像,示例的,在重建服务器中存储的所有的三维图像所对应人脸信息,与该获取请求中所携带的目标对象的人脸信息均不同,则重建服务器可以确定其未存储有该目标对象的三维图像。此时,重建服务器可以向试衣镜发送响应,所述响应指示重建服务器中未存储有该目标对象的三维图像;试衣镜可以基于该获取响应向重建服务器发送携带有该目标对象的信息的三维重建指令;重建服务器可以基于该三维重建指令,对该目标对象进行三维重建后,向该试衣镜发送携带有该目标对象的三维图像的获取响应。其中,重建服务器基于三维重建指令,对该目标对象进行三维重建的过程可以参考上述步骤S412中的对应过程在此不再赘述。
步骤S506、试衣镜基于获取响应向该目标对象提供虚拟试衣服务。
在本公开实施例中,试衣镜在接收到重建服务器发送的携带有该目标对象的三维图像的获取响应后,可以基于该获取响应向该目标对象提供虚拟试衣服务。
需要说明的是,获取响应所携带的目标对象的三维图像的图像质量可能较差,例如,当重建服务器在基于较少帧数的包含该目标对象的图像获取三维图像时,重建服务器获取的三维图像的图像质量较差。因此,试衣镜可以对获取响应所携带的该目标对象的三维图像的图像质量进行分析,确定是否会对向目标对象提供虚拟试衣服务存在影响。若存在影响,试衣镜会发出提示目标对象旋转一圈的语音信息。此时,在目标对象旋转后,通过试衣镜摄像头可以重新在不同拍摄角度采集该目标对象的图像,重建服务可以基于每帧图像重新对该目标对象进行三维重建,此时,得到该目标对象的三维图像的成像质量较高。
在相关技术中,在用户使用试衣镜的过程中,通常需要让用户在试衣镜前选转一圈,便于试衣镜中安装的摄像头采集包含该用户不同拍摄角度的图像,然后再进行三维重建,得到该用户的三维图像。因此,用户在使用试衣镜的过程中,进行三维重建获取三维图像的时长。
而在本公开实施例中,第一摄像机布置在商店或商场中,重建服务器可以实时获取第一摄像机采集的包含该用户的不同拍摄角度的图像,再满足三维重建的条件后,可以直接进行三维重建,然后再将得到的三维图像发送给试衣镜。当该用户使用该试衣镜时,无需在试衣镜前旋转一圈,且无需等待三维重建获取三维图像,可以直接获取到该用户的三维图像,改善了用户体验。
需要说明的是,本公开实施例提供的三维重建方法步骤的先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本公开的保护范围之内,因此不再赘述。
综上所述,本公开实施例提供的三维重建方法,通过角度识别模型可以确定出图像的拍摄角度,基于该拍摄角度可以在多个角度区间中,确定图像对应的角度区间,后续可以基于对应于各个角度区间的包含同一目标对象的图像,对该目标对象进行三维重建,得到该目标对象的三维图像。在获取图像的同时获取了该图像的拍摄角度,在后续对目标对象进行三维重建时,无需再采用额外的算法对多帧图像进行排序,可以直接基于拍摄角度获取到多帧图像的顺序,有效的减少了三维重建时的运算量,提高了获取三维图像时的效率。并且,当用户使用试衣镜时,无需在试衣镜前旋转一圈,且无需等待三维重建获取三维图像,可以直接获取该用户的三维图像,改善了用户体验。
本公开实施例还提供了一种模型训练方法,该模型训练方法用于训练图3、图5或图6示出的三维重建方法中所采用的角度识别模型。该模型训练方法应用于图2示出的模型训练系统200中的训练服务器202。该模型训练方法可以包括:
执行多次训练过程,直至角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值。
其中,请参考图7,图7是根据本公开的至少一个实施例的训练过程的流程图。该训练过程可以包括:
步骤S601、获取第二摄像机采集的包含样本对象的样本图像以及与该样本图像对应的深度图。
在本公开实施例中,该样本对象可以为人、动物或者物体,训练服务器可以采用第二摄像机采集包含样本对象的样本图像以及与该样本图像对应的深度图。该第二摄像机可以为包含深度摄像头的摄像机,或者为双目摄像机。例如,该第二摄像机可以为诸如Kinect设备等带有深度摄像头的设备。需要说明的是,该第二摄像机能够同时采集深度图和彩色图,因此通过该第二摄像机拍摄样本对象后,训练服务器可以同时获取第二摄像机采集的包含样本对象的样本图像以及与该样本图像对应的深度图。
还需要说明的是,第二摄像机在拍摄样本对象后,获取到的彩色图和深度图中,不仅包含该样本对象,还包含除该样本对象之前的其他背景图像,为了便于后续的进行的图像处理,在第二摄像机对样本对象进行拍摄后,训练服务器还需要对获取到的深度图和彩色图进行截取,使得截取后的样本图像以及其对应的深度图中仅包含样本对象。
步骤S602、从深度图中获取样本对象的第一关键点和第二关键点。
在本公开实施例中,若样本对象为人,则该样本对象的第一关键点和第二关键点可以分别为人的两个双肩关节点。需要说明的是,Kinect设备在采集到含有样本对象的深度图后,该Kinect设备可以采集该样本对象的所有的关节点。例如,如图8所示,该Kinect设备可以采集该样本对象的14个关节点,此时,训练服务器可以获取该深度图中的样本对象的两个双肩关节点a和b。
步骤S603、基于第一关键点的三维坐标和第二关键点的三维坐标,确定样本图像的拍摄角度。该拍摄角度用于表征第二摄像机拍摄样本对象时的方向。
在本公开实施例中,可以将第一关键点和第二关键点的连线的垂直方向,与世界坐标系中的Z轴方向的夹角确定为该样本图像的拍摄角度。该世界坐标系中的Z轴方向通常与第二摄像机的光轴方向平行。训练服务器可以基于第一关键点的三维坐标和第二关键点的三维坐标,确定样本图像的拍摄角度。
其中,关键点的三维坐标在X轴和Y轴分量可以确定出该关键点在深度图中的位置,关键点的三维坐标在Z轴上的分量可以确定该关键点的深度值。需要说明的是,训练服务器在获取到样本图像以及其对应的深度图后,可以确定出该深度图中任意一点的三维坐标。
在本公开的一些实施例中,基于该第一关键点的三维坐标和第二关键点的三维坐标,确定该样本图像的拍摄角度,可以包括:采用角度计算公式计算样本图像的拍摄角度,该角度计算公式为:
Figure PCTCN2020077850-appb-000002
其中,第一关键点的三维坐标为(x 1,y 1,z 1),第二关键点的三维坐标为(x 2,y 2,z 2);V 1表示第一关键点和第二关键点的连线在世界坐标系中的XZ平面内 的向量;V 2表示与V 1垂直的单位向量;V Z表示与世界坐标系中与Z轴平行的单位向量;α表示拍摄角度。
在本公开实施例中,训练服务器确定样本图像的拍摄角度时有以下两种特殊情况:
第一种特殊情况,训练服务器若仅基于第一关键点的三维坐标和第二关键点的三维坐标,确定样本图像的拍摄角度,则可能会存在两帧拍摄角度相同,但第二摄像机的拍摄方向不同的样本图像。例如,第二摄像机以当前拍摄方向拍摄样本对象时的拍摄角度,和第二摄像机的拍摄方向旋转180°后拍摄样本对象时的拍摄角度相同。因此,为了区分两帧拍摄角度相同,但第二摄像机的拍摄方向不同的样本图像,在步骤S603之后,该训练过程还可以包括:
步骤A4、基于样本图像,判断样本对象相对于第二摄像机的朝向姿态是否为背对朝向姿态。
在本公开实施例中,训练服务器可以基于样本图像,判断出样本对象相对于第二摄像机的朝向姿态是背向朝向姿态,还是正向朝向姿态。当训练服务器判断出样本对象相对于第二摄像机的朝向姿态为背对朝向姿态时,需要对样本对象的拍摄角度进行修正,执行步骤B4;当训练服务器判断出样本对象相对于第二摄像机的朝向姿态为正向朝向姿态时,无需对样本对象的拍摄角度进行修正。
步骤B4、当样本对象相对于第二摄像机的朝向姿态为背对朝向姿态时,采用修正计算公式对拍摄角度进行修正处理,得到修正后的拍摄角度。该修正计算公式为:
α1=α2+180°;其中,α1为修正后的拍摄角度;α2为修正前的拍摄角度。
在本公开实施例中,训练服务器为了区分两帧拍摄角度相同,但拍摄方向不同的样本图像,可以在判断出样本对象相对于第二摄像机的朝向姿态为背对朝向姿态时,将该样本图像的拍摄角度进行修正,使得第二摄像机以不同拍摄方向拍摄的任意两帧样本图像的拍摄角度也不同。
第二种特殊情况,若样本对象相对于第二摄像机的朝向姿态为侧向姿态,样本对象中的第一关键点和第二关键点几近重合,训练服务器基于第一关键点的三维坐标和第二关键点,确定样本图像的拍摄角度的准确性较低。因此,为了提高样本图像的拍摄角度的准确性,在步骤603之前,该训练过程还可以包 括:
步骤A5、判断第一关键点和第二关键点之间的距离是否小于距离阈值。
在本公开的一些实施例中,如图8所示,假设样本对象为人,该样本对象中的第一关键点和第二关键点可以分别为人的两个双肩关节点a和b,该距离阈值可以为头关节点c与颈关节点d之间的距离。训练服务器可以计算出深度图中的样本对象中的两个双肩关节点a和b之间的距离,与距离阈值(即,头关节点c和颈关节点d之间的距离)进行比较,以判断第一关键点和第二关键点之间的距离是否小于距离阈值。若第一关键点和第二关键点之间的距离小于距离阈值,执行步骤B5;若第一关键点和第二关键点之间的距离不小于距离阈值,执行上述步骤S603。
步骤B5、若第一关键点和第二关键点之间的距离小于距离阈值,确定样本图像的拍摄角度为指定角度。该指定角度为位于固定范围的角度区间内的任一角度。
在本公开实施例中,训练服务器在判断出第一关键点和第二关键点之间的距离小于距离阈值后,该训练服务器可以将样本图像的拍摄角度确定为指定角度。
示例的,该指定角度可以为90°或270°。为了更精确的确定样本图像的拍摄角度,训练服务器需要基于样本图像,确定样本对象相对于第二摄像机的朝向姿态,并基于该样本对象相对于第二摄像机的朝向姿态确定样本图像的拍摄角度为90°,还是为270°。例如,样本对象相对于第二摄像机的朝向姿态还可以包括:向右侧向朝向姿态和向左侧向朝向姿态。当样本对象相对于第二摄像机的朝向姿态为向右侧朝向姿态时,该样本图像的拍摄角度为90°;当样本对象相对于第二摄像机的朝向姿态为向左侧朝向姿态时,该样本图像的拍摄角度为270°。
步骤S604、向深度学习模型输入样本图像,得到样本图像的预测拍摄角度,根据样本图像的拍摄角度和预测拍摄角度,确定拍摄角度的分类准确度。
在本公开实施例中,深度学习模型可以学习样本图像与拍摄角度之间的对应关系。在深度学习模型对样本图像与拍摄角度之间的对应关系学习完成之后,可以得到样本图像的预测拍摄角度,根据该样本图像的拍摄角度和预测拍摄角度,确定拍摄角度的分类准确度后,训练服务器可以判断该分类准确度是否大于设定阈值;当该分类准确度大于或等于设定阈值时,结束对该样本图像的训 练,后续可以向深度学习模型输入新样本图像;当该分类准确度小于设定阈值时,重复执行步骤S604,重新向深度学习模型输入该样本图像。
需要说明的是,通过多次上述步骤S601至步骤S604的训练过程,可以得到角度识别模型,且该角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值。
示例的,在上述步骤S604中,根据样本图像的拍摄角度和预测拍摄角度可以确定出损失函数的损失值LOSS,该损失函数的损失值可以通过以下计算公式确定:
Figure PCTCN2020077850-appb-000003
其中,其中a代表样本图像的预测拍摄角度,
Figure PCTCN2020077850-appb-000004
代表样本图像的真实的拍摄角度,CE代表交叉熵,MSE代表均方差,
Figure PCTCN2020077850-appb-000005
代表二者的融合系数。
该深度学习模型的其他参数配置如下:输入的样本图像的分辨率为224×112,使用的优化器为Adam优化器,迭代次数为50次。其中,一次迭代指的是:深度学习模型对样本图像与该样本图像的拍摄角度之间的对应关系进行一次学习。
需要说明的是,通过上述步骤可以得到图3、图5或图6示出的三维重建方法中所采用的角度识别模型。当对该角度识别模型输入目标图像时,该角度模型可以输出该目标图像的拍摄角度。
本公开的至少一个实施例提供了一种三维重建装置,包括:处理器;以及
存储器,所述存储器上存储有可被所述处理器执行的程序代码,当所述程序代码被所述处理器执行时,将所述处理器配置为:
获取第一摄像机采集的第一图像,所述第一图像包括目标对象;
采用角度识别模型确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型;
基于所述第一图像的拍摄角度,在角度范围[0,360°)包括的多个角度区间中确定所述第一图像对应的角度区间,并将该第一图像设置为所述角度区间的目标图像;
基于各个所述角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
本公开的至少一个实施例提供了一种三维重建装置。图9示出了根据本公开一个实施例的三维重建装置的框图。该三维重建装置700可以集成在如图1示出的三维重建系统100中的重建服务器102中。该三维重建装置700可以包括:
第一获取模块701,配置为获取第一摄像机采集的目标图像,目标图像为包含目标对象的图像。
第一确定模块702,配置为采用角度识别模型确定目标图像的拍摄角度,拍摄角度用于表征第一摄像机拍摄目标图像时的拍摄方向,角度识别模型为对样本图像与样本图像的拍摄角度进行学习训练得到的模型。
第二确定模块703,配置为基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中确定目标图像对应的角度区间。
三维重建模块704,配置为基于各个角度区间对应的包含目标对象的目标图像,对目标对象进行三维重建处理,得到目标对象的三维图像。
在本公开的一些实施例中,该第一确定模块702,配置为:向角度识别模型输入目标图像;接收角度识别模型输出的角度信息;将角度信息确定为拍摄角度。
图10是根据本公开另一实施例的三维重建装置的框图。该三维重建装置700还可以包括:
标记模块705,配置为对目标图像进行标记,得到目标图像的标签,所述标签用于对目标图像中的目标对象进行标记。例如,所述标记模块705采用目标识别算法对目标图像进行标记。
归类模块706,配置为基于目标图像的标签,将包含所述目标对象的目标图像归类为一个图像集合。
第二获取模块707,配置为基于所述图像集合,获取各个角度区间对应的包含所述目标对象的目标图像。
在本公开的一些实施例中,如图10所示,该三维重建装置700还可以包括:
第一判断模块708,配置为基于图像集合,判断每个角度区间是否对应有多于两帧位于所述图像集合内的目标图像。
评分模块709,配置为若角度区间对应有多于两帧位于相同的图像集合内的目标图像,对多于两帧位于所述图像集合内的目标图像进行图像质量评分处理,得到每帧目标图像的图像质量评分。
第一删除模块710,配置为保留图像质量评分最高的目标图像,删除其他目标图像。
图11是根据本公开又一个实施例的三维重建装置的框图。该三维重建装置700还可以包括:
第二判断模块711,配置为判断目标图像的分辨率是否小于分辨率阈值。
第二删除模块712,配置为若目标图像的分辨率小于分辨率阈值,删除该目标图像。
修改模块713,配置为若目标图像的分辨率不小于分辨率阈值,将目标图像修改为指定分辨率的图像,指定分辨率大于或等于分辨率阈值。
在本公开的一些实施例中,该三维重建模块704,配置为:若多个角度区间中的每个角度区间均对应有包含所述目标对象的目标图像,基于每个角度区间对应的目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
在本公开的一些实施例中,该三维重建模块704,配置为:当接收到三维重建指令时,基于三维重建指令携带的待重建的目标对象的信息,获取多帧包含待重建的目标对象的目标图像;基于每帧包含待重建的目标对象的目标图像,对待重建的目标对象进行三维重建,得到待重建的目标对象的三维图像;判断三维图像是否为非完整的三维图像;若三维图像为非完整的三维图像,对非完整的三维图像进行修复,得到修复后的三维图像。
综上所述,在根据本公开实施例的三维重建装置中,通过角度识别模型可以确定出目标图像的拍摄角度,基于该拍摄角度,可以在多个角度区间中确定目标图像对应的角度区间,后续可以基于各个角度区间对应的包含所述目标对象的目标图像,对所述目标对象进行三维重建处理,得到所述目标对象的三维图像。在获取目标图像时,还获取了该目标图像的拍摄角度,在后续对目标对象进行三维重建时,无需再采用额外的算法对多帧目标图像进行排序,可以直接基于拍摄角度获取多帧目标图像的顺序,有效的减少了三维重建时的运算量,提高了获取三维图像时的效率。
本公开实施例还提供了一种模型训练装置。该模型训练装置可以集成在图2示出的模型训练系统200中的训练服务器202中。该模型训练装置配置为训练图3、图5或图6示出的三维重建方法中所采用的角度识别模型。该模型训练装置可以包括:
训练模块,配置为执行多次训练过程,直至角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值。该训练过程可以包括:
获取第二摄像机采集的包含样本对象的样本图像以及与样本图像对应的深度图;
获取深度图中的样本对象的第一关键点和第二关键点;
基于所述第一关键点和所述第二关键点的三维坐标,确定样本图像的拍摄角度,拍摄角度用于表征第二摄像机拍摄样本对象时的方向;
向深度学习模型输入样本图像,得到样本图像的预测拍摄角度,根据样本图像的拍摄角度和预测拍摄角度,确定拍摄角度的分类准确度。
在本公开的一些实施例中,基于所述第一关键点和所述第二关键点的三维坐标,确定样本图像的拍摄角度,包括:
采用角度计算公式计算样本图像的拍摄角度,角度计算公式为:
Figure PCTCN2020077850-appb-000006
其中,第一关键点的三维坐标为(x 1,y 1,z 1),第二关键点的三维坐标为(x 2,y 2,z 2);V 1表示第一关键点和第二关键点的连线在世界坐标系中的XZ平面内的向量;V 2表示与V 1垂直的单位向量;V Z表示与世界坐标系中与Z轴平行的单位向量;α表示拍摄角度。
在本公开的一些实施例中,在基于所述第一关键点和所述第二关键点的三维坐标,确定样本图像的拍摄角度之后,该训练过程还包括:
基于样本图像,判断样本对象相对于第二摄像机的朝向姿态是否为背对朝向姿态;当样本对象相对于第二摄像机的朝向姿态为背对朝向姿态时,采用修正计算公式对拍摄角度进行修正处理,得到修正后的拍摄角度,修正计算公式为:α1=α2+180°;其中,α1为修正后的拍摄角度;α2为修正前的拍摄角度。
在本公开的一些实施例中,在基于所述第一关键点和所述第二关键点的三维坐标,确定样本图像的拍摄角度之前,该训练过程还包括:
判断所述第一关键点和所述第二关键点之间的距离是否小于距离阈值;若 所述第一关键点和所述第二关键点之间的距离小于距离阈值,确定样本图像的拍摄角度为指定角度,指定角度为位于固定范围的角度区间内的任一角度。
本公开的至少一个实施例还提供了一种三维重建系统,该三维重建系统可以包括:重建服务器和第一摄像机。该三维重建系统的结构可以参考图1示出的三维重建系统示出的结构。该重建服务器可以包括:图9、图10或图11示出的三维重建装置700。
在本公开的一些实施例中,该三维重建系统包括:试衣镜。所述试衣镜配置为在检测到目标对象时,向重建服务器发送获取请求,所述获取请求用于请求从所述重建服务器获取所述目标对象的三维图像,所述获取请求携带目所述标对象的信息;重建服务器配置为基于所述目标对象的信息,向所述试衣镜发送获取响应,所述获取响应携带有所述目标对象的三维图像。
本公开的至少一个实施例还提供了一种模型训练系统,该模型训练系统可以包括:训练服务器和第二摄像机。该模型训练系统的结构可以参考图2示出的模型训练系统示出的结构。该训练服务器可以包括:上述实施例示出的训练模块。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本公开的至少一个实施例还提供了一种非易失性计算机可读存储介质,该存储介质中存储有代码指令,该代码指令由处理器执行,以执行上述实施例示出的三维重建方法,例如,图3、图5或图6示出的三维重建方法。
本公开的至少一个实施例还提供了一种计算机可读存储介质,该存储介质为非易失性存储介质,该存储介质中存储有代码指令,该代码指令由处理器执行,以执行上述实施例示出的模型训练方法,例如,图7示出的训练过程。
在本公开实施例中,术语“第一”和“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“多个”指两个或两个以上,除非另有明确的限定。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,例如,磁盘或光盘等。
以上所述仅为本公开的可选的实施例,并不用以限制本公开,凡在本公开的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (15)

  1. 一种三维重建方法,包括:
    获取第一摄像机采集的第一图像,所述第一图像为包含目标对象的图像;
    确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向;
    基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像;以及
    基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
  2. 根据权利要求1所述的方法,其中,所述确定所述第一图像的所述拍摄角度,包括:
    向角度识别模型输入所述第一图像;
    接收所述角度识别模型输出的角度信息;以及
    将所述角度信息确定为所述拍摄角度;
    其中,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型。
  3. 根据权利要求1所述的方法,其中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:
    用标签对所述第一图像进行标记,所述标签用于对所述第一图像中的目标对象进行标记;
    基于所述第一图像的标签,将包含所述目标对象的第一图像归类为一个图像集合;
    基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像之前,所述方法还包括:
    基于所述图像集合,获取各个所述角度区间对应的第一图像。
  4. 根据权利要求3所述的方法,其中,在基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间之后,所述方法还包括:
    对于每个角度区间,判断所述图像集合是否包含多帧第一图像对应于所述角度区间;
    若在所述图像集合中包含多帧第一图像对应于所述角度区间,对所述多帧第一图像进行图像质量评分,得到所述多帧第一图像中的每帧第一图像的图像质量评分;以及
    保留图像质量评分最高的第一图像,删除其他第一图像。
  5. 根据权利要求1至4任一所述的方法,其中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:
    判断所述第一图像的分辨率是否小于分辨率阈值;
    若所述第一图像的分辨率小于所述分辨率阈值,删除所述第一图像;以及
    若所述第一图像的分辨率不小于所述分辨率阈值,将所述第一图像修改为指定分辨率的图像,所述指定分辨率大于或等于所述分辨率阈值。
  6. 根据权利要求1至4任一所述的方法,其中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:
    当所述多个角度区间中的每个角度区间均具有对应的目标图像时,基于每个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
  7. 根据权利要求1至4任一所述的方法,其中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:
    当接收到三维重建指令时,基于所述三维重建指令携带的目标对象的信息,获取多帧第一图像;
    基于所述多帧第一图像,确定所述多帧第一图像的所对应的角度区间,并 将所述多帧第一图像分别确定为所对应的角度区间的目标图像;
    基于所对应的角度区间的目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像;以及
    判断所述三维图像是否为非完整的三维图像;
    若所述三维图像是否为非完整的三维图像,对所述非完整的三维图像进行修复,得到修复后的三维图像。
  8. 一种三维重建装置,包括:
    处理器;以及
    存储器,所述存储器上存储有可被所述处理器执行的程序代码,当所述程序代码被所述处理器执行时,将所述处理器配置为:
    获取第一摄像机采集的第一图像,所述第一图像包括目标对象;
    采用角度识别模型确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型;
    基于所述第一图像的拍摄角度,在角度范围[0,360°)包括的多个角度区间中确定所述第一图像对应的角度区间,并将该第一图像设置为所述角度区间的目标图像;
    基于各个所述角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
  9. 一种三维重建系统,其包括:重建服务器和第一摄像机,所述重建服务器包括:权利要求8所述的三维重建装置。
  10. 根据权利要求9所述的系统,其还包括:试衣镜;
    在检测到目标对象时,所述试衣镜配置为向所述重建服务器发送获取请求,所述获取请求携带有所述目标对象的信息;
    所述重建服务器用于基于所述目标对象的信息,向所述试衣镜发送获取响应,所述获取响应携带有所述目标对象的三维图像。
  11. 一种模型训练方法,配置为训练角度识别模型,所述方法包括:
    执行多次训练过程,直至所述角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值,其中,所述训练过程包括:
    获取第二摄像机采集的包含样本对象的样本图像以及与所述样本图像对应的深度图;
    获取所述深度图中的样本对象中的第一关键点和第二关键点;
    基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度,所述拍摄角度用于表征所述第二摄像机拍摄所述样本图像时相对于所述样本对象的方向;
    向深度学习模型输入样本图像,得到所述样本图像的预测拍摄角度,根据所述样本图像的拍摄角度和预测拍摄角度,确定所述拍摄角度的分类准确度。
  12. 根据权利要求11所述的方法,其中,基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度,包括:
    采用角度计算公式计算所述样本图像的拍摄角度,所述角度计算公式为:
    Figure PCTCN2020077850-appb-100001
    其中,所述第一关键点的三维坐标为(x 1,y 1,z 1),所述第二关键点的三维坐标为(x 2,y 2,z 2);V 1表示所述第一关键点和所述第二关键点的连线在世界坐标系中的XZ平面内的向量;V 2表示与V 1垂直的单位向量;V Z表示与世界坐标系中与Z轴平行的单位向量;α表示所述拍摄角度。
  13. 根据权利要求11所述的方法,其中,在基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度之后,所述训练过程还包括:
    基于所述样本图像,判断所述样本对象相对于所述第二摄像机的朝向姿态是否为背对朝向姿态;
    当所述样本对象相对于所述第二摄像机的朝向姿态为所述背对朝向姿态时,采用修正计算公式对所述拍摄角度进行修正处理,得到修正后的拍摄角度,所述修正计算公式为:
    α1=α2+180°;
    其中,α1为修正后的拍摄角度;α2为修正前的拍摄角度。
  14. 根据权利要求11至13任一所述的方法,其中,基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度之前,所述训练过程还包括:
    判断所述第一关键点和所述第二关键点之间的距离是否小于距离阈值;
    若所述第一关键点和所述第二关键点之间的距离小于所述距离阈值,确定所述样本图像的拍摄角度为指定角度,所述指定角度为位于固定范围的角度区间内的任一角度。
  15. 一种非易失性的计算机可读存储介质,其中,所述存储介质中存储有代码指令,所述代码指令由处理器执行,以执行权利要求1至7任一所述的三维重建方法。
PCT/CN2020/077850 2019-04-24 2020-03-04 三维重建方法及装置、系统、模型训练方法、存储介质 WO2020215898A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/043,061 US11403818B2 (en) 2019-04-24 2020-03-04 Three-dimensional reconstruction method, apparatus and system, model training method and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910333474.0A CN111862296B (zh) 2019-04-24 2019-04-24 三维重建方法及装置、系统、模型训练方法、存储介质
CN201910333474.0 2019-04-24

Publications (1)

Publication Number Publication Date
WO2020215898A1 true WO2020215898A1 (zh) 2020-10-29

Family

ID=72940873

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/077850 WO2020215898A1 (zh) 2019-04-24 2020-03-04 三维重建方法及装置、系统、模型训练方法、存储介质

Country Status (3)

Country Link
US (1) US11403818B2 (zh)
CN (1) CN111862296B (zh)
WO (1) WO2020215898A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734856A (zh) * 2021-01-05 2021-04-30 恒信东方文化股份有限公司 一种服装的拍摄角度的确定方法及系统
US20210365673A1 (en) * 2020-05-19 2021-11-25 Board Of Regents, The University Of Texas System Method and apparatus for discreet person identification on pocket-size offline mobile platform with augmented reality feedback with real-time training capability for usage by universal users
CN114125304A (zh) * 2021-11-30 2022-03-01 维沃移动通信有限公司 拍摄方法及其装置

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10586379B2 (en) * 2017-03-08 2020-03-10 Ebay Inc. Integration of 3D models
US11727656B2 (en) 2018-06-12 2023-08-15 Ebay Inc. Reconstruction of 3D model with immersive experience
CN112907726B (zh) * 2021-01-25 2022-09-20 重庆金山医疗技术研究院有限公司 一种图像处理方法、装置、设备及计算机可读存储介质
CN112884638A (zh) * 2021-02-02 2021-06-01 北京东方国信科技股份有限公司 虚拟试衣方法及装置
CN112819953B (zh) * 2021-02-24 2024-01-19 北京创想智控科技有限公司 三维重建方法、网络模型训练方法、装置及电子设备
CN113141498B (zh) * 2021-04-09 2023-01-06 深圳市慧鲤科技有限公司 信息生成方法、装置、计算机设备及存储介质
CN114821497A (zh) * 2022-02-24 2022-07-29 广州文远知行科技有限公司 目标物位置的确定方法、装置、设备及存储介质
CN114999644B (zh) * 2022-06-01 2023-06-20 江苏锦业建设工程有限公司 一种建筑人员疫情防控可视化管理系统及管理方法
CN115222814B (zh) * 2022-06-02 2023-09-01 珠海云洲智能科技股份有限公司 救援设备导引方法、装置、终端设备及存储介质
CN115361500A (zh) * 2022-08-17 2022-11-18 武汉大势智慧科技有限公司 用于三维建模的图像获取方法、系统及三维建模方法
CN117671402A (zh) * 2022-08-22 2024-03-08 华为技术有限公司 识别模型训练方法、装置以及可移动智能设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090160858A1 (en) * 2007-12-21 2009-06-25 Industrial Technology Research Institute Method for reconstructing three dimensional model
CN102244732A (zh) * 2011-07-01 2011-11-16 深圳超多维光电子有限公司 一种立体摄像机的参数设置方法、装置及该立体摄像机
CN104992441A (zh) * 2015-07-08 2015-10-21 华中科技大学 一种面向个性化虚拟试衣的真实人体三维建模方法
CN108053476A (zh) * 2017-11-22 2018-05-18 上海大学 一种基于分段三维重建的人体参数测量系统及方法
CN109272576A (zh) * 2018-09-30 2019-01-25 Oppo广东移动通信有限公司 一种数据处理方法、mec服务器、终端设备及装置
CN109377524A (zh) * 2018-10-29 2019-02-22 山东师范大学 一种单幅图像深度恢复方法和系统
CN109461180A (zh) * 2018-09-25 2019-03-12 北京理工大学 一种基于深度学习的三维场景重建方法

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1131495C (zh) * 1996-08-29 2003-12-17 三洋电机株式会社 特征信息赋予方法及装置
CN101271582B (zh) * 2008-04-10 2010-06-16 清华大学 基于多视角二维图像并结合sift算法的三维重建方法
US8963829B2 (en) * 2009-10-07 2015-02-24 Microsoft Corporation Methods and systems for determining and tracking extremities of a target
US8861800B2 (en) * 2010-07-19 2014-10-14 Carnegie Mellon University Rapid 3D face reconstruction from a 2D image and methods using such rapid 3D face reconstruction
EP2906062B1 (en) * 2013-01-16 2016-03-16 Van de Velde NV Fitting room mirror
CN104966316B (zh) * 2015-05-22 2019-03-15 腾讯科技(深圳)有限公司 一种3d人脸重建方法、装置及服务器
CN106709746A (zh) * 2015-11-17 2017-05-24 北京三件客科技有限公司 3d扫描、模型测量一体化的互联网服装定制系统
CN105427369A (zh) * 2015-11-25 2016-03-23 努比亚技术有限公司 移动终端及其三维形象的生成方法
CN105760836A (zh) * 2016-02-17 2016-07-13 厦门美图之家科技有限公司 基于深度学习的多角度人脸对齐方法、系统及拍摄终端
CN107103641A (zh) * 2017-03-23 2017-08-29 微景天下(北京)科技有限公司 三维重建成像系统和三维重建成像方法
CN107392086B (zh) * 2017-05-26 2020-11-03 深圳奥比中光科技有限公司 人体姿态的评估装置、系统及存储装置
CN109522775B (zh) * 2017-09-19 2021-07-20 杭州海康威视数字技术股份有限公司 人脸属性检测方法、装置及电子设备
CN108229332B (zh) * 2017-12-08 2020-02-14 华为技术有限公司 骨骼姿态确定方法、装置及计算机可读存储介质
CN108921000B (zh) * 2018-04-16 2024-02-06 深圳市深网视界科技有限公司 头部角度标注、预测模型训练、预测方法、设备和介质
US11386614B2 (en) * 2018-06-15 2022-07-12 Google Llc Shading images in three-dimensional content system
CN108960093A (zh) * 2018-06-21 2018-12-07 阿里体育有限公司 脸部转动角度的识别方法及设备
CN109389054A (zh) * 2018-09-21 2019-02-26 北京邮电大学 基于自动图像识别和动作模型对比的智能镜子设计方法
US11062454B1 (en) * 2019-04-16 2021-07-13 Zoox, Inc. Multi-modal sensor data association architecture

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090160858A1 (en) * 2007-12-21 2009-06-25 Industrial Technology Research Institute Method for reconstructing three dimensional model
CN102244732A (zh) * 2011-07-01 2011-11-16 深圳超多维光电子有限公司 一种立体摄像机的参数设置方法、装置及该立体摄像机
CN104992441A (zh) * 2015-07-08 2015-10-21 华中科技大学 一种面向个性化虚拟试衣的真实人体三维建模方法
CN108053476A (zh) * 2017-11-22 2018-05-18 上海大学 一种基于分段三维重建的人体参数测量系统及方法
CN109461180A (zh) * 2018-09-25 2019-03-12 北京理工大学 一种基于深度学习的三维场景重建方法
CN109272576A (zh) * 2018-09-30 2019-01-25 Oppo广东移动通信有限公司 一种数据处理方法、mec服务器、终端设备及装置
CN109377524A (zh) * 2018-10-29 2019-02-22 山东师范大学 一种单幅图像深度恢复方法和系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365673A1 (en) * 2020-05-19 2021-11-25 Board Of Regents, The University Of Texas System Method and apparatus for discreet person identification on pocket-size offline mobile platform with augmented reality feedback with real-time training capability for usage by universal users
CN112734856A (zh) * 2021-01-05 2021-04-30 恒信东方文化股份有限公司 一种服装的拍摄角度的确定方法及系统
CN114125304A (zh) * 2021-11-30 2022-03-01 维沃移动通信有限公司 拍摄方法及其装置
CN114125304B (zh) * 2021-11-30 2024-04-30 维沃移动通信有限公司 拍摄方法及其装置

Also Published As

Publication number Publication date
CN111862296B (zh) 2023-09-29
CN111862296A (zh) 2020-10-30
US11403818B2 (en) 2022-08-02
US20210375036A1 (en) 2021-12-02

Similar Documents

Publication Publication Date Title
WO2020215898A1 (zh) 三维重建方法及装置、系统、模型训练方法、存储介质
CN109934065B (zh) 一种用于手势识别的方法和装置
JP6942488B2 (ja) 画像処理装置、画像処理システム、画像処理方法、及びプログラム
US11403874B2 (en) Virtual avatar generation method and apparatus for generating virtual avatar including user selected face property, and storage medium
JP6685827B2 (ja) 画像処理装置、画像処理方法及びプログラム
JP2019087229A (ja) 情報処理装置、情報処理装置の制御方法及びプログラム
CN111382613B (zh) 图像处理方法、装置、设备和介质
JP4951498B2 (ja) 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体
CN106981078B (zh) 视线校正方法、装置、智能会议终端及存储介质
JP2004094491A (ja) 顔向き推定装置および顔向き推定方法ならびに顔向き推定プログラム
CN111160291B (zh) 基于深度信息与cnn的人眼检测方法
JP2019096113A (ja) キーポイントデータに関する加工装置、方法及びプログラム
JP2019117577A (ja) プログラム、学習処理方法、学習モデル、データ構造、学習装置、および物体認識装置
WO2021218568A1 (zh) 图像深度确定方法及活体识别方法、电路、设备和介质
WO2022174594A1 (zh) 基于多相机的裸手追踪显示方法、装置及系统
US20210319234A1 (en) Systems and methods for video surveillance
EP3791356B1 (en) Perspective distortion correction on faces
CN112200056B (zh) 人脸活体检测方法、装置、电子设备及存储介质
US20160093028A1 (en) Image processing method, image processing apparatus and electronic device
JP2018113021A (ja) 情報処理装置およびその制御方法、プログラム
US11615549B2 (en) Image processing system and image processing method
WO2021134311A1 (zh) 拍摄对象切换方法及装置、图像处理方法及装置
WO2022237048A1 (zh) 位姿获取方法、装置、电子设备、存储介质及程序
JP6798609B2 (ja) 映像解析装置、映像解析方法およびプログラム
EP3699865B1 (en) Three-dimensional face shape derivation device, three-dimensional face shape deriving method, and non-transitory computer readable medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20793950

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20793950

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20793950

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06/05/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20793950

Country of ref document: EP

Kind code of ref document: A1