WO2020215898A1 - 三维重建方法及装置、系统、模型训练方法、存储介质 - Google Patents
三维重建方法及装置、系统、模型训练方法、存储介质 Download PDFInfo
- Publication number
- WO2020215898A1 WO2020215898A1 PCT/CN2020/077850 CN2020077850W WO2020215898A1 WO 2020215898 A1 WO2020215898 A1 WO 2020215898A1 CN 2020077850 W CN2020077850 W CN 2020077850W WO 2020215898 A1 WO2020215898 A1 WO 2020215898A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- angle
- dimensional
- target object
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 101
- 238000012549 training Methods 0.000 title claims abstract description 85
- 230000008569 process Effects 0.000 claims description 32
- 238000004364 calculation method Methods 0.000 claims description 24
- 230000004044 response Effects 0.000 claims description 17
- 238000012937 correction Methods 0.000 claims description 14
- 238000013136 deep learning model Methods 0.000 claims description 10
- 230000008439 repair process Effects 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 4
- 210000000323 shoulder joint Anatomy 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000001815 facial effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000000877 morphologic effect Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/10—Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0641—Shopping interfaces
- G06Q30/0643—Graphical representation of items or shoppers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47G—HOUSEHOLD OR TABLE EQUIPMENT
- A47G1/00—Mirrors; Picture frames or the like, e.g. provided with heating, lighting or ventilating means
- A47G1/02—Mirrors used as equipment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Definitions
- the embodiments of the present disclosure relate to a three-dimensional reconstruction method and device, system, model training method, and storage medium.
- Some fitting mirrors with multiple functions can provide a fitting effect when the user does not actually try on clothes, which improves the convenience of fitting and saves the time of fitting.
- the three-dimensional image is usually obtained by performing three-dimensional reconstruction processing on the target image containing the target object obtained by the camera at various shooting angles. owned. Each shooting angle is located in an angle interval obtained by dividing the angle range [0, 360°).
- the target image containing the target object it is necessary to obtain the target image containing the target object at multiple shooting angles. For example, the camera is used to shoot the target object at 15° intervals in clockwise or counter-clockwise sequence, and then The acquired 24 frames of target images are processed for 3D reconstruction.
- the inventor found that in the current three-dimensional reconstruction process, it is necessary to sort the target images obtained by the camera through an algorithm to ensure that the two angle intervals corresponding to the two adjacent target images are adjacent, resulting in the three-dimensional reconstruction
- the amount of calculation during processing is relatively large, and the efficiency of obtaining three-dimensional images is low.
- Various embodiments of the present disclosure provide a three-dimensional reconstruction method, which includes:
- Determining a shooting angle of the first image where the shooting angle is used to characterize a shooting direction relative to the target object when the first camera shoots the first object;
- the determining the shooting angle of the target image includes:
- the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image.
- the method further includes:
- Labeling the first image with a label the label being used to label a target object in the target image
- the method Before performing three-dimensional reconstruction on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object, the method further includes:
- the first image corresponding to each of the angle intervals is acquired.
- the method further includes :
- image quality scoring processing is performed on the multiple frames of first images to obtain the value of each frame of the first image in the multiple frames of first image Image quality score;
- the method further includes:
- the first image is modified to an image of a specified resolution, and the specified resolution is greater than or equal to the resolution threshold.
- the three-dimensional reconstruction of the target object based on the target image of each angle interval to obtain the three-dimensional image of the target object includes:
- each of the multiple angle intervals has a corresponding target image
- perform three-dimensional reconstruction on the target object to obtain a three-dimensional image of the target object.
- the three-dimensional reconstruction of the target object based on the target image of each angle interval to obtain the three-dimensional image of the target object includes:
- the three-dimensional image is an incomplete three-dimensional image, repair the incomplete three-dimensional image to obtain a repaired three-dimensional image.
- Various embodiments of the present disclosure provide a three-dimensional reconstruction device, including:
- a first acquiring module configured to acquire a first image acquired by a first camera, the first image including a target object
- a first determining module configured to determine a shooting angle of the first image using an angle recognition model, where the shooting angle is used to characterize the shooting direction of the first object relative to the target object when the first camera shoots the first object;
- the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image;
- the second determining module is configured to determine the angle interval corresponding to the first image among a plurality of angle intervals included in the angle range [0, 360°) based on the shooting angle of the first image, and to combine the first image Set as a target image in the angle interval;
- the three-dimensional reconstruction module is configured to perform three-dimensional reconstruction on the target object based on the target image in each of the angle intervals to obtain a three-dimensional image of the target object.
- Various embodiments of the present disclosure provide a three-dimensional reconstruction system, including: a reconstruction server and a first camera, and the reconstruction server includes the above-mentioned three-dimensional reconstruction device.
- the system further includes: a dressing mirror;
- the dressing mirror is configured to send an acquisition request to the reconstruction server, and the acquisition request carries information of the target object;
- the reconstruction server is configured to send an acquisition response to the dressing mirror based on the information of the target object, and the acquisition correspondingly carries the three-dimensional image of the target object.
- a model training method configured to train an angle recognition model, and the method includes:
- the shooting angle of the sample image is determined, and the shooting angle is used to characterize that the second camera captures the sample image relative to The direction of the sample object;
- the sample image is input to the deep learning model to obtain the predicted shooting angle of the sample image, and the classification accuracy of the shooting angle is determined according to the shooting angle and the predicted shooting angle of the sample image.
- determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point includes:
- the angle calculation formula is used to calculate the shooting angle of the sample image, and the angle calculation formula is:
- the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
- V 1 represents the first key point A vector in the XZ plane of the line connecting a key point and the second key point in the world coordinate system;
- V 2 represents a unit vector perpendicular to V 1 ;
- V Z represents a unit parallel to the Z axis in the world coordinate system Vector;
- ⁇ represents the shooting angle.
- the training process further includes:
- the shooting angle is corrected by using a correction calculation formula to obtain a corrected shooting angle, and the correction calculation formula is :
- ⁇ 1 is the shooting angle after correction
- ⁇ 2 is the shooting angle before correction
- the training process before determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point, the training process further includes:
- the shooting angle of the sample image is a specified angle, and the specified angle is any angle within a fixed range. One angle.
- Various embodiments of the present disclosure provide a non-volatile computer-readable storage medium in which code instructions are stored, and the code instructions are executed by a processor to execute the above-mentioned three-dimensional reconstruction method.
- Fig. 1 is a block diagram of a three-dimensional reconstruction system involved in a three-dimensional reconstruction method according to an embodiment of the present disclosure
- FIG. 2 is a block diagram of a model training system involved in a model training method according to an embodiment of the present disclosure
- Fig. 3 is a flowchart of a three-dimensional reconstruction method according to an embodiment of the present disclosure
- Figure 4 is an effect diagram when the first camera shoots the first object
- Fig. 5 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
- Fig. 6 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
- FIG. 7 is a flowchart of a training process according to an embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of joint points of a sample object according to an embodiment of the present disclosure.
- Fig. 9 is a block diagram of a three-dimensional reconstruction device according to an embodiment of the present disclosure.
- Fig. 10 is a block diagram of a three-dimensional reconstruction device according to another embodiment of the present disclosure.
- FIG. 11 is a block diagram of a three-dimensional reconstruction apparatus according to still another embodiment of the present disclosure.
- FIG. 1 is a block diagram of a three-dimensional reconstruction system involved in a three-dimensional reconstruction method according to an embodiment of the present disclosure.
- the three-dimensional reconstruction system 100 may include: at least one first camera 101 and a reconstruction server 102.
- the first camera 101 may generally include a surveillance camera such as an RGB camera or an infrared camera, and the first camera 101 is usually set to be multiple.
- the multiple first cameras 101 may be deployed at different locations in a mall or shop.
- the reconstruction server 102 may be a server, or a server cluster composed of several servers, or a cloud computing service center, or a computer device.
- the first camera 101 can establish a communication connection with the reconstruction server 102.
- the three-dimensional reconstruction system 100 may further include: a dressing mirror 103.
- the fitting mirror 103 can usually be deployed in a store such as a clothing store, and the fitting mirror 103 can provide users with virtual fitting services.
- the fitting mirror 103 can establish a communication connection with the reconstruction server 102.
- FIG. 2 is a block diagram of a model training system involved in a model training method provided by an embodiment of the present disclosure.
- the model training system 200 may include: a second camera 201 and a training server 202.
- the second camera 201 may be a camera including a depth camera, or a binocular camera.
- the second camera can acquire both a color map (also called an RGB map) and a depth map.
- the pixel value of each pixel in the depth map is a depth value, and the depth value is used to indicate the distance of the corresponding pixel from the second camera.
- the training server 202 may be a server, or a server cluster composed of several servers, or a cloud computing service center, or a computer device.
- the second camera 201 may establish a communication connection with the training server 202 in a wired or wireless manner.
- FIG. 3 is a flowchart of a three-dimensional reconstruction method according to an embodiment of the present disclosure.
- the three-dimensional reconstruction method is applicable to the reconstruction server 102 in the three-dimensional reconstruction system 100 shown in FIG. 1.
- the three-dimensional reconstruction method may include:
- Step S301 Obtain a first image collected by the first camera.
- the first image contains the target object.
- Step S302 Determine a shooting angle of the first image, where the shooting angle is used to characterize the shooting direction of the first camera relative to the target object when the first image is captured.
- an angle recognition model may be used to determine the shooting angle of the first image, and the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image.
- Step S303 Based on the shooting angle, determine the corresponding angle interval of the first image in the multiple angle intervals included in the angle range [0, 360°), and set the first image as the angle interval of the angle interval. Target image.
- the multiple angle intervals are obtained by dividing the angle range [0, 360°), and the angle values contained in each angle interval are different.
- Figure 4 is an effect diagram when the first camera shoots the target object.
- the first camera 02 shoots the target object 01 from different directions, and the first camera 02 shoots the target
- the shooting direction of object 01 can be characterized by shooting angle.
- counterclockwise rotation with the target object 01 as the center is an angular interval every 15°.
- the angle information included in each angle interval is the shooting angle obtained in step S302, and therefore the shooting angle obtained in step S302 belongs to one of the multiple angle intervals.
- Step S304 Perform three-dimensional reconstruction processing on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object.
- the shooting angle of the first image can be determined by the angle recognition model, and based on the shooting angle, the angle interval corresponding to the first image in the multiple angle intervals can be determined. Therefore, when the first image is acquired, the shooting angle of the first image is also acquired.
- the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort multiple frames of the first image, which can be directly based on the shooting angle
- the sequence of acquiring the first image of multiple frames effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
- the shooting angle of the first image can be determined through the angle recognition model, and the corresponding angle interval of the first image in the multiple angle intervals can be determined based on the shooting angle , And set the first image as the target image in the angle interval, and subsequently, based on the target image in each angle interval, three-dimensional reconstruction of the target object may be performed to obtain a three-dimensional image of the target object.
- the shooting angle of the first image is also acquired.
- the target object When the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort multiple frames of the first image, and it can be obtained directly based on the shooting angle
- the sequence of the first image of multiple frames effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
- FIG. 5 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
- the three-dimensional reconstruction method is applied to the reconstruction server 102 in the three-dimensional reconstruction system 100 shown in FIG. 1.
- the three-dimensional reconstruction method may include:
- Step S401 Acquire a first image collected by a first camera.
- the first image is an image containing the target object collected by the first camera.
- the target object can be a person, an animal, or an object. If the first camera is a surveillance camera and the first camera is deployed in a mall or store, the target object is a person in the mall or store.
- the reconstruction server may intercept multiple frames of first images in the frame of image, and each frame of the first image The target audience is different.
- Step S402 Determine whether the resolution of the first image is less than the resolution threshold.
- step S403 if the resolution of the first image is less than the resolution threshold, step S403 is executed; if the resolution of the first image is not less than the resolution threshold, step S404 is executed.
- the resolution threshold is: 224 ⁇ 112.
- Step S403 If the resolution of the first image is less than the resolution threshold, delete the first image.
- the reconstruction server may determine whether the resolution of the first image is less than the resolution threshold, and if the reconstruction server determines that the resolution of the first image is less than the resolution threshold, delete the first image.
- Step S404 If the resolution of the first image is not less than the resolution threshold, modify the first image to an image with a specified resolution.
- the specified resolution is greater than or equal to the resolution threshold.
- the first image may be modified to an image with a specified resolution. For example, if the resolution of the first image is greater than the specified resolution, the reconstruction server needs to compress the resolution of the first image to the specified resolution; if the resolution of the first image is less than the specified resolution, the reconstruction server needs to The resolution of an image is expanded to the specified resolution.
- Step S405 Mark the first image with a label, where the label is used to mark a target object in the first image.
- a target recognition algorithm can be used to mark the first image.
- the target recognition algorithm may be a pedestrian movement detection algorithm.
- the reconstruction server can mark the first image with a label through the pedestrian movement detection algorithm, where the label is used to mark the target object in the first image.
- the pedestrian movement detection algorithm may analyze at least one of the clothing feature, the face feature, and the morphological feature of the target object, so as to mark the first image with a label.
- Step S406 Based on the tag, classify the first image into an image set.
- the first images have the same label, and the first images contain the same target object. Therefore, the reconstruction server can classify the target images containing the same target object into an image set based on the tags.
- step S406 the target objects in the first images in the same image set are the same, but the target objects in the first images in different image sets are different.
- Step S407 Use the angle recognition model to determine the shooting angle of the first image.
- the reconstruction server may use an angle recognition model to determine the shooting angle of each frame of the first image.
- the angle recognition model is a model obtained by learning and training the sample image and the shooting angle of the sample image. The method for obtaining the angle recognition model will be introduced in the subsequent embodiments, and will not be repeated here.
- the reconstruction server uses the angle recognition model to determine the shooting angle of the first image, which may include the following steps:
- Step A1 Input the first image to the angle recognition model.
- Step B1 Receive the angle information output by the angle recognition model.
- Step C1 Determine the angle information as the shooting angle of the first image.
- Step S408 Determine an angle interval corresponding to the first image based on the shooting angle of the first image, where the angle interval is an angle interval included in a plurality of angle intervals in the angle range [0, 360°), and The first image is set as a target image in the angle interval.
- the reconstruction server can obtain the first camera in real time from different shooting angles while the target object is walking. Contains the first image of the target object. For example, if the target object is rotated clockwise or counterclockwise every 15° as an angular interval, then the number of the multiple angular intervals is 24.
- the reconstruction server may determine the angle interval corresponding to the first image among multiple angle intervals based on the shooting angle of each frame of the first image, and set the first image as a target image of the angle interval.
- the shooting angle of the first image is 10°
- the angle interval corresponding to the first image is [0, 15°)
- the first image is set as the target image with the angle interval of [0, 15°)
- the target image is used for three-dimensional reconstruction.
- Step S409 Based on the image set, for each angle interval, determine whether there are more than two frames of first images in the same image set corresponding to the angle interval.
- the reconstruction server since the reconstruction server subsequently needs to refer to the target image corresponding to each angle interval to perform three-dimensional reconstruction of the target object, there may be more than two frames of target images containing the same target object corresponding to one angle interval If the reconstruction server directly refers to the multiple first images corresponding to the angle interval and contains the same target object to perform three-dimensional reconstruction on the target object, the efficiency of performing three-dimensional reconstruction on the target object may be affected. Therefore, the reconstruction server can determine, based on the image collection, whether there are more than two frames of target images in the same image collection corresponding to an angular interval, that is, whether there are more than two target images containing the same target object corresponding to An angular interval.
- the reconstruction server may determine whether there are more than two frames of target images corresponding to the angle interval in the same image set. If there are more than two frames of target images in the same image set corresponding to the angle interval, that is, there are more than two frames of target images containing the same target object corresponding to the angle interval, go to step S410; if none of them are in the same image More than two frames of target images in the set correspond to the angle interval, and step S409 is repeatedly executed.
- Step S410 If there are more than two frames of first images located in the same image set corresponding to the angle interval, perform image quality scoring processing on more than two frames of first images located in the same image set to obtain the first image of each frame The image quality rating of an image.
- the reconstruction server may use an image quality scoring algorithm to perform image quality on more than two frames of first images corresponding to the angle interval.
- the scoring process obtains the quality score of the first image of each frame.
- Step S411 Keep the first image with the highest image quality score, set the first image with the highest image quality score as the target image in the angle interval, and delete other first images.
- the higher the quality score the higher the definition of the first image.
- the first image is set as the target image in the corresponding angle interval, based on the three-dimensional image obtained during the subsequent three-dimensional reconstruction of the target image.
- the better the quality. Therefore, the reconstruction server can retain the first image with the highest image quality score, delete other first images, and ensure that each angle interval corresponds to only one frame of the first image with higher definition, which effectively improves the subsequent 3D reconstruction of the target object
- the imaging quality of the obtained three-dimensional image is reduced, and the number of frames of the first image that needs to be processed during three-dimensional reconstruction is reduced, thereby improving the efficiency of three-dimensional reconstruction of the target object.
- Step S412 Perform three-dimensional reconstruction on the target object based on the target image in each angle interval to obtain a three-dimensional image of the target object.
- the three-dimensional reconstruction method may further include: obtaining, based on the image set, corresponding to each angle interval containing the same target object The first image. It should also be noted that, after obtaining the three-dimensional image of the target object, the reconstruction server may store the three-dimensional image in the memory of the reconstruction server.
- the reconstruction server can perform three-dimensional reconstruction on the target object that meets the three-dimensional reconstruction conditions.
- the embodiments of the present disclosure are schematically illustrated in the following two optional implementation manners.
- the reconstruction server determines the target image corresponding to each of the multiple angle intervals, the target object satisfies the three-dimensional reconstruction condition.
- step S412 may include: if each of the multiple angle intervals has a target object, based on the target image of each angle interval, three-dimensional reconstruction of the target object is performed to obtain a three-dimensional image of the target object.
- the reconstruction server may use a motion reconstruction (English: Structure from motion; SFM for short) algorithm to perform three-dimensional reconstruction of the target object to obtain a three-dimensional image of the target object.
- the process of the reconstruction server determining that the target image containing the same target object corresponds to each of the multiple angle intervals may include the following steps:
- Step A2 For each image set, obtain the angle interval corresponding to each frame of the first image in the image set.
- the reconstruction server can determine the angle interval corresponding to each frame of the first image in each image set, and in the same image set, each angle interval only corresponds to One frame of the first image, and the first image is set as the target image of the angle interval. Therefore, for each image set, the reconstruction server can obtain the angle interval corresponding to each frame of the first image in the image set in real time.
- Step B2 Determine whether the number of angle intervals corresponding to all target images is the same as the number of multiple angle intervals.
- the reconstruction server can determine that each of the multiple angle intervals has a target image, that is, execute Step C2: If the number of angle intervals corresponding to all target images is different from the number of multiple angle intervals, the reconstruction server can determine that at least one of the multiple angle intervals does not have a target image, and repeat step A2.
- Step C2 If the number of all target images is the same as the number of multiple angle intervals, it is determined that each of the multiple angle intervals has a target image.
- the reconstruction server may determine that the target object has a target image in each of the multiple angle intervals. , The target object meets the 3D reconstruction conditions.
- the reconstruction server when the reconstruction server receives the three-dimensional reconstruction instruction carrying the information of the target object, the target object meets the three-dimensional reconstruction condition.
- step S412 may include the following steps:
- Step A3 When the three-dimensional reconstruction instruction is received, based on the information of the target object carried by the three-dimensional reconstruction instruction, obtain multiple frames of first images containing the target object.
- the three-dimensional reconstruction system may further include: a dressing mirror, and the three-dimensional reconstruction instruction may be an instruction sent by the dressing mirror.
- the reconstruction server when the reconstruction server receives the three-dimensional reconstruction instruction carrying the information of the target object, the reconstruction server may obtain multiple frames of first images containing the target object based on the information of the target object.
- the information about the target object may include at least one of clothing features, facial features, and morphological features. Since the 3D reconstruction server obtains the first image, it will also analyze at least one of the dress feature, face feature, and morphological feature of the target image. Therefore, the reconstruction server can obtain multiple frames containing the target based on the information of the target object. The first image of the object.
- Step B3 Based on the first image of each frame, determine the three-dimensional reconstruction of the target object corresponding to the first image of the frame to obtain a three-dimensional image of the target object. Based on the multiple frames of first images, determining corresponding angle intervals of the multiple frames of first images, and determining the multiple frames of first images as target images in the corresponding angle intervals;
- the reconstruction server may use the SFM algorithm to perform three-dimensional reconstruction of the target object based on the first image containing the target object in each frame to obtain the three-dimensional image of the target object.
- Step C3 Determine whether the three-dimensional image is an incomplete three-dimensional image.
- the number of frames of the first image on which the reconstruction server is based is small, that is, there may be at least one angle interval that does not contain the target object among the multiple angle intervals.
- the first image therefore, the three-dimensional image obtained after three-dimensional reconstruction may be an incomplete three-dimensional image, for example, the angle of the hole contained in the three-dimensional image.
- the reconstruction server can determine whether the three-dimensional image is an incomplete three-dimensional image. If the three-dimensional image is an incomplete three-dimensional image, perform step D3; if the three-dimensional image is a complete three-dimensional image, the action ends.
- Step D3 If the three-dimensional image is an incomplete three-dimensional image, repair the incomplete three-dimensional image to obtain a repaired three-dimensional image.
- the reconstruction server in order for the reconstruction server to obtain a three-dimensional image with higher image quality, after determining that the three-dimensional image is an incomplete three-dimensional image, the reconstruction server needs to repair the incomplete three-dimensional image.
- the reconstruction server can repair the three-dimensional image according to the law of the three-dimensional image of the human body.
- step 407 can be performed first, and then step 405 to step 406 can be performed. Steps can also be increased or decreased according to the situation.
- the shooting angle of the target image can be determined through the angle recognition model. Based on the shooting angle, the angle interval of the first image can be determined in multiple angle intervals, and The first image is set as a target image in an angle interval, and then based on the target image corresponding to each angle interval, three-dimensional reconstruction processing may be performed on the target object to obtain a three-dimensional image of the target object.
- the shooting angle of the first image is also acquired.
- the target object is subsequently reconstructed in 3D, there is no need to use additional algorithms to sort the multi-frame target images, and the multiple frames can be obtained directly based on the shooting angle.
- the sequence of the first image of the frame effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
- FIG. 6 is a flowchart of a three-dimensional reconstruction method according to another embodiment of the present disclosure.
- the three-dimensional reconstruction method is applicable to the three-dimensional reconstruction system 100 shown in FIG. 1, and the three-dimensional reconstruction method may include:
- Step S501 The first camera collects an image.
- the first camera may be a surveillance camera, and the number of the first camera is multiple, which may be deployed at different locations in a mall or a store.
- Step S502 The first camera sends the collected image to the reconstruction server.
- the first camera may send the real-time collected images to the reconstruction server, so that the reconstruction server can perform three-dimensional reconstruction of the target object.
- Step S503 The reconstruction server performs three-dimensional reconstruction on the target object based on the image collected by the first camera to obtain a three-dimensional image of the target object.
- the reconstruction server performs three-dimensional reconstruction of the target object based on the image collected by the first camera to obtain the three-dimensional image of the target object.
- the process of obtaining the three-dimensional image of the target object please refer to the related content in the aforementioned step S401 to step S412, which will not be repeated here.
- Step S504 The dressing mirror sends an acquisition request to the reconstruction server, and the acquisition request carries information of the target object.
- the target object is a person standing in front of the dressing mirror.
- the dressing mirror needs to obtain the target object’s information from the reconstruction server. Three-dimensional image.
- the dressing mirror may be provided with a dressing mirror camera.
- the dressing mirror camera can collect information of the target object located in front of the dressing mirror, and send an acquisition request to the reconstruction server, and the acquisition request carries the information of the target object.
- Step S505 The reconstruction server sends an acquisition response to the dressing mirror based on the information of the target object, and the acquisition response carries a three-dimensional image of the target object.
- the reconstruction server may first determine whether the three-dimensional image of the target object is stored.
- the target object information may be face information
- the reconstruction server when the reconstruction server obtains the three-dimensional image of the target object, it will also obtain the face information of the target object. Therefore, the reconstruction server can determine whether it stores a three-dimensional image of the target object based on the face information.
- the reconstruction server can determine that it stores A three-dimensional image of the target object. At this time, the reconstruction server may send an acquisition response carrying the three-dimensional image of the target object to the dressing mirror.
- the reconstruction server may send a response to the fitting mirror, the response indicating that the three-dimensional image of the target object is not stored in the reconstruction server; the fitting mirror may send the reconstruction server based on the acquisition response to the reconstruction server with information about the target object Three-dimensional reconstruction instruction; the reconstruction server may, based on the three-dimensional reconstruction instruction, perform three-dimensional reconstruction on the target object, and send to the dressing mirror an acquisition response that carries the three-dimensional image of the target object.
- the reconstruction server based on the three-dimensional reconstruction instruction the process of performing the three-dimensional reconstruction of the target object can refer to the corresponding process in the above step S412, which will not be repeated here.
- Step S506 The fitting mirror provides a virtual fitting service to the target object based on the acquisition response.
- the dressing mirror may provide a virtual fitting service to the target object based on the acquisition response.
- the image quality of the three-dimensional image of the target object carried in the response may be poor.
- the reconstruction server acquires the three-dimensional image based on the image containing the target object with a small number of frames
- the three-dimensional image obtained by the reconstruction server The image quality of the image is poor. Therefore, the fitting mirror can analyze the image quality of the three-dimensional image of the target object carried in the response to determine whether it will affect the provision of virtual fitting services to the target object. If there is an impact, the dressing mirror will send out a voice message prompting the target object to rotate in a circle.
- the image of the target object can be re-acquired at different shooting angles through the dressing mirror camera, and the reconstruction service can rebuild the target object in three dimensions based on each frame of image.
- the target object is obtained The imaging quality of the three-dimensional images is higher.
- the first camera is arranged in a store or a shopping mall
- the reconstruction server can obtain images collected by the first camera in real time and containing different shooting angles of the user, and then can directly perform three-dimensional reconstruction after satisfying the conditions of three-dimensional reconstruction. Reconstruction, and then send the obtained three-dimensional image to the dressing mirror.
- the user uses the dressing mirror, there is no need to rotate a circle in front of the dressing mirror, and there is no need to wait for three-dimensional reconstruction to obtain a three-dimensional image, and a three-dimensional image of the user can be directly obtained, which improves the user experience.
- the shooting angle of an image can be determined through the angle recognition model. Based on the shooting angle, the angle interval corresponding to the image can be determined in a plurality of angle intervals, and the corresponding angle interval can be determined subsequently based on the corresponding Three-dimensional reconstruction is performed on the image of the same target object in each angle interval to obtain the three-dimensional image of the target object.
- the shooting angle of the image is acquired at the same time as the image is acquired.
- the order of the multi-frame images can be obtained directly based on the shooting angle, which is effective This reduces the amount of calculations in 3D reconstruction and improves the efficiency of acquiring 3D images.
- the user uses the dressing mirror, he does not need to rotate a circle in front of the dressing mirror, and does not need to wait for three-dimensional reconstruction to obtain a three-dimensional image, and can directly obtain a three-dimensional image of the user, which improves the user experience.
- the embodiment of the present disclosure also provides a model training method, which is used to train the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5, or FIG.
- This model training method is applied to the training server 202 in the model training system 200 shown in FIG. 2.
- the model training method may include:
- FIG. 7 is a flowchart of a training process according to at least one embodiment of the present disclosure.
- the training process can include:
- Step S601 Obtain a sample image containing a sample object and a depth map corresponding to the sample image collected by the second camera.
- the sample object may be a person, an animal, or an object
- the training server may use a second camera to collect a sample image containing the sample object and a depth map corresponding to the sample image.
- the second camera may be a camera including a depth camera, or a binocular camera.
- the second camera may be a device with a depth camera such as a Kinect device. It should be noted that the second camera can simultaneously collect the depth map and the color map. Therefore, after the sample object is captured by the second camera, the training server can simultaneously obtain the sample image containing the sample object collected by the second camera and the sample image. The corresponding depth map.
- the color map and depth map obtained by the second camera after shooting the sample object not only include the sample object, but also other background images before the sample object, in order to facilitate subsequent images Processing: After the second camera shoots the sample object, the training server also needs to intercept the acquired depth map and color map, so that the intercepted sample image and its corresponding depth map only contain the sample object.
- Step S602 Obtain the first key point and the second key point of the sample object from the depth map.
- the first key point and the second key point of the sample object may be two shoulder joint points of the person, respectively.
- the Kinect device can collect all the joint points of the sample object. For example, as shown in FIG. 8, the Kinect device can collect 14 joint points of the sample object.
- the training server can obtain two shoulder joint points a and b of the sample object in the depth map.
- Step S603 Determine the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point.
- the shooting angle is used to characterize the direction when the second camera shoots the sample object.
- the angle between the vertical direction of the line connecting the first key point and the second key point and the Z axis direction in the world coordinate system can be determined as the shooting angle of the sample image.
- the Z-axis direction in the world coordinate system is usually parallel to the optical axis direction of the second camera.
- the training server may determine the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point.
- the X-axis and Y-axis components of the key point's three-dimensional coordinates can determine the position of the key point in the depth map, and the key point's three-dimensional coordinate component on the Z axis can determine the depth value of the key point. It should be noted that after obtaining the sample image and its corresponding depth map, the training server can determine the three-dimensional coordinates of any point in the depth map.
- determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point may include: calculating the shooting angle of the sample image by using an angle calculation formula, and The angle calculation formula is:
- the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
- V 1 represents the first key point and the second key point.
- the line of key points is a vector in the XZ plane in the world coordinate system;
- V 2 represents a unit vector perpendicular to V 1 ;
- V Z represents a unit vector parallel to the Z axis in the world coordinate system;
- ⁇ represents a shooting angle.
- the training server only determines the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the three-dimensional coordinates of the second key point, there may be two frames with the same shooting angle but the second camera shooting Sample images with different orientations.
- the shooting angle when the second camera shoots the sample object in the current shooting direction is the same as the shooting angle when the second camera shoots the sample object after the shooting direction is rotated by 180°. Therefore, in order to distinguish two sample images with the same shooting angle but different shooting directions of the second camera, after step S603, the training process may further include:
- Step A4 Based on the sample image, determine whether the orientation posture of the sample object relative to the second camera is the back-facing posture.
- the training server may determine whether the orientation posture of the sample object relative to the second camera is the back orientation posture or the forward orientation posture based on the sample image.
- the training server determines that the orientation posture of the sample object relative to the second camera is the back-facing posture, the shooting angle of the sample object needs to be corrected, and step B4 is executed; when the training server determines the orientation of the sample object relative to the second camera When the posture is forward facing posture, there is no need to correct the shooting angle of the sample object.
- Step B4 When the orientation posture of the sample object relative to the second camera is the back-facing posture, a correction calculation formula is used to correct the shooting angle to obtain the corrected shooting angle.
- the correction calculation formula is:
- ⁇ 1 ⁇ 2+180°; where ⁇ 1 is the shooting angle after correction; ⁇ 2 is the shooting angle before correction.
- the training server may determine that the orientation attitude of the sample object relative to the second camera is the back-to-facing attitude, and the sample image The shooting angle of the camera is corrected so that the shooting angles of any two sample images captured by the second camera in different shooting directions are also different.
- the training server is based on the three-dimensional coordinates of the first key point and The second key point is that the accuracy of determining the shooting angle of the sample image is low. Therefore, in order to improve the accuracy of the shooting angle of the sample image, before step 603, the training process may also include:
- Step A5 Determine whether the distance between the first key point and the second key point is less than a distance threshold.
- the first key point and the second key point in the sample object may be the two shoulder joint points a and b of the person, and the distance threshold It can be the distance between the head joint point c and the neck joint point d.
- the training server can calculate the distance between the two shoulder joint points a and b in the sample object in the depth map, and compare it with the distance threshold (that is, the distance between the head joint point c and the neck joint point d), to Determine whether the distance between the first key point and the second key point is less than the distance threshold. If the distance between the first key point and the second key point is less than the distance threshold, step B5 is executed; if the distance between the first key point and the second key point is not less than the distance threshold, the above step S603 is executed.
- Step B5 If the distance between the first key point and the second key point is less than the distance threshold, it is determined that the shooting angle of the sample image is the specified angle.
- the specified angle is any angle within the angle interval of the fixed range.
- the training server may determine the shooting angle of the sample image as the specified angle.
- the specified angle may be 90° or 270°.
- the training server needs to determine the orientation and posture of the sample object relative to the second camera based on the sample image, and determine the shooting angle of the sample image based on the orientation and posture of the sample object relative to the second camera. 90°, still 270°.
- the orientation posture of the sample object relative to the second camera may further include: a rightward orientation posture and a leftward orientation posture. When the orientation posture of the sample object relative to the second camera is rightward orientation posture, the shooting angle of the sample image is 90°; when the orientation posture of the sample object relative to the second camera is leftward orientation posture, the sample The shooting angle of the image is 270°.
- Step S604 Input the sample image to the deep learning model to obtain the predicted shooting angle of the sample image, and determine the classification accuracy of the shooting angle according to the shooting angle and the predicted shooting angle of the sample image.
- the deep learning model can learn the correspondence between the sample image and the shooting angle. After the deep learning model has completed the learning of the correspondence between the sample image and the shooting angle, the predicted shooting angle of the sample image can be obtained. According to the shooting angle and predicted shooting angle of the sample image, the classification accuracy of the shooting angle is determined, and then training The server can determine whether the classification accuracy is greater than the set threshold; when the classification accuracy is greater than or equal to the set threshold, the training of the sample image is ended, and then new sample images can be input to the deep learning model; when the classification accuracy is When it is less than the set threshold, step S604 is repeated to input the sample image to the deep learning model again.
- the angle recognition model can be obtained through the training process from step S601 to step S604 for multiple times, and the angle recognition model's accuracy of classifying the shooting angle of the sample images in the training image set reaches the set threshold.
- the loss value LOSS of the loss function can be determined according to the shooting angle and the predicted shooting angle of the sample image, and the loss value of the loss function can be determined by the following calculation formula:
- a represents the predicted shooting angle of the sample image
- CE represents cross entropy
- MSE represents mean square error
- the other parameter configuration of the deep learning model is as follows: the resolution of the input sample image is 224 ⁇ 112, the optimizer used is the Adam optimizer, and the number of iterations is 50 times. Among them, one iteration means that the deep learning model learns the correspondence between the sample image and the shooting angle of the sample image once.
- the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5 or FIG. 6 can be obtained through the above steps.
- the angle model can output the shooting angle of the target image.
- At least one embodiment of the present disclosure provides a three-dimensional reconstruction device, including: a processor; and
- a memory stores program code executable by the processor, and when the program code is executed by the processor, the processor is configured to:
- the angle recognition model is used to determine the shooting angle of the first image, the shooting angle is used to characterize the shooting direction of the first camera relative to the target object when the first camera shoots the first image, and the angle recognition model is right A model obtained by learning and training the sample image and the shooting angle of the sample image;
- FIG. 9 shows a block diagram of a three-dimensional reconstruction device according to an embodiment of the present disclosure.
- the three-dimensional reconstruction apparatus 700 can be integrated in the reconstruction server 102 in the three-dimensional reconstruction system 100 as shown in FIG. 1.
- the three-dimensional reconstruction device 700 may include:
- the first acquisition module 701 is configured to acquire a target image collected by a first camera, where the target image is an image containing the target object.
- the first determining module 702 is configured to determine the shooting angle of the target image using an angle recognition model, the shooting angle is used to characterize the shooting direction when the first camera shoots the target image, and the angle recognition model is to learn the shooting angles of the sample image and the sample image The trained model.
- the second determining module 703 is configured to determine an angle interval corresponding to the target image in a plurality of angle intervals included in the angle range [0, 360°) based on the shooting angle.
- the three-dimensional reconstruction module 704 is configured to perform three-dimensional reconstruction processing on the target object based on the target image containing the target object corresponding to each angle interval to obtain a three-dimensional image of the target object.
- the first determining module 702 is configured to: input a target image to the angle recognition model; receive angle information output by the angle recognition model; and determine the angle information as a shooting angle.
- Fig. 10 is a block diagram of a three-dimensional reconstruction apparatus according to another embodiment of the present disclosure.
- the three-dimensional reconstruction device 700 may further include:
- the marking module 705 is configured to mark the target image to obtain a label of the target image, and the label is used to mark the target object in the target image.
- the marking module 705 uses a target recognition algorithm to mark the target image.
- the classification module 706 is configured to classify the target image containing the target object into an image set based on the tag of the target image.
- the second acquisition module 707 is configured to acquire a target image corresponding to each angle interval and containing the target object based on the image collection.
- the three-dimensional reconstruction apparatus 700 may further include:
- the first determining module 708 is configured to determine whether each angle interval corresponds to more than two target images located in the image set based on the image set.
- the scoring module 709 is configured to, if there are more than two frames of target images in the same image set corresponding to the angle interval, perform image quality scoring processing on the target images of more than two frames in the image set to obtain each frame of target image The image quality rating of.
- the first deletion module 710 is configured to retain the target image with the highest image quality score and delete other target images.
- Fig. 11 is a block diagram of a three-dimensional reconstruction device according to another embodiment of the present disclosure.
- the three-dimensional reconstruction device 700 may further include:
- the second determining module 711 is configured to determine whether the resolution of the target image is less than the resolution threshold.
- the second deleting module 712 is configured to delete the target image if the resolution of the target image is less than the resolution threshold.
- the modification module 713 is configured to modify the target image to an image of a specified resolution if the resolution of the target image is not less than the resolution threshold, and the specified resolution is greater than or equal to the resolution threshold.
- the three-dimensional reconstruction module 704 is configured to: if each of the multiple angle intervals corresponds to a target image containing the target object, based on the target image corresponding to each angle interval , Performing three-dimensional reconstruction on the target object to obtain a three-dimensional image of the target object.
- the 3D reconstruction module 704 is configured to: when a 3D reconstruction instruction is received, obtain multiple frames containing the target object to be reconstructed based on the information of the target object to be reconstructed carried in the 3D reconstruction instruction Target image; based on each frame of the target image containing the target object to be reconstructed, perform three-dimensional reconstruction of the target object to be reconstructed to obtain the three-dimensional image of the target object to be reconstructed; determine whether the three-dimensional image is an incomplete three-dimensional image; if the three-dimensional image is Incomplete three-dimensional image, repair the incomplete three-dimensional image, and obtain the repaired three-dimensional image.
- the shooting angle of the target image can be determined through the angle recognition model, and based on the shooting angle, the angle interval corresponding to the target image can be determined in multiple angle intervals, Subsequently, a three-dimensional reconstruction process may be performed on the target object based on the target image corresponding to each angle interval and containing the target object to obtain a three-dimensional image of the target object.
- the shooting angle of the target image is also obtained.
- the target object is subsequently reconstructed in 3D
- no additional algorithm is needed to sort the multi-frame target images, and the multi-frame target image can be obtained directly based on the shooting angle
- the sequence effectively reduces the amount of calculation during 3D reconstruction and improves the efficiency of acquiring 3D images.
- the embodiment of the present disclosure also provides a model training device.
- the model training device can be integrated in the training server 202 in the model training system 200 shown in FIG. 2.
- the model training device is configured to train the angle recognition model used in the three-dimensional reconstruction method shown in FIG. 3, FIG. 5 or FIG.
- the model training device may include:
- the training module is configured to perform multiple training processes until the angle recognition model classifies the shooting angle of the sample images in the training image set to a set threshold.
- the training process can include:
- determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the second key point includes:
- the angle calculation formula is used to calculate the shooting angle of the sample image.
- the angle calculation formula is:
- the three-dimensional coordinates of the first key point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of the second key point are (x 2 , y 2 , z 2 );
- V 1 represents the first key point and the second key point.
- the line of key points is a vector in the XZ plane in the world coordinate system;
- V 2 represents a unit vector perpendicular to V 1 ;
- V Z represents a unit vector parallel to the Z axis in the world coordinate system;
- ⁇ represents a shooting angle.
- the training process further includes:
- the training process before determining the shooting angle of the sample image based on the three-dimensional coordinates of the first key point and the second key point, the training process further includes:
- the angle is the specified angle, and the specified angle is any angle within the angle interval of the fixed range.
- At least one embodiment of the present disclosure also provides a three-dimensional reconstruction system, which may include: a reconstruction server and a first camera.
- the structure of the three-dimensional reconstruction system can refer to the structure shown in the three-dimensional reconstruction system shown in FIG. 1.
- the reconstruction server may include: the three-dimensional reconstruction apparatus 700 shown in FIG. 9, FIG. 10 or FIG. 11.
- the three-dimensional reconstruction system includes: a dressing mirror.
- the dressing mirror is configured to send an acquisition request to the reconstruction server when the target object is detected, the acquisition request is used to request to acquire a three-dimensional image of the target object from the reconstruction server, and the acquisition request carries the Information of the target object;
- the reconstruction server is configured to send an acquisition response to the dressing mirror based on the information of the target object, and the acquisition response carries a three-dimensional image of the target object.
- At least one embodiment of the present disclosure also provides a model training system.
- the model training system may include a training server and a second camera.
- the structure of the model training system can refer to the structure shown in the model training system shown in FIG. 2.
- the training server may include: the training module shown in the foregoing embodiment.
- At least one embodiment of the present disclosure also provides a non-volatile computer-readable storage medium, in which code instructions are stored, and the code instructions are executed by a processor to execute the three-dimensional reconstruction method shown in the above embodiments, For example, the three-dimensional reconstruction method shown in FIG. 3, FIG. 5, or FIG.
- At least one embodiment of the present disclosure also provides a computer-readable storage medium.
- the storage medium is a non-volatile storage medium.
- the storage medium stores code instructions, and the code instructions are executed by a processor to perform the foregoing implementation.
- the illustrated model training method is, for example, the training process shown in FIG. 7.
- first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.
- plurality refers to two or more, unless specifically defined otherwise.
- the program can be stored in a computer-readable storage medium.
- the storage medium mentioned can be a read-only memory, for example, a magnetic disk or an optical disk.
Abstract
Description
Claims (15)
- 一种三维重建方法,包括:获取第一摄像机采集的第一图像,所述第一图像为包含目标对象的图像;确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向;基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间,并将所述第一图像设置为所述角度区间的目标图像;以及基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
- 根据权利要求1所述的方法,其中,所述确定所述第一图像的所述拍摄角度,包括:向角度识别模型输入所述第一图像;接收所述角度识别模型输出的角度信息;以及将所述角度信息确定为所述拍摄角度;其中,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型。
- 根据权利要求1所述的方法,其中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:用标签对所述第一图像进行标记,所述标签用于对所述第一图像中的目标对象进行标记;基于所述第一图像的标签,将包含所述目标对象的第一图像归类为一个图像集合;基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像之前,所述方法还包括:基于所述图像集合,获取各个所述角度区间对应的第一图像。
- 根据权利要求3所述的方法,其中,在基于所述拍摄角度,在角度范围[0,360°)包括的多个角度区间中,确定所述第一图像对应的角度区间之后,所述方法还包括:对于每个角度区间,判断所述图像集合是否包含多帧第一图像对应于所述角度区间;若在所述图像集合中包含多帧第一图像对应于所述角度区间,对所述多帧第一图像进行图像质量评分,得到所述多帧第一图像中的每帧第一图像的图像质量评分;以及保留图像质量评分最高的第一图像,删除其他第一图像。
- 根据权利要求1至4任一所述的方法,其中,在所述获取第一摄像机采集的第一图像之后,所述方法还包括:判断所述第一图像的分辨率是否小于分辨率阈值;若所述第一图像的分辨率小于所述分辨率阈值,删除所述第一图像;以及若所述第一图像的分辨率不小于所述分辨率阈值,将所述第一图像修改为指定分辨率的图像,所述指定分辨率大于或等于所述分辨率阈值。
- 根据权利要求1至4任一所述的方法,其中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:当所述多个角度区间中的每个角度区间均具有对应的目标图像时,基于每个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
- 根据权利要求1至4任一所述的方法,其中,所述基于各个角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像,包括:当接收到三维重建指令时,基于所述三维重建指令携带的目标对象的信息,获取多帧第一图像;基于所述多帧第一图像,确定所述多帧第一图像的所对应的角度区间,并 将所述多帧第一图像分别确定为所对应的角度区间的目标图像;基于所对应的角度区间的目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像;以及判断所述三维图像是否为非完整的三维图像;若所述三维图像是否为非完整的三维图像,对所述非完整的三维图像进行修复,得到修复后的三维图像。
- 一种三维重建装置,包括:处理器;以及存储器,所述存储器上存储有可被所述处理器执行的程序代码,当所述程序代码被所述处理器执行时,将所述处理器配置为:获取第一摄像机采集的第一图像,所述第一图像包括目标对象;采用角度识别模型确定所述第一图像的拍摄角度,所述拍摄角度用于表征所述第一摄像机拍摄所述第一图像时相对于所述目标对象的拍摄方向,所述角度识别模型为对样本图像与所述样本图像的拍摄角度进行学习训练得到的模型;基于所述第一图像的拍摄角度,在角度范围[0,360°)包括的多个角度区间中确定所述第一图像对应的角度区间,并将该第一图像设置为所述角度区间的目标图像;基于各个所述角度区间的所述目标图像,对所述目标对象进行三维重建,得到所述目标对象的三维图像。
- 一种三维重建系统,其包括:重建服务器和第一摄像机,所述重建服务器包括:权利要求8所述的三维重建装置。
- 根据权利要求9所述的系统,其还包括:试衣镜;在检测到目标对象时,所述试衣镜配置为向所述重建服务器发送获取请求,所述获取请求携带有所述目标对象的信息;所述重建服务器用于基于所述目标对象的信息,向所述试衣镜发送获取响应,所述获取响应携带有所述目标对象的三维图像。
- 一种模型训练方法,配置为训练角度识别模型,所述方法包括:执行多次训练过程,直至所述角度识别模型对训练图像集中的样本图像的拍摄角度分类准确度达到设定阈值,其中,所述训练过程包括:获取第二摄像机采集的包含样本对象的样本图像以及与所述样本图像对应的深度图;获取所述深度图中的样本对象中的第一关键点和第二关键点;基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度,所述拍摄角度用于表征所述第二摄像机拍摄所述样本图像时相对于所述样本对象的方向;向深度学习模型输入样本图像,得到所述样本图像的预测拍摄角度,根据所述样本图像的拍摄角度和预测拍摄角度,确定所述拍摄角度的分类准确度。
- 根据权利要求11所述的方法,其中,在基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度之后,所述训练过程还包括:基于所述样本图像,判断所述样本对象相对于所述第二摄像机的朝向姿态是否为背对朝向姿态;当所述样本对象相对于所述第二摄像机的朝向姿态为所述背对朝向姿态时,采用修正计算公式对所述拍摄角度进行修正处理,得到修正后的拍摄角度,所述修正计算公式为:α1=α2+180°;其中,α1为修正后的拍摄角度;α2为修正前的拍摄角度。
- 根据权利要求11至13任一所述的方法,其中,基于所述第一关键点的三维坐标和所述第二关键点的三维坐标,确定所述样本图像的拍摄角度之前,所述训练过程还包括:判断所述第一关键点和所述第二关键点之间的距离是否小于距离阈值;若所述第一关键点和所述第二关键点之间的距离小于所述距离阈值,确定所述样本图像的拍摄角度为指定角度,所述指定角度为位于固定范围的角度区间内的任一角度。
- 一种非易失性的计算机可读存储介质,其中,所述存储介质中存储有代码指令,所述代码指令由处理器执行,以执行权利要求1至7任一所述的三维重建方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/043,061 US11403818B2 (en) | 2019-04-24 | 2020-03-04 | Three-dimensional reconstruction method, apparatus and system, model training method and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910333474.0A CN111862296B (zh) | 2019-04-24 | 2019-04-24 | 三维重建方法及装置、系统、模型训练方法、存储介质 |
CN201910333474.0 | 2019-04-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020215898A1 true WO2020215898A1 (zh) | 2020-10-29 |
Family
ID=72940873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/077850 WO2020215898A1 (zh) | 2019-04-24 | 2020-03-04 | 三维重建方法及装置、系统、模型训练方法、存储介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11403818B2 (zh) |
CN (1) | CN111862296B (zh) |
WO (1) | WO2020215898A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734856A (zh) * | 2021-01-05 | 2021-04-30 | 恒信东方文化股份有限公司 | 一种服装的拍摄角度的确定方法及系统 |
US20210365673A1 (en) * | 2020-05-19 | 2021-11-25 | Board Of Regents, The University Of Texas System | Method and apparatus for discreet person identification on pocket-size offline mobile platform with augmented reality feedback with real-time training capability for usage by universal users |
CN114125304A (zh) * | 2021-11-30 | 2022-03-01 | 维沃移动通信有限公司 | 拍摄方法及其装置 |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10586379B2 (en) * | 2017-03-08 | 2020-03-10 | Ebay Inc. | Integration of 3D models |
US11727656B2 (en) | 2018-06-12 | 2023-08-15 | Ebay Inc. | Reconstruction of 3D model with immersive experience |
CN112907726B (zh) * | 2021-01-25 | 2022-09-20 | 重庆金山医疗技术研究院有限公司 | 一种图像处理方法、装置、设备及计算机可读存储介质 |
CN112884638A (zh) * | 2021-02-02 | 2021-06-01 | 北京东方国信科技股份有限公司 | 虚拟试衣方法及装置 |
CN112819953B (zh) * | 2021-02-24 | 2024-01-19 | 北京创想智控科技有限公司 | 三维重建方法、网络模型训练方法、装置及电子设备 |
CN113141498B (zh) * | 2021-04-09 | 2023-01-06 | 深圳市慧鲤科技有限公司 | 信息生成方法、装置、计算机设备及存储介质 |
CN114821497A (zh) * | 2022-02-24 | 2022-07-29 | 广州文远知行科技有限公司 | 目标物位置的确定方法、装置、设备及存储介质 |
CN114999644B (zh) * | 2022-06-01 | 2023-06-20 | 江苏锦业建设工程有限公司 | 一种建筑人员疫情防控可视化管理系统及管理方法 |
CN115222814B (zh) * | 2022-06-02 | 2023-09-01 | 珠海云洲智能科技股份有限公司 | 救援设备导引方法、装置、终端设备及存储介质 |
CN115361500A (zh) * | 2022-08-17 | 2022-11-18 | 武汉大势智慧科技有限公司 | 用于三维建模的图像获取方法、系统及三维建模方法 |
CN117671402A (zh) * | 2022-08-22 | 2024-03-08 | 华为技术有限公司 | 识别模型训练方法、装置以及可移动智能设备 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090160858A1 (en) * | 2007-12-21 | 2009-06-25 | Industrial Technology Research Institute | Method for reconstructing three dimensional model |
CN102244732A (zh) * | 2011-07-01 | 2011-11-16 | 深圳超多维光电子有限公司 | 一种立体摄像机的参数设置方法、装置及该立体摄像机 |
CN104992441A (zh) * | 2015-07-08 | 2015-10-21 | 华中科技大学 | 一种面向个性化虚拟试衣的真实人体三维建模方法 |
CN108053476A (zh) * | 2017-11-22 | 2018-05-18 | 上海大学 | 一种基于分段三维重建的人体参数测量系统及方法 |
CN109272576A (zh) * | 2018-09-30 | 2019-01-25 | Oppo广东移动通信有限公司 | 一种数据处理方法、mec服务器、终端设备及装置 |
CN109377524A (zh) * | 2018-10-29 | 2019-02-22 | 山东师范大学 | 一种单幅图像深度恢复方法和系统 |
CN109461180A (zh) * | 2018-09-25 | 2019-03-12 | 北京理工大学 | 一种基于深度学习的三维场景重建方法 |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1131495C (zh) * | 1996-08-29 | 2003-12-17 | 三洋电机株式会社 | 特征信息赋予方法及装置 |
CN101271582B (zh) * | 2008-04-10 | 2010-06-16 | 清华大学 | 基于多视角二维图像并结合sift算法的三维重建方法 |
US8963829B2 (en) * | 2009-10-07 | 2015-02-24 | Microsoft Corporation | Methods and systems for determining and tracking extremities of a target |
US8861800B2 (en) * | 2010-07-19 | 2014-10-14 | Carnegie Mellon University | Rapid 3D face reconstruction from a 2D image and methods using such rapid 3D face reconstruction |
EP2906062B1 (en) * | 2013-01-16 | 2016-03-16 | Van de Velde NV | Fitting room mirror |
CN104966316B (zh) * | 2015-05-22 | 2019-03-15 | 腾讯科技(深圳)有限公司 | 一种3d人脸重建方法、装置及服务器 |
CN106709746A (zh) * | 2015-11-17 | 2017-05-24 | 北京三件客科技有限公司 | 3d扫描、模型测量一体化的互联网服装定制系统 |
CN105427369A (zh) * | 2015-11-25 | 2016-03-23 | 努比亚技术有限公司 | 移动终端及其三维形象的生成方法 |
CN105760836A (zh) * | 2016-02-17 | 2016-07-13 | 厦门美图之家科技有限公司 | 基于深度学习的多角度人脸对齐方法、系统及拍摄终端 |
CN107103641A (zh) * | 2017-03-23 | 2017-08-29 | 微景天下(北京)科技有限公司 | 三维重建成像系统和三维重建成像方法 |
CN107392086B (zh) * | 2017-05-26 | 2020-11-03 | 深圳奥比中光科技有限公司 | 人体姿态的评估装置、系统及存储装置 |
CN109522775B (zh) * | 2017-09-19 | 2021-07-20 | 杭州海康威视数字技术股份有限公司 | 人脸属性检测方法、装置及电子设备 |
CN108229332B (zh) * | 2017-12-08 | 2020-02-14 | 华为技术有限公司 | 骨骼姿态确定方法、装置及计算机可读存储介质 |
CN108921000B (zh) * | 2018-04-16 | 2024-02-06 | 深圳市深网视界科技有限公司 | 头部角度标注、预测模型训练、预测方法、设备和介质 |
US11386614B2 (en) * | 2018-06-15 | 2022-07-12 | Google Llc | Shading images in three-dimensional content system |
CN108960093A (zh) * | 2018-06-21 | 2018-12-07 | 阿里体育有限公司 | 脸部转动角度的识别方法及设备 |
CN109389054A (zh) * | 2018-09-21 | 2019-02-26 | 北京邮电大学 | 基于自动图像识别和动作模型对比的智能镜子设计方法 |
US11062454B1 (en) * | 2019-04-16 | 2021-07-13 | Zoox, Inc. | Multi-modal sensor data association architecture |
-
2019
- 2019-04-24 CN CN201910333474.0A patent/CN111862296B/zh active Active
-
2020
- 2020-03-04 US US17/043,061 patent/US11403818B2/en active Active
- 2020-03-04 WO PCT/CN2020/077850 patent/WO2020215898A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090160858A1 (en) * | 2007-12-21 | 2009-06-25 | Industrial Technology Research Institute | Method for reconstructing three dimensional model |
CN102244732A (zh) * | 2011-07-01 | 2011-11-16 | 深圳超多维光电子有限公司 | 一种立体摄像机的参数设置方法、装置及该立体摄像机 |
CN104992441A (zh) * | 2015-07-08 | 2015-10-21 | 华中科技大学 | 一种面向个性化虚拟试衣的真实人体三维建模方法 |
CN108053476A (zh) * | 2017-11-22 | 2018-05-18 | 上海大学 | 一种基于分段三维重建的人体参数测量系统及方法 |
CN109461180A (zh) * | 2018-09-25 | 2019-03-12 | 北京理工大学 | 一种基于深度学习的三维场景重建方法 |
CN109272576A (zh) * | 2018-09-30 | 2019-01-25 | Oppo广东移动通信有限公司 | 一种数据处理方法、mec服务器、终端设备及装置 |
CN109377524A (zh) * | 2018-10-29 | 2019-02-22 | 山东师范大学 | 一种单幅图像深度恢复方法和系统 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210365673A1 (en) * | 2020-05-19 | 2021-11-25 | Board Of Regents, The University Of Texas System | Method and apparatus for discreet person identification on pocket-size offline mobile platform with augmented reality feedback with real-time training capability for usage by universal users |
CN112734856A (zh) * | 2021-01-05 | 2021-04-30 | 恒信东方文化股份有限公司 | 一种服装的拍摄角度的确定方法及系统 |
CN114125304A (zh) * | 2021-11-30 | 2022-03-01 | 维沃移动通信有限公司 | 拍摄方法及其装置 |
CN114125304B (zh) * | 2021-11-30 | 2024-04-30 | 维沃移动通信有限公司 | 拍摄方法及其装置 |
Also Published As
Publication number | Publication date |
---|---|
CN111862296B (zh) | 2023-09-29 |
CN111862296A (zh) | 2020-10-30 |
US11403818B2 (en) | 2022-08-02 |
US20210375036A1 (en) | 2021-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020215898A1 (zh) | 三维重建方法及装置、系统、模型训练方法、存储介质 | |
CN109934065B (zh) | 一种用于手势识别的方法和装置 | |
JP6942488B2 (ja) | 画像処理装置、画像処理システム、画像処理方法、及びプログラム | |
US11403874B2 (en) | Virtual avatar generation method and apparatus for generating virtual avatar including user selected face property, and storage medium | |
JP6685827B2 (ja) | 画像処理装置、画像処理方法及びプログラム | |
JP2019087229A (ja) | 情報処理装置、情報処理装置の制御方法及びプログラム | |
CN111382613B (zh) | 图像处理方法、装置、设备和介质 | |
JP4951498B2 (ja) | 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体 | |
CN106981078B (zh) | 视线校正方法、装置、智能会议终端及存储介质 | |
JP2004094491A (ja) | 顔向き推定装置および顔向き推定方法ならびに顔向き推定プログラム | |
CN111160291B (zh) | 基于深度信息与cnn的人眼检测方法 | |
JP2019096113A (ja) | キーポイントデータに関する加工装置、方法及びプログラム | |
JP2019117577A (ja) | プログラム、学習処理方法、学習モデル、データ構造、学習装置、および物体認識装置 | |
WO2021218568A1 (zh) | 图像深度确定方法及活体识别方法、电路、设备和介质 | |
WO2022174594A1 (zh) | 基于多相机的裸手追踪显示方法、装置及系统 | |
US20210319234A1 (en) | Systems and methods for video surveillance | |
EP3791356B1 (en) | Perspective distortion correction on faces | |
CN112200056B (zh) | 人脸活体检测方法、装置、电子设备及存储介质 | |
US20160093028A1 (en) | Image processing method, image processing apparatus and electronic device | |
JP2018113021A (ja) | 情報処理装置およびその制御方法、プログラム | |
US11615549B2 (en) | Image processing system and image processing method | |
WO2021134311A1 (zh) | 拍摄对象切换方法及装置、图像处理方法及装置 | |
WO2022237048A1 (zh) | 位姿获取方法、装置、电子设备、存储介质及程序 | |
JP6798609B2 (ja) | 映像解析装置、映像解析方法およびプログラム | |
EP3699865B1 (en) | Three-dimensional face shape derivation device, three-dimensional face shape deriving method, and non-transitory computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20793950 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20793950 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20793950 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06/05/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20793950 Country of ref document: EP Kind code of ref document: A1 |