WO2021249401A1 - 模型生成方法、图像透视图确定方法、装置、设备及介质 - Google Patents

模型生成方法、图像透视图确定方法、装置、设备及介质 Download PDF

Info

Publication number
WO2021249401A1
WO2021249401A1 PCT/CN2021/098942 CN2021098942W WO2021249401A1 WO 2021249401 A1 WO2021249401 A1 WO 2021249401A1 CN 2021098942 W CN2021098942 W CN 2021098942W WO 2021249401 A1 WO2021249401 A1 WO 2021249401A1
Authority
WO
WIPO (PCT)
Prior art keywords
perspective
image
point cloud
view
acquisition time
Prior art date
Application number
PCT/CN2021/098942
Other languages
English (en)
French (fr)
Inventor
李艳丽
刘冬冬
Original Assignee
北京京东乾石科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东乾石科技有限公司 filed Critical 北京京东乾石科技有限公司
Priority to KR1020227045255A priority Critical patent/KR20230015446A/ko
Priority to US17/928,553 priority patent/US20230351677A1/en
Priority to JP2022564397A priority patent/JP7461504B2/ja
Priority to EP21823023.3A priority patent/EP4131145A4/en
Publication of WO2021249401A1 publication Critical patent/WO2021249401A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/56Particle system, point based geometry or rendering

Definitions

  • the embodiments of the present application relate to the field of data processing technology, such as a model generation method, an image perspective determination method, device, equipment, and medium.
  • Point cloud mapping is based on the lidar equipment collecting the point cloud data in the scene to be built at each moment, and obtaining the point cloud at each moment based on the method of surveying and mapping or simultaneous localization and mapping (SLAM) According to the three-dimensional coordinates of the data, the point cloud data at multiple times are projected and merged according to the three-dimensional coordinates.
  • SLAM simultaneous localization and mapping
  • Simple point cloud mapping can only obtain the three-dimensional coordinates of the point cloud data at multiple times, and the information is relatively single.
  • data can be collected synchronously to generate a perspective image of the corresponding moment, so as to use multiple data source fusion to develop more applications.
  • the lidar equipment and camera are space-time calibration to obtain color point cloud data
  • the image perspective is used to assist in viewing the real scene during the mapping process
  • the image perspective is used in intelligent perception to improve lanes, pedestrians, etc. Recognition of dynamic objects.
  • the related technology has the following technical problems: the acquisition process of the above-mentioned image perspective is relatively time-consuming and labor-intensive. First, it needs to build a complicated lidar equipment and a camera synchronization system, and perform space-time calibration of the two. This time-space calibration process is often More cumbersome; secondly, in order to obtain high-quality and omni-directional image perspective, the camera used is often expensive, such as a 360-degree panoramic view of Ladybug3, which costs more than 200,000 yuan; in addition, the image collected by the camera The quality of perspective images is easily affected by environmental factors such as weather, light, shadows, etc. For example, the image brightness of perspective images collected in a low light environment is low, and the speed of the car is too fast and it is prone to jitter and blur.
  • the embodiments of the present application provide a model generation method, an image perspective view determination method, device, equipment, and medium, which solve the problem that the acquisition process of the image perspective view is relatively time-consuming and labor-intensive.
  • an embodiment of the present application provides a model generation method, which may include:
  • the point cloud perspective view at each image acquisition time point and the image perspective view at each image acquisition time point are used as a set of training samples, and the original neural network model is trained based on multiple sets of training samples to generate An image conversion model for converting a point cloud perspective view into an image perspective view.
  • an embodiment of the present application also provides a method for determining an image perspective view, which may include:
  • Collect point cloud data based on the preset collection system, obtain the coordinate data of the point cloud data and the point cloud collection time point, determine the pose matrix corresponding to the point cloud collection time point, and generate the point cloud collection time according to the pose matrix and coordinate data Click the point cloud perspective view;
  • an embodiment of the present application also provides a model generation device, which may include:
  • the data collection module is configured to collect point cloud data and a multi-frame image perspective based on a preset collection system, to obtain coordinate data of the point cloud data, and multiple image collection time points one-to-one corresponding to the multi-frame image perspective;
  • the first generation module is configured to determine the pose matrix corresponding to each image acquisition time point of the plurality of image acquisition time points, and generate the pose matrix and coordinate data corresponding to each image acquisition time point. Describe the point cloud perspective view at each image acquisition time point;
  • the second generation module is configured to use the point cloud perspective view at each image acquisition time point and the image perspective view at each image acquisition time point as a set of training samples, and compare the original nerves based on the multiple sets of training samples.
  • the network model is trained to generate an image conversion model for converting a point cloud perspective view into an image perspective view.
  • an embodiment of the present application also provides a device for determining an image perspective view, which may include:
  • the third generation module is set to collect point cloud data based on the preset collection system, obtain the coordinate data of the point cloud data and the point cloud collection time point, determine the pose matrix corresponding to the point cloud collection time point, and according to the pose matrix and The coordinate data generates the point cloud perspective view at the point cloud collection time point;
  • the image perspective view determination module is configured to obtain the image conversion model generated according to the model generation method provided by any embodiment of this application, and input the point cloud perspective view into the image conversion model, and determine according to the output result of the image conversion model A perspective view of the image at the point cloud acquisition time point.
  • an embodiment of the present application also provides a device, which may include:
  • At least one processor At least one processor
  • Memory set to store at least one program
  • the at least one processor When at least one program is executed by at least one processor, the at least one processor implements the model generation method or the image perspective determination method provided in any embodiment of the present application.
  • an embodiment of the present application also provides a computer-readable storage medium, and the computer-readable storage medium stores a computer program, which when executed by a processor, realizes the model provided in any embodiment of the present application Generation method or image perspective determination method.
  • FIG. 1 is a flowchart of a model generation method in Embodiment 1 of the present application
  • FIG. 2 is a first schematic diagram of point cloud mapping in a model generation method in Embodiment 1 of the present application;
  • FIG. 3a is a second schematic diagram of point cloud mapping in a model generation method in Embodiment 1 of the present application;
  • 3b is a schematic diagram of a perspective view of a point cloud in a model generation method in Embodiment 1 of the present application;
  • FIG. 4a is a schematic diagram of an original neural network model used for single frame conversion in a model generation method in Embodiment 1 of the present application;
  • FIG. 4b is a schematic diagram of an original neural network model used for sequence frame conversion in a model generation method in Embodiment 1 of the present application;
  • FIG. 5 is a flowchart of a model generation method in the second embodiment of the present application.
  • Fig. 6 is a structural block diagram of a model generating device in the third embodiment of the present application.
  • FIG. 7 is a structural block diagram of a device for determining a perspective image of an image in the fourth embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a device in Embodiment 5 of the present application.
  • Fig. 1 is a flowchart of a model generation method provided in Embodiment 1 of the present application. This embodiment is applicable to the case of generating an image conversion model for converting a point cloud perspective view into an image perspective view.
  • the method may be executed by the model generation device provided in the embodiment of the present application, and the device may be implemented by software and/or hardware, and the device may be integrated on various user terminals or servers.
  • the method of the embodiment of the present application includes the following steps:
  • S110 Collect the point cloud data and the multi-frame image perspective based on the preset collection system, to obtain coordinate data of the point cloud data, and multiple image collection time points one-to-one corresponding to the multi-frame image perspective.
  • the point cloud data is based on the data collected by the point cloud collection device in the preset collection system under the scene to be built, such as the point collected based on the lidar scanning device, the fictitious scene thinning device or the multi-view reconstruction device Cloud data;
  • the image perspective is based on the perspective acquired by the image acquisition device in the preset acquisition system.
  • the image acquisition device can be a spherical panoramic camera, a wide-angle camera, a normal orthoscopic perspective camera, etc., accordingly, the acquisition
  • the obtained image perspective can be a spherical panoramic image, a wide-angle image, an ordinary undistorted perspective image, etc., which is not limited here.
  • the point cloud data can be mapped, as shown in Figure 2, and the coordinate data of multiple point cloud data can be obtained during the mapping process.
  • the mapping method can be surveying and mapping, SLAM, etc. , It is not limited here; correspondingly, after the image perspective is acquired, the image acquisition time point of each frame of the image perspective can be acquired, and the image acquisition time point is the time point when the image perspective is acquired.
  • the pose matrix is a matrix of the point cloud acquisition device at a certain image acquisition time point and in the coordinate system where the coordinate data of the point cloud data is located, and the pose matrix includes a rotation matrix and a translation vector.
  • the pose matrix can be obtained from the combined inertial navigation data; if the mapping is based on the SLAM method, the pose matrix can be provided by the SLAM algorithm.
  • the local coordinate system of the image capture device at the image capture time point can be obtained according to the pose matrix, or in other words, the image capture device at the image capture time point can be obtained according to the pose matrix
  • the local coordinate system at the image acquisition location so as to convert the coordinate data of the point cloud data to the local coordinate system.
  • the point cloud at the image acquisition time point can be obtained according to the coordinate data that has been converted. perspective.
  • the point cloud data in the to-be-built scene obtained after point cloud mapping is shown in Figure 3a
  • the point cloud perspective view synthesized based on the point cloud data and the corresponding pose matrix is shown in Figure 3b.
  • the above-mentioned pose matrix can be determined by the following steps: obtain the pose trajectory of the preset acquisition system according to the point cloud data.
  • the position and posture of the acquisition system changes during the movement.
  • the position and posture can include position and posture.
  • the mapping is based on the surveying and mapping method, the pose of the preset acquisition system at multiple acquisition time points can be obtained based on the combined inertial navigation in the preset acquisition system; if the mapping is based on the SLAM method, it can be based on SLAM
  • the algorithm obtains the pose of the preset acquisition system at multiple acquisition time points.
  • the pose trajectory is sampled based on multiple image acquisition time points, and a pose matrix corresponding to each image acquisition time point is obtained according to the sampling results.
  • the two can be used as a set of training samples, the point cloud perspective view is used as the actual input data, and the image perspective view is used as the desired output data.
  • the original neural network model can be trained based on multiple sets of training samples, and an image conversion model for converting a point cloud perspective view into an image perspective view can be generated.
  • the original neural network model is any untrained convolutional neural network model that can convert a point cloud perspective view into an image perspective view.
  • a schematic diagram of an optional original neural network model is shown in Figure 4a. , which is an image conversion model from a single-frame point cloud perspective view to a single-frame image perspective view.
  • the solid line is the data layer
  • Mt in the data layer is a point cloud perspective view
  • its dimension is H*W*C
  • the dotted line is the network layer, and the neurons in the network layer may include the convolutional layer cxx_kx_sx, the excitation layer leakyPeLU, the convolutional block layer ResnetXtBlock_cxx_xx, the upsampling layer PixelShuffle, the excitation layer tanh, and so on.
  • the convolutional layer c32_k3_s2 uses 32 convolution kernels with a 3x3 size (k3) of step 2 (s2) for convolution.
  • the convolutional block layer ResnetXtBlock_c256_x10 is obtained by serially connecting 10 sets of ResnetXtBlock, and the internal convolutional layer can be a volume of 3x3 size (k3) with a step size of 2 (s2).
  • the remaining convolution kernels can also be used.
  • c256 is the number of convolution kernels.
  • PixelShuffle can be twice the upsampling.
  • the technical solution of the embodiment of the present application based on the point cloud data and the multi-frame image perspective collected by the preset collection system, can obtain the coordinate data of the point cloud data and the multiple image collection times corresponding to the multi-frame image perspective one-to-one Further, after determining the pose matrix corresponding to each of the multiple image capture time points, the pose matrix and coordinate data corresponding to each image capture time point can be generated to generate the The point cloud perspective view at each image acquisition time point, that is, the point cloud data of the three-dimensional scene point is projected to the virtual camera when the image acquisition time point is formed to form a point cloud perspective view; thus, each image acquisition time point The point cloud perspective view below and the image perspective view at each image acquisition time point are used as a set of training samples, and the original neural network model is trained based on multiple sets of training samples, which can be used to convert the point cloud perspective view into Image conversion model of image perspective.
  • the above technical solution can guide the synthetic image perspective view based on the point cloud perspective view projected by the point cloud data, which solves the problem that the acquisition process of the image perspective view is relatively time-consuming and labor-intensive, and achieves the acquisition in a simple and low-cost manner. To the effect of high-quality image perspective.
  • the pixels corresponding to the point cloud data in the point cloud perspective can be obtained, and the attribute information of the point cloud data can be assigned to the pixels.
  • the attribute information can be It is intensity information, semantic information, color information, etc., for example, the intensity information can be obtained based on the reflection of the lidar scanning device, and the semantic information can be obtained based on point cloud analysis.
  • each pixel in the perspective image of the image obtained after projection records the color information of the three-dimensional scene point (R/G/B);
  • the point cloud perspective view is the process of reconstructing the projection of the 3D scene point on the camera film.
  • Each pixel in the point cloud perspective view records the attribute information of the 3D scene point, which means There is a strong correlation between the point cloud perspective view and the image perspective view, and this correlation improves the synthesis accuracy of the composite image perspective view based on the point cloud perspective view.
  • the point cloud data in the scene to be built is projected to the pixels in the point cloud perspective, there may be a many-to-one relationship. If multiple point cloud data corresponds to one pixel, then the point cloud closest to the camera can be The attribute information of the data is assigned to the pixel point, which is in line with the viewing law of the human eye. When the current 3D scene point blocks the back 3D scene point, the human eye can only see the front 3D scene point (ie, multiple point clouds). The point cloud data closest to the camera in the data), and the three-dimensional scene points that are hidden behind cannot be seen (that is, the point cloud data of the multiple point cloud data except the point cloud data closest to the camera).
  • the point cloud perspective synthesis process is to project the real camera or the point cloud data within a certain range around the virtual camera onto the camera film, thereby simulating the imaging of the real three-dimensional scene points
  • the process takes the location of the camera as the center, and projects all the point cloud data within a circle with a radius of 500 meters onto the camera film.
  • the point cloud perspective is a perspective view formed by projecting real three-dimensional scene points on the photo film according to the perspective relationship. Therefore, the image capture device can be a preset camera or a spherical panoramic camera.
  • the preset camera can be a perspective camera or a wide-angle camera.
  • the points can be divided according to the following formula
  • the P W_3d of the cloud data is projected to the P C_2d (t C ) in the spherical panoramic camera coordinate system collected at t C , and a perspective view of the point cloud under t C is generated according to multiple P C_2d (t C ):
  • R is the radius of the sphere of the spherical panoramic camera
  • P C_2d (t C ) is the two-dimensional coordinate data of the pixel point projected on the point cloud perspective view (ie, the spherical panorama) of the point cloud data whose three-dimensional coordinate data is P W_3d .
  • At least two image acquisition times can be The point cloud perspective view and the image perspective view corresponding to the point cloud perspective view at the at least two image acquisition times are collectively used as training samples to train the original neural network model.
  • the point cloud perspective view at the current image acquisition time point among multiple image acquisition time points may be used as the first point cloud perspective view
  • the image perspective view at the current image acquisition time point may be used as the first image perspective view
  • the point cloud perspective view at least one image acquisition time point before the current image acquisition time point is used as the second point cloud perspective view
  • at least one image acquisition time point before the current image acquisition time point is used as the second image perspective view
  • the image perspective view corresponding to the point cloud perspective view is used as the second image perspective view
  • the number of the second point cloud perspective view is at least one
  • the number of the second image perspective view is at least one
  • the second image perspective has a one-to-one correspondence
  • the first point cloud perspective, the second point cloud perspective, the first image perspective, and the second image perspective are taken as a set of training samples, where the first point cloud perspective
  • the figure, the second point cloud perspective view, and the second image perspective view are actual input data
  • the first image perspective view is the
  • the original neural network model that cooperates with the above-mentioned training samples can include a point cloud convolution excitation module, an image convolution excitation module, and a merge processing module. Therefore, the original neural network model is trained based on multiple sets of training samples. It may include: inputting training samples into the original neural network model; processing the channel cascade results of the first point cloud perspective view and the second point cloud perspective view via the point cloud convolution excitation module to obtain a point cloud feature map, and The second image perspective view is processed through the image convolution excitation module to obtain the image feature map.
  • the channel level can be performed on at least two second image perspective views first.
  • the network parameters of the original neural network model are adjusted according to the third image perspective view and the first image perspective view, such as calculating the loss function according to the difference between the two, and adjusting the network parameters according to the calculation result.
  • the original neural network model of this embodiment will be exemplarily described below in conjunction with specific examples.
  • the original neural network model The schematic diagram of is shown in Figure 4b, which is an image conversion model from a sequence of frame point cloud perspective views to a single frame image perspective view.
  • the neurons in the network layer of the original neural network model shown in Fig. 4b may also include the cascade layer concat, and Mt in the data layer is the first point cloud perspective view.
  • Both Mt-2 and Mt-1 are perspective views of the second point cloud, It is the perspective view of the first image, and It-2 is the perspective view of the second image, which belongs to the perspective view at the same image acquisition time point as Mt-2 , It-1 is also the second image perspective view, which belongs to the perspective view at the same image acquisition time point as Mt-1.
  • the dimension of the channel concatenation result of Mt-2, Mt-1 and Mt is H*W*(3*C)
  • the dimension of the channel concatenation result of It-2 and It-1 is H*W*6 .
  • Mt and It are the point cloud perspective view and the image perspective view at the 10th second respectively
  • Mt-1 and It-1 are respectively
  • Mt-2 and It-2 are the point cloud perspective view and the image perspective view at the 8th second respectively.
  • the 3 points at the 8th-10th second The cloud perspective and the two image perspectives at 8-9 seconds are used as the actual input data
  • the image perspective at the 10th second is used as the expected output data, which are jointly input to the original neural network model for model training.
  • Fig. 5 is a flowchart of a method for determining a perspective image of an image provided in the second embodiment of the present application. This embodiment is applicable to the case of synthesizing an image perspective view based on point cloud data and a pre-generated image conversion model.
  • the method can be executed by the image perspective view determining device provided in the embodiment of the application, and the device can be software and/or hardware.
  • the device can be integrated in various user terminals or servers.
  • the method of the embodiment of the present application includes steps S210 to S220.
  • S210 Collect the point cloud data based on the preset collection system, obtain the coordinate data of the point cloud data and the point cloud collection time point, determine the pose matrix corresponding to the point cloud collection time point, and generate the point cloud according to the pose matrix and coordinate data A perspective view of the point cloud at the acquisition time point.
  • the point cloud collection time point simulates the image collection time point of the image perspective.
  • the points under the point cloud collection time point can be synthesized Cloud perspective.
  • the pose matrix can be determined through the following steps: build a map of the collected point cloud data, and obtain the pose trajectory of the preset acquisition system during the map creation process; exemplary, according to the point cloud acquisition time point The pose trajectory is sampled in time series, and the pose matrix corresponding to each point cloud collection time point is obtained according to the results of the time series sampling.
  • the technical solution of the embodiment of the present application based on the point cloud data collected by the preset collection system, can obtain the coordinate data of the point cloud data and the point cloud collection time point.
  • the point cloud collection time point simulates the image collection of the image perspective. Time point; further, after determining the pose matrix corresponding to the point cloud collection time point, the point cloud perspective view at the point cloud collection time point can be generated according to the pose matrix and coordinate data, that is, the point cloud data of the three-dimensional scene point
  • the point cloud perspective is formed under the virtual camera when projected to the point cloud collection time point; thus, after the point cloud perspective is input to the pre-generated image conversion model, the point cloud can be determined according to the output result of the image conversion model A perspective view of the image at the time of acquisition.
  • the above technical solution can guide the synthetic image perspective view based on the point cloud perspective view projected by the point cloud data, which solves the problem that the acquisition process of the image perspective view is relatively time-consuming and labor-intensive, and achieves the acquisition in a simple and low-cost manner. To the effect of high-quality image perspective.
  • Fig. 6 is a structural block diagram of a model generation device provided in Embodiment 3 of the application, and the device is configured to execute the model generation method provided in any of the foregoing embodiments.
  • This device and the model generation method of the foregoing embodiments belong to the same inventive concept.
  • the device may include: a data collection module 310, a first generation module 320, and a second generation module 330.
  • the data collection module 310 is configured to collect point cloud data and multi-frame image perspective based on a preset collection system to obtain coordinate data of the point cloud data, and multiple image collections corresponding to the multi-frame image perspective one-to-one Point in time
  • the first generating module 320 is configured to determine a pose matrix corresponding to each image acquisition time point of a plurality of image acquisition time points, and according to the pose matrix corresponding to each image acquisition time point and each image The coordinate data corresponding to the collection time point is generated to generate a point cloud perspective view at each image collection time point;
  • the second generation module 330 is configured to use the point cloud perspective view at each image acquisition time point and the image perspective view at each image acquisition time point as a set of training samples, and compare the original
  • the neural network model is trained to generate an image conversion model for converting a point cloud perspective view into an image perspective view.
  • the first generating module 320 may include:
  • the pose trajectory obtaining unit is set to obtain the pose trajectory of the preset acquisition system according to the point cloud data;
  • the pose matrix obtaining unit is configured to sample the pose trajectory based on multiple image acquisition time points to obtain a pose matrix corresponding to each image acquisition time point of the multiple image acquisition time points.
  • the first generation module 320 is set to:
  • the image perspective is collected based on the preset camera in the preset collection system, and the preset camera includes a perspective camera or a wide-angle camera
  • the three-dimensional coordinate data P W_3d of the point cloud data in the world coordinate system is projected to
  • the two-dimensional coordinate data PC_2d (t C ) in the preset camera coordinate system collected at each image collection time point t C , and the point cloud perspective view under t C is generated according to multiple PC_2d (t C ):
  • M W ⁇ L (t C ) is the pose matrix of the point cloud collection device in the preset collection system in the world coordinate system at t C
  • K c is the internal parameter matrix of the preset camera
  • the P W_3d of the point cloud data is projected to P C_2d (t C ) in the spherical panoramic camera coordinate system collected at t C according to the following formula, and (t C) generating point cloud perspective view of a plurality of C t P C_2d:
  • R is the sphere radius of the spherical panoramic camera.
  • the device may also include:
  • the attribute information assignment module is set to obtain the pixel points corresponding to the point cloud data in the point cloud perspective view, and assign the attribute information of the point cloud data to the pixels.
  • the second generation module 330 is also set to:
  • the point cloud perspective images at at least two image acquisition time points and the image perspective images corresponding to the point cloud perspective images at the at least two image acquisition time points are collectively used as a set of training samples.
  • the second generating module 330 may include:
  • the first obtaining unit is configured to use the point cloud perspective view at the current image acquisition time point in the multi-image acquisition time point as the first point cloud perspective view, and the image perspective view at the current image acquisition time point as the first point cloud perspective view.
  • the second obtaining unit is configured to use a point cloud perspective view at at least one image acquisition time point before the current image acquisition time point as a second point cloud perspective view, and use at least one image before the current image acquisition time point
  • the image perspective view corresponding to the point cloud perspective view at the acquisition time point is used as the second image perspective view;
  • the training sample obtaining unit is set to use the first point cloud perspective view, the second point cloud perspective view, the first image perspective view, and the second image perspective view as a set of training samples, where the first point cloud perspective view, the second point cloud perspective view
  • the point cloud perspective and the second image perspective are actual input data
  • the first image perspective is the expected output data.
  • the second generating module 330 may further include:
  • the input unit is set to the original neural network model including a point cloud convolution excitation module, an image convolution excitation module, and a merge processing module, and the training samples are input into the original neural network model;
  • the feature map obtaining unit is configured to process the channel cascade results of the first point cloud perspective view and the second point cloud perspective view via the point cloud convolution excitation module to obtain the point cloud feature map, and the image convolution excitation module Process the second image perspective view to obtain an image feature map;
  • the network parameter adjustment unit is configured to merge the point cloud feature map and the image feature map via the merging processing module, and generate a third image perspective view based on the merged processing result, and adjust the original image based on the third image perspective view and the first image perspective view.
  • the network parameters of the neural network model are configured to merge the point cloud feature map and the image feature map via the merging processing module, and generate a third image perspective view based on the merged processing result, and adjust the original image based on the third image perspective view and the first image perspective view.
  • the model generation device provided in the third embodiment of the present application can obtain the coordinate data of the point cloud data and the perspective view of the multi-frame image based on the point cloud data and the multi-frame image perspective collected by the preset collection system through the data collection module. Corresponding multiple image acquisition time points; further, after determining the pose matrix corresponding to each image acquisition time point of the multiple image acquisition time points, the first generation module can be based on each image The pose matrix and coordinate data corresponding to the acquisition time point are generated to generate the point cloud perspective view at each image acquisition time point, that is, the point cloud data of the three-dimensional scene point is projected to the virtual camera at the image acquisition time point to form the point cloud perspective Figure; Therefore, the second generation module takes the point cloud perspective view at each image acquisition time point and the image perspective view at each image acquisition time point as a set of training samples, based on multiple sets of training sample pairs The original neural network model is trained to generate an image conversion model for converting a point cloud perspective view into an image perspective view.
  • the above-mentioned device can guide the synthetic image perspective view based on the point cloud perspective view projected by the point cloud data, which solves the problem that the acquisition process of the image perspective view is relatively time-consuming and labor-intensive, and achieves the acquisition in a simple and low-cost manner.
  • the effect of high-quality image perspective is relatively time-consuming and labor-intensive.
  • the model generation device provided by the embodiment of this application can execute the model generation method provided by any embodiment of this application, and has the corresponding functional modules for executing the method.
  • FIG. 7 is a structural block diagram of a device for determining a perspective view of an image provided in the fourth embodiment of the application, and the device is configured to execute the method for determining a perspective view of an image provided by any of the foregoing embodiments.
  • This device belongs to the same application concept as the image perspective view determination method of the foregoing embodiments.
  • the device may include: a third generation module 410 and an image perspective view determination module 420.
  • the third generation module 410 is configured to collect point cloud data based on a preset collection system, obtain the coordinate data of the point cloud data and the point cloud collection time point, determine the pose matrix corresponding to the point cloud collection time point, and according to the position Pose matrix and coordinate data to generate a point cloud perspective view at the point cloud collection time point;
  • the image perspective view determining module 420 is configured to obtain the image conversion model generated according to the model generation method provided in any embodiment of the present application, and input the point cloud perspective view into the image conversion model, and determine according to the output result of the image conversion model A perspective view of the image at the point in time when the point cloud was collected.
  • the third generation module can obtain the coordinate data of the point cloud data and the point cloud collection time point based on the point cloud data collected by the preset collection system, and the point cloud collection time
  • the point simulates the image acquisition time point of the image perspective, and after determining the pose matrix corresponding to the point cloud acquisition time point, the point cloud perspective at the point cloud acquisition time point can be generated according to the pose matrix and coordinate data Figure, the point cloud data of the 3D scene point is projected to the virtual camera at the point cloud collection time point to form a point cloud perspective view; the image perspective view determination module inputs the point cloud perspective view into the pre-generated image conversion model, and then according to The output result of the image conversion model can determine the image perspective at the point cloud collection time point.
  • the above-mentioned device guides the synthetic image perspective view based on the point cloud perspective view projected by the point cloud data, which solves the problem that the acquisition process of the image perspective view is relatively time-consuming and labor-intensive. The quality of the image perspective effect.
  • the apparatus for determining a perspective view of an image provided by an embodiment of the present application can execute the method for determining a perspective view of an image provided by any embodiment of the present application, and is equipped with corresponding functional modules for executing the method.
  • FIG. 8 is a schematic structural diagram of a device provided in Embodiment 5 of this application.
  • the device includes a memory 510, a processor 520, an input device 530, and an output device 540.
  • the number of processors 520 in the device may be at least one.
  • one processor 520 is taken as an example; the memory 510, the processor 520, the input device 530, and the output device 540 in the device may be connected by a bus or other means, In 8, the connection via the bus 550 is taken as an example.
  • the memory 510 can be configured to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the model generation method in the embodiment of the present application (for example, data in the model generation device).
  • the processor 520 executes various functional applications and data processing of the device by running the software programs, instructions, and modules stored in the memory 510, that is, implements the aforementioned model generation method or image perspective determination method.
  • the memory 510 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the device, and the like.
  • the memory 510 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the memory 510 may include a memory remotely provided with respect to the processor 520, and these remote memories may be connected to the device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input device 530 may be configured to receive input digital or character information, and to generate key signal input related to user settings and function control of the device.
  • the output device 540 may include a display device such as a display screen.
  • the sixth embodiment of the present application provides a storage medium containing computer-executable instructions, which are used to execute a model generation method when the computer-executable instructions are executed by a computer processor, and the method includes:
  • a storage medium containing computer-executable instructions provided by an embodiment of the present application is not limited to the method operations described above, and can also execute the model generation method provided in any embodiment of the present application. Related operations.
  • the seventh embodiment of the present application provides a storage medium containing computer-executable instructions, when the computer-executable instructions are executed by a computer processor, a method for determining a perspective view of an image is executed, and the method includes:
  • Collect point cloud data based on the preset collection system, obtain the coordinate data of the point cloud data and the point cloud collection time point, determine the pose matrix corresponding to the point cloud collection time point, and generate the point cloud collection time according to the pose matrix and coordinate data Click the point cloud perspective view;
  • the present application can be implemented by software and necessary general-purpose hardware, and of course, it can also be implemented by hardware.
  • the technical solution of this application essentially or the part that contributes to the related technology can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, Read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, A server, or a network device, etc.) executes the method described in each embodiment of the present application.
  • a computer device which can be a personal computer, A server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种模型生成方法、图像透视图确定方法、装置、设备及介质。该模型生成方法包括:基于预设采集系统采集点云数据和多帧图像透视图,得到点云数据的坐标数据,以及与多帧图像透视图一一对应的多个图像采集时间点(S110);确定与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图(S120);将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型(S130)。

Description

模型生成方法、图像透视图确定方法、装置、设备及介质
本申请要求在2020年6月8日提交中国专利局、申请号为202010514388.2的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及数据处理技术领域,例如涉及一种模型生成方法、图像透视图确定方法、装置、设备及介质。
背景技术
随着虚拟仿真、高精地图制作、机器人、自动驾驶等行业的推进和发展,点云建图的应用越来越广泛。点云建图是基于激光雷达设备采集每个时刻的待建图场景下的点云数据,并基于测绘方式或者同时定位和地图构建(simultaneous localization and mapping,SLAM)方式获取每个时刻的点云数据的三维坐标,进而根据该三维坐标将多个时刻的点云数据进行投影和拼合。
单纯的点云建图仅能获取到多个时刻的点云数据的三维坐标,信息比较单一。为解决这一问题,在点云建图过程中通过搭建摄像头可以同步采集数据以生成相应时刻的图像透视图,从而利用多数据源融合开展更多应用。例如,在仿真重建中通过对激光雷达设备和摄像头进行时空标定以获取彩色点云数据,在制图过程中利用图像透视图辅助观看真实场景,在智能感知中借助图像透视图来提高车道、行人等动态物体的识别。
相关技术中存在以下技术问题:上述图像透视图的获取过程较为耗时耗力,首先,其需要搭建复杂的激光雷达设备和摄像头同步系统,并对二者进行时空标定,这一时空标定过程往往比较繁琐;其次,为了获取高质量且全方位的图像透视图,所采用的摄像头往往价格不菲,如一个360度的全景Ladybug3的费用高达二十多万;再者,经由摄像头采集到的图像透视图的质量容易受到天气、光照、阴影等环境因素影响,如在暗光环境下采集到的图像透视图的图像亮度 偏低,车速过快容易出现抖动模糊。
发明内容
本申请实施例提供了一种模型生成方法、图像透视图确定方法、装置、设备及介质,解决了图像透视图的获取过程较为耗时耗力的问题。
第一方面,本申请实施例提供了一种模型生成方法,可以包括:
基于预设采集系统采集点云数据和多帧图像透视图,得到点云数据的坐标数据,以及与所述多帧图像透视图一一对应的多个图像采集时间点;
确定与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图;
将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
第二方面,本申请实施例还提供了一种图像透视图确定方法,可以包括:
基于预设采集系统采集点云数据,得到点云数据的坐标数据以及点云采集时间点,确定与点云采集时间点对应的位姿矩阵,并根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图;
获取按照本申请任意实施例所提供的模型生成方法生成的图像转换模型,并将点云透视图输入至图像转换模型中,根据图像转换模型的输出结果,确定出点云采集时间点下的图像透视图。
第三方面,本申请实施例还提供了一种模型生成装置,可以包括:
数据采集模块,设置为基于预设采集系统采集点云数据和多帧图像透视图,得到点云数据的坐标数据,以及与所述多帧图像透视图一一对应的多个图像采集时间点;
第一生成模块,设置为确定与多个所述图像采集时间点中的每个图像采集 时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图;
第二生成模块,设置为将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
第四方面,本申请实施例还提供了一种图像透视图确定装置,可以包括:
第三生成模块,设置为基于预设采集系统采集点云数据,得到点云数据的坐标数据以及点云采集时间点,确定与点云采集时间点对应的位姿矩阵,并根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图;
图像透视图确定模块,设置为获取按照本申请任意实施例所提供的模型生成方法生成的图像转换模型,并将点云透视图输入至图像转换模型中,根据图像转换模型的输出结果,确定出点云采集时间点下的图像透视图。
第五方面,本申请实施例还提供了一种设备,该设备可以包括:
至少一个处理器;
存储器,设置为存储至少一个程序;
当至少一个程序被至少一个处理器执行,使得至少一个处理器实现本申请任意实施例所提供的模型生成方法或是图像透视图确定方法。
第六方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现本申请任意实施例所提供的模型生成方法或是图像透视图确定方法。
附图说明
图1是本申请实施例一中的一种模型生成方法的流程图;
图2是本申请实施例一中的一种模型生成方法中点云建图的第一示意图;
图3a是本申请实施例一中的一种模型生成方法中点云建图的第二示意图;
图3b是本申请实施例一中的一种模型生成方法中点云透视图的示意图;
图4a是本申请实施例一中的一种模型生成方法中用于单帧转换的原始神经网络模型的示意图;
图4b是本申请实施例一中的一种模型生成方法中用于序列帧转换的原始神经网络模型的示意图;
图5是本申请实施例二中的一种模型生成方法的流程图;
图6是本申请实施例三中的一种模型生成装置的结构框图;
图7是本申请实施例四中的一种图像透视图确定装置的结构框图;
图8是本申请实施例五中的一种设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作详细说明。
实施例一
图1是本申请实施例一中提供的一种模型生成方法的流程图。本实施例可适用于生成用于将点云透视图转换为图像透视图的图像转换模型的情况。该方法可以由本申请实施例提供的模型生成装置来执行,该装置可以由软件和/或硬件的方式实现,该装置可以集成在各种用户终端或服务器上。
参见图1,本申请实施例的方法包括如下步骤:
S110、基于预设采集系统采集点云数据和多帧图像透视图,得到点云数据的坐标数据,以及与多帧图像透视图一一对应的多个图像采集时间点。
其中,点云数据是基于预设采集系统中的点云采集设备采集到的待建图场景下的数据,如基于激光雷达扫描设备、虚构场景抽稀设备或是多视图重建设备采集到的点云数据;图像透视图是基于预设采集系统中的图像采集设备采集到的透视图,该图像采集设备可以是球面全景相机、广角相机、普通无畸变的透视相机等,相应的,由此采集到的图像透视图可以是球面全景图像、广角图像、普通无畸变的透视图像等,在此未做限定。在采集到点云数据之后,可以 对点云数据进行建图,如图2所示,并在建图过程中得到多个点云数据的坐标数据,建图方式可以是测绘方式、SLAM方式等,在此未做限定;相应的,在采集到图像透视图之后,可以获取到每帧图像透视图的图像采集时间点,该图像采集时间点是采集到图像透视图时的时间点。
S120、确定与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图。
其中,位姿矩阵是点云采集设备在某图像采集时间点时、且在点云数据的坐标数据所在的坐标系下的矩阵,该位姿矩阵包括旋转矩阵和平移向量。在实际应用中,若基于测绘方式建图,位姿矩阵可以从组合惯导数据中获取;若基于SLAM方式建图,位姿矩阵可以由SLAM算法提供。在获取到位姿矩阵后,根据该位姿矩阵可以得到图像采集设备在该图像采集时间点时的局部坐标系,或是说,根据该位姿矩阵可以得到图像采集设备在该图像采集时间点时所在的图像采集位置上的局部坐标系,以便将点云数据的坐标数据转换到该局部坐标系下,由此一来,根据已转换完成的坐标数据可以得到该图像采集时间点下的点云透视图。例如,经过点云建图后得到的待建图场景下的点云数据如图3a所示,基于该点云数据和相应的位姿矩阵合成的点云透视图如图3b所示。
可选的,上述位姿矩阵可以通过如下步骤确定:根据点云数据得到预设采集系统的位姿轨迹,该位姿轨迹可以是在点云数据的建图过程中得到,其可以呈现出预设采集系统在移动过程中的位姿的变化情况,该位姿可以包括位置和姿态。在实际应用中,若基于测绘方式建图,可以基于预设采集系统中的组合惯导获取到预设采集系统在多个采集时间点下的位姿;若基于SLAM方式建图,可以基于SLAM算法获取预设采集系统在多个采集时间点下的位姿。进而,基于多个图像采集时间点对位姿轨迹进行采样,根据采样结果得到与每个图像采集时间点分别对应的位姿矩阵。
S130、将所述每个图像采集时间点下的点云透视图和所述每个图像采集时 间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
其中,由于每个图像采集时间点下均存在一个点云透视图和一个图像透视图,可以将二者作为一组训练样本,点云透视图作为实际输入数据,图像透视图作为期望输出数据,由此,可基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
需要说明的是,原始神经网络模型是任意的未经训练的可以将点云透视图转换为图像透视图的卷积神经网络模型,一种可选的原始神经网络模型的示意图如图4a所示,其是一种单帧点云透视图到单帧图像透视图的图像转换模型。示例性的,实线为数据层,数据层中的Mt是点云透视图,其维度是H*W*C,C可以是点云数据的属性信息的个数,例如,在属性信息是强度信息和语义信息时,C=2,再例如,在属性信息是色彩信息(R/G/B)时,C=3;数据层中的It是图像透视图,其维度是H*W*3,3是色彩信息(R/G/B)。虚线为网络层,该网络层中的神经元可以包括卷积层cxx_kx_sx、激励层leakyPeLU、卷积块层ResnetXtBlock_cxx_xx、上采样层PixelShuffle、激励层tanh等等。示例性的,卷积层c32_k3_s2是用3x3尺寸(k3)的步长2(s2)的32个卷积核进行卷积,其余卷积层的含义类似,在此不再赘述;激励层leakyPeLU的参数可以为0.2,也可以为其余数值;卷积块层ResnetXtBlock_c256_x10是用10套ResnetXtBlock顺序串联而得,其内部的卷积层可以统一用3x3尺寸(k3)的步长为2(s2)的卷积核,也可以用其余卷积核,c256是卷积核数量,其余的卷积块层的含义类似,在此不再赘述;PixelShuffle可以是上采样的2倍。
本申请实施例的技术方案,基于预设采集系统采集到的点云数据和多帧图像透视图,可以得到点云数据的坐标数据以及与多帧图像透视图一一对应的多个图像采集时间点;进而,在确定出与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵后,可以根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图,即将三维场景点的 点云数据投影到图像采集时刻点时的虚拟相机下形成点云透视图;由此,将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,可以生成用于将点云透视图转换为图像透视图的图像转换模型。上述技术方案,可以基于点云数据投影出的点云透视图引导合成图像透视图,解决了图像透视图的获取过程较为耗时耗力的问题,达到了以操作简单且成本较低的方式获取到高质量的图像透视图的效果。
一种可选的技术方案,在生成点云透视图之后,可以获取到点云透视图中与点云数据对应的像素点,并将点云数据的属性信息赋值给像素点,该属性信息可以是强度信息、语义信息、色彩信息等等,示例性的,强度信息可以根据激光雷达扫描设备反射获取,语义信息可以基于点云解析获取。上述步骤设置的好处在于,考虑到相机成像过程是待建图场景的三维场景点在相机底片投影的过程,投影后得到的图像透视图中的每个像素点记录了该三维场景点的色彩信息(R/G/B);相应的,点云透视图是重构三维场景点在相机底片投影的过程,点云透视图中的每个像素点记录了该三维场景点的属性信息,这意味着点云透视图和图像透视图间具有强烈的相关性,这种相关性提高了基于点云透视图合成图像透视图的合成精度。
考虑到待建图场景中的点云数据投影到点云透视图中的像素点可能是多对一的关系,若多个点云数据对应于一个像素点,那么可以将距离相机最近的点云数据的属性信息赋值给该像素点,这符合人眼的观看规律,当前面的三维场景点挡住后面的三维场景点时,人眼只能看到前面的三维场景点(即,多个点云数据中距离相机最近的点云数据),而无法看到后面被遮挡住的三维场景点(即,多个点云数据中除距离相机最近的点云数据以外的点云数据)。
一种可选的技术方案,考虑到点云透视图的合成过程是将真实相机或是虚拟相机周围一定范围内的点云数据都投影到相机底片上,从而模拟出真实的三维场景点的成像过程,比如,以相机所在位置为中心,将500米半径的圆周内 的全部点云数据都投影到相机底片上。换言之,根据摄影几何理论可知,点云透视图为根据透视关系将真实的三维场景点投影到相片底片后形成的透视图。由此,图像采集设备可以是预设相机,也可以是球面全景相机,该预设相机可以是透视相机或是广角相机,在点云采集设备和图像采集设备的位姿相一致时,根据每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图,可以包括:图像透视图是基于预设采集系统中的预设相机采集得到,根据如下公式将点云数据在世界坐标系下的三维坐标数据P W_3d投影到在图像采集时间点t C采集到的预设相机坐标系下的二维坐标数据P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:P C_2d(t C)=K cM W→L(t C)P W_3d,其中,M W→L(t C)是预设采集系统中的点云采集设备在t C时的世界坐标系下的位姿矩阵,K c是预设相机的内参矩阵,P C_2d(t C)是三维坐标数据为P W_3d的点云数据投影在点云透视图上的像素点的二维坐标数据,因此,根据多个像素点的二维坐标数据可以生成点云透视图。
类似的,若图像透视图是基于预设采集系统中的球面全景相机采集得到,三维场景点会投影到一个球面上,将球表面按照经纬度展开就是球面全景图,因此,可根据如下公式将点云数据的P W_3d投影到在t C采集到的球面全景相机坐标系下的P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:
P 3d=M W→L(t C)P W_3d,P 3d=[x 3d,y 3d,z 3d],
Figure PCTCN2021098942-appb-000001
其中,R是球面全景相机的球体半径,P C_2d(t C)是三维坐标数据为P W_3d的点云数据投影在点云透视图(即,球体全景图)上的像素点的二维坐标数据。
一种可选的技术方案,在基于点云透视图引导合成图像透视图时,为了保证时空相关性,避免独立帧一对一解析导致的时序跳变,可以将至少两个图像采集时间下的点云透视图、以及该至少两个图像采集时间下的点云透视图对应 的图像透视图共同作为训练样本,以对原始神经网络模型进行训练。示例性的,可以将多个图像采集时间点中的当前图像采集时间点下的点云透视图作为第一点云透视图,以及当前图像采集时间点下的图像透视图作为第一图像透视图;将在当前图像采集时间点之前的至少一个图像采集时间点下的点云透视图作为第二点云透视图,且将在当前图像采集时间点之前的至少一个图像采集时间点下的所述点云透视图对应的图像透视图作为第二图像透视图,第二点云透视图的数量为至少一个,第二图像透视图的数量为至少一个,至少一个第二点云透视图和至少一个第二图像透视图为一一对应关系;将第一点云透视图、第二点云透视图、第一图像透视图和第二图像透视图作为一组训练样本,其中,第一点云透视图、第二点云透视图和第二图像透视图是实际输入数据,第一图像透视图是期望输出数据。
在此基础上,与上述训练样本配合的原始神经网络模型可以包括点云卷积激励模块、图像卷积激励模块以及合并处理模块,由此,基于多组训练样本对原始神经网络模型进行训练,可以包括:将训练样本输入至原始神经网络模型中;经由点云卷积激励模块对第一点云透视图和第二点云透视图的通道级联结果进行处理,得到点云特征图,且经由图像卷积激励模块对第二图像透视图进行处理,得到图像特征图,当然,若第二图像透视图的数量是至少两个,则可以先对至少两个第二图像透视图进行通道级联,再对第二图像透视图的通道级联结果进行处理;经由合并处理模块将点云特征图和图像特征图进行合并处理,并根据合并处理结果生成第三图像透视图,该第三图像透视图是实际输出数据;由此,根据第三图像透视图和第一图像透视图调节原始神经网络模型的网络参数,如根据二者的差异性计算损失函数,并根据计算结果调节网络参数。
下面结合具体示例对本实施例的原始神经网络模型进行示例性的说明。示例性的,在将第一点云透视图、第二点云透视图、第一图像透视图和第二图像透视图作为一组训练样本时,为与该训练样本相互配合,原始神经网络模型的示意图如图4b所示,其是一种序列帧点云透视图到单帧图像透视图的图像转换 模型。相较于图4a所示的原始神经网络模型,图4b所示的原始神经网络模型的网络层中的神经元还可以包括级联层concat,数据层中的Mt是第一点云透视图,Mt-2和Mt-1均是第二点云透视图,It是第一图像透视图,It-2是第二图像透视图,其与Mt-2隶属于同一图像采集时间点下的透视图,It-1也是第二图像透视图,其与Mt-1隶属于同一图像采集时间点下的透视图。另外,Mt-2、Mt-1和Mt的通道级联结果的维度是H*W*(3*C),且It-2和It-1的通道级联结果的维度是H*W*6。
示例性的,以多个图像采集时间点的时间间隔是1秒为例,若Mt和It分别是第10秒时的点云透视图和图像透视图,则Mt-1和It-1分别是第9秒时的点云透视图和图像透视图,且Mt-2和It-2分别是第8秒时的点云透视图和图像透视图,此时,第8-10秒的3个点云透视图和第8-9秒的2个图像透视图作为实际输入数据,第10秒的图像透视图作为期望输出数据,共同输入到原始神经网络模型中以进行模型训练。
需要说明的是,在基于如图4b所示的原始神经网络模型训练得到图像转换模型之后,在图像转换模型的应用阶段,任一帧图像透视图都是未知的,这意味着无法基于前3帧点云透视图和前2帧图像透视图预测出第3帧图像透视图。为解决这一问题,一种可选方案是,在对原始神经网络模型进行训练时,将前两帧点云透视图设置为空、随机数等等,并从第三帧点云透视图开始进行训练,这样一来,在图像转换模型的应用阶段,也可以直接将前两帧图像透视图设置为空、随机数等等,并从第三帧图像透视图开始进行预测。
实施例二
图5是本申请实施例二中提供的一种图像透视图确定方法的流程图。本实施例可适用于基于点云数据和预先生成的图像转换模型合成图像透视图的情况,该方法可以由本申请实施例提供的图像透视图确定装置来执行,该装置可以由软件和/或硬件的方式实现,该装置可以集成在各种用户终端或服务器上。
参见图5,本申请实施例的方法包括步骤S210至S220。
S210、基于预设采集系统采集点云数据,得到点云数据的坐标数据以及点云采集时间点,确定与点云采集时间点对应的位姿矩阵,并根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图。
其中,点云采集时间点模拟出了图像透视图的图像采集时间点,根据与点云采集时间点对应的位姿矩阵,以及点云数据的坐标数据,可以合成点云采集时间点下的点云透视图。可选的,位姿矩阵可以通过如下步骤确定:对已采集的点云数据进行建图,在建图过程中得到预设采集系统的位姿轨迹;示例性的,按照点云采集时间点对位姿轨迹进行时序采样,根据时序采样结果获取每个点云采集时间点对应的位姿矩阵。
S220、获取按照本申请任意实施例所提供的模型生成方法生成的图像转换模型,并将点云透视图输入至图像转换模型中,根据图像转换模型的输出结果,确定出点云采集时间点下的图像透视图。
上述技术方案首先,预设采集系统中只需要设置点云采集设备,无需设置价格昂贵的图像采集设备,成本较低;其次,只需要将点云透视图输入至已训练完成的图像转换模型中,即可预测出同一采集时间点下的图像透视图,无需进行任何时空标定,操作较为简单;再者,通过提高训练样本的质量,可以保证经由图像转换模型得到高质量的图像透视图。
本申请实施例的技术方案,基于预设采集系统采集到的点云数据,可以得到点云数据的坐标数据以及点云采集时间点,该点云采集时间点模拟出了图像透视图的图像采集时间点;进而,在确定出与点云采集时间点对应的位姿矩阵后,可以根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图,即将三维场景点的点云数据投影到点云采集时间点时的虚拟相机下形成点云透视图;由此,在将点云透视图输入至预先生成的图像转换模型后,根据图像转换模型的输出结果,可以确定出点云采集时间点下的图像透视图。上述技术方案,可以基于点云数据投影出的点云透视图引导合成图像透视图,解决了图像透视图的获取过程较为耗时耗力的问题,达到了以操作简单且成本较低的方式获取到 高质量的图像透视图的效果。
实施例三
图6为本申请实施例三提供的模型生成装置的结构框图,该装置设置为执行上述任意实施例所提供的模型生成方法。该装置与上述各实施例的模型生成方法属于同一个发明构思,在模型生成装置的实施例中未详尽描述的细节内容,可以参考上述模型生成方法的实施例。参见图6,该装置可包括:数据采集模块310、第一生成模块320和第二生成模块330。
其中,数据采集模块310,设置为基于预设采集系统采集点云数据和多帧图像透视图,得到点云数据的坐标数据,以及与所述多帧图像透视图一一对应的多个图像采集时间点;
第一生成模块320,设置为确定与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和所述每个图像采集时间点对应的坐标数据生成所述每个图像采集时间点下的点云透视图;
第二生成模块330,设置为将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
可选的,第一生成模块320,可以包括:
位姿轨迹得到单元,设置为根据点云数据得到预设采集系统的位姿轨迹;
位姿矩阵得到单元,设置为基于多个图像采集时间点对位姿轨迹进行采样,得到与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵。
可选的,第一生成模块320,是设置为:
若图像透视图是基于预设采集系统中的预设相机采集得到,预设相机包括透视相机或是广角相机,则根据如下公式将点云数据在世界坐标系下的三维坐标数据P W_3d投影到在每个图像采集时间点t C采集到的预设相机坐标系下的二维 坐标数据P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:
P C_2d(t C)=K cM W→L(t C)P W_3d
其中,M W→L(t C)是预设采集系统中的点云采集设备在t C时的世界坐标系下的位姿矩阵,K c是预设相机的内参矩阵;
若图像透视图是基于预设采集系统中的球面全景相机,则根据如下公式将点云数据的P W_3d投影到在t C采集到的球面全景相机坐标系下的P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:
P 3d=M W→L(t C)P W_3d,P 3d=[x 3d,y 3d,z 3d]
Figure PCTCN2021098942-appb-000002
其中,R是球面全景相机的球体半径。
可选的,该装置还可包括:
属性信息赋值模块,设置为获取点云透视图中与点云数据对应的像素点,将点云数据的属性信息赋值给像素点。
可选的,第二生成模块330还设置为:
将至少两个图像采集时间点下的所述点云透视图和所述至少两个图像采集时间点下的所述点云透视图对应的所述图像透视图共同作为一组训练样本。
可选的,第二生成模块330,可以包括:
第一得到单元,设置为将多图像采集时间点中的当前图像采集时间点下的点云透视图作为第一点云透视图,以及所述当前图像采集时间点下的图像透视图作为第一图像透视图;
第二得到单元,设置为将在当前图像采集时间点之前的至少一个图像采集时间点下的点云透视图作为第二点云透视图,且将所述当前图像采集时间点之前的至少一个图像采集时间点下的所述点云透视图对应的图像透视图作为第二图像透视图;
训练样本得到单元,设置为将第一点云透视图、第二点云透视图、第一图 像透视图和第二图像透视图作为一组训练样本,其中,第一点云透视图、第二点云透视图和第二图像透视图是实际输入数据,第一图像透视图是期望输出数据。
可选的,第二生成模块330,还可以包括:
输入单元,设置为原始神经网络模型包括点云卷积激励模块、图像卷积激励模块以及合并处理模块,将训练样本输入至原始神经网络模型中;
特征图得到单元,设置为经由点云卷积激励模块对第一点云透视图和第二点云透视图的通道级联结果进行处理,得到点云特征图,且经由图像卷积激励模块对第二图像透视图进行处理,得到图像特征图;
网络参数调节单元,设置为经由合并处理模块将点云特征图和图像特征图进行合并处理,并根据合并处理结果生成第三图像透视图,根据第三图像透视图和第一图像透视图调节原始神经网络模型的网络参数。
本申请实施例三提供的模型生成装置,通过数据采集模块基于预设采集系统采集到的点云数据和多帧图像透视图,可以得到点云数据的坐标数据以及与多帧图像透视图一一对应的多个图像采集时间点;进而,第一生成模块在确定出与与多个所述图像采集时间点中的每个图像采集时间点对应的位姿矩阵后,可以根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图,即将三维场景点的点云数据投影到图像采集时刻点时的虚拟相机下形成点云透视图;由此,第二生成模块将所述每个图像采集时间点下的点云透视图和所述每个图像采集时间点下的图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,可以生成用于将点云透视图转换为图像透视图的图像转换模型。上述装置,可以基于点云数据投影出的点云透视图引导合成图像透视图,解决了图像透视图的获取过程较为耗时耗力的问题,达到了以操作简单且成本较低的方式获取到高质量的图像透视图的效果。
本申请实施例所提供的模型生成装置可执行本申请任意实施例所提供的模 型生成方法,具备执行方法相应的功能模块。
值得注意的是,上述模型生成装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的名称也只是为了便于相互区分,并不用于限制本申请的保护范围。
实施例四
图7为本申请实施例四提供的图像透视图确定装置的结构框图,该装置设置为执行上述任意实施例所提供的图像透视图确定方法。该装置与上述各实施例的图像透视图确定方法属于同一个申请构思,在图像透视图确定装置的实施例中未详尽描述的细节内容,可以参考上述图像透视图确定方法的实施例。参见图7,该装置可包括:第三生成模块410和图像透视图确定模块420。
其中,第三生成模块410,设置为基于预设采集系统采集点云数据,得到点云数据的坐标数据以及点云采集时间点,确定与点云采集时间点对应的位姿矩阵,并根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图;
图像透视图确定模块420,设置为获取按照本申请任意实施例所提供的模型生成方法生成的图像转换模型,并将点云透视图输入至图像转换模型中,根据图像转换模型的输出结果,确定出点云采集时间点下的图像透视图。
本申请实施例四提供的图像透视图确定装置,通过第三生成模块基于预设采集系统采集到的点云数据,可以得到点云数据的坐标数据以及点云采集时间点,该点云采集时间点模拟出了图像透视图的图像采集时间点,而且,在确定出与点云采集时间点对应的位姿矩阵后,可以根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图,即将三维场景点的点云数据投影到点云采集时间点时的虚拟相机下形成点云透视图;图像透视图确定模块在将点云透视图输入至预先生成的图像转换模型后,根据图像转换模型的输出结果,可以确定出点云采集时间点下的图像透视图。上述装置,基于点云数据投影出的点云透视图引导合成图像透视图,解决了图像透视图的获取过程较为耗时耗力的问题, 达到了以操作简单且成本较低的方式获取到高质量的图像透视图的效果。
本申请实施例所提供的图像透视图确定装置可执行本申请任意实施例所提供的图像透视图确定方法,具备执行方法相应的功能模块。
值得注意的是,上述图像透视图确定装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的名称也只是为了便于相互区分,并不用于限制本申请的保护范围。
实施例五
图8为本申请实施例五提供的一种设备的结构示意图,如图8所示,该设备包括存储器510、处理器520、输入装置530和输出装置540。设备中的处理器520的数量可以是至少一个,图8中以一个处理器520为例;设备中的存储器510、处理器520、输入装置530和输出装置540可以通过总线或其它方式连接,图8中以通过总线550连接为例。
存储器510作为一种计算机可读存储介质,可设置为存储软件程序、计算机可执行程序以及模块,如本申请实施例中的模型生成方法对应的程序指令/模块(例如,模型生成装置中的数据采集模块310、第一生成模块320和第二生成模块330),或是,如本申请实施例中的图像透视图确定方法对应的程序指令/模块(例如,图像透视图确定装置中的第三生成模块410和图像透视图确定模块420)。处理器520通过运行存储在存储器510中的软件程序、指令以及模块,从而执行设备的各种功能应用以及数据处理,即实现上述的模型生成方法或是图像透视图确定方法。
存储器510可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据设备的使用所创建的数据等。此外,存储器510可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储器510可包括相对于处理器520远程设 置的存储器,这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置530可设置为接收输入的数字或字符信息,以及产生与装置的用户设置以及功能控制有关的键信号输入。输出装置540可包括显示屏等显示设备。
实施例六
本申请实施例六提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行一种模型生成方法,该方法包括:
基于预设采集系统采集点云数据和图多帧像透视图,得到点云数据的坐标数据,以及与所述多帧图像透视图一一对应的多个图像采集时间点;
确定与多个图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和坐标数据生成所述每个图像采集时间点下的点云透视图;
将每个图像采集时间点下的点云透视图和图像透视图作为一组训练样本,基于多组训练样本对原始神经网络模型进行训练,生成用于将点云透视图转换为图像透视图的图像转换模型。
当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的模型生成方法中的相关操作。
实施例七
本申请实施例七提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行一种图像透视图确定方法,该方法包括:
基于预设采集系统采集点云数据,得到点云数据的坐标数据以及点云采集时间点,确定与点云采集时间点对应的位姿矩阵,并根据位姿矩阵和坐标数据生成点云采集时间点下的点云透视图;
获取按照本申请任意实施例所提供的模型生成方法生成的图像转换模型,并将点云透视图输入至图像转换模型中,根据图像转换模型的输出结果,确定出点云采集时间点下的图像透视图。
通过以上关于实施方式的描述,所属领域的技术人员可以清楚地了解到,本申请可借助软件及必需的通用硬件来实现,当然也可以通过硬件实现。依据这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。

Claims (12)

  1. 一种模型生成方法,包括:
    基于预设采集系统采集点云数据和多帧图像透视图,得到所述点云数据的坐标数据,以及与所述多帧所述图像透视图一一对应的多个图像采集时间点;
    确定与多个所述图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和所述坐标数据生成所述每个图像采集时间点下的点云透视图;
    将所述每个图像采集时间点下的所述点云透视图和所述每个图像采集时间点下的所述图像透视图作为一组训练样本,基于多组所述训练样本对原始神经网络模型进行训练,生成用于将所述点云透视图转换为所述图像透视图的图像转换模型。
  2. 根据权利要求1所述的方法,其中,所述确定与多个所述图像采集时间点中的每个图像采集时间点对应的位姿矩阵,包括:
    根据所述点云数据得到所述预设采集系统的位姿轨迹;
    基于多个所述图像采集时间点对所述位姿轨迹进行采样,得到与多个所述图像采集时间点中的每个图像采集时间点对应的位姿矩阵。
  3. 根据权利要求1所述的方法,其中,所述根据所述每个图像采集时间点对应的位姿矩阵和所述坐标数据生成所述每个图像采集时间点下的点云透视图,包括:
    在所述图像透视图是基于所述预设采集系统中的预设相机采集得到的情况下,所述预设相机包括透视相机或广角相机,则根据如下公式将所述点云数据在世界坐标系下的三维坐标数据P W_3d投影到在所述每个图像采集时间点t C采集到的预设相机坐标系下的二维坐标数据P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:
    P C_2d(t C)=K cM W→L(t C)P W_3d
    其中,M W→L(t C)是所述预设采集系统中的点云采集设备在t C时的所述世界坐 标系下的所述位姿矩阵,K c是所述预设相机的内参矩阵;
    在所述图像透视图是基于所述预设采集系统中的球面全景相机的情况下,根据如下公式将所述点云数据的P W_3d投影到在t C采集到的球面全景相机坐标系下的P C_2d(t C),并根据多个P C_2d(t C)生成t C下的点云透视图:
    P 3d=M W→L(t C)P W_3d,P 3d=[x 3d,y 3d,z 3d],
    Figure PCTCN2021098942-appb-100001
    其中,R是所述球面全景相机的球体半径。
  4. 根据权利要求1所述的方法,还包括:
    获取所述点云透视图中与所述点云数据对应的像素点,将所述点云数据的属性信息赋值给所述像素点。
  5. 根据权利要求1所述的方法,还包括,将至少两个图像采集时间点下的所述点云透视图和所述至少两个图像采集时间点下的所述点云透视图对应的所述图像透视图共同作为一组训练样本。
  6. 根据权利要求5所述的方法,其中,将至少两个图像采集时间点下的所述点云透视图和所述至少两个图像采集时间点下的所述点云透视图对应的所述图像透视图共同作为一组训练样本,包括:
    将多个所述图像采集时间点中的当前图像采集时间点下的所述点云透视图作为第一点云透视图,以及所述当前图像采集时间点下的所述图像透视图作为第一图像透视图;
    将在所述当前图像采集时间点之前的至少一个图像采集时间点下的所述点云透视图作为第二点云透视图,且将所述当前图像采集时间点之前的至少一个图像采集时间点下的所述点云透视图对应的所述图像透视图作为第二图像透视图;
    将所述第一点云透视图、所述第二点云透视图、所述第一图像透视图和所述第二图像透视图作为一组训练样本,其中,所述第一点云透视图、所述第二 点云透视图和所述第二图像透视图是实际输入数据,所述第一图像透视图是期望输出数据。
  7. 根据权利要求6所述的方法,其中,所述原始神经网络模型包括点云卷积激励模块、图像卷积激励模块以及合并处理模块,所述基于多组所述训练样本对原始神经网络模型进行训练,包括:
    将所述训练样本输入至所述原始神经网络模型中;
    所述点云卷积激励模块对所述第一点云透视图和所述第二点云透视图的通道级联结果进行处理,得到点云特征图,且所述图像卷积激励模块对所述第二图像透视图进行处理,得到图像特征图;
    所述合并处理模块将所述点云特征图和所述图像特征图进行合并处理,并根据合并处理结果生成第三图像透视图,根据所述第三图像透视图和所述第一图像透视图调节所述原始神经网络模型的网络参数。
  8. 一种图像透视图确定方法,包括:
    基于预设采集系统采集点云数据,得到所述点云数据的坐标数据以及点云采集时间点,确定与所述点云采集时间点对应的位姿矩阵,并根据所述位姿矩阵和所述坐标数据生成所述点云采集时间点下的点云透视图;
    获取按照权利要求1-7中任一项的模型生成方法生成的图像转换模型,并将所述点云透视图输入至所述图像转换模型中,根据所述图像转换模型的输出结果,确定出所述点云采集时间点下的图像透视图。
  9. 一种模型生成装置,包括:
    数据采集模块,设置为基于预设采集系统采集点云数据和多帧图像透视图,得到所述点云数据的坐标数据,以及与所述多帧图像透视图一一对应的多个图像采集时间点;
    第一生成模块,设置为确定与多个所述图像采集时间点中的每个图像采集时间点对应的位姿矩阵,根据所述每个图像采集时间点对应的位姿矩阵和所述坐标数据生成所述每个图像采集时间点下的点云透视图;
    第二生成模块,设置为将所述每个所述图像采集时间点下的所述点云透视图和所述每个所述图像采集时间点下的所述图像透视图作为一组训练样本,基于多组所述训练样本对原始神经网络模型进行训练,生成用于将所述点云透视图转换为所述图像透视图的图像转换模型。
  10. 一种图像透视图确定装置,包括:
    生成模块,设置为基于预设采集系统采集点云数据,得到所述点云数据的坐标数据以及点云采集时间点,确定与所述点云采集时间点对应的位姿矩阵,并根据所述位姿矩阵和所述坐标数据生成所述点云采集时间点下的点云透视图;
    图像透视图确定模块,设置为获取按照权利要求1-7中任一项的模型生成方法生成的图像转换模型,并将所述点云透视图输入至所述图像转换模型中,根据所述图像转换模型的输出结果,确定出所述点云采集时间点下的图像透视图。
  11. 一种设备,包括:
    至少一个处理器;
    存储器,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一所述的模型生成方法,或者如权利要求8中所述的图像透视图确定方法。
  12. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-7中任一所述的模型生成方法,或者如权利要求8中所述的图像透视图确定方法。
PCT/CN2021/098942 2020-06-08 2021-06-08 模型生成方法、图像透视图确定方法、装置、设备及介质 WO2021249401A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020227045255A KR20230015446A (ko) 2020-06-08 2021-06-08 모델 생성 방법, 이미지 투시도 결정 방법, 장치, 설비 및 매체
US17/928,553 US20230351677A1 (en) 2020-06-08 2021-06-08 Model Generation Method and Apparatus, Image Perspective Determining Method and Apparatus, Device, and Medium
JP2022564397A JP7461504B2 (ja) 2020-06-08 2021-06-08 モデル生成方法、画像透視図の決定方法、装置、ディバイス及び媒体
EP21823023.3A EP4131145A4 (en) 2020-06-08 2021-06-08 MODEL GENERATING METHOD AND APPARATUS, METHOD AND APPARATUS FOR DETERMINING IMAGE PERSPECTIVE, APPARATUS AND MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010514388.2 2020-06-08
CN202010514388.2A CN113763231B (zh) 2020-06-08 2020-06-08 模型生成方法、图像透视图确定方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2021249401A1 true WO2021249401A1 (zh) 2021-12-16

Family

ID=78785406

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098942 WO2021249401A1 (zh) 2020-06-08 2021-06-08 模型生成方法、图像透视图确定方法、装置、设备及介质

Country Status (6)

Country Link
US (1) US20230351677A1 (zh)
EP (1) EP4131145A4 (zh)
JP (1) JP7461504B2 (zh)
KR (1) KR20230015446A (zh)
CN (1) CN113763231B (zh)
WO (1) WO2021249401A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049783A (zh) * 2022-05-20 2022-09-13 支付宝(杭州)信息技术有限公司 模型的确定方法、场景重建模型、介质、设备及产品
CN116233391A (zh) * 2023-03-03 2023-06-06 北京有竹居网络技术有限公司 用于图像处理的装置、方法和存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11790557B2 (en) * 2021-04-27 2023-10-17 Faro Technologies, Inc. Calibrating system for colorizing point-clouds
CN115121913B (zh) * 2022-08-30 2023-01-10 北京博清科技有限公司 激光中心线的提取方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780601A (zh) * 2016-12-01 2017-05-31 北京未动科技有限公司 一种空间位置追踪方法、装置及智能设备
US20180075618A1 (en) * 2016-09-10 2018-03-15 Industrial Technology Research Institute Measurement system and method for measuring multi-dimensions
CN107958482A (zh) * 2016-10-17 2018-04-24 杭州海康威视数字技术股份有限公司 一种三维场景模型构建方法及装置
CN109087340A (zh) * 2018-06-04 2018-12-25 成都通甲优博科技有限责任公司 一种包含尺度信息的人脸三维重建方法及系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109118532B (zh) * 2017-06-23 2020-11-20 百度在线网络技术(北京)有限公司 视觉景深估计方法、装置、设备及存储介质
US10438371B2 (en) * 2017-09-22 2019-10-08 Zoox, Inc. Three-dimensional bounding box from two-dimensional image and point cloud data
CN108230379B (zh) 2017-12-29 2020-12-04 百度在线网络技术(北京)有限公司 用于融合点云数据的方法和装置
JP6601825B2 (ja) * 2018-04-06 2019-11-06 株式会社EmbodyMe 画像処理装置および2次元画像生成用プログラム
CN110895833A (zh) * 2018-09-13 2020-03-20 北京京东尚科信息技术有限公司 一种室内场景三维建模的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180075618A1 (en) * 2016-09-10 2018-03-15 Industrial Technology Research Institute Measurement system and method for measuring multi-dimensions
CN107958482A (zh) * 2016-10-17 2018-04-24 杭州海康威视数字技术股份有限公司 一种三维场景模型构建方法及装置
CN106780601A (zh) * 2016-12-01 2017-05-31 北京未动科技有限公司 一种空间位置追踪方法、装置及智能设备
CN109087340A (zh) * 2018-06-04 2018-12-25 成都通甲优博科技有限责任公司 一种包含尺度信息的人脸三维重建方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4131145A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049783A (zh) * 2022-05-20 2022-09-13 支付宝(杭州)信息技术有限公司 模型的确定方法、场景重建模型、介质、设备及产品
CN115049783B (zh) * 2022-05-20 2024-04-02 支付宝(杭州)信息技术有限公司 模型的确定方法、场景重建模型、介质、设备及产品
CN116233391A (zh) * 2023-03-03 2023-06-06 北京有竹居网络技术有限公司 用于图像处理的装置、方法和存储介质

Also Published As

Publication number Publication date
US20230351677A1 (en) 2023-11-02
EP4131145A4 (en) 2024-05-08
CN113763231B (zh) 2024-02-09
CN113763231A (zh) 2021-12-07
JP2023522442A (ja) 2023-05-30
KR20230015446A (ko) 2023-01-31
JP7461504B2 (ja) 2024-04-03
EP4131145A1 (en) 2023-02-08

Similar Documents

Publication Publication Date Title
WO2021249401A1 (zh) 模型生成方法、图像透视图确定方法、装置、设备及介质
CN111325794B (zh) 一种基于深度卷积自编码器的视觉同时定位与地图构建方法
US11288857B2 (en) Neural rerendering from 3D models
CN112894832B (zh) 三维建模方法、装置、电子设备和存储介质
CN111062873B (zh) 一种基于多对双目相机的视差图像拼接与可视化方法
US11551405B2 (en) Computing images of dynamic scenes
RU2586566C1 (ru) Способ отображения объекта
WO2021030002A1 (en) Depth-aware photo editing
CN109919911B (zh) 基于多视角光度立体的移动三维重建方法
CN113822977A (zh) 图像渲染方法、装置、设备以及存储介质
CN110728671B (zh) 基于视觉的无纹理场景的稠密重建方法
CN107452031B (zh) 虚拟光线跟踪方法及光场动态重聚焦显示系统
KR20180082170A (ko) 3차원 얼굴 모델 획득 방법 및 시스템
CN108805056B (zh) 一种基于3d人脸模型的摄像监控人脸样本扩充方法
CN108399634B (zh) 基于云端计算的rgb-d数据生成方法及装置
CN115731336B (zh) 图像渲染方法、图像渲染模型生成方法及相关装置
US20240161388A1 (en) Hair rendering system based on deep neural network
CN116071504B (zh) 一种面向高分辨率图像的多视图立体重建方法
Szabó et al. Data processing for virtual reality
WO2023086398A1 (en) 3d rendering networks based on refractive neural radiance fields
CN115082636B (zh) 基于混合高斯网络的单图像三维重建方法及设备
CN115359067A (zh) 一种基于连续卷积网络的逐点融合点云语义分割方法
CN114972599A (zh) 一种对场景进行虚拟化的方法
CN115272450A (zh) 一种基于全景分割的目标定位方法
KR20190029842A (ko) 지피유 가속 컴퓨팅을 이용한 3차원 복원 클라우드 포인트 생성 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21823023

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022564397

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021823023

Country of ref document: EP

Effective date: 20221028

ENP Entry into the national phase

Ref document number: 20227045255

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE