WO2021136386A1

WO2021136386A1 - Data processing method, terminal, and server

Info

Publication number: WO2021136386A1
Application number: PCT/CN2020/141440
Authority: WO
Inventors: 黄山; 谭凯; 王硕; 杜斯亮; 方伟
Original assignee: 华为技术有限公司
Priority date: 2019-12-31
Filing date: 2020-12-30
Publication date: 2021-07-08
Also published as: CN113132717A

Abstract

Disclosed in embodiments of the present application is a data processing method, used for performing three-dimensional reconstruction of a target scene according to a panoramic image. The method in the embodiments of the present application comprises: the terminal obtains a panoramic image sequence, wherein the panoramic image sequence comprises a plurality of panoramic images obtained by photographing a target scene from different orientations, and the plurality of panoramic images comprise a first panoramic image and a second panoramic image photographed continuously; when the overlapping degree of the second panoramic image is greater than or equal to a first threshold, the terminal sends to a server a target panoramic image sequence comprising the first panoramic image and the second panoramic image, wherein the overlapping degree of the second panoramic image is a proportion of an overlapping area between the second panoramic image and the first panoramic image to the second panoramic image, the target panoramic image sequence is part of the plurality of panoramic images, and the target panoramic image sequence is used for three-dimensional reconstruction of the target scene.

Description

Data processing method, terminal and server

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201911417063.6, and the invention title is "data processing method, terminal and server" on December 31, 2019, the entire content of which is incorporated into this application by reference .

Technical field

This application relates to the field of image measurement technology, and in particular to a data processing method, terminal and server.

Background technique

During the construction of a base station site, information about the survey site is required, including site size, size, model, and location of the equipment in the site, as well as the connection relationship between the equipment, and relative positions. By collecting the image data of the site site, in order to obtain the clear and complete information of the site, realize the digitization of the site, and provide data support for the subsequent site design and operation and maintenance.

In the prior art, an operator holds a camera to shoot a frame image based on the center projection in the tested scene, and uploads the frame image to a server after completing the collection, and the server calculates the image pose based on the frame image, thereby realizing site digitization.

Due to the limited field of view of a single frame image, in order to calculate the image pose, it is necessary to ensure that the captured frame images have a sufficient overlap ratio to meet the requirements of pose calculation, which requires higher skills for image collectors. The unqualified rate of frame images is relatively high. When the captured frame images are unqualified, the image collectors need to go back to the station to collect them again, which is time-consuming and labor-intensive.

Summary of the invention

The embodiment of the application provides a data processing method for 3D reconstruction of a target scene based on a panoramic image, which can reduce the skill requirements of image collectors, improve the success rate of image 3D reconstruction, and avoid image collectors repeatedly going to the station to collect data .

The first aspect of the present application provides a data processing method, including: a terminal acquires a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images shot on a target scene in different poses, and the plurality of panoramic images includes a continuously shot first A panoramic image and a second panoramic image; in the case that the degree of overlap of the second panoramic image is greater than or equal to the first threshold, the terminal sends a target panoramic image sequence including the first panoramic image and the second panoramic image to the server , The degree of overlap of the second panoramic image is the proportion of the overlapping area of the second panoramic image and the first panoramic image in the second panoramic image, the target panoramic image sequence is a part of the plurality of panoramic images, and the target panoramic image The image sequence is used for the three-dimensional reconstruction of the target scene.

In the data processing method provided by the embodiment of the present application, after the terminal acquires the panoramic image sequence, it can detect the degree of overlap between the first panoramic image and the second panoramic image obtained by continuous shooting, and when the degree of overlap is greater than or equal to the first threshold , The target panoramic image sequence including the first panoramic image and the second panoramic image may be sent to the server for the three-dimensional reconstruction of the target scene. Since the information collected by the panoramic image is comprehensive, the skill requirements for the image collector can be reduced. In addition, the terminal can reduce the unqualified rate of images by screening the panoramic images, and avoid the image collectors from repeatedly going to the station to collect data.

In a possible implementation of the first aspect, the second panoramic image is an image continuously captured after the first panoramic image is captured.

In the data processing method provided by the embodiments of this application, the second panoramic image is an image taken after the first panoramic image. The terminal collecting the panoramic image sequence requires a period of continuous shooting. In this solution, the second panoramic image will be Perform overlap detection on the previous first panoramic image, which can instantly obtain whether the overlap degree of the second panoramic image meets the preset requirements, so that image collectors can quickly determine whether the current captured image is qualified, which is convenient for instant correction when the captured image is unqualified. To avoid re-acquisition of the panoramic image sequence caused by the unqualified overlap after the entire group of panoramic image sequence is taken.

In a possible implementation of the first aspect, the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are set in the target scene for image size calibration. landmark.

The data processing method provided by the embodiments of the present application detects specific markers appearing in multiple panoramic images of a panoramic image sequence, and determines that the number of specific markers in the panoramic image sequence is greater than or equal to a preset second threshold. It needs to be explained However, if the specific marker set in the target scene consists of multiple markers, the number of each marker in the panoramic image sequence can be detected separately, and the number of each marker can be determined to be greater than or equal to the preset threshold. Optionally, the threshold for the quantity of each marker is the same. Since the number of specific markers in the panoramic image sequence is greater than or equal to the preset threshold, the position of the specific marker in the target scene can be determined according to the panoramic image sequence, which can be used for image size calibration to improve the accuracy of 3D modeling degree.

In a possible implementation of the first aspect, the target panoramic image sequence further includes a third panoramic image captured with a specific marker, and the specific marker is set in the target scene for image size calibration. The method further includes: the terminal determines the location range of the specific landmark in the third panoramic image; the terminal sends the location range to the server, and the location range is used to determine the location range in the third panoramic image The location of specific markers.

In the data processing method provided by the embodiment of the application, the terminal can detect the position range of the specific marker in the third panoramic image with the specific marker, and send the position range to the server for determining the precise position of the specific marker, avoiding Detecting specific markers in all the third panoramic images can reduce the amount of calculation.

In a possible implementation of the first aspect, the error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, and the camera pose of the target panoramic image sequence is based on the image point of the target panoramic image sequence with the same name. Performing the posture restoration determination, the image point with the same name is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.

In the data processing method provided by the embodiments of the present application, the terminal can perform camera pose estimation and determine a panoramic image sequence with an error less than or equal to a preset threshold as the target panoramic image sequence, thereby improving the success rate of three-dimensional reconstruction of the target scene.

In a possible implementation of the first aspect, the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.

The data processing method provided by the embodiments of this application determines the image points with the same name by projecting the image pairs whose overlap degree meets the first preset condition in the target panoramic image sequence onto the three-dimensional spherical surface, and can quickly eliminate the image points through the method of grid division and motion statistics characteristics. Mismatches, in order to improve the stability of the match.

In a possible implementation of the first aspect, the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene to a three-dimensional sphere according to the camera pose of the target panoramic image sequence , And the spherical distance between the image point of the object point in the target panoramic image sequence converted to the point formed by the three-dimensional spherical surface.

In the data processing method provided by the embodiments of the present application, the coordinates of a feature point in the world coordinate system are back-projected to the image point coordinates on the image, and the distance between the coordinates of the image point with the same name in the image corresponding to the object point can be used To measure the camera pose error, calculating the camera pose error in a three-dimensional spherical surface can reduce the amount of calculation.

In a possible implementation manner of the first aspect, the method further includes: the terminal sends a camera pose of the target panoramic image sequence to the server, where the camera pose is used to achieve three-dimensional reconstruction of the target scene.

In the data processing method provided by the embodiments of the present application, the terminal can send the camera pose of the target panoramic image sequence to the server to realize the three-dimensional reconstruction of the target scene. In this solution, after the server obtains the camera pose sent by the terminal, it can The camera pose is used as the initial pose for calculation, reducing the amount of calculation and improving the speed of three-dimensional reconstruction.

In a possible implementation of the first aspect, the target panoramic image sequence further satisfies a second preset condition; the second preset condition includes at least one of the following: the blur degree of the panoramic image satisfies a preset third Preset condition; and, the exposure of the panoramic image satisfies a preset fourth preset condition; and, the proportion of the invalid area of the panoramic image is less than or equal to the fifth threshold, and the invalid area includes the captured target scene Outside the area, the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area.

In the data processing method provided by the embodiments of the present application, the terminal can filter the acquired panoramic image according to a variety of possible preset condition combinations, including image quality indicators such as blurriness, exposure, and the proportion of invalid areas of the image, and filter out The target image sequence that satisfies the preset conditions has a higher success rate in achieving three-dimensional reconstruction of the target scene.

A second aspect of the embodiments of the present application provides a data processing method, including: a server receives a panoramic image sequence sent by a terminal, the panoramic image sequence includes a plurality of panoramic images sequentially shot of a target scene in different poses; the server The camera pose of the panoramic image sequence is determined according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or equal to the first threshold.

In the data processing method provided by the embodiment of the present application, after the server receives the panoramic image sequence sent by the terminal, the error of the camera pose is less than or equal to the first threshold. In this way, the server has a higher success rate in achieving three-dimensional reconstruction based on the panoramic image.

In a possible implementation of the second aspect, the image point with the same name is an image point of an image pair whose overlap degree meets a preset condition in the panoramic image sequence,

In the data processing method provided by the embodiments of the present application, image pairs whose overlap degree meets a preset condition are used for image matching to determine points with the same name, which can improve calculation efficiency.

In a possible implementation of the second aspect, the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.

The data processing method provided by the embodiment of the application projects the image pairs whose overlap degree meets the first preset condition in the target panoramic image sequence to the three-dimensional spherical surface to determine the image points with the same name. Errors can be quickly eliminated through the method of grid division and motion statistics characteristics. Matching to improve the stability of matching.

In a possible implementation of the second aspect, the server detects an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: pedestrian area, road Vehicle area and sky area; the image point with the same name is the image point outside the invalid area in the panoramic image.

In the data processing method provided by the embodiments of the present application, the server detects invalid areas in a panoramic image, searches for pixels with the same name in the effective image area after removing the invalid areas, and performs image matching, which can improve the efficiency of determining pixels with the same name.

In a possible implementation of the second aspect, the panoramic image sequence includes a panoramic image captured with a specific marker, and the specific marker is a marker set in the target scene for image size calibration; The server determining the camera pose of the panoramic image sequence according to the image point of the panoramic image sequence with the same name includes: the server determines the camera pose of the panoramic image according to the position of the image point of the same name and the specific marker, and the camera pose is used for Realize the three-dimensional reconstruction of the target scene.

In the data processing method provided by the embodiments of the present application, the panoramic image sequence includes a panoramic image captured with a specific marker for image size calibration. The image point of the same name and the position of the specific marker determine the camera pose of the panoramic image, The calculation accuracy can be improved.

In a possible implementation of the second aspect, the server receives the location range of the specific marker of the panoramic image from the terminal; the server determines the location of the specific marker from the location range of the specific marker.

In the data processing method provided by the embodiment of the application, the server receives the position range of the specific marker sent by the terminal, and determines the precise position of the specific marker according to the position range, avoiding the detection of the specific marker in the entire panoramic image, which can reduce the amount of calculation .

In a possible implementation of the second aspect, the method further includes: the server receives a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the panoramic image determined by the terminal The camera pose of the sequence; the server determines the second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.

In the data processing method provided by the embodiments of the present application, the server receives the camera pose sent from the panoramic image sequence from the terminal, and the camera pose can be used as the initial pose for calculation, which reduces the amount of calculation and improves the three-dimensional reconstruction speed.

In a possible implementation of the second aspect, the panoramic image sequence satisfies a first preset condition, and the first preset condition includes at least one of the following: the degree of overlap of the second panoramic image in the panoramic image sequence is greater than or Equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the degree of overlap of the second panoramic image is that the overlap area between the second panoramic image and the first panoramic image occupies the The proportion of the second panoramic image; and, the number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and, the blur degree of the panoramic image satisfies the second preset condition; and, The exposure of the panoramic image satisfies the third preset condition; and, the error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is performed according to the image points of the panoramic image sequence with the same name. The posture restoration is determined, and the image point with the same name is the image point of the image pair whose overlap degree meets the preset condition in the target panoramic image sequence.

In the data processing method provided by the embodiments of this application, the server can filter the acquired panoramic image sequence according to a variety of possible preset condition combinations, including image quality indicators such as blurriness and exposure, overlap of continuously captured images, and specific signs The number of objects, etc., the selected target image sequence that satisfies the preset conditions has a higher success rate in achieving the three-dimensional reconstruction of the target scene.

A third aspect of the embodiments of the present application provides a terminal, including: an acquisition module, configured to acquire a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images taken of a target scene in different poses, and the plurality of panoramic images includes The first panoramic image and the second panoramic image taken continuously; the sending module is configured to send the first panoramic image and the second panoramic image to the server when the degree of overlap of the second panoramic image is greater than or equal to the first threshold. A target panoramic image sequence of a panoramic image, the degree of overlap of the second panoramic image is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image, and the target panoramic image sequence is the plurality of panoramic images A part of the image, the target panoramic image sequence is used for the three-dimensional reconstruction of the target scene.

In a possible implementation manner of the third aspect, the second panoramic image is an image continuously captured after the first panoramic image is captured.

In a possible implementation of the third aspect, the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific marker is set in the target scene for image size calibration landmark.

In a possible implementation of the third aspect, the target panoramic image sequence further includes a third panoramic image captured with a specific marker, and the specific marker is set in the target scene for image size calibration. A landmark, the terminal further includes: a determining module for determining the location range of the specific landmark in the third panoramic image; the sending module is also used for sending the location range to the server, and the location range is used for determining The location of the specific marker in the third panoramic image.

In a possible implementation of the third aspect, the error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, and the camera pose of the target panoramic image sequence is based on the image points of the target panoramic image sequence with the same name. Performing the posture restoration determination, the image point with the same name is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.

In a possible implementation manner of the third aspect, the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.

In a possible implementation of the third aspect, the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene to a three-dimensional spherical surface according to the camera pose of the target panoramic image sequence , And the spherical distance between the image point of the object point in the target panoramic image sequence converted to the point formed by the three-dimensional spherical surface.

In a possible implementation of the third aspect, the sending module is further configured to send the camera pose of the target panoramic image sequence to the server, where the camera pose is used to realize the three-dimensional reconstruction of the target scene.

In a possible implementation manner of the third aspect, the target panoramic image sequence further satisfies a second preset condition; the second preset condition includes at least one of the following: the blur degree of the panoramic image satisfies a preset third Preset condition; and, the exposure of the panoramic image satisfies a preset fourth preset condition; and, the proportion of the invalid area of the panoramic image is less than or equal to the fifth threshold, and the invalid area includes the captured target scene Outside the area, the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area.

The fourth aspect of the embodiments of the present application provides a server, which is characterized by comprising: a receiving module, configured to receive a panoramic image sequence sent by a terminal, the panoramic image sequence including a plurality of sequentially photographed target scenes in different poses Panoramic image; determination module for determining the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence to achieve the three-dimensional reconstruction of the target scene. The error of the camera pose of the panoramic image sequence is less than or equal to the first A threshold.

In a possible implementation of the fourth aspect, the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.

In a possible implementation manner of the fourth aspect, the server further includes: a detection module configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes At least one of the following: pedestrian area, road vehicle area, and sky area; the image point with the same name is an image point outside the invalid area in the panoramic image.

In a possible implementation of the fourth aspect, the panoramic image sequence includes a panoramic image captured with a specific marker, and the specific marker is a marker set in the target scene for image size calibration; The determining module is also used to determine the camera pose of the panoramic image according to the position of the image point with the same name and the specific marker, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.

In a possible implementation of the fourth aspect, the receiving module is further used to receive the position range of the specific marker of the panoramic image from the terminal; the determining module is also used to obtain the position range of the specific marker from the terminal. Determine the location of the specific marker in the range.

In a possible implementation of the fourth aspect, the receiving module is further configured to receive a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the panoramic image determined by the terminal The camera pose of the sequence; the determining module is also used to determine the second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.

In a possible implementation of the fourth aspect, the panoramic image sequence satisfies a first preset condition, and the first preset condition includes at least one of the following: the degree of overlap of the second panoramic image in the panoramic image sequence is greater than or Equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the degree of overlap of the second panoramic image is that the overlap area between the second panoramic image and the first panoramic image occupies the The proportion of the second panoramic image; and, the number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and, the blur degree of the panoramic image satisfies the second preset condition; and, The exposure of the panoramic image satisfies the third preset condition; and, the error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is performed according to the image points of the panoramic image sequence with the same name. The posture restoration is determined, and the image point with the same name is the image point of the image pair whose overlap degree meets the preset condition in the target panoramic image sequence.

The fifth aspect of the embodiments of the present application provides a terminal, which is characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions. The processor is used to call the program instructions to execute the method in any one of the foregoing first aspect and various possible implementation manners.

The sixth aspect of the embodiments of the present application provides a server, which is characterized in that it includes a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions. The processor is used to call the program instructions to execute the method in any one of the foregoing second aspect and various possible implementation manners.

A seventh aspect of the embodiments of the present application provides a data processing device, which is characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions The processor is used to call the program instructions to execute the method in any one of the foregoing first aspect, second aspect and various possible implementation manners.

The eighth aspect of the embodiments of the present application provides a computer program product containing instructions, which is characterized in that, when it runs on a computer, the computer executes the first and second aspects and various possible implementation manners as described above. Any one of the methods.

The ninth aspect of the embodiments of the present application provides a computer-readable storage medium, including instructions, which are characterized in that, when the instructions are run on a computer, the computer executes the first and second aspects as well as various possible possibilities. Any one of the ways to achieve this.

A tenth aspect of the embodiments of the present application provides a chip including a processor. The processor is used to read and execute the computer program stored in the memory to execute the method in any possible implementation manner of any one of the foregoing aspects. Optionally, the chip should include a memory, and the memory and the processor are connected to the memory through a circuit or a wire. Further optionally, the chip further includes a communication interface, and the processor is connected to the communication interface. The communication interface is used to receive data and/or information that needs to be processed. The processor obtains the data and/or information from the communication interface, processes the data and/or information, and outputs the processing result through the communication interface. The interface can be an input and output interface.

It can be seen from the above technical solutions that the embodiments of the present application have the following advantages:

In the data processing method provided by the embodiments of the present application, since the information collected by the panoramic image is comprehensive, the skill requirements for the image collector can be reduced. After the terminal acquires the panoramic image sequence, it will filter the panoramic images whose overlap degree meets the preset overlap degree threshold and send it to the server for 3D reconstruction of the target scene. By filtering the panoramic images, the terminal can reduce the image failure rate and avoid the repetition of image collectors. Collect data on the station.

The terminal acquires a panoramic image sequence for a period of continuous shooting. In this solution, the second panoramic image will be collected and the previous first panoramic image will be detected for overlap, so that the overlap of the second panoramic image can be instantly obtained Satisfy the preset requirements, allowing the image collector to quickly determine whether the current captured image is qualified, which is convenient for instant re-shooting when the captured image is unqualified, and avoids the re-acquisition of the panoramic image sequence caused by the unqualified overlap after the entire group of panoramic image sequences are taken. .

The server obtains the filtered panoramic images that meet the preset overlap threshold, which can improve the success rate of 3D reconstruction and prevent image collectors from repeatedly going to the station to collect data.

Description of the drawings

Figure 1 is a schematic diagram of a survey scene in an embodiment of the application;

2 is a schematic diagram of an embodiment of a data processing method in an embodiment of the application;

FIG. 3 is a schematic diagram of the coordinate system of the camera pose calculation in an embodiment of the application;

4 is a schematic diagram of the image semantic segmentation recognition result in an embodiment of the application;

FIG. 5 is a schematic diagram of another embodiment of a data processing method in an embodiment of the application;

FIG. 6 is a schematic diagram of an embodiment of a terminal in an embodiment of the application;

FIG. 7 is a schematic diagram of an embodiment of a server in an embodiment of the application;

FIG. 8 is a schematic diagram of another embodiment of a terminal in an embodiment of this application;

FIG. 9 is a schematic diagram of another embodiment of a server in an embodiment of the application;

FIG. 10 is a schematic diagram of an embodiment of a data processing device in an embodiment of the application.

Detailed ways

The following describes the embodiments of the present application with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. A person of ordinary skill in the art knows that with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are equally applicable to similar technical problems.

The terms "first", "second", etc. in the description and claims of the application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments described herein can be implemented in a sequence other than the content illustrated or described herein. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or modules is not necessarily limited to those clearly listed. Those steps or modules may include other steps or modules that are not clearly listed or are inherent to these processes, methods, products, or equipment. The naming or numbering of steps appearing in this application does not mean that the steps in the method flow must be executed in the time/logical sequence indicated by the naming or numbering. The named or numbered process steps can be implemented according to the The technical purpose changes the execution order, as long as the same or similar technical effects can be achieved.

For ease of understanding, the following briefly introduces some technical terms involved in this application:

Panoramic image: A broad panoramic image refers to a wide-angle image, that is, an image with a larger angle of view. The embodiments of this application specifically refer to images with a horizontal viewing angle of 360 degrees and a vertical viewing angle of 180 degrees. The panoramic image can be realized by different projection methods, including: equiangular projection, equirectangular projection, and orthogonal projection Sum equal area projection, etc., are not specifically limited here. Since panoramic images need to project as large a three-dimensional scene as possible into a limited two-dimensional image plane for presentation and storage, the panoramic images produced by various projection methods have large image distortions.

Panoramic camera: used to collect panoramic images. The panoramic camera is equipped with at least two fisheye lenses, or multiple wide-angle lenses, or multiple ordinary lenses, which are not specifically limited here. The images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited.

Panoramic image sequence: a series of panoramic images obtained sequentially and continuously on a target at different times and different positions. This application mainly relates to a series of panoramic images obtained by continuous shooting of a scene or site.

Image overlap area: refers to the image area that contains the same object in two images. The ratio of the image overlap area to the entire image is the degree of overlap.

Image pose: refers to the camera pose when the image is taken. The camera pose refers to the position of the camera in space and the orientation of the camera, which can be regarded as the translational transformation and rotation transformation of the camera from the original reference position to the current position. .

Image matching: From two images with overlapping information, extract the feature points of each image and the feature vector corresponding to the feature point as a descriptor, and use the feature vector to determine the image point corresponding to the same object point in the two images. Multiple image points corresponding to the same object point in different images are called image points with the same name.

Grid-based motion statistics (GMS): A method that uses motion smoothness as a statistic to perform local area matching for fast and super robust feature matching.

SFM: structure from motion, a technology to determine the sparse point cloud of the subject and the pose (position and posture) of the image from multiple images.

Binocular measurement: a method in which the camera poses corresponding to two images are known, and the three-dimensional coordinates of the object points corresponding to the points are determined by measuring the image points with the same name in the two images.

Frame image: the center projection image taken by a common mobile phone or SLR.

Relative orientation: restore or determine the relative relationship of the image pair during photography.

PnP (pespective-n-point) algorithm: According to the known point pairs of n spatial 3D points and image 2D points, the algorithm for calculating the pose of the camera.

Forward intersection: Determine the three-dimensional coordinates of the object point corresponding to the image point with the same name based on the known poses of the two images and the image point with the same name.

In order to understand the site information, by collecting the image information of the site, based on the image information for three-dimensional reconstruction, and then realize the site digitization, can provide data support for subsequent design or operation and maintenance, such as the size of the site, the size of the equipment in the site, model, Location, connection relationship between devices, relative location and other information.

The operator holds a frame camera, takes frame images in the tested scene, uses the SFM algorithm in computer vision to calculate the camera pose when the image is taken, and completes data collection with binocular measurement. Due to the limited field of view of a single image of the frame image, in order to meet the requirements of pose calculation, sufficient repeated targets must be included among the collected multiple images. For binocular measurement of targets in the scene, at least 2 images are required. To the target, ordinary data collectors are prone to shooting loopholes due to insufficient professionalism or inexperience.

Because the process of calculating the pose of a sequence image based on the image information is computationally intensive, time-consuming, and the terminal's computing performance is limited, usually the image collector collects images through the terminal and directly uploads all the collected images to the server side, and then The calculation is performed centrally on the server side. Once the captured image is unqualified during the server calculation process, the image collector needs to go to the station again to shoot, which is time-consuming and laborious.

The server is based on the three-dimensional reconstruction of ordinary frame images. When the scene has a single texture or a small number of scenes, the acquired feature points are very similar or lack of sufficient feature points to be continuously tracked and the reconstruction fails.

The embodiment of the application provides a data processing method for collecting panoramic images that meet the requirements of site digitization, which can improve the quality of the collected panoramic images and reduce the rate of image collectors returning to the site.

Please refer to Figure 1, which is a schematic diagram of a survey scene in an embodiment of this application;

In the image acquisition method provided by the embodiments of the application, a panoramic image is acquired in a site through an image acquisition device. First, a specific marker is placed in the shooting scene in advance. Common specific markers include targets or benchmarks. Among them, the distance between the two ends of the benchmark It is known that the distance and angle between the targets are known. Figure 1 shows a specific marker, including 3 targets. Among them, the distance between the No. 1 target and the No. 2 target is 1 meter. The distance between the target and the No. 3 target is also 1 meter, and the angle between the line connecting the No. 1 target and the No. 2 and No. 3 targets is 90 degrees. In this way, the 3 targets can define a plane. Optionally, the No. 1 target can be used as the origin of the coordinate system of the plane, so that the size of the captured image can be calibrated.

Then the image collector uses the image acquisition device to take single-point image shooting. Optionally, in order to capture the specific landmark, the shooting point selected by the image collector can be a certain distance away from the specific landmark, for example, about one and a half steps away. , The details can be changed according to the different shooting scenes, and there is no specific limitation here. Optionally, the image collection personnel perform image collection according to the "step by step" rule. Figure 1 shows the "step by step" shooting trajectory in this scene. A step refers to a step of the image collector, which can be walking or running or moving in some way. The specific size of the step is not limited here. It should be noted that in order to achieve three-dimensional reconstruction, shooting in a scene A set of images needs to cover the entire scene. A scene here can refer to the inside of a room, or outdoors, and a circle around the target subject. The specific scene is not limited.

The following describes the data processing method provided by the embodiment of the present application. Please refer to FIG. 2, which is a schematic diagram of an embodiment of the data processing method in the embodiment of the present application.

It should be noted that the three-dimensional reconstruction of the target scene can be realized in various forms through the data processing method provided in the embodiments of the present application:

1. The terminal can obtain a panoramic image and perform image pose estimation. For the implementation process, please refer to step 201 to step 208. After step 208, three-dimensional reconstruction can be directly performed according to the camera pose;

2. The terminal uploads the acquired panoramic image to the server, and the server performs image pose estimation. You can refer to steps 201, 209 to 213. The terminal directly uploads the acquired panoramic image sequence to the server without going through steps 202 to steps. Image screening in 208;

3. The terminal obtains the panoramic image and performs image screening, uploads the panoramic image that meets the preset conditions to the server, and the server implements the image pose estimation. The specific implementation form is not limited here. In the following, the third implementation method is taken as an example for detailed introduction.

201. The terminal acquires a panoramic image;

The image acquisition device is used to collect panoramic images, and it may be a terminal equipped with a panoramic camera, or a panoramic camera with a communication connection with the terminal, and the details are not limited here. In the embodiments of the present application, the smart terminal plus the peripheral panoramic camera is taken as an example for description. The terminal can connect to the panoramic camera through TypeC, Bluetooth or wifi, and control the panoramic camera to shoot through the client.

The panoramic camera takes a panoramic image and sends the panoramic image to the terminal. The terminal obtains the panoramic image, and then processes the panoramic image, realizes image pre-detection, and determines whether the captured panoramic image sequence meets the requirements.

In order to calculate the image pose, it is necessary to obtain at least two panoramic images. Optionally, a panoramic camera is used to shoot a set of panoramic image sequences for the site scene, including multiple panoramic images. The number of panoramic images included in the panoramic image sequence is not limited here. .

202. The terminal performs image quality detection on the panoramic image;

The objective image quality indicators include focus, noise, color, exposure, sharpness, etc. This embodiment does not limit the number and types of objective quality indicators selected by the terminal for image quality detection.

Optionally, the terminal detects whether the exposure and blur of the panoramic image meet preset requirements. The following are introduced separately:

1) Image exposure calculation;

There are multiple exposure calculation methods. Optionally, convert the image from RGB space to HSV color space to obtain the hue, saturation, and brightness of each pixel of the image, and calculate the image exposure based on the saturation and brightness; optional, The exposure degree can also be calculated by directly counting the grayscale histogram of the image, and the calculation method of the exposure degree is not specifically limited here.

After the exposure of the captured image is obtained, it can be compared with the preset exposure threshold to determine whether it is within the threshold range. If it is within the threshold range, the image exposure is appropriate, otherwise it is determined that the image is overexposed or underexposed, optional Yes, the terminal can prompt the user to take an image with abnormal exposure. It should be noted that the specific range of the exposure threshold is not limited here.

In this embodiment, the gray-scale histogram analysis algorithm is used as an example for introduction. The exposure threshold is set to 1.0. If the calculated value is greater than 1.0, the exposure is considered abnormal.

Optionally, if the image exposure is unqualified, the user may be prompted to delete the image and take another shot.

2) Calculation of image blur degree;

There are many kinds of ambiguity calculations, including Tenengrad gradient method, Laplacian gradient method, gradient variance, or sobel ambiguity detection algorithm, etc. The specifics are not limited here.

After calculating the blur degree value of the captured image, it can be compared with the preset blur degree threshold to determine whether the blur degree of the captured image is within the threshold range. If it is within the threshold range, the blur degree of the image is qualified, otherwise it is unqualified. Yes, the terminal can prompt the user that the captured image is blurred. The specific range of the blurriness threshold is not limited here. For example, the blurriness threshold is set to 6.0, and if it is greater than the threshold, the image is considered to be blurred.

Optionally, if the image blur is unqualified, the user may be prompted to delete the image and take another shot.

Since the terminal can detect the image quality after acquiring the panoramic image, the detection result can be fed back to the image collector in time, which helps to improve the qualification rate of the panoramic image.

203. The terminal performs invalid region detection on the panoramic image.

The invalid area is the part of the image that is worthless except for the target scene, such as moving objects, such as data collectors or road vehicles. The terminal presets the type of the invalid area, for example, defines the moving object in the captured image as the invalid area, and calculates the proportion of the invalid area in the captured image.

Optionally, use the image recognition method to identify the invalid area in the captured image, calculate the proportion of the invalid area in the captured image, and compare it with a preset invalid area threshold. If it is within the threshold range, the image is qualified, otherwise the image Unqualified. Optionally, the threshold of the effective area can be preset, and the effective area ratio of the image can be calculated, and the details are not described here.

Exemplarily, the invalid area threshold is 70%, and the MobileNet model is used to perform semantic segmentation on the panoramic image, identify the range of motion areas in the image, and count the proportion of the motion area of each image in the image. If it is greater than 70%, then determine the The proportion of the invalid area of the image is unqualified.

Optionally, if the invalid area of the panoramic image is greater than the preset invalid area threshold, the user may be prompted to delete the image and take another shot.

204. The terminal detects the degree of overlap of two consecutive images;

In order to reduce the amount of calculation in the panoramic image matching step of the mobile phone, the overlap degree of the continuously shot panoramic image sequence can be detected and judge whether the preset overlap degree threshold is met. If the threshold is met, the image overlap degree is qualified, if not satisfied , The panoramic image overlap degree is unqualified, and you need to take another shot.

Exemplarily, the preset overlap degree threshold is 30%. After acquiring the second panoramic image, the terminal will perform overlap detection with the acquired first panoramic image. If the overlap degree of the two images is greater than or equal to 30%, then Make sure that the overlap of the second image meets the requirements, otherwise you will be prompted to take the second image again. Similarly, after acquiring the third image, the terminal performs overlap detection between the third image and the second image to determine whether the third image meets the overlap requirement.

It should be noted that step 205 is an optional step, which may or may not be performed, and the details are not limited here.

205. The terminal performs specific marker statistics on the panoramic image sequence;

The specific markers in the image can be used as control information to predefine the number threshold of the specific markers appearing in the panoramic image to ensure that the final calculated image pose can be binocular measurement or vector modeling. The number threshold is, for example, 2 or 3, and the specific value is not limited here.

After acquiring a panoramic image sequence of a scene, the terminal will count the specific markers that appear in all panoramic images in the panoramic image sequence. There may be multiple specific markers. If each specific marker appears in the panoramic image sequence If the number is greater than or equal to the preset number threshold of the specific marker, the panoramic image sequence is qualified; otherwise, the entire group of panoramic image sequence is unqualified, prompting the user that the photographing data of the characteristic markers is insufficient and needs to be photographed again. Optionally, according to the detection result, whether the shooting control information meets the requirements is prompted.

First, the terminal needs to identify the specific marker in each panoramic image. Optionally, use the MobileNet model for image recognition, identify the specific marker in each image, and determine whether the number of markers meets the preset specific markers The number threshold. Optionally, in order to increase the calculation speed, specific marker recognition and invalid region recognition can be combined. For example, after a panoramic image is obtained, the MobileNet model is used to simultaneously identify the invalid region and the specific marker.

Then, the terminal counts the number of specific markers in the panoramic image sequence. Exemplarily, if the specific markers are 3 targets, which are target 1, target 2, and target 3, the terminal needs to analyze each panoramic image. Recognize in the image, determine the type and number of targets, and then count the types and numbers of all the targets in the panoramic image sequence, for example, count the types and numbers of targets in 15 images, the number of target 1 is 4, 2 The number of target No. 3 is 3, the number of No. 3 targets is 5, and the preset threshold for the number of specific markers is 2. Because the number of No. 1, No. 2 and No. 3 targets are all greater than the preset threshold 2. Therefore, it is judged that the number of specific markers of the panoramic image sequence is qualified. If the number of target 1, target 2, or target 3 is less than 2, the panoramic image sequence is unqualified, which can promote the user to take another shot .

Optionally, in order to increase the calculation speed, in step 203, the invalid area and specific markers can be identified at the same time, the type and quantity of the specific markers of each panoramic image are recorded, and then in this step, whether the number of specific markers meets the requirements Just ask.

206. Image retrieval;

In order to reduce the amount of calculation in the panoramic image matching step of the mobile phone, image retrieval can be performed on the panoramic image sequence, the overlap relationship between each panoramic image and other images in the panoramic image sequence can be determined, and the image pair used for subsequent image matching can be determined.

Optionally, the degree of overlap between the images is determined after the images are reduced according to a certain method. For example, the number of pixels in the horizontal and vertical directions of the reduced image to the image frame is less than 2000, and then the image retrieval of the reduced panoramic image can reduce the amount of calculation.

Optionally, preset an overlap degree threshold, and determine an image whose overlap degree with each image is higher than the overlap degree threshold as the image for image matching, for example, the second to fourth images and the sixth image and the first If the overlap degree of an image is higher than the overlap threshold value, it is determined that the second to fourth images and the sixth image are respectively matched with the first image, that is, four image pairs are determined.

Optionally, determine the 5 images with the highest degree of overlap corresponding to each image as the images for image matching. For example, if the second to sixth images are the 5 images with the highest degree of overlap with the first image, then determine The second to sixth images are matched with the first image respectively.

207. Image matching;

Image matching is performed according to the image pair acquired by the image retrieval in step 207 to determine the image point with the same name in the image pair.

There are many algorithms for image feature point matching, including scale-invariant feature transform (SIFT) algorithm, accelerated robust feature (speeded up robust features, SURF) algorithm, or ORB (oriented FAST and rotated) BRIEF) algorithm, ORB algorithm is a fast feature point extraction and description algorithm, which is mainly divided into two parts, feature point extraction and feature point description. Feature extraction is developed by the FAST (features from accelerated segment test) algorithm, and the feature point description is improved based on the BRIEF (binary robust independent elementary features) feature description algorithm. This application does not limit the specific types of matching algorithms. The following takes the improved orb algorithm as an example to introduce.

The traditional matching algorithm is a matching strategy based on the center projection model of a common frame image. This embodiment proposes a matching strategy based on a three-dimensional spherical coordinate system to search for an image point with the same name in a panoramic spherical space. This application improves on the traditional orb algorithm. After extracting orb feature points, the image is projected onto a three-dimensional spherical surface, and the spherical surface is divided into grids, which are determined by grid-based motion statistics (GMS) For the image points with the same name, the wrong matching can be quickly eliminated through the method of grid division and motion statistics, so as to improve the stability of the matching.

Optionally, use the RANSAC-based quadratic polynomial model to eliminate mismatched points, optimize the matching results, and achieve rapid and highly reliable matching of panoramic images.

208. The terminal performs camera pose recovery;

There are many ways of camera pose estimation, including SFM (structure from motion) technology, or simultaneous positioning and map construction (simultaneous localization and mapping, SLAM) technology, etc., which are not specifically limited here. This embodiment China and Israel introduced camera pose recovery based on SFM technology.

This application improves the existing SFM algorithm, and proposes algorithms based on relative orientation, PnP, and forward intersection based on three-dimensional spherical coordinates based on the characteristics of panoramic images. Calculate the camera pose corresponding to each panoramic image. If the error of the camera pose is less than or equal to the preset threshold, the calculation is successful; if the error of the camera pose is greater than the preset threshold, the calculation fails. Optionally, if the calculation fails, the user may be prompted to retake the panoramic image sequence.

According to the panoramic image sequence, the coordinates of the feature points in the world coordinate system are calculated, and the camera pose corresponding to a single panoramic image, including the coordinates of the image shooting center in the world coordinate system, and the world coordinate system and the shooting center It is the transformation matrix between the three-dimensional spherical coordinate system of the origin. The coordinates of a feature point in the world coordinate system are back projected to the image point coordinates on the image, and the distance between the coordinates of the image point with the same name in the image corresponding to the object point can be used to measure the camera pose error. Optionally, if the standard deviation of the distance from the corresponding image point after the multiple feature points are back-projected to the image is less than or equal to the threshold, it is determined that the camera pose calculation of the image is successful.

Please refer to FIG. 3, which is a schematic diagram of a coordinate system for calculating a camera pose in an embodiment of the application.

The coordinate systems involved in Figure 3 include: the world coordinate system is O-XYZ, the three-dimensional spherical coordinate system o-p0p1p2, and the image plane coordinate system uv.

Among them, P point represents an object point in the world coordinate system, and [X Y Z] is the coordinate of the object point P in the world coordinate system. Point o is the shooting center of the image, and the coordinates of point o in the world coordinate system are [X _S Y _S Z _S ]. Point p represents the image space point of the object point P in the spherical projection, and the coordinate of point p in the three-dimensional spherical coordinate system is [p ₀ p ₁ p ₂ ]. Point p'represents the image point of the object point P in the panoramic image, and the coordinates of point p'in the image plane coordinate system are [u, v].

In addition, R represents a transformation matrix between the world coordinate system and the three-dimensional spherical coordinate system.

Taking the PnP algorithm as an example, when calculating the camera pose parameter [X _S Y _S Z _S ] of the panoramic image and the error of the R matrix, for the known object point [XYZ], use the pose parameter to calculate other corresponding three-dimensional spherical coordinates The coordinates in the system [P ₀ ′ P ₁ ′ P ₂ ′], and the coordinates [u′ v′] in the image plane coordinate system, that is, the value of [u′ v′] is calculated according to formulas (1) and (2), Compare the value of [u′ v′] with the input value of [u, v], and calculate the distance between [u′ v′] and [uv]

Count the average distances corresponding to all the object points. If the average value is less than the threshold T, the calculated pose is considered valid. The threshold T may be, for example, 6 pixels.

u'=π-a tan 2(p ₀ ',p ₂ ')

if u'＞π

u'=u'-2π (2)

or

u'=u'

v'=atan(p ₁ ',r ₀ )

Among them, λ represents the scale factor.

Due to the complexity of the above calculation process, this application provides a simplified algorithm. The following describes the method for calculating the camera pose error in this embodiment. When calculating the camera pose error, the feature points and object points are back-projected to the three-dimensional spherical coordinate system. Project the image point corresponding to the image to the three-dimensional spherical coordinate system. When using the three-dimensional coordinate calculation, define the arc distance threshold between the point and the point or the point and the line in the three-dimensional space to determine whether the camera pose calculation is successful. The following is introduced:

Preset the arc length threshold T′ on the three-dimensional spherical surface, and directly compare the cos value of the angle _{between [p 0} 'p ₁ ' p ₂ '] and [p ₀ p ₁ p _{2] calculated by [u v] on the spherical surface,}

Threshold T'=cos(T/f). Since the [u v] of each image point is calculated according to formula (3) to obtain [p ₀ p ₁ p ₂ ] only once, the calculation is simplified and the speed is improved.

Among them, f is the main distance, that is, the distance from the optical center to the imaging surface.

Similarly, when calculating the relative pose between two images, the calculation is performed directly in the three-dimensional spherical coordinate system. Such as formula (4),

Represents the image space coordinates of the p1 pixel of the first image,

Represents the image space coordinates of the image point p2 with the same name of the pixel p1 in the second image, and R′ represents the relative rotation matrix between the two images,

Indicates the position of the optical center of the first image in the world coordinate system,

Indicates the position of the second image in the world coordinate system. When calculating the residual, write (4) equivalently as formula (5), namely

A, B, C are calculated by formula (5). Due to the error of pose calculation, there are errors in the calculation of A, B, and C. The actual calculated value is A', B', C', and the calculated error value is

Compare with the preset threshold T', optionally, T'=sin(T/f), T is 4 pixels, for example, if the average residual distance value of all image pairs is less than the threshold, then one of the two calculated images will be accepted The relative pose between.

among them,

If the error in calculating the camera pose in this step of the panoramic image sequence meets the threshold requirement, the initial pose calculation between images is completed, and the camera pose corresponding to each panoramic image and the sparse point cloud of the shooting scene are obtained. The panoramic image sequence is qualified. Optionally, if the calculation fails, the user is prompted to take another shot.

In this step, based on the characteristics of the panoramic image, the SFM network is constructed based on the spherical coordinates of the three-dimensional image space. The two-dimensional image coordinates are no longer used for relative orientation and PnP calculations, but the two-dimensional image coordinates are converted to the three-dimensional spherical coordinates for calculation. , Which can reduce the amount of calculation.

In addition, compared to the traditional SFM algorithm for local optimization every time an image is added, this step can be based on the number of pairs of pixels with the same name between the images. When the preset conditions are met, multiple images can be added at one time for local optimization. The setting condition can be that the number of pairs of image points with the same name between the uncalculated image and the calculated image is greater than the threshold, such as 15, and the PnP calculation result is valid, thus, the number of optimizations of the beam method can be reduced, and the number of beam method adjustments can be reduced. It is a method of accurately determining the position and posture of an image using optimization methods. The amount of calculation can be reduced, the calculation time can be reduced, and the user experience can be improved.

Optionally, if the pose estimation is only performed through the terminal, the three-dimensional reconstruction of the target scene can be performed according to the camera pose obtained in this step.

209. The terminal uploads the panoramic image sequence to the server;

The terminal sends the panoramic image sequence to the server, and the communication method is not limited.

Optionally, the terminal sends the detection result of the specific marker to the server.

Optionally, the terminal sends the camera pose and the sparse point cloud of the shooting scene corresponding to each panoramic image to the server.

210. The server performs interference area detection;

The server receives the panoramic image sequence sent by the terminal and separately detects the interference area in each panoramic image. The interference area is interference information that does not need to be used for site digitization. The interference area can be predefined, for example, set the moving objects and the sky in the image The area is the interference area. It should be noted that the definition of the interference area and the invalid area in step 203 may be the same or different, and the specific definition is not limited here.

Optionally, the interference area in the image is recognized by the image recognition method, and the mask picture of each image is generated according to the recognition result, the non-interference area is reserved, and the mask picture of each image is generated;

Optionally, considering the projection distortion of the panoramic image, the recognition rate of semantic segmentation is low by directly using the image recognition algorithm. In this embodiment, since the photographer usually appears in the lower area of the image, in order to facilitate the identification of interference objects in the lower area of the image, each image point of the image can be converted to a three-dimensional spherical coordinate system, and the image can be rotated and transformed to make the original panoramic image The lower area, that is, the main area to be recognized is rotated to the equatorial area of the sphere. Since the equatorial area of the sphere has the smallest image distortion after being converted into a two-dimensional image, the recognition rate of image semantic segmentation can be improved.

Exemplarily, the DeepNet network model is used to perform semantic segmentation, and at the same time, matching interference regions, such as the sky, pedestrians, etc., are identified. Optionally, an image segmentation (graph cut) algorithm is used to optimize the regions marked as pedestrians to further improve the segmentation accuracy. Finally, according to the recognition results, a corresponding matching mask image is produced. In the mask image, pixels with a gray level of 0 represent interference areas, and pixels with a gray level of 255 represent non-interference areas. When extracting image feature points, the input image and the corresponding mask image are input at the same time, and the feature points are not extracted in the area of the mask image where the gray level is 0.

Please refer to FIG. 4, which is a schematic diagram of an image semantic segmentation recognition result in an embodiment of the application.

The left image is the input image, and the right image is the predicted semantic segmentation recognition result. The area A represents the sky, the area B represents the effective area, the area C represents the feature marker, and the area D represents the interference area formed by moving objects such as pedestrians.

211. The server detects a specific marker;

The server determines the number and precise location of the specific marker in the image. The specified information in the specific marker is used as the control information to ensure that the scale of the coordinate system where the image pose is calculated is correct.

Optionally, in step 205, the terminal has preliminarily identified the specific marker through pre-detection. Therefore, in this step, the server can perform detection in the local area of the identified specific marker according to the detection result of the specific marker uploaded by the terminal pre-detection. , To determine the precise location of a specific marker.

Optionally, use the DeepNet network model to identify the target area. In addition, in order to increase the detection speed, the identification of the specific marker may be performed at the same time as the identification of the invalid area is performed in step 210.

Optionally, in order to determine the precise position of the specific marker, the partial area containing the specific marker in the panoramic image is reprojected to the central projection plane to obtain a frame image containing the specific marker, thereby reducing image distortion. Then, a target detection algorithm (such as Yolo v3) is used to locate a specific marker. Exemplarily, if the specific marker is the target, binarize the frame image containing the target, analyze the result of the binarization, determine the target number, and use a circular detector to extract the center point of the target position.

Exemplarily, a series of images is obtained based on the rotation transformation of the original panoramic image on the three-dimensional spherical coordinate system. Among them, the panoramic image contains a specific marker, that is, the partial area of the target can be reprojected to the central projection plane to obtain the containing target. The frame image of the target can reduce the image distortion of the target area and facilitate the precise positioning of the target. In addition, the lower area of the original panoramic image, that is, the main area to be identified that may have interference, can be rotated to the equator area of the spherical surface and projected to the central projection plane. Because the equator area of the spherical surface is transformed into a two-dimensional image, the image is deformed Minimal, which can improve the recognition rate of image semantic segmentation. In addition, it is also possible to transfer the image collector holding the panoramic camera to the middle of the image, which is easy to be recognized.

212. The server performs image matching;

The server obtains a pair of images with a high degree of overlap according to the image retrieval, or, in step 209, the server obtains the detection result of the image pair sent by the terminal, determines the overlap relationship between the images, and then determines the image pair for image matching.

According to the panoramic image and the mask image obtained by the interference area recognition in step 210, feature points are extracted from each image, and the image pair is matched with features, optionally, an orb feature matching algorithm is used for matching. Optionally, if the server obtains the image matching result sent by the terminal in step 209, the image matching result can be used as the initial value of the matching, thereby reducing the matching search range and improving the matching speed.

213. The server performs camera pose recovery;

The server performs camera pose optimization based on the matching result obtained in step 212, and accurately determines the camera pose of each panoramic image and the sparse point cloud of the target scene according to the specific marker extraction result obtained in step 211.

Optionally, if the server obtains the pose information pre-detected by the terminal in step 209, the pose information may be used as the initial value of the pose for detection, so as to reduce the amount of calculation and increase the speed of pose calculation.

214. The server realizes site digitization;

The server stores the pose information and the sparse point cloud of the panoramic image sequence, and then performs stereo measurement and vector modeling based on the panoramic image, thereby realizing the digitization of the site.

In the data processing method provided by the embodiments of this application, the terminal obtains the panoramic image, performs pre-detection through the terminal, and sends the panoramic image that meets the preset conditions to the server for the digitization of the site. Since the information collected by the panoramic image is comprehensive, in addition, the terminal passes Filtering images with preset conditions can reduce the rate of image failure and prevent image collectors from repeatedly going to the station to collect data.

The data processing method provided in the embodiments of this application takes into account the problem of insufficient professional skills of the collectors, and directly collects panoramic images for data processing; during the collection, the quality of the collected images is checked while collecting, such as exposure and blur To ensure the imaging quality of the image itself; after the collection is completed, the specific sign information is detected on the mobile phone, pre-processing, image matching, and image pose estimation are calculated to ensure that the multiple images captured can be accurately determined. Time position and posture, avoid the second time to collect on-site.

Aiming at the problem of large distortion of the panoramic image, the existing SFM algorithm is improved, and the SFM construction based on the spherical coordinate of the image is adopted. The coordinate of the panoramic image is directly transferred to the spherical coordinate for relative orientation and PnP solution.

In view of the problem of limited computing performance of the terminal, the embodiment of this application uses accelerated optimization of image matching, including reduction of panoramic images, image retrieval, and proposes a spherical grid-based motion statistics method to determine the orb feature points of the same name. At the same time, it uses RANSAC-based secondary The polynomial model eliminates mismatched points, optimizes the matching results, and realizes fast and highly reliable matching of panoramic images. In addition, the traditional SFM construction process is improved by adding only one image at a time. This solution adds multiple images to the already constructed network at a time, which can speed up the network construction and reduce the calculation time. This can solve the problem that the mobile phone cannot perform the panoramic image SFM network construction, and achieve the effect of realizing the mobile phone pre-detection within the user's acceptable time.

The data processing method provided in the embodiments of this application can be applied to terminal devices with multiple performances. As shown in Table 1, in order to test the performance of the mobile phone pre-detection algorithm, mobile phones with different performances are selected for testing, including high-end devices (such as Huawei mate20pro), mid-range machines (such as Huawei p10), low-end machines (such as Honor 9). The larger the number of images, the longer the calculation time. Therefore, take 30 CV60 camera images that are commonly processed in the survey scene as an example for statistics, and the calculation time is shown in Table 1. All times are within 20 minutes, which is within the acceptable range of the collection operator.

Table 1 Mobile phone pre-detection time

After testing, after adding the mobile phone pre-detection function, the rate of secondary visits has been reduced from 30% to 0, completely eliminating secondary visits. After the server algorithm is optimized, the synthesis success rate can reach 90%, which is significantly higher than the existing success rate of 60%.

Please refer to FIG. 5, which is a schematic diagram of another embodiment of the data processing method in the embodiment of this application;

This application collects 360-degree panoramic image data for site digital information collection, including two parts of work on the mobile phone side and the background service side. On the mobile phone side, collect panoramic images, perform image quality detection and image preprocessing, and image synthesis pre-detection, specifically the improved SFM pose calculation. Through this pre-detection, it is determined whether the camera position when each image is captured can be correctly estimated. If the requirements are met, the image data will be passed to the background server. The server side performs high-precision image synthesis, including receiving the image and pose data sent by the mobile phone, identifying the interference area of the image, calculating the pose based on eliminating the interference area, and combining the pose data sent by the mobile phone to optimize the image pose calculation Processing to accurately determine the position and posture parameters of the image.

The data processing method provided in the embodiments of the present application has been introduced above, and the terminal and server that implement the data processing method will be introduced below.

Please refer to FIG. 6, which is a schematic diagram of an embodiment of a terminal in an embodiment of this application;

The embodiment of the present application provides a terminal, including:

The acquiring module 601 is configured to acquire a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images shot on a target scene in different poses, and the plurality of panoramic images includes a first panoramic image and a second panoramic image that are continuously shot ；

The sending module 602 is configured to send a target panoramic image sequence including the first panoramic image and the second panoramic image to the server when the degree of overlap of the second panoramic image is greater than or equal to a first threshold, so The degree of overlap of the second panoramic image is the proportion of the overlapping area of the second panoramic image and the first panoramic image in the second panoramic image, and the target panoramic image sequence is a part of the plurality of panoramic images The target panoramic image sequence is used for three-dimensional reconstruction of the target scene.

Optionally, the second panoramic image is an image continuously captured after the first panoramic image is captured.

Optionally, the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are markers set in the target scene for image size calibration.

Optionally, the target panoramic image sequence further includes a third panoramic image shot with a specific marker, and the specific marker is a marker set in the target scene for image size calibration, and the terminal Also includes:

The determining module 603 is configured to determine the position range of the specific marker in the third panoramic image;

The sending module 602 is further configured to send the location range to the server, where the location range is used to determine the location of the specific landmark in the third panoramic image.

Optionally, the error of the camera pose of the target panoramic image sequence is less than or equal to a third threshold, and the camera pose of the target panoramic image sequence is determined according to the same-named image points of the target panoramic image sequence, The image point with the same name is an image point of an image pair whose overlap degree meets a first preset condition in the target panoramic image sequence.

Optionally, the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.

Optionally, the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene onto a three-dimensional spherical surface according to the camera pose of the target panoramic image sequence, and the object point is The image points in the target panoramic image sequence are converted to the spherical distance between the points formed by the three-dimensional spherical surface.

Optionally, the sending module 602 is further configured to:

Sending the camera pose of the target panoramic image sequence to the server, where the camera pose is used to achieve three-dimensional reconstruction of the target scene.

Optionally, the target panoramic image sequence also satisfies a second preset condition;

The second preset condition includes at least one of the following:

The blur degree of the panoramic image satisfies a preset third preset condition; and,

The exposure of the panoramic image satisfies a preset fourth preset condition; and,

The proportion of the invalid area of the panoramic image is less than or equal to a fifth threshold, the invalid area includes an area outside the captured target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and Sky area.

Please refer to FIG. 7, which is a schematic diagram of an embodiment of a server in an embodiment of this application;

The server provided in the embodiment of the present application includes:

The receiving module 701 is configured to receive a panoramic image sequence sent by a terminal, and the panoramic image sequence includes a plurality of panoramic images sequentially shot of a target scene in different poses;

The determining module 702 is configured to determine the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or Equal to the first threshold.

Optionally, the server further includes:

The detection module 703 is configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area The image point with the same name is the image point outside the invalid area in the panoramic image.

Optionally, the panoramic image sequence includes a panoramic image shot with a specific marker, and the specific marker is a marker set in the target scene for image size calibration;

The determining module 702 is further configured to determine the camera pose of the panoramic image according to the position of the image point of the same name and the specific marker, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.

Optionally, the receiving module 701 is further configured to receive a position range of a specific marker of the panoramic image from the terminal;

The determining module 702 is further configured to determine the position of the specific marker from the range of the position of the specific marker.

Optionally, the receiving module 701 is further configured to receive a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the position of the panoramic image sequence determined by the terminal Camera pose

The determining module 702 is further configured to determine a second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.

Optionally, the panoramic image sequence satisfies a first preset condition,

The first preset condition includes at least one of the following:

The degree of overlap of the second panoramic image in the panoramic image sequence is greater than or equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the overlap of the second panoramic image is Degree is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image; and,

The number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and,

The blur degree of the panoramic image satisfies the second preset condition; and,

The exposure of the panoramic image satisfies the third preset condition; and,

The error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is determined according to the same-name image points of the panoramic image sequence, and the same-name image points are all The image points of the image pair whose overlap degree meets the preset condition in the target panoramic image sequence.

Please refer to FIG. 8, which is a schematic diagram of another embodiment of a terminal in an embodiment of this application;

FIG. 8 shows a block diagram of a part of the structure of a terminal provided in an embodiment of the present application. The terminal includes: an image acquisition unit 1710, a sensor 1730, a display unit 1740, an input unit 1750, a memory 1760, a processor 1770, and a power supply 1780. Those skilled in the art can understand that the terminal structure shown in FIG. 8 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or a combination of certain components, or different component arrangements.

The following describes each component of the terminal in detail with reference to Figure 8:

The image acquisition unit 1710 is used to acquire a panoramic image. In the embodiment of the present application, it is used to acquire an image of a target scene. The image acquisition unit 1710 can acquire a panoramic image through a panoramic camera. The panoramic camera is provided with at least two fisheye lenses, or more A wide-angle lens, or multiple ordinary lenses, the details are not limited here. The images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited. The image acquisition unit can be connected to the panoramic camera via wired or wireless connection, including TypeC, Bluetooth or wifi, etc. The specific connection form is not limited here.

The display unit 1740 may be used to display information input by the user or information provided to the user and various menus of the terminal. Including the panoramic image used to show the collection. The display unit 1740 may include a display panel 1741. Optionally, the display panel 1741 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 1751 can cover the display panel 1741. When the touch panel 1751 detects a touch operation on or near it, it transmits it to the processor 1770 to determine the type of the touch event, and then the processor 1770 responds to the touch event. The type provides corresponding visual output on the display panel 1741. Although in FIG. 8, the touch panel 1751 and the display panel 1741 are used as two independent components to realize the input and input functions of the terminal, but in some embodiments, the touch panel 1751 and the display panel 1741 can be integrated. Realize the input and output functions of the terminal.

The input unit 1750 can be used to receive input digital or character information, and generate key signal input related to user settings and function control of the terminal. Specifically, the input unit 1750 may include a touch panel 1751 and other input devices 1752. The touch panel 1751, also known as a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1751 or near the touch panel 1751. Operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 1751 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1770, and can receive and execute the commands sent by the processor 1770. In addition, the touch panel 1751 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1751, the input unit 1750 may also include other input devices 1752. Specifically, the other input device 1752 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.

The memory 1760 may be used to store software programs and modules. The processor 1770 executes various functional applications and data processing of the terminal by running the software programs and modules stored in the memory 1760. The memory 1760 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of the terminal, etc. In addition, the memory 1760 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.

The processor 1770 is the control center of the terminal. It uses various interfaces and lines to connect various parts of the entire terminal. It executes by running or executing software programs and/or modules stored in the memory 1760, and calling data stored in the memory 1760. Various functions of the terminal and processing data, so as to monitor the terminal as a whole. Optionally, the processor 1770 may include one or more processing units; preferably, the processor 1770 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1770.

The terminal also includes a power supply 1780 (such as a battery) for supplying power to various components. Preferably, the power supply may be logically connected to the processor 1770 through a power management system, so that functions such as charging, discharging, and power management are realized through the power management system.

Although not shown, optionally, the terminal may include an audio circuit, which includes a speaker and a microphone, and may provide an audio interface between the user and the terminal.

Although not shown, optionally, the terminal can include a wireless fidelity (WiFi) module. WiFi is a short-distance wireless transmission technology. The terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module. , It provides users with wireless broadband Internet access.

Although not shown, optionally, the terminal may further include a radio frequency (RF) circuit.

Although not shown, the terminal may also include a Bluetooth module, etc., which will not be repeated here.

Although not shown, optionally, the terminal may also include a GPS module.

Although not shown, optionally, the terminal may also include at least one sensor, such as a magnetometer, an inertial measurement unit gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and other sensors, which will not be repeated here.

In the embodiment of the present application, the processor 1770 included in the terminal also has the function of implementing the foregoing data processing methods.

Please refer to FIG. 9, which is a schematic diagram of another embodiment of the server in the embodiment of this application;

The server provided in this embodiment may be an independent computer device or a virtual machine VM. The virtual machine may run on one computer device or be located on multiple computer devices. The virtual machine can also be a computing and transmission resource that does not depend on an independent computer device, but is divided from a resource pool. Different processors on one computer device or different processors on multiple computer devices are not specifically limited here.

The server 1800 may have relatively large differences due to different configurations or performances, and may include one or more processors 1801 and a memory 1802, and the memory 1802 stores programs or data.

Among them, the memory 1802 may be volatile storage or non-volatile storage. Optionally, the processor 1801 is one or more central processing units (CPU), and the CPU may be a single-core CPU or a multi-core CPU. The processor 1801 may communicate with the memory 1802, and execute a series of instructions in the memory 1802 on the server 1800.

The server 1800 also includes one or more wired or wireless network interfaces 1803, such as an Ethernet interface.

Optionally, although not shown in FIG. 9, the server 1800 may also include one or more power supplies; one or more input and output interfaces, which can be used to connect a display, a mouse, a keyboard, a touch screen device, or a sensor device Etc., the input/output interface is an optional component, which may or may not exist, and is not limited here.

For the process executed by the processor 1801 in the server 1800 in this embodiment, reference may be made to the method process described in the foregoing method embodiment, and details are not described herein.

Please refer to FIG. 10, which is a schematic diagram of an embodiment of a data processing device in an embodiment of the application;

The panoramic camera is provided with at least two fisheye lenses, or multiple wide-angle lenses, or multiple ordinary lenses, which are not specifically limited here. The images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited.

Terminals, including mobile phones, tablet computers, or personal digital assistants (personal digital assistants, PDAs), etc., are not specifically limited here.

The terminal hardware module includes: processor, memory and communication interface. Among them, the communication interface is used to realize a communication connection with the panoramic camera, including various forms of wired or wireless connection, including TypeC, Bluetooth, or wifi, etc. The specifics are not limited here. Memory: A storage medium used to store data and codes. Processor: A processor used to execute code, such as an ARM processor.

The terminal software module includes: a camera control module and a camera pre-detection module. Camera control module: used to control the panoramic camera to take pictures and transfer the image from the panoramic camera to the mobile phone. Camera pre-detection algorithm and software: The algorithm and software proposed in this application are used to perform quality detection on the captured image to determine whether the image meets the requirements for image pose calculation.

The server may be an independent computer device or a virtual machine VM. The virtual machine may run on one computer device or be located on multiple computer devices. The virtual machine can also be a computing and transmission resource that does not depend on an independent computer device, but is divided from a resource pool. Different processors on one computer device or different processors on multiple computer devices are not specifically limited here.

The server hardware module includes: processor, memory and communication interface. The memory may be volatile storage or non-volatile storage, and programs or data are stored in the memory.

The software running on the server includes an optimized image pose calculation module, a database, and modeling software; an optimized image pose calculation module: the improved algorithm proposed in this application is used to improve the image pose calculation and improve the accuracy of the pose calculation. The data is used to store images, image pose parameters, survey data and modeling data, etc. The modeling software uses the image and its pose information in the database to model the scene for subsequent simulations. In addition, although not shown in the figure, the server can also include binocular ranging software to calculate the image pose and measure the target Survey information such as location and size.

Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

Claims

A data processing method, characterized in that it comprises:

The terminal acquires a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images shot on a target scene in different poses, and the plurality of panoramic images includes a first panoramic image and a second panoramic image that are continuously shot;

In the case that the degree of overlap of the second panoramic image is greater than or equal to the first threshold, the terminal sends a target panoramic image sequence including the first panoramic image and the second panoramic image to the server, and the second panoramic image The degree of overlap of the panoramic image is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image, the target panoramic image sequence is a part of the plurality of panoramic images, and The target panoramic image sequence is used for the three-dimensional reconstruction of the target scene.
The method according to claim 1, wherein the second panoramic image is an image continuously captured after the first panoramic image is captured.
The method according to claim 1 or 2, wherein the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are set in the target scene for Marker for image size calibration.
The method according to any one of claims 1 to 3, wherein the target panoramic image sequence further includes a third panoramic image captured with a specific marker, and the specific marker is set on the target A marker used for image size calibration in the scene, and the method further includes:

Determining, by the terminal, the position range of the specific marker in the third panoramic image;

The terminal sends the location range to the server, where the location range is used to determine the location of the specific landmark in the third panoramic image.
The method according to any one of claims 1 to 4, characterized in that:

The error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, the camera pose of the target panoramic image sequence is determined by performing the pose restoration according to the image points of the target panoramic image sequence with the same name. The point is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.
The method according to claim 5, wherein the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
The method according to claim 5 or 6, characterized in that:

The camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene onto a three-dimensional spherical surface according to the camera pose of the target panoramic image sequence, and the object point is in the target panoramic image sequence. The image points in the image sequence are converted to the spherical distance between the points formed by the three-dimensional spherical surface.
The method according to any one of claims 4 to 7, wherein the method further comprises:

The terminal sends the camera pose of the target panoramic image sequence to the server, where the camera pose is used to achieve three-dimensional reconstruction of the target scene.
The method according to any one of claims 1 to 8, wherein the target panoramic image sequence further satisfies a second preset condition;

The second preset condition includes at least one of the following:

The blur degree of the panoramic image satisfies a preset third preset condition; and,

The exposure of the panoramic image satisfies a preset fourth preset condition; and,

The proportion of the invalid area of the panoramic image is less than or equal to a fifth threshold, the invalid area includes an area outside the captured target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and Sky area.
A data processing method, characterized in that it comprises:

The server receives a panoramic image sequence sent by the terminal, where the panoramic image sequence includes a plurality of panoramic images sequentially shot of the target scene in different poses;

The server determines the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or equal to the first Threshold.
The method according to claim 10, wherein the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
The method according to claim 10 or 11, wherein the method further comprises:

The server detects an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area;

The image point with the same name is an image point outside the invalid area in the panoramic image.
The method according to any one of claims 10 to 12, wherein the panoramic image sequence includes a panoramic image shot with a specific marker, and the specific marker is set in the target scene for Markers for image size calibration;

The server determining the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence with the same name includes:

The server determines the camera pose of the panoramic image according to the position of the image point with the same name and the specific landmark, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.
The method according to claim 13, wherein the method further comprises:

Receiving, by the server, the position range of the specific landmark of the panoramic image from the terminal;

The server determines the location of the specific landmark from the location range of the specific landmark.
The method according to any one of claims 10 to 14, wherein the method further comprises:

The server receives a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the camera pose of the panoramic image sequence determined by the terminal;

The server determines the second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.
The method according to any one of claims 10 to 15, wherein the panoramic image sequence satisfies a first preset condition,

The first preset condition includes at least one of the following:

The degree of overlap of the second panoramic image in the panoramic image sequence is greater than or equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the overlap of the second panoramic image is Degree is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image; and,

The number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and,

The blur degree of the panoramic image satisfies the second preset condition; and,

The exposure of the panoramic image satisfies the third preset condition.
A terminal, characterized in that it comprises:

An acquiring module, configured to acquire a panoramic image sequence, the panoramic image sequence including a plurality of panoramic images shot of a target scene in different poses, the plurality of panoramic images including a first panoramic image and a second panoramic image that are continuously shot;

A sending module, configured to send a target panoramic image sequence including the first panoramic image and the second panoramic image to the server when the degree of overlap of the second panoramic image is greater than or equal to a first threshold, and The degree of overlap of the second panoramic image is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image, and the target panoramic image sequence is a part of the plurality of panoramic images, The target panoramic image sequence is used for three-dimensional reconstruction of the target scene.
The terminal according to claim 17, wherein the second panoramic image is an image continuously captured after the first panoramic image is captured.
The terminal according to claim 17 or 18, wherein the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are set in the target scene for Marker for image size calibration.
The terminal according to any one of claims 17 to 19, wherein the target panoramic image sequence further includes a third panoramic image shot with a specific marker, and the specific marker is set on the target A marker used for image size calibration in a scene, and the terminal further includes:

A determining module, configured to determine the location range of a specific marker in the third panoramic image;

The sending module is further configured to send the location range to the server, where the location range is used to determine the location of the specific landmark in the third panoramic image.
The terminal according to any one of claims 17 to 20, wherein:

The error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, the camera pose of the target panoramic image sequence is determined by performing the pose restoration according to the image points of the target panoramic image sequence with the same name. The point is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.
The terminal according to claim 21, wherein the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method (GMS).
The terminal according to claim 21 or 22, wherein the camera pose error of the target panoramic image sequence is: projecting object points in the target scene to the camera pose according to the target panoramic image sequence The spherical distance between the point formed by the three-dimensional spherical surface and the image point of the object point in the target panoramic image sequence converted to the point formed by the three-dimensional spherical surface.
The terminal according to any one of claims 20 to 23, wherein the sending module is further configured to:

Sending the camera pose of the target panoramic image sequence to the server, where the camera pose is used to achieve three-dimensional reconstruction of the target scene.
The terminal according to any one of claims 17 to 24, wherein the target panoramic image sequence further satisfies a second preset condition;

The second preset condition includes at least one of the following:

The blur degree of the panoramic image satisfies a preset third preset condition; and,

The exposure of the panoramic image satisfies a preset fourth preset condition; and,

The proportion of the invalid area of the panoramic image is less than or equal to a fifth threshold, the invalid area includes an area outside the captured target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and Sky area.
A server, characterized in that it comprises:

A receiving module, configured to receive a panoramic image sequence sent by a terminal, the panoramic image sequence including a plurality of panoramic images sequentially shot of a target scene in different poses;

The determining module is configured to determine the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or equal to The first threshold.
The server according to claim 26, wherein the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method (GMS).
The server according to claim 26 or 27, wherein the server further comprises:

The detection module is configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area; The image point with the same name is an image point outside the invalid area in the panoramic image.
The server according to any one of claims 26 to 28, wherein the panoramic image sequence includes a panoramic image shot with a specific marker, and the specific marker is set in the target scene for Markers for image size calibration;

The determining module is further configured to determine the camera pose of the panoramic image according to the position of the image point of the same name and the specific marker, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.
The server according to claim 29, wherein:

The receiving module is further configured to receive the position range of the specific marker of the panoramic image from the terminal;

The determining module is also used to determine the position of the specific marker from the range of the position of the specific marker.
The server according to any one of claims 26 to 30, wherein:

The receiving module is further configured to receive a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the camera pose of the panoramic image sequence determined by the terminal;

The determining module is further configured to determine a second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.
The server according to any one of claims 26 to 31, wherein the panoramic image sequence satisfies a first preset condition,

The first preset condition includes at least one of the following:

The degree of overlap of the second panoramic image in the panoramic image sequence is greater than or equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the overlap of the second panoramic image is Degree is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image; and,

The number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and,

The blur degree of the panoramic image satisfies the second preset condition; and,

The exposure of the panoramic image satisfies the third preset condition; and,

The error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is determined according to the same-name image points of the panoramic image sequence, and the same-name image points are all The image points of the image pair whose overlap degree meets the preset condition in the target panoramic image sequence.
A terminal, characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is used to call The program instructions execute the method according to any one of claims 1-9.
A server, characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is used to call The program instructions execute the method according to any one of claims 10 to 16.
A computer program product containing instructions, which is characterized in that when it runs on a computer, the computer executes the method according to any one of claims 1 to 16.
A computer-readable storage medium, comprising instructions, characterized in that, when the instructions are run on a computer, the computer executes the method according to any one of claims 1 to 16.