WO2021136386A1 - Procédé de traitement de données, terminal et serveur - Google Patents

Procédé de traitement de données, terminal et serveur Download PDF

Info

Publication number
WO2021136386A1
WO2021136386A1 PCT/CN2020/141440 CN2020141440W WO2021136386A1 WO 2021136386 A1 WO2021136386 A1 WO 2021136386A1 CN 2020141440 W CN2020141440 W CN 2020141440W WO 2021136386 A1 WO2021136386 A1 WO 2021136386A1
Authority
WO
WIPO (PCT)
Prior art keywords
panoramic image
image
panoramic
target
image sequence
Prior art date
Application number
PCT/CN2020/141440
Other languages
English (en)
Chinese (zh)
Inventor
黄山
谭凯
王硕
杜斯亮
方伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021136386A1 publication Critical patent/WO2021136386A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/08
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • H04N17/004Diagnosis, testing or measuring for television systems or their details for digital television systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture

Definitions

  • This application relates to the field of image measurement technology, and in particular to a data processing method, terminal and server.
  • information about the survey site is required, including site size, size, model, and location of the equipment in the site, as well as the connection relationship between the equipment, and relative positions.
  • an operator holds a camera to shoot a frame image based on the center projection in the tested scene, and uploads the frame image to a server after completing the collection, and the server calculates the image pose based on the frame image, thereby realizing site digitization.
  • the embodiment of the application provides a data processing method for 3D reconstruction of a target scene based on a panoramic image, which can reduce the skill requirements of image collectors, improve the success rate of image 3D reconstruction, and avoid image collectors repeatedly going to the station to collect data .
  • the first aspect of the present application provides a data processing method, including: a terminal acquires a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images shot on a target scene in different poses, and the plurality of panoramic images includes a continuously shot first A panoramic image and a second panoramic image; in the case that the degree of overlap of the second panoramic image is greater than or equal to the first threshold, the terminal sends a target panoramic image sequence including the first panoramic image and the second panoramic image to the server ,
  • the degree of overlap of the second panoramic image is the proportion of the overlapping area of the second panoramic image and the first panoramic image in the second panoramic image
  • the target panoramic image sequence is a part of the plurality of panoramic images
  • the target panoramic image The image sequence is used for the three-dimensional reconstruction of the target scene.
  • the terminal after the terminal acquires the panoramic image sequence, it can detect the degree of overlap between the first panoramic image and the second panoramic image obtained by continuous shooting, and when the degree of overlap is greater than or equal to the first threshold ,
  • the target panoramic image sequence including the first panoramic image and the second panoramic image may be sent to the server for the three-dimensional reconstruction of the target scene. Since the information collected by the panoramic image is comprehensive, the skill requirements for the image collector can be reduced.
  • the terminal can reduce the unqualified rate of images by screening the panoramic images, and avoid the image collectors from repeatedly going to the station to collect data.
  • the second panoramic image is an image continuously captured after the first panoramic image is captured.
  • the second panoramic image is an image taken after the first panoramic image.
  • the terminal collecting the panoramic image sequence requires a period of continuous shooting.
  • the second panoramic image will be Perform overlap detection on the previous first panoramic image, which can instantly obtain whether the overlap degree of the second panoramic image meets the preset requirements, so that image collectors can quickly determine whether the current captured image is qualified, which is convenient for instant correction when the captured image is unqualified. To avoid re-acquisition of the panoramic image sequence caused by the unqualified overlap after the entire group of panoramic image sequence is taken.
  • the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are set in the target scene for image size calibration. landmark.
  • the data processing method provided by the embodiments of the present application detects specific markers appearing in multiple panoramic images of a panoramic image sequence, and determines that the number of specific markers in the panoramic image sequence is greater than or equal to a preset second threshold. It needs to be explained However, if the specific marker set in the target scene consists of multiple markers, the number of each marker in the panoramic image sequence can be detected separately, and the number of each marker can be determined to be greater than or equal to the preset threshold. Optionally, the threshold for the quantity of each marker is the same. Since the number of specific markers in the panoramic image sequence is greater than or equal to the preset threshold, the position of the specific marker in the target scene can be determined according to the panoramic image sequence, which can be used for image size calibration to improve the accuracy of 3D modeling degree.
  • the target panoramic image sequence further includes a third panoramic image captured with a specific marker, and the specific marker is set in the target scene for image size calibration.
  • the method further includes: the terminal determines the location range of the specific landmark in the third panoramic image; the terminal sends the location range to the server, and the location range is used to determine the location range in the third panoramic image The location of specific markers.
  • the terminal can detect the position range of the specific marker in the third panoramic image with the specific marker, and send the position range to the server for determining the precise position of the specific marker, avoiding Detecting specific markers in all the third panoramic images can reduce the amount of calculation.
  • the error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, and the camera pose of the target panoramic image sequence is based on the image point of the target panoramic image sequence with the same name.
  • the image point with the same name is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.
  • the terminal can perform camera pose estimation and determine a panoramic image sequence with an error less than or equal to a preset threshold as the target panoramic image sequence, thereby improving the success rate of three-dimensional reconstruction of the target scene.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the data processing method provided by the embodiments of this application determines the image points with the same name by projecting the image pairs whose overlap degree meets the first preset condition in the target panoramic image sequence onto the three-dimensional spherical surface, and can quickly eliminate the image points through the method of grid division and motion statistics characteristics. Mismatches, in order to improve the stability of the match.
  • the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene to a three-dimensional sphere according to the camera pose of the target panoramic image sequence , And the spherical distance between the image point of the object point in the target panoramic image sequence converted to the point formed by the three-dimensional spherical surface.
  • the coordinates of a feature point in the world coordinate system are back-projected to the image point coordinates on the image, and the distance between the coordinates of the image point with the same name in the image corresponding to the object point can be used
  • calculating the camera pose error in a three-dimensional spherical surface can reduce the amount of calculation.
  • the method further includes: the terminal sends a camera pose of the target panoramic image sequence to the server, where the camera pose is used to achieve three-dimensional reconstruction of the target scene.
  • the terminal can send the camera pose of the target panoramic image sequence to the server to realize the three-dimensional reconstruction of the target scene.
  • the server after the server obtains the camera pose sent by the terminal, it can The camera pose is used as the initial pose for calculation, reducing the amount of calculation and improving the speed of three-dimensional reconstruction.
  • the target panoramic image sequence further satisfies a second preset condition;
  • the second preset condition includes at least one of the following: the blur degree of the panoramic image satisfies a preset third Preset condition; and, the exposure of the panoramic image satisfies a preset fourth preset condition; and, the proportion of the invalid area of the panoramic image is less than or equal to the fifth threshold, and the invalid area includes the captured target scene Outside the area, the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area.
  • the terminal can filter the acquired panoramic image according to a variety of possible preset condition combinations, including image quality indicators such as blurriness, exposure, and the proportion of invalid areas of the image, and filter out
  • image quality indicators such as blurriness, exposure, and the proportion of invalid areas of the image
  • a second aspect of the embodiments of the present application provides a data processing method, including: a server receives a panoramic image sequence sent by a terminal, the panoramic image sequence includes a plurality of panoramic images sequentially shot of a target scene in different poses; the server The camera pose of the panoramic image sequence is determined according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or equal to the first threshold.
  • the server after the server receives the panoramic image sequence sent by the terminal, the error of the camera pose is less than or equal to the first threshold. In this way, the server has a higher success rate in achieving three-dimensional reconstruction based on the panoramic image.
  • the image point with the same name is an image point of an image pair whose overlap degree meets a preset condition in the panoramic image sequence
  • image pairs whose overlap degree meets a preset condition are used for image matching to determine points with the same name, which can improve calculation efficiency.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the data processing method provided by the embodiment of the application projects the image pairs whose overlap degree meets the first preset condition in the target panoramic image sequence to the three-dimensional spherical surface to determine the image points with the same name. Errors can be quickly eliminated through the method of grid division and motion statistics characteristics. Matching to improve the stability of matching.
  • the server detects an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: pedestrian area, road Vehicle area and sky area; the image point with the same name is the image point outside the invalid area in the panoramic image.
  • the server detects invalid areas in a panoramic image, searches for pixels with the same name in the effective image area after removing the invalid areas, and performs image matching, which can improve the efficiency of determining pixels with the same name.
  • the panoramic image sequence includes a panoramic image captured with a specific marker, and the specific marker is a marker set in the target scene for image size calibration;
  • the server determining the camera pose of the panoramic image sequence according to the image point of the panoramic image sequence with the same name includes: the server determines the camera pose of the panoramic image according to the position of the image point of the same name and the specific marker, and the camera pose is used for Realize the three-dimensional reconstruction of the target scene.
  • the panoramic image sequence includes a panoramic image captured with a specific marker for image size calibration.
  • the image point of the same name and the position of the specific marker determine the camera pose of the panoramic image, The calculation accuracy can be improved.
  • the server receives the location range of the specific marker of the panoramic image from the terminal; the server determines the location of the specific marker from the location range of the specific marker.
  • the server receives the position range of the specific marker sent by the terminal, and determines the precise position of the specific marker according to the position range, avoiding the detection of the specific marker in the entire panoramic image, which can reduce the amount of calculation .
  • the method further includes: the server receives a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the panoramic image determined by the terminal The camera pose of the sequence; the server determines the second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.
  • the server receives the camera pose sent from the panoramic image sequence from the terminal, and the camera pose can be used as the initial pose for calculation, which reduces the amount of calculation and improves the three-dimensional reconstruction speed.
  • the panoramic image sequence satisfies a first preset condition
  • the first preset condition includes at least one of the following: the degree of overlap of the second panoramic image in the panoramic image sequence is greater than or Equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the degree of overlap of the second panoramic image is that the overlap area between the second panoramic image and the first panoramic image occupies the The proportion of the second panoramic image; and, the number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and, the blur degree of the panoramic image satisfies the second preset condition; and, The exposure of the panoramic image satisfies the third preset condition; and, the error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is performed according to the image points of the panoramic image sequence with the same name.
  • the posture restoration is determined, and the image point with the same name is the image
  • the server can filter the acquired panoramic image sequence according to a variety of possible preset condition combinations, including image quality indicators such as blurriness and exposure, overlap of continuously captured images, and specific signs The number of objects, etc., the selected target image sequence that satisfies the preset conditions has a higher success rate in achieving the three-dimensional reconstruction of the target scene.
  • a third aspect of the embodiments of the present application provides a terminal, including: an acquisition module, configured to acquire a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images taken of a target scene in different poses, and the plurality of panoramic images includes The first panoramic image and the second panoramic image taken continuously; the sending module is configured to send the first panoramic image and the second panoramic image to the server when the degree of overlap of the second panoramic image is greater than or equal to the first threshold.
  • a target panoramic image sequence of a panoramic image is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image, and the target panoramic image sequence is the plurality of panoramic images A part of the image, the target panoramic image sequence is used for the three-dimensional reconstruction of the target scene.
  • the second panoramic image is an image continuously captured after the first panoramic image is captured.
  • the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific marker is set in the target scene for image size calibration landmark.
  • the target panoramic image sequence further includes a third panoramic image captured with a specific marker, and the specific marker is set in the target scene for image size calibration.
  • a landmark the terminal further includes: a determining module for determining the location range of the specific landmark in the third panoramic image; the sending module is also used for sending the location range to the server, and the location range is used for determining The location of the specific marker in the third panoramic image.
  • the error of the camera pose of the target panoramic image sequence is less than or equal to the third threshold, and the camera pose of the target panoramic image sequence is based on the image points of the target panoramic image sequence with the same name.
  • the image point with the same name is the image point of the image pair whose overlap degree meets the first preset condition in the target panoramic image sequence.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene to a three-dimensional spherical surface according to the camera pose of the target panoramic image sequence , And the spherical distance between the image point of the object point in the target panoramic image sequence converted to the point formed by the three-dimensional spherical surface.
  • the sending module is further configured to send the camera pose of the target panoramic image sequence to the server, where the camera pose is used to realize the three-dimensional reconstruction of the target scene.
  • the target panoramic image sequence further satisfies a second preset condition;
  • the second preset condition includes at least one of the following: the blur degree of the panoramic image satisfies a preset third Preset condition; and, the exposure of the panoramic image satisfies a preset fourth preset condition; and, the proportion of the invalid area of the panoramic image is less than or equal to the fifth threshold, and the invalid area includes the captured target scene Outside the area, the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area.
  • the fourth aspect of the embodiments of the present application provides a server, which is characterized by comprising: a receiving module, configured to receive a panoramic image sequence sent by a terminal, the panoramic image sequence including a plurality of sequentially photographed target scenes in different poses Panoramic image; determination module for determining the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence to achieve the three-dimensional reconstruction of the target scene.
  • the error of the camera pose of the panoramic image sequence is less than or equal to the first A threshold.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the server further includes: a detection module configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes At least one of the following: pedestrian area, road vehicle area, and sky area; the image point with the same name is an image point outside the invalid area in the panoramic image.
  • a detection module configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes At least one of the following: pedestrian area, road vehicle area, and sky area; the image point with the same name is an image point outside the invalid area in the panoramic image.
  • the panoramic image sequence includes a panoramic image captured with a specific marker, and the specific marker is a marker set in the target scene for image size calibration;
  • the determining module is also used to determine the camera pose of the panoramic image according to the position of the image point with the same name and the specific marker, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.
  • the receiving module is further used to receive the position range of the specific marker of the panoramic image from the terminal; the determining module is also used to obtain the position range of the specific marker from the terminal. Determine the location of the specific marker in the range.
  • the receiving module is further configured to receive a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the panoramic image determined by the terminal The camera pose of the sequence; the determining module is also used to determine the second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.
  • the panoramic image sequence satisfies a first preset condition
  • the first preset condition includes at least one of the following: the degree of overlap of the second panoramic image in the panoramic image sequence is greater than or Equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the degree of overlap of the second panoramic image is that the overlap area between the second panoramic image and the first panoramic image occupies the The proportion of the second panoramic image; and, the number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications; and, the blur degree of the panoramic image satisfies the second preset condition; and, The exposure of the panoramic image satisfies the third preset condition; and, the error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is performed according to the image points of the panoramic image sequence with the same name.
  • the posture restoration is determined, and the image point with the same name is the image
  • the fifth aspect of the embodiments of the present application provides a terminal, which is characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions.
  • the processor is used to call the program instructions to execute the method in any one of the foregoing first aspect and various possible implementation manners.
  • the sixth aspect of the embodiments of the present application provides a server, which is characterized in that it includes a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions.
  • the processor is used to call the program instructions to execute the method in any one of the foregoing second aspect and various possible implementation manners.
  • a seventh aspect of the embodiments of the present application provides a data processing device, which is characterized by comprising a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions
  • the processor is used to call the program instructions to execute the method in any one of the foregoing first aspect, second aspect and various possible implementation manners.
  • the eighth aspect of the embodiments of the present application provides a computer program product containing instructions, which is characterized in that, when it runs on a computer, the computer executes the first and second aspects and various possible implementation manners as described above. Any one of the methods.
  • the ninth aspect of the embodiments of the present application provides a computer-readable storage medium, including instructions, which are characterized in that, when the instructions are run on a computer, the computer executes the first and second aspects as well as various possible possibilities. Any one of the ways to achieve this.
  • a tenth aspect of the embodiments of the present application provides a chip including a processor.
  • the processor is used to read and execute the computer program stored in the memory to execute the method in any possible implementation manner of any one of the foregoing aspects.
  • the chip should include a memory, and the memory and the processor are connected to the memory through a circuit or a wire.
  • the chip further includes a communication interface, and the processor is connected to the communication interface.
  • the communication interface is used to receive data and/or information that needs to be processed.
  • the processor obtains the data and/or information from the communication interface, processes the data and/or information, and outputs the processing result through the communication interface.
  • the interface can be an input and output interface.
  • the skill requirements for the image collector can be reduced.
  • the terminal After the terminal acquires the panoramic image sequence, it will filter the panoramic images whose overlap degree meets the preset overlap degree threshold and send it to the server for 3D reconstruction of the target scene. By filtering the panoramic images, the terminal can reduce the image failure rate and avoid the repetition of image collectors. Collect data on the station.
  • the terminal acquires a panoramic image sequence for a period of continuous shooting.
  • the second panoramic image will be collected and the previous first panoramic image will be detected for overlap, so that the overlap of the second panoramic image can be instantly obtained Satisfy the preset requirements, allowing the image collector to quickly determine whether the current captured image is qualified, which is convenient for instant re-shooting when the captured image is unqualified, and avoids the re-acquisition of the panoramic image sequence caused by the unqualified overlap after the entire group of panoramic image sequences are taken. .
  • the server obtains the filtered panoramic images that meet the preset overlap threshold, which can improve the success rate of 3D reconstruction and prevent image collectors from repeatedly going to the station to collect data.
  • Figure 1 is a schematic diagram of a survey scene in an embodiment of the application
  • FIG. 2 is a schematic diagram of an embodiment of a data processing method in an embodiment of the application
  • FIG. 3 is a schematic diagram of the coordinate system of the camera pose calculation in an embodiment of the application.
  • FIG. 5 is a schematic diagram of another embodiment of a data processing method in an embodiment of the application.
  • FIG. 6 is a schematic diagram of an embodiment of a terminal in an embodiment of the application.
  • FIG. 7 is a schematic diagram of an embodiment of a server in an embodiment of the application.
  • FIG. 8 is a schematic diagram of another embodiment of a terminal in an embodiment of this application.
  • FIG. 9 is a schematic diagram of another embodiment of a server in an embodiment of the application.
  • FIG. 10 is a schematic diagram of an embodiment of a data processing device in an embodiment of the application.
  • the embodiment of the application provides a data processing method for 3D reconstruction of a target scene based on a panoramic image, which can reduce the skill requirements of image collectors, improve the success rate of image 3D reconstruction, and avoid image collectors repeatedly going to the station to collect data .
  • Panoramic image A broad panoramic image refers to a wide-angle image, that is, an image with a larger angle of view.
  • the embodiments of this application specifically refer to images with a horizontal viewing angle of 360 degrees and a vertical viewing angle of 180 degrees.
  • the panoramic image can be realized by different projection methods, including: equiangular projection, equirectangular projection, and orthogonal projection Sum equal area projection, etc., are not specifically limited here. Since panoramic images need to project as large a three-dimensional scene as possible into a limited two-dimensional image plane for presentation and storage, the panoramic images produced by various projection methods have large image distortions.
  • Panoramic camera used to collect panoramic images.
  • the panoramic camera is equipped with at least two fisheye lenses, or multiple wide-angle lenses, or multiple ordinary lenses, which are not specifically limited here.
  • the images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited.
  • Panoramic image sequence a series of panoramic images obtained sequentially and continuously on a target at different times and different positions. This application mainly relates to a series of panoramic images obtained by continuous shooting of a scene or site.
  • Image overlap area refers to the image area that contains the same object in two images.
  • the ratio of the image overlap area to the entire image is the degree of overlap.
  • Image pose refers to the camera pose when the image is taken.
  • the camera pose refers to the position of the camera in space and the orientation of the camera, which can be regarded as the translational transformation and rotation transformation of the camera from the original reference position to the current position. .
  • Image matching From two images with overlapping information, extract the feature points of each image and the feature vector corresponding to the feature point as a descriptor, and use the feature vector to determine the image point corresponding to the same object point in the two images. Multiple image points corresponding to the same object point in different images are called image points with the same name.
  • GMS Grid-based motion statistics
  • SFM structure from motion, a technology to determine the sparse point cloud of the subject and the pose (position and posture) of the image from multiple images.
  • Binocular measurement a method in which the camera poses corresponding to two images are known, and the three-dimensional coordinates of the object points corresponding to the points are determined by measuring the image points with the same name in the two images.
  • Frame image the center projection image taken by a common mobile phone or SLR.
  • Relative orientation restore or determine the relative relationship of the image pair during photography.
  • PnP peerspective-n-point
  • Forward intersection Determine the three-dimensional coordinates of the object point corresponding to the image point with the same name based on the known poses of the two images and the image point with the same name.
  • the site information by collecting the image information of the site, based on the image information for three-dimensional reconstruction, and then realize the site digitization, can provide data support for subsequent design or operation and maintenance, such as the size of the site, the size of the equipment in the site, model, Location, connection relationship between devices, relative location and other information.
  • the operator holds a frame camera, takes frame images in the tested scene, uses the SFM algorithm in computer vision to calculate the camera pose when the image is taken, and completes data collection with binocular measurement. Due to the limited field of view of a single image of the frame image, in order to meet the requirements of pose calculation, sufficient repeated targets must be included among the collected multiple images. For binocular measurement of targets in the scene, at least 2 images are required. To the target, ordinary data collectors are prone to shooting loopholes due to insufficient professionalism or inexperience.
  • the image collector collects images through the terminal and directly uploads all the collected images to the server side, and then The calculation is performed centrally on the server side. Once the captured image is unqualified during the server calculation process, the image collector needs to go to the station again to shoot, which is time-consuming and laborious.
  • the server is based on the three-dimensional reconstruction of ordinary frame images.
  • the acquired feature points are very similar or lack of sufficient feature points to be continuously tracked and the reconstruction fails.
  • the embodiment of the application provides a data processing method for collecting panoramic images that meet the requirements of site digitization, which can improve the quality of the collected panoramic images and reduce the rate of image collectors returning to the site.
  • Figure 1 is a schematic diagram of a survey scene in an embodiment of this application.
  • a panoramic image is acquired in a site through an image acquisition device.
  • a specific marker is placed in the shooting scene in advance.
  • Common specific markers include targets or benchmarks. Among them, the distance between the two ends of the benchmark It is known that the distance and angle between the targets are known.
  • Figure 1 shows a specific marker, including 3 targets. Among them, the distance between the No. 1 target and the No. 2 target is 1 meter. The distance between the target and the No. 3 target is also 1 meter, and the angle between the line connecting the No. 1 target and the No. 2 and No. 3 targets is 90 degrees. In this way, the 3 targets can define a plane.
  • the No. 1 target can be used as the origin of the coordinate system of the plane, so that the size of the captured image can be calibrated.
  • the image collector uses the image acquisition device to take single-point image shooting.
  • the shooting point selected by the image collector can be a certain distance away from the specific landmark, for example, about one and a half steps away.
  • the details can be changed according to the different shooting scenes, and there is no specific limitation here.
  • the image collection personnel perform image collection according to the "step by step” rule.
  • Figure 1 shows the "step by step” shooting trajectory in this scene.
  • a step refers to a step of the image collector, which can be walking or running or moving in some way.
  • the specific size of the step is not limited here. It should be noted that in order to achieve three-dimensional reconstruction, shooting in a scene A set of images needs to cover the entire scene.
  • a scene here can refer to the inside of a room, or outdoors, and a circle around the target subject. The specific scene is not limited.
  • FIG. 2 is a schematic diagram of an embodiment of the data processing method in the embodiment of the present application.
  • the terminal can obtain a panoramic image and perform image pose estimation.
  • image pose estimation please refer to step 201 to step 208.
  • three-dimensional reconstruction can be directly performed according to the camera pose;
  • the terminal uploads the acquired panoramic image to the server, and the server performs image pose estimation. You can refer to steps 201, 209 to 213.
  • the terminal directly uploads the acquired panoramic image sequence to the server without going through steps 202 to steps. Image screening in 208;
  • the terminal obtains the panoramic image and performs image screening, uploads the panoramic image that meets the preset conditions to the server, and the server implements the image pose estimation.
  • the specific implementation form is not limited here.
  • the third implementation method is taken as an example for detailed introduction.
  • the terminal acquires a panoramic image
  • the image acquisition device is used to collect panoramic images, and it may be a terminal equipped with a panoramic camera, or a panoramic camera with a communication connection with the terminal, and the details are not limited here.
  • the smart terminal plus the peripheral panoramic camera is taken as an example for description.
  • the terminal can connect to the panoramic camera through TypeC, Bluetooth or wifi, and control the panoramic camera to shoot through the client.
  • the panoramic camera takes a panoramic image and sends the panoramic image to the terminal.
  • the terminal obtains the panoramic image, and then processes the panoramic image, realizes image pre-detection, and determines whether the captured panoramic image sequence meets the requirements.
  • a panoramic camera is used to shoot a set of panoramic image sequences for the site scene, including multiple panoramic images.
  • the number of panoramic images included in the panoramic image sequence is not limited here. .
  • the terminal performs image quality detection on the panoramic image
  • the objective image quality indicators include focus, noise, color, exposure, sharpness, etc. This embodiment does not limit the number and types of objective quality indicators selected by the terminal for image quality detection.
  • the terminal detects whether the exposure and blur of the panoramic image meet preset requirements.
  • the following are introduced separately:
  • the exposure degree can also be calculated by directly counting the grayscale histogram of the image, and the calculation method of the exposure degree is not specifically limited here.
  • the terminal After the exposure of the captured image is obtained, it can be compared with the preset exposure threshold to determine whether it is within the threshold range. If it is within the threshold range, the image exposure is appropriate, otherwise it is determined that the image is overexposed or underexposed, optional Yes, the terminal can prompt the user to take an image with abnormal exposure. It should be noted that the specific range of the exposure threshold is not limited here.
  • the gray-scale histogram analysis algorithm is used as an example for introduction.
  • the exposure threshold is set to 1.0. If the calculated value is greater than 1.0, the exposure is considered abnormal.
  • the user may be prompted to delete the image and take another shot.
  • the terminal After calculating the blur degree value of the captured image, it can be compared with the preset blur degree threshold to determine whether the blur degree of the captured image is within the threshold range. If it is within the threshold range, the blur degree of the image is qualified, otherwise it is unqualified. Yes, the terminal can prompt the user that the captured image is blurred.
  • the specific range of the blurriness threshold is not limited here. For example, the blurriness threshold is set to 6.0, and if it is greater than the threshold, the image is considered to be blurred.
  • the user may be prompted to delete the image and take another shot.
  • the terminal can detect the image quality after acquiring the panoramic image, the detection result can be fed back to the image collector in time, which helps to improve the qualification rate of the panoramic image.
  • the terminal performs invalid region detection on the panoramic image.
  • the invalid area is the part of the image that is worthless except for the target scene, such as moving objects, such as data collectors or road vehicles.
  • the terminal presets the type of the invalid area, for example, defines the moving object in the captured image as the invalid area, and calculates the proportion of the invalid area in the captured image.
  • the image recognition method uses the image recognition method to identify the invalid area in the captured image, calculate the proportion of the invalid area in the captured image, and compare it with a preset invalid area threshold. If it is within the threshold range, the image is qualified, otherwise the image Unqualified.
  • the threshold of the effective area can be preset, and the effective area ratio of the image can be calculated, and the details are not described here.
  • the invalid area threshold is 70%
  • the MobileNet model is used to perform semantic segmentation on the panoramic image, identify the range of motion areas in the image, and count the proportion of the motion area of each image in the image. If it is greater than 70%, then determine the The proportion of the invalid area of the image is unqualified.
  • the user may be prompted to delete the image and take another shot.
  • the terminal detects the degree of overlap of two consecutive images
  • the overlap degree of the continuously shot panoramic image sequence can be detected and judge whether the preset overlap degree threshold is met. If the threshold is met, the image overlap degree is qualified, if not satisfied , The panoramic image overlap degree is unqualified, and you need to take another shot.
  • the preset overlap degree threshold is 30%. After acquiring the second panoramic image, the terminal will perform overlap detection with the acquired first panoramic image. If the overlap degree of the two images is greater than or equal to 30%, then Make sure that the overlap of the second image meets the requirements, otherwise you will be prompted to take the second image again. Similarly, after acquiring the third image, the terminal performs overlap detection between the third image and the second image to determine whether the third image meets the overlap requirement.
  • step 205 is an optional step, which may or may not be performed, and the details are not limited here.
  • the terminal performs specific marker statistics on the panoramic image sequence
  • the specific markers in the image can be used as control information to predefine the number threshold of the specific markers appearing in the panoramic image to ensure that the final calculated image pose can be binocular measurement or vector modeling.
  • the number threshold is, for example, 2 or 3, and the specific value is not limited here.
  • the terminal After acquiring a panoramic image sequence of a scene, the terminal will count the specific markers that appear in all panoramic images in the panoramic image sequence. There may be multiple specific markers. If each specific marker appears in the panoramic image sequence If the number is greater than or equal to the preset number threshold of the specific marker, the panoramic image sequence is qualified; otherwise, the entire group of panoramic image sequence is unqualified, prompting the user that the photographing data of the characteristic markers is insufficient and needs to be photographed again. Optionally, according to the detection result, whether the shooting control information meets the requirements is prompted.
  • the terminal needs to identify the specific marker in each panoramic image.
  • use the MobileNet model for image recognition identify the specific marker in each image, and determine whether the number of markers meets the preset specific markers The number threshold.
  • specific marker recognition and invalid region recognition can be combined. For example, after a panoramic image is obtained, the MobileNet model is used to simultaneously identify the invalid region and the specific marker.
  • the terminal counts the number of specific markers in the panoramic image sequence.
  • the specific markers are 3 targets, which are target 1, target 2, and target 3, the terminal needs to analyze each panoramic image. Recognize in the image, determine the type and number of targets, and then count the types and numbers of all the targets in the panoramic image sequence, for example, count the types and numbers of targets in 15 images, the number of target 1 is 4, 2 The number of target No. 3 is 3, the number of No. 3 targets is 5, and the preset threshold for the number of specific markers is 2. Because the number of No. 1, No. 2 and No. 3 targets are all greater than the preset threshold 2. Therefore, it is judged that the number of specific markers of the panoramic image sequence is qualified. If the number of target 1, target 2, or target 3 is less than 2, the panoramic image sequence is unqualified, which can promote the user to take another shot .
  • step 203 the invalid area and specific markers can be identified at the same time, the type and quantity of the specific markers of each panoramic image are recorded, and then in this step, whether the number of specific markers meets the requirements Just ask.
  • image retrieval can be performed on the panoramic image sequence, the overlap relationship between each panoramic image and other images in the panoramic image sequence can be determined, and the image pair used for subsequent image matching can be determined.
  • the degree of overlap between the images is determined after the images are reduced according to a certain method.
  • the number of pixels in the horizontal and vertical directions of the reduced image to the image frame is less than 2000, and then the image retrieval of the reduced panoramic image can reduce the amount of calculation.
  • preset an overlap degree threshold and determine an image whose overlap degree with each image is higher than the overlap degree threshold as the image for image matching, for example, the second to fourth images and the sixth image and the first If the overlap degree of an image is higher than the overlap threshold value, it is determined that the second to fourth images and the sixth image are respectively matched with the first image, that is, four image pairs are determined.
  • Image matching is performed according to the image pair acquired by the image retrieval in step 207 to determine the image point with the same name in the image pair.
  • ORB algorithm is a fast feature point extraction and description algorithm, which is mainly divided into two parts, feature point extraction and feature point description. Feature extraction is developed by the FAST (features from accelerated segment test) algorithm, and the feature point description is improved based on the BRIEF (binary robust independent elementary features) feature description algorithm. This application does not limit the specific types of matching algorithms. The following takes the improved orb algorithm as an example to introduce.
  • the traditional matching algorithm is a matching strategy based on the center projection model of a common frame image.
  • This embodiment proposes a matching strategy based on a three-dimensional spherical coordinate system to search for an image point with the same name in a panoramic spherical space.
  • This application improves on the traditional orb algorithm. After extracting orb feature points, the image is projected onto a three-dimensional spherical surface, and the spherical surface is divided into grids, which are determined by grid-based motion statistics (GMS) For the image points with the same name, the wrong matching can be quickly eliminated through the method of grid division and motion statistics, so as to improve the stability of the matching.
  • GMS grid-based motion statistics
  • the terminal performs camera pose recovery
  • camera pose estimation There are many ways of camera pose estimation, including SFM (structure from motion) technology, or simultaneous positioning and map construction (simultaneous localization and mapping, SLAM) technology, etc., which are not specifically limited here.
  • SFM structure from motion
  • SLAM simultaneous positioning and map construction
  • This embodiment China and Israel introduced camera pose recovery based on SFM technology.
  • This application improves the existing SFM algorithm, and proposes algorithms based on relative orientation, PnP, and forward intersection based on three-dimensional spherical coordinates based on the characteristics of panoramic images.
  • the coordinates of the feature points in the world coordinate system are calculated, and the camera pose corresponding to a single panoramic image, including the coordinates of the image shooting center in the world coordinate system, and the world coordinate system and the shooting center It is the transformation matrix between the three-dimensional spherical coordinate system of the origin.
  • the coordinates of a feature point in the world coordinate system are back projected to the image point coordinates on the image, and the distance between the coordinates of the image point with the same name in the image corresponding to the object point can be used to measure the camera pose error.
  • the standard deviation of the distance from the corresponding image point after the multiple feature points are back-projected to the image is less than or equal to the threshold, it is determined that the camera pose calculation of the image is successful.
  • FIG. 3 is a schematic diagram of a coordinate system for calculating a camera pose in an embodiment of the application.
  • the coordinate systems involved in Figure 3 include: the world coordinate system is O-XYZ, the three-dimensional spherical coordinate system o-p0p1p2, and the image plane coordinate system uv.
  • P point represents an object point in the world coordinate system
  • [X Y Z] is the coordinate of the object point P in the world coordinate system
  • Point o is the shooting center of the image
  • the coordinates of point o in the world coordinate system are [X S Y S Z S ].
  • Point p represents the image space point of the object point P in the spherical projection
  • the coordinate of point p in the three-dimensional spherical coordinate system is [p 0 p 1 p 2 ].
  • Point p' represents the image point of the object point P in the panoramic image, and the coordinates of point p'in the image plane coordinate system are [u, v].
  • R represents a transformation matrix between the world coordinate system and the three-dimensional spherical coordinate system.
  • the pose parameter when calculating the camera pose parameter [X S Y S Z S ] of the panoramic image and the error of the R matrix, for the known object point [XYZ], use the pose parameter to calculate other corresponding three-dimensional spherical coordinates
  • the coordinates in the system [P 0 ′ P 1 ′ P 2 ′], and the coordinates [u′ v′] in the image plane coordinate system, that is, the value of [u′ v′] is calculated according to formulas (1) and (2), Compare the value of [u′ v′] with the input value of [u, v], and calculate the distance between [u′ v′] and [uv] Count the average distances corresponding to all the object points. If the average value is less than the threshold T, the calculated pose is considered valid.
  • the threshold T may be, for example, 6 pixels.
  • represents the scale factor
  • this application provides a simplified algorithm.
  • the following describes the method for calculating the camera pose error in this embodiment.
  • the feature points and object points are back-projected to the three-dimensional spherical coordinate system. Project the image point corresponding to the image to the three-dimensional spherical coordinate system.
  • the three-dimensional coordinate calculation define the arc distance threshold between the point and the point or the point and the line in the three-dimensional space to determine whether the camera pose calculation is successful. The following is introduced:
  • f is the main distance, that is, the distance from the optical center to the imaging surface.
  • the initial pose calculation between images is completed, and the camera pose corresponding to each panoramic image and the sparse point cloud of the shooting scene are obtained.
  • the panoramic image sequence is qualified.
  • the user is prompted to take another shot.
  • the SFM network is constructed based on the spherical coordinates of the three-dimensional image space.
  • the two-dimensional image coordinates are no longer used for relative orientation and PnP calculations, but the two-dimensional image coordinates are converted to the three-dimensional spherical coordinates for calculation. , Which can reduce the amount of calculation.
  • this step can be based on the number of pairs of pixels with the same name between the images.
  • the setting condition can be that the number of pairs of image points with the same name between the uncalculated image and the calculated image is greater than the threshold, such as 15, and the PnP calculation result is valid, thus, the number of optimizations of the beam method can be reduced, and the number of beam method adjustments can be reduced. It is a method of accurately determining the position and posture of an image using optimization methods. The amount of calculation can be reduced, the calculation time can be reduced, and the user experience can be improved.
  • the three-dimensional reconstruction of the target scene can be performed according to the camera pose obtained in this step.
  • the terminal uploads the panoramic image sequence to the server
  • the terminal sends the panoramic image sequence to the server, and the communication method is not limited.
  • the terminal sends the detection result of the specific marker to the server.
  • the terminal sends the camera pose and the sparse point cloud of the shooting scene corresponding to each panoramic image to the server.
  • the server performs interference area detection
  • the server receives the panoramic image sequence sent by the terminal and separately detects the interference area in each panoramic image.
  • the interference area is interference information that does not need to be used for site digitization.
  • the interference area can be predefined, for example, set the moving objects and the sky in the image
  • the area is the interference area. It should be noted that the definition of the interference area and the invalid area in step 203 may be the same or different, and the specific definition is not limited here.
  • the interference area in the image is recognized by the image recognition method, and the mask picture of each image is generated according to the recognition result, the non-interference area is reserved, and the mask picture of each image is generated;
  • the recognition rate of semantic segmentation is low by directly using the image recognition algorithm.
  • each image point of the image can be converted to a three-dimensional spherical coordinate system, and the image can be rotated and transformed to make the original panoramic image
  • the lower area that is, the main area to be recognized is rotated to the equatorial area of the sphere. Since the equatorial area of the sphere has the smallest image distortion after being converted into a two-dimensional image, the recognition rate of image semantic segmentation can be improved.
  • the DeepNet network model is used to perform semantic segmentation, and at the same time, matching interference regions, such as the sky, pedestrians, etc., are identified.
  • an image segmentation (graph cut) algorithm is used to optimize the regions marked as pedestrians to further improve the segmentation accuracy.
  • a corresponding matching mask image is produced.
  • pixels with a gray level of 0 represent interference areas
  • pixels with a gray level of 255 represent non-interference areas.
  • FIG. 4 is a schematic diagram of an image semantic segmentation recognition result in an embodiment of the application.
  • the left image is the input image
  • the right image is the predicted semantic segmentation recognition result.
  • the area A represents the sky
  • the area B represents the effective area
  • the area C represents the feature marker
  • the area D represents the interference area formed by moving objects such as pedestrians.
  • the server detects a specific marker
  • the server determines the number and precise location of the specific marker in the image.
  • the specified information in the specific marker is used as the control information to ensure that the scale of the coordinate system where the image pose is calculated is correct.
  • step 205 the terminal has preliminarily identified the specific marker through pre-detection. Therefore, in this step, the server can perform detection in the local area of the identified specific marker according to the detection result of the specific marker uploaded by the terminal pre-detection. , To determine the precise location of a specific marker.
  • the identification of the specific marker may be performed at the same time as the identification of the invalid area is performed in step 210.
  • the partial area containing the specific marker in the panoramic image is reprojected to the central projection plane to obtain a frame image containing the specific marker, thereby reducing image distortion.
  • a target detection algorithm such as Yolo v3
  • the specific marker is the target, binarize the frame image containing the target, analyze the result of the binarization, determine the target number, and use a circular detector to extract the center point of the target position.
  • a series of images is obtained based on the rotation transformation of the original panoramic image on the three-dimensional spherical coordinate system.
  • the panoramic image contains a specific marker, that is, the partial area of the target can be reprojected to the central projection plane to obtain the containing target.
  • the frame image of the target can reduce the image distortion of the target area and facilitate the precise positioning of the target.
  • the lower area of the original panoramic image that is, the main area to be identified that may have interference, can be rotated to the equator area of the spherical surface and projected to the central projection plane.
  • the image is deformed Minimal, which can improve the recognition rate of image semantic segmentation.
  • the server performs image matching
  • the server obtains a pair of images with a high degree of overlap according to the image retrieval, or, in step 209, the server obtains the detection result of the image pair sent by the terminal, determines the overlap relationship between the images, and then determines the image pair for image matching.
  • the panoramic image and the mask image obtained by the interference area recognition in step 210 feature points are extracted from each image, and the image pair is matched with features, optionally, an orb feature matching algorithm is used for matching.
  • an orb feature matching algorithm is used for matching.
  • the server obtains the image matching result sent by the terminal in step 209, the image matching result can be used as the initial value of the matching, thereby reducing the matching search range and improving the matching speed.
  • the server performs camera pose recovery
  • the server performs camera pose optimization based on the matching result obtained in step 212, and accurately determines the camera pose of each panoramic image and the sparse point cloud of the target scene according to the specific marker extraction result obtained in step 211.
  • the pose information may be used as the initial value of the pose for detection, so as to reduce the amount of calculation and increase the speed of pose calculation.
  • the server realizes site digitization
  • the server stores the pose information and the sparse point cloud of the panoramic image sequence, and then performs stereo measurement and vector modeling based on the panoramic image, thereby realizing the digitization of the site.
  • the terminal obtains the panoramic image, performs pre-detection through the terminal, and sends the panoramic image that meets the preset conditions to the server for the digitization of the site. Since the information collected by the panoramic image is comprehensive, in addition, the terminal passes Filtering images with preset conditions can reduce the rate of image failure and prevent image collectors from repeatedly going to the station to collect data.
  • the data processing method provided in the embodiments of this application takes into account the problem of insufficient professional skills of the collectors, and directly collects panoramic images for data processing; during the collection, the quality of the collected images is checked while collecting, such as exposure and blur To ensure the imaging quality of the image itself; after the collection is completed, the specific sign information is detected on the mobile phone, pre-processing, image matching, and image pose estimation are calculated to ensure that the multiple images captured can be accurately determined. Time position and posture, avoid the second time to collect on-site.
  • the existing SFM algorithm is improved, and the SFM construction based on the spherical coordinate of the image is adopted.
  • the coordinate of the panoramic image is directly transferred to the spherical coordinate for relative orientation and PnP solution.
  • the embodiment of this application uses accelerated optimization of image matching, including reduction of panoramic images, image retrieval, and proposes a spherical grid-based motion statistics method to determine the orb feature points of the same name.
  • it uses RANSAC-based secondary
  • the polynomial model eliminates mismatched points, optimizes the matching results, and realizes fast and highly reliable matching of panoramic images.
  • the traditional SFM construction process is improved by adding only one image at a time. This solution adds multiple images to the already constructed network at a time, which can speed up the network construction and reduce the calculation time. This can solve the problem that the mobile phone cannot perform the panoramic image SFM network construction, and achieve the effect of realizing the mobile phone pre-detection within the user's acceptable time.
  • the data processing method provided in the embodiments of this application can be applied to terminal devices with multiple performances.
  • Table 1 in order to test the performance of the mobile phone pre-detection algorithm, mobile phones with different performances are selected for testing, including high-end devices (such as Huawei mate20pro), mid-range machines (such as Huawei p10), low-end machines (such as Honor 9).
  • high-end devices such as Huawei mate20pro
  • mid-range machines such as Huawei p10
  • low-end machines such as Honor 9
  • the rate of secondary visits has been reduced from 30% to 0, completely eliminating secondary visits.
  • the server algorithm is optimized, the synthesis success rate can reach 90%, which is significantly higher than the existing success rate of 60%.
  • FIG. 5 is a schematic diagram of another embodiment of the data processing method in the embodiment of this application.
  • This application collects 360-degree panoramic image data for site digital information collection, including two parts of work on the mobile phone side and the background service side.
  • the mobile phone side collect panoramic images, perform image quality detection and image preprocessing, and image synthesis pre-detection, specifically the improved SFM pose calculation. Through this pre-detection, it is determined whether the camera position when each image is captured can be correctly estimated. If the requirements are met, the image data will be passed to the background server.
  • the server side performs high-precision image synthesis, including receiving the image and pose data sent by the mobile phone, identifying the interference area of the image, calculating the pose based on eliminating the interference area, and combining the pose data sent by the mobile phone to optimize the image pose calculation Processing to accurately determine the position and posture parameters of the image.
  • FIG. 6 is a schematic diagram of an embodiment of a terminal in an embodiment of this application.
  • the embodiment of the present application provides a terminal, including:
  • the acquiring module 601 is configured to acquire a panoramic image sequence, the panoramic image sequence includes a plurality of panoramic images shot on a target scene in different poses, and the plurality of panoramic images includes a first panoramic image and a second panoramic image that are continuously shot ;
  • the sending module 602 is configured to send a target panoramic image sequence including the first panoramic image and the second panoramic image to the server when the degree of overlap of the second panoramic image is greater than or equal to a first threshold, so
  • the degree of overlap of the second panoramic image is the proportion of the overlapping area of the second panoramic image and the first panoramic image in the second panoramic image
  • the target panoramic image sequence is a part of the plurality of panoramic images
  • the target panoramic image sequence is used for three-dimensional reconstruction of the target scene.
  • the second panoramic image is an image continuously captured after the first panoramic image is captured.
  • the number of specific markers in the target panoramic image sequence is greater than or equal to a second threshold, and the specific markers are markers set in the target scene for image size calibration.
  • the target panoramic image sequence further includes a third panoramic image shot with a specific marker, and the specific marker is a marker set in the target scene for image size calibration, and the terminal Also includes:
  • the determining module 603 is configured to determine the position range of the specific marker in the third panoramic image
  • the sending module 602 is further configured to send the location range to the server, where the location range is used to determine the location of the specific landmark in the third panoramic image.
  • the error of the camera pose of the target panoramic image sequence is less than or equal to a third threshold, and the camera pose of the target panoramic image sequence is determined according to the same-named image points of the target panoramic image sequence,
  • the image point with the same name is an image point of an image pair whose overlap degree meets a first preset condition in the target panoramic image sequence.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the camera pose error of the target panoramic image sequence is a point formed by projecting an object point in the target scene onto a three-dimensional spherical surface according to the camera pose of the target panoramic image sequence, and the object point is The image points in the target panoramic image sequence are converted to the spherical distance between the points formed by the three-dimensional spherical surface.
  • the sending module 602 is further configured to:
  • the target panoramic image sequence also satisfies a second preset condition
  • the second preset condition includes at least one of the following:
  • the blur degree of the panoramic image satisfies a preset third preset condition
  • the exposure of the panoramic image satisfies a preset fourth preset condition
  • the proportion of the invalid area of the panoramic image is less than or equal to a fifth threshold, the invalid area includes an area outside the captured target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and Sky area.
  • FIG. 7 is a schematic diagram of an embodiment of a server in an embodiment of this application.
  • the receiving module 701 is configured to receive a panoramic image sequence sent by a terminal, and the panoramic image sequence includes a plurality of panoramic images sequentially shot of a target scene in different poses;
  • the determining module 702 is configured to determine the camera pose of the panoramic image sequence according to the image points of the panoramic image sequence with the same name, so as to realize the three-dimensional reconstruction of the target scene, and the error of the camera pose of the panoramic image sequence is less than or Equal to the first threshold.
  • the image point with the same name is the image pair projected onto a three-dimensional spherical surface, and is obtained by a grid-based motion statistics method GMS.
  • the server further includes:
  • the detection module 703 is configured to detect an invalid area in the panoramic image, the invalid area includes a captured area outside the target scene, and the invalid area includes at least one of the following: a pedestrian area, a road vehicle area, and a sky area
  • the image point with the same name is the image point outside the invalid area in the panoramic image.
  • the panoramic image sequence includes a panoramic image shot with a specific marker, and the specific marker is a marker set in the target scene for image size calibration;
  • the determining module 702 is further configured to determine the camera pose of the panoramic image according to the position of the image point of the same name and the specific marker, and the camera pose is used to realize the three-dimensional reconstruction of the target scene.
  • the receiving module 701 is further configured to receive a position range of a specific marker of the panoramic image from the terminal;
  • the determining module 702 is further configured to determine the position of the specific marker from the range of the position of the specific marker.
  • the receiving module 701 is further configured to receive a first camera pose of the panoramic image sequence from the terminal, where the first camera pose is the position of the panoramic image sequence determined by the terminal Camera pose
  • the determining module 702 is further configured to determine a second camera pose of the panoramic image sequence according to the first camera pose, and the accuracy of the second camera pose is higher than that of the first camera pose.
  • the panoramic image sequence satisfies a first preset condition
  • the first preset condition includes at least one of the following:
  • the degree of overlap of the second panoramic image in the panoramic image sequence is greater than or equal to a preset second threshold, the second panoramic image is continuously captured after the first panoramic image is captured, and the overlap of the second panoramic image is Degree is the ratio of the overlapping area of the second panoramic image and the first panoramic image to the second panoramic image; and,
  • the number of specific markers in the panoramic image sequence is greater than or equal to a preset threshold for the number of specific marker identifications
  • the blur degree of the panoramic image satisfies the second preset condition
  • the exposure of the panoramic image satisfies the third preset condition.
  • the error of the camera pose of the panoramic image sequence is less than or equal to the third threshold, and the camera pose of the panoramic image sequence is determined according to the same-name image points of the panoramic image sequence, and the same-name image points are all The image points of the image pair whose overlap degree meets the preset condition in the target panoramic image sequence.
  • FIG. 8 is a schematic diagram of another embodiment of a terminal in an embodiment of this application.
  • FIG. 8 shows a block diagram of a part of the structure of a terminal provided in an embodiment of the present application.
  • the terminal includes: an image acquisition unit 1710, a sensor 1730, a display unit 1740, an input unit 1750, a memory 1760, a processor 1770, and a power supply 1780.
  • the terminal structure shown in FIG. 8 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or a combination of certain components, or different component arrangements.
  • the image acquisition unit 1710 is used to acquire a panoramic image. In the embodiment of the present application, it is used to acquire an image of a target scene.
  • the image acquisition unit 1710 can acquire a panoramic image through a panoramic camera.
  • the panoramic camera is provided with at least two fisheye lenses, or more A wide-angle lens, or multiple ordinary lenses, the details are not limited here.
  • the images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited.
  • the image acquisition unit can be connected to the panoramic camera via wired or wireless connection, including TypeC, Bluetooth or wifi, etc.
  • the specific connection form is not limited here.
  • the display unit 1740 may be used to display information input by the user or information provided to the user and various menus of the terminal. Including the panoramic image used to show the collection.
  • the display unit 1740 may include a display panel 1741.
  • the display panel 1741 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 1751 can cover the display panel 1741. When the touch panel 1751 detects a touch operation on or near it, it transmits it to the processor 1770 to determine the type of the touch event, and then the processor 1770 responds to the touch event. The type provides corresponding visual output on the display panel 1741.
  • the touch panel 1751 and the display panel 1741 are used as two independent components to realize the input and input functions of the terminal, but in some embodiments, the touch panel 1751 and the display panel 1741 can be integrated. Realize the input and output functions of the terminal.
  • the input unit 1750 can be used to receive input digital or character information, and generate key signal input related to user settings and function control of the terminal.
  • the input unit 1750 may include a touch panel 1751 and other input devices 1752.
  • the touch panel 1751 also known as a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1751 or near the touch panel 1751. Operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 1751 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1770, and can receive and execute the commands sent by the processor 1770.
  • the touch panel 1751 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the input unit 1750 may also include other input devices 1752.
  • the other input device 1752 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the memory 1760 may be used to store software programs and modules.
  • the processor 1770 executes various functional applications and data processing of the terminal by running the software programs and modules stored in the memory 1760.
  • the memory 1760 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of the terminal, etc.
  • the memory 1760 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the processor 1770 is the control center of the terminal. It uses various interfaces and lines to connect various parts of the entire terminal. It executes by running or executing software programs and/or modules stored in the memory 1760, and calling data stored in the memory 1760. Various functions of the terminal and processing data, so as to monitor the terminal as a whole.
  • the processor 1770 may include one or more processing units; preferably, the processor 1770 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1770.
  • the terminal also includes a power supply 1780 (such as a battery) for supplying power to various components.
  • a power supply 1780 (such as a battery) for supplying power to various components.
  • the power supply may be logically connected to the processor 1770 through a power management system, so that functions such as charging, discharging, and power management are realized through the power management system.
  • the terminal may include an audio circuit, which includes a speaker and a microphone, and may provide an audio interface between the user and the terminal.
  • an audio circuit which includes a speaker and a microphone, and may provide an audio interface between the user and the terminal.
  • the terminal can include a wireless fidelity (WiFi) module.
  • WiFi is a short-distance wireless transmission technology.
  • the terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module. , It provides users with wireless broadband Internet access.
  • the terminal may further include a radio frequency (RF) circuit.
  • RF radio frequency
  • the terminal may also include a Bluetooth module, etc., which will not be repeated here.
  • the terminal may also include a GPS module.
  • the terminal may also include at least one sensor, such as a magnetometer, an inertial measurement unit gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and other sensors, which will not be repeated here.
  • a magnetometer such as a magnetometer, an inertial measurement unit gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and other sensors, which will not be repeated here.
  • the processor 1770 included in the terminal also has the function of implementing the foregoing data processing methods.
  • FIG. 9 is a schematic diagram of another embodiment of the server in the embodiment of this application.
  • the server provided in this embodiment may be an independent computer device or a virtual machine VM.
  • the virtual machine may run on one computer device or be located on multiple computer devices.
  • the virtual machine can also be a computing and transmission resource that does not depend on an independent computer device, but is divided from a resource pool. Different processors on one computer device or different processors on multiple computer devices are not specifically limited here.
  • the server 1800 may have relatively large differences due to different configurations or performances, and may include one or more processors 1801 and a memory 1802, and the memory 1802 stores programs or data.
  • the memory 1802 may be volatile storage or non-volatile storage.
  • the processor 1801 is one or more central processing units (CPU), and the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 1801 may communicate with the memory 1802, and execute a series of instructions in the memory 1802 on the server 1800.
  • the server 1800 also includes one or more wired or wireless network interfaces 1803, such as an Ethernet interface.
  • the server 1800 may also include one or more power supplies; one or more input and output interfaces, which can be used to connect a display, a mouse, a keyboard, a touch screen device, or a sensor device Etc.
  • the input/output interface is an optional component, which may or may not exist, and is not limited here.
  • FIG. 10 is a schematic diagram of an embodiment of a data processing device in an embodiment of the application.
  • the panoramic camera is provided with at least two fisheye lenses, or multiple wide-angle lenses, or multiple ordinary lenses, which are not specifically limited here.
  • the images collected by each lens can be converted into panoramic images through stitching technology, and the specific stitching method is not limited.
  • Terminals including mobile phones, tablet computers, or personal digital assistants (personal digital assistants, PDAs), etc., are not specifically limited here.
  • the terminal hardware module includes: processor, memory and communication interface.
  • the communication interface is used to realize a communication connection with the panoramic camera, including various forms of wired or wireless connection, including TypeC, Bluetooth, or wifi, etc.
  • Memory A storage medium used to store data and codes.
  • Processor A processor used to execute code, such as an ARM processor.
  • the terminal software module includes: a camera control module and a camera pre-detection module.
  • Camera control module used to control the panoramic camera to take pictures and transfer the image from the panoramic camera to the mobile phone.
  • Camera pre-detection algorithm and software The algorithm and software proposed in this application are used to perform quality detection on the captured image to determine whether the image meets the requirements for image pose calculation.
  • the server may be an independent computer device or a virtual machine VM.
  • the virtual machine may run on one computer device or be located on multiple computer devices.
  • the virtual machine can also be a computing and transmission resource that does not depend on an independent computer device, but is divided from a resource pool. Different processors on one computer device or different processors on multiple computer devices are not specifically limited here.
  • the server hardware module includes: processor, memory and communication interface.
  • the memory may be volatile storage or non-volatile storage, and programs or data are stored in the memory.
  • the software running on the server includes an optimized image pose calculation module, a database, and modeling software; an optimized image pose calculation module: the improved algorithm proposed in this application is used to improve the image pose calculation and improve the accuracy of the pose calculation.
  • the data is used to store images, image pose parameters, survey data and modeling data, etc.
  • the modeling software uses the image and its pose information in the database to model the scene for subsequent simulations.
  • the server can also include binocular ranging software to calculate the image pose and measure the target Survey information such as location and size.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

Abstract

Des modes de réalisation de la présente invention concernent un procédé de traitement de données, utilisé pour réaliser une reconstruction tridimensionnelle d'une scène cible selon une image panoramique. Le procédé dans les modes de réalisation de la présente invention comprend les étapes suivantes : le terminal obtient une séquence d'images panoramiques, la séquence d'images panoramiques comprenant une pluralité d'images panoramiques obtenues par photographie d'une scène cible à partir de différentes orientations, et la pluralité d'images panoramiques comprenant une première image panoramique et une seconde image panoramique photographiées en continu ; lorsque le degré de chevauchement de la seconde image panoramique est supérieur ou égal à un premier seuil, le terminal envoie à un serveur une séquence d'image panoramique cible comprenant la première image panoramique et la seconde image panoramique, le degré de chevauchement de la seconde image panoramique étant une proportion d'une zone de chevauchement entre la seconde image panoramique et la première image panoramique vers la seconde image panoramique, la séquence d'image panoramique cible fait partie de la pluralité d'images panoramiques, et la séquence d'image panoramique cible est utilisée pour une reconstruction tridimensionnelle de la scène cible.
PCT/CN2020/141440 2019-12-31 2020-12-30 Procédé de traitement de données, terminal et serveur WO2021136386A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911417063.6 2019-12-31
CN201911417063.6A CN113132717A (zh) 2019-12-31 2019-12-31 数据处理方法、终端和服务器

Publications (1)

Publication Number Publication Date
WO2021136386A1 true WO2021136386A1 (fr) 2021-07-08

Family

ID=76687125

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141440 WO2021136386A1 (fr) 2019-12-31 2020-12-30 Procédé de traitement de données, terminal et serveur

Country Status (2)

Country Link
CN (1) CN113132717A (fr)
WO (1) WO2021136386A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113592960A (zh) * 2021-08-18 2021-11-02 易思维(杭州)科技有限公司 一种从多张图像中筛选含特定特征图像的方法
CN113989450A (zh) * 2021-10-27 2022-01-28 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和介质
CN116843824A (zh) * 2023-03-17 2023-10-03 瞰景科技发展(上海)有限公司 三维模型实时重建方法、装置及系统
CN117745216A (zh) * 2023-12-18 2024-03-22 江苏省测绘研究所 基于节点时序全景的自然资源要素保障进度动态跟踪方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724131A (zh) * 2021-09-02 2021-11-30 北京有竹居网络技术有限公司 信息处理方法、装置和电子设备
CN116258812A (zh) * 2021-12-10 2023-06-13 杭州海康威视数字技术股份有限公司 对象模型建立方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030142882A1 (en) * 2002-01-28 2003-07-31 Gabriel Beged-Dov Alignment of images for stitching
CN104539890A (zh) * 2014-12-18 2015-04-22 苏州阔地网络科技有限公司 一种目标跟踪方法及系统
CN105427369A (zh) * 2015-11-25 2016-03-23 努比亚技术有限公司 移动终端及其三维形象的生成方法
CN105809664A (zh) * 2014-12-31 2016-07-27 北京三星通信技术研究有限公司 生成三维图像的方法和装置
CN106157241A (zh) * 2015-04-22 2016-11-23 无锡天脉聚源传媒科技有限公司 一种全景图像拼接的方法及装置
CN106331685A (zh) * 2016-11-03 2017-01-11 Tcl集团股份有限公司 一种3d全景图像获取方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872113B (zh) * 2010-06-07 2014-03-19 中兴通讯股份有限公司 一种全景照片的拍摄方法及设备
US20130215239A1 (en) * 2012-02-21 2013-08-22 Sen Wang 3d scene model from video
CN106657910B (zh) * 2016-12-22 2018-10-09 国网浙江省电力公司杭州供电公司 一种电力变电站的全景视频监控方法
CN107578373A (zh) * 2017-05-27 2018-01-12 深圳先进技术研究院 全景图像拼接方法、终端设备及计算机可读存储介质
CN109995985B (zh) * 2017-12-29 2021-06-29 深圳市优必选科技有限公司 基于机器人的全景图像拍摄方法、装置及机器人
CN110321048B (zh) * 2018-03-30 2022-11-01 阿里巴巴集团控股有限公司 三维全景场景信息处理、交互方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030142882A1 (en) * 2002-01-28 2003-07-31 Gabriel Beged-Dov Alignment of images for stitching
CN104539890A (zh) * 2014-12-18 2015-04-22 苏州阔地网络科技有限公司 一种目标跟踪方法及系统
CN105809664A (zh) * 2014-12-31 2016-07-27 北京三星通信技术研究有限公司 生成三维图像的方法和装置
CN106157241A (zh) * 2015-04-22 2016-11-23 无锡天脉聚源传媒科技有限公司 一种全景图像拼接的方法及装置
CN105427369A (zh) * 2015-11-25 2016-03-23 努比亚技术有限公司 移动终端及其三维形象的生成方法
CN106331685A (zh) * 2016-11-03 2017-01-11 Tcl集团股份有限公司 一种3d全景图像获取方法和装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113592960A (zh) * 2021-08-18 2021-11-02 易思维(杭州)科技有限公司 一种从多张图像中筛选含特定特征图像的方法
CN113592960B (zh) * 2021-08-18 2024-03-01 易思维(杭州)科技股份有限公司 一种从多张图像中筛选含特定特征图像的方法
CN113989450A (zh) * 2021-10-27 2022-01-28 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和介质
CN113989450B (zh) * 2021-10-27 2023-09-26 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和介质
CN116843824A (zh) * 2023-03-17 2023-10-03 瞰景科技发展(上海)有限公司 三维模型实时重建方法、装置及系统
CN117745216A (zh) * 2023-12-18 2024-03-22 江苏省测绘研究所 基于节点时序全景的自然资源要素保障进度动态跟踪方法

Also Published As

Publication number Publication date
CN113132717A (zh) 2021-07-16

Similar Documents

Publication Publication Date Title
WO2021136386A1 (fr) Procédé de traitement de données, terminal et serveur
WO2020259248A1 (fr) Procédé et dispositif de détermination de pose en fonction d'informations de profondeur, support et appareil électronique
WO2021115071A1 (fr) Procédé et appareil de reconstruction tridimensionnelle pour image d'endoscope monoculaire, et dispositif terminal
CN110427917B (zh) 用于检测关键点的方法和装置
CN109978755B (zh) 全景图像合成方法、装置、设备与存储介质
US7733404B2 (en) Fast imaging system calibration
US10915998B2 (en) Image processing method and device
JP2020509506A (ja) カメラ姿勢情報の決定方法、装置、デバイス及び記憶媒体
US10645364B2 (en) Dynamic calibration of multi-camera systems using multiple multi-view image frames
WO2022095596A1 (fr) Procédé d'alignement d'image, appareil d'alignement d'image et dispositif terminal
CN111127524A (zh) 一种轨迹跟踪与三维重建方法、系统及装置
CN111307039A (zh) 一种物体长度识别方法、装置、终端设备和存储介质
WO2022160857A1 (fr) Procédé et appareil de traitement d'images, support de stockage lisible par ordinateur et dispositif électronique
JP2016212784A (ja) 画像処理装置、画像処理方法
JPWO2016208404A1 (ja) 情報処理装置および方法、並びにプログラム
CN110120012B (zh) 基于双目摄像头的同步关键帧提取的视频拼接方法
CN113838151B (zh) 相机标定方法、装置、设备及介质
CN110163914B (zh) 基于视觉的定位
WO2022247126A1 (fr) Procédé et appareil de localisation visuelle, dispositif, support et programme
CN112073640B (zh) 全景信息采集位姿获取方法及装置、系统
US9135715B1 (en) Local feature cameras for structure from motion (SFM) problems with generalized cameras
JP2006113832A (ja) ステレオ画像処理装置およびプログラム
JP2015032256A (ja) 画像処理装置およびそのデータベース構築装置
JP2014038566A (ja) 画像処理装置
CN113298871B (zh) 地图生成方法、定位方法及其系统、计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20910583

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20910583

Country of ref document: EP

Kind code of ref document: A1