WO2020259360A1 - 定位方法及装置、终端、存储介质 - Google Patents

定位方法及装置、终端、存储介质 Download PDF

Info

Publication number
WO2020259360A1
WO2020259360A1 PCT/CN2020/096488 CN2020096488W WO2020259360A1 WO 2020259360 A1 WO2020259360 A1 WO 2020259360A1 CN 2020096488 W CN2020096488 W CN 2020096488W WO 2020259360 A1 WO2020259360 A1 WO 2020259360A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
key frame
sample
processed
Prior art date
Application number
PCT/CN2020/096488
Other languages
English (en)
French (fr)
Inventor
金珂
杨宇尘
陈岩
方攀
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2020259360A1 publication Critical patent/WO2020259360A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration

Definitions

  • This application relates to indoor positioning technology, but is not limited to positioning methods and devices, terminals, and storage media.
  • the position of the user operating the camera is determined by identifying the person and the background in the image collected by the camera. Match the background with the pre-determined indoor map of the building, determine the corresponding position of the background indoors, and then confirm the user's indoor position according to the background position to realize the indoor positioning of the user. In this way, positioning the user operating the camera based on the person, the screen background or the fixed object in the image has poor positioning robustness.
  • the embodiments of the present application provide a positioning method and device, a terminal, and a storage medium in order to solve at least one problem existing in the related technology.
  • the embodiment of the present application provides a positioning method, the method includes:
  • the location information of the image capture device used to capture the image to be processed is determined.
  • the first image feature of the image to be processed includes: identification information and two-dimensional (2-Dimensional, 2D) position information of the feature point of the image to be processed;
  • the second image feature includes: two-dimensional location information, three-dimensional (3-Dimensional, 3D) location information, and identification information of feature points of the key frame image.
  • the three-dimensional location information of the feature points of the key frame image is obtained by mapping the two-dimensional location information of the feature points of the key frame image on the three-dimensional coordinate system where the preset map is located.
  • the extracting the first image feature of the image to be processed includes:
  • the identification information of each feature point in the feature point set and the two-dimensional position information of each feature point in the image to be processed are determined.
  • matching the second image feature from the image features of the key frame image stored in the preset map according to the first image feature includes:
  • the second ratio vector is the ratio of the multiple sample feature points among the feature points included in the key frame image
  • a second image feature is matched from the image features of the key frame image.
  • the matching the second image feature from the image features of the key frame image according to the first image feature, the first ratio vector, and the second ratio vector includes:
  • the first ratio vector and the second ratio vector from the image features of the key frame image, determine similar image features whose similarity with the first image feature is greater than a second threshold;
  • a second image feature whose similarity with the first image feature meets a preset similarity threshold is selected.
  • the selecting a second image feature whose similarity with the first image feature meets a preset similarity threshold from the image features of the similar key frame images includes:
  • a second image feature whose similarity with the first image feature meets a preset similarity threshold is selected.
  • the selecting a second image feature whose similarity with the first image feature meets a preset similarity threshold from the image features of the joint frame image includes:
  • the identification information of the feature points of the target combined frame image and the identification information of the feature points of the image to be processed from the image features of the target combined frame image, select the feature similarity of the first image to satisfy the preset similarity The second image feature of the degree threshold.
  • the method before determining the location information of the image capture device used to capture the image to be processed based on the first image feature and the second image feature, the method further includes:
  • determining the location information of the image capture device used to capture the image to be processed according to the second image feature and the second image feature includes:
  • the number of target Euclidean distances contained in the target Euclidean distance set is greater than the fifth threshold, based on the three-dimensional position information of the feature points of the key frame image corresponding to the second image feature and the to-be-processed information corresponding to the first image feature
  • the two-dimensional location information of the feature points of the image determines the location information of the image acquisition device.
  • the method before the collecting the image in the current scene, the method further includes:
  • the method before selecting key frame images meeting preset conditions from the sample image library to obtain the key frame image set, the method further includes:
  • corner points are pixels in the sample image that are different from the preset number of pixels in the preset area;
  • the sixth threshold it is determined that the scene corresponding to the sample image is a discrete scene.
  • the scene corresponding to the sample image is a discrete scene, select the key frame image from the sample image library according to the input selection instruction;
  • the key frame image is selected from the sample image library according to the preset frame rate or disparity.
  • the determining the ratio of each sample feature point in the key frame image to obtain the ratio vector set includes:
  • the first average number is determined; wherein, the first average number is used to indicate all The average number of times the feature point of the i-th sample appears in each sample image;
  • the second average number is used to indicate the proportion of the sample feature points contained in the j-th key frame image occupied by the i-th sample feature point;
  • the ratio of the sample feature point in the key frame image is obtained, and the ratio vector set is obtained.
  • An embodiment of the present application provides a positioning device, which includes: a first extraction module, a first matching module, and a first determination module, wherein:
  • the first extraction module is configured to extract the first image feature of the image to be processed
  • the first matching module is configured to match the second image feature from the image features of the key frame images stored in the preset map according to the first image feature;
  • the first determining module is configured to determine location information of an image capture device used to capture the image to be processed based on the first image feature and the second image feature.
  • the first image feature of the image to be processed includes: identification information and two-dimensional position information of the feature point of the image to be processed;
  • the second image feature includes: two-dimensional location information, three-dimensional location information, and identification information of the feature points of the key frame image.
  • the three-dimensional position information of the feature points of the key frame image is obtained by mapping the two-dimensional position information of the feature points of the key frame image in the three-dimensional coordinate system where the preset map is located.
  • the first extraction module includes:
  • the first extraction sub-module is configured to extract the feature point set of the image to be processed
  • the first determining submodule is configured to determine the identification information of each feature point in the feature point set and the two-dimensional position information of each feature point in the image to be processed.
  • the first matching module includes:
  • the second determining submodule is configured to respectively determine the ratios of different sample feature points in the feature point set to obtain a first ratio vector
  • the first obtaining submodule is configured to obtain a second ratio vector, where the second ratio vector is the ratio of the plurality of sample feature points among the feature points contained in the key frame image;
  • the first matching sub-module is configured to match a second image feature from the image features of the key frame image according to the first image feature, the first ratio vector and the second ratio vector.
  • the first matching submodule includes:
  • the first determining unit is configured to determine, from the image features of the key frame image, that the similarity with the first image feature is greater than a second threshold according to the first ratio vector and the second ratio vector Image feature
  • the second determining unit is configured to determine similar key frame images to which the similar image features belong to obtain a set of similar key frame images
  • the first selection unit is configured to select, from the image features of the similar key frame images, a second image feature whose similarity with the first image feature meets a preset similarity threshold.
  • the first selection unit includes:
  • the first determining subunit is configured to determine the time difference between the acquisition times of at least two similar key frame images, and the similarity between the image features of the at least two similar key frame images and the first image features, respectively difference;
  • the first joint subunit is configured to combine similar key frame images whose time difference is less than a third threshold and whose similarity difference is less than a fourth threshold to obtain a joint frame image;
  • the first selection subunit is configured to select, from the image features of the joint frame image, a second image feature whose similarity to the first image feature meets a preset similarity threshold.
  • the first selection subunit is further configured to determine the sum of the similarity between the image feature of each key frame image contained in the multiple joint frame images and the first image feature;
  • the joint frame image with the largest sum is determined as the target joint frame image with the highest similarity to the image to be processed; according to the identification information of the feature points of the target joint frame image and the identification information of the feature points of the image to be processed, From the image features of the target joint frame image, a second image feature whose similarity with the first image feature meets a preset similarity threshold is selected.
  • the device further includes:
  • a second determining module configured to determine the image containing the second image feature as a matching frame image of the image to be processed
  • the third determining module is configured to determine a target Euclidean distance between any two feature points contained in the matching frame image and which is less than a first threshold to obtain a target Euclidean distance set.
  • the first determining module includes:
  • the third determining submodule is configured to, if the number of target Euclidean distances included in the target Euclidean distance set is greater than a fifth threshold, based on the three-dimensional position information of the feature point of the key frame image corresponding to the second image feature and the The two-dimensional location information of the feature points of the image to be processed corresponding to the first image feature determines the location information of the image acquisition device.
  • the device further includes:
  • the first selection module is configured to select key frame images meeting preset conditions from the sample image library to obtain a set of key frame images
  • the second extraction module is configured to extract the image features of each key frame image to obtain a key image feature set
  • the third extraction module is configured to extract feature points of the sample image to obtain a sample feature point set containing different feature points
  • the fourth determining module is configured to determine the ratio of each sample feature point in the key frame image to obtain a set of ratio vectors
  • the first storage module is configured to store the ratio vector set and the key image feature set to obtain the preset map.
  • the device further includes:
  • the second selection module is configured to select a preset number of corner points from the sample image; wherein the corner points are pixels in the sample image that are different from the preset number of pixels in the preset area;
  • a fifth determining module configured to determine that the scene corresponding to the sample image is a continuous scene if the number of the same corner points contained in two sample images adjacent to the acquisition time is greater than or equal to a sixth threshold;
  • the sixth determining module is configured to determine that the scene corresponding to the sample image is a discrete scene if the number of identical corner points contained in two sample images adjacent to the acquisition time is less than a sixth threshold.
  • the first selection module includes:
  • the first selection submodule is configured to, if the scene corresponding to the sample image is a discrete scene, select the key frame image from the sample image library according to the input selection instruction;
  • the second selection sub-module is configured to select a key frame image from the sample image library according to a preset frame rate or disparity if the scene corresponding to the sample image is a continuous scene.
  • the fourth determining module includes:
  • the fourth determining sub-module is configured to determine the first average number of times according to the first number of sample images contained in the sample image library and the first number of times the i-th sample feature point appears in the sample image library; wherein, the The first average number is used to indicate the average number of times the i-th sample feature point appears in each sample image;
  • the fifth determining submodule is configured to, based on the second number of times the i-th sample feature point appears in the j-th key frame image and the second number of sample feature points contained in the j-th key frame image, Determine the second average number; wherein, the second average number is used to indicate the ratio of the i-th sample feature points to the sample feature points contained in the j-th key frame image;
  • the sixth determining submodule is configured to obtain the ratio of the sample feature points in the key frame image according to the first average number and the second average number, and obtain the ratio vector set.
  • An embodiment of the present application provides a terminal including a memory and a processor, the memory stores a computer program that can run on the processor, and the processor implements the steps in the positioning method when the program is executed.
  • the embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above positioning method are implemented.
  • the embodiments of the present application provide a positioning method, device, terminal, and storage medium. First, extract the first image feature of the image to be processed; then, according to the first image feature, use key frames stored in a preset map Among the image features of the image, the second image feature is matched; finally, according to the first image feature and the second image feature, the location information of the image capture device used to capture the image to be processed is determined; so, for For any image to be processed, by matching the image features with the image features of the key frame image in the preset map, the matching frame image in the preset map can be obtained, thereby realizing the positioning of the image acquisition device without relying on Fixed objects in the image.
  • FIG. 1 is a schematic diagram of the implementation process of a positioning method according to an embodiment of the application
  • 2A is a schematic diagram of the implementation process of the positioning method according to an embodiment of the application.
  • 2B is a schematic diagram of the implementation process of creating a preset map according to an embodiment of the application
  • 2C is a schematic diagram of another implementation process of the positioning method according to the embodiment of this application.
  • FIG. 3 is a schematic diagram of another implementation process of the positioning method according to the embodiment of the application.
  • FIG. 4 is a schematic diagram of the structure of a ratio vector in an embodiment of the application.
  • FIG. 5A is an application scene diagram for determining a matching frame image according to an embodiment of the application
  • 5B is a schematic structural diagram of determining location information of a collection device according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of the composition structure of a positioning device according to an embodiment of the application.
  • FIG. 1 is a schematic diagram of the implementation process of the positioning method according to the embodiment of the application. As shown in FIG. 1, the method includes the following steps:
  • Step S101 Extract the first image feature of the image to be processed.
  • the first image feature includes: identification information and 2D position information of the feature point of the image to be processed.
  • step S101 first, extract the feature points of the image to be processed; then, determine the identification information of the feature points and the 2D position information of the feature points in the image to be processed;
  • the identification information can be understood as the descriptor information that can uniquely identify the feature point.
  • Step S102 According to the first image feature, the second image feature is matched from the image feature of the key frame image stored in the preset map.
  • the second image feature includes: 2D location information, 3D location information, and identification information of the feature points of the key frame image.
  • the step S102 can be understood as selecting a second image feature with a higher degree of matching with the first image feature from the image features of the key frame image stored in the preset map.
  • Step S103 According to the first image feature and the second image feature, determine location information of an image capture device used to capture the image to be processed.
  • the location information of the image acquisition device is determined based on the 3D location information of the feature points of the key frame image corresponding to the second image feature and the 2D location information of the feature points of the image to be processed corresponding to the first image feature .
  • the 2D position information of the feature points of the image to be processed is converted into 3D position information, and then the 3D position information is combined with the key in the three-dimensional coordinate system of the preset map
  • the 3D position information of the characteristic points of the frame image is compared to determine the position information of the image acquisition device.
  • the 2D position information of the image acquisition device when the image acquisition device is positioned, the 2D position information of the image acquisition device can be obtained, and the 3D position information of the image acquisition device can also be obtained. It is understood that both the planar spatial position of the image acquisition device and the stereo spatial position of the image acquisition device can be obtained.
  • the image feature is extracted, then, the second image feature matching the image feature is selected from the image features of the key frame image in the preset map, and finally,
  • the position information of the feature points of the two image features can be used to realize the positioning of the image acquisition device, that is, it does not need to rely on fixed objects in the image, nor does it need to rely on the network, thereby improving the positioning accuracy and positioning robustness.
  • FIG. 2A is a schematic diagram of the implementation process of the positioning method according to the embodiment of the present application. As shown in FIG. 2A, the method includes the following steps:
  • Step S201 Extract the feature point set of the image to be processed.
  • the feature points of the image to be processed are extracted to obtain a feature point set.
  • Step S202 Determine the identification information of each feature point in the feature point set and the 2D position information of each feature point in the image to be processed.
  • the descriptor information (identification information) of the feature point is determined, and the 2D position information can be considered as the 2D coordinates of the feature point.
  • steps S201 and S202 give a way to achieve "extract the first image feature of the image to be processed", in this way, the 2D coordinates of each feature point of the image to be processed and the descriptor of the feature point are obtained information.
  • Step S203 Determine the ratios of different sample feature points in the feature point set, respectively, to obtain a first ratio vector.
  • the preset bag-of-words model includes a plurality of different sample feature points and the ratio of the plurality of sample feature points to the feature points contained in the key frame image.
  • the first ratio vector may be determined according to the number of sample images, the number of times the sample feature points appear in the sample images, the number of times the sample feature points appear in the image to be processed, and the total number of sample feature points that appear in the image to be processed.
  • Step S204 Obtain a second ratio vector.
  • the second ratio vector is the ratio of the multiple sample feature points among the feature points contained in the key frame image; the second ratio vector is pre-stored in a preset bag of words model, Therefore, when the image features of the image to be processed need to be matched, the second ratio vector is obtained from the preset word bag model.
  • the determination process of the second ratio vector is similar to the determination process of the first ratio vector, and both can be determined by using formula (1); and the dimensions of the first ratio vector and the second ratio vector are the same.
  • Step S205 According to the first image feature, the first ratio vector and the second ratio vector, a second image feature is matched from the image features of the key frame image.
  • step S205 can be implemented through the following process:
  • the first step according to the first ratio vector and the second ratio vector, from the image features of the key frame image, determine similar image features whose similarity with the first image feature is greater than a second threshold.
  • the first one by comparing the ratio of vector v 1 and the image to be processed each second key frame image vector v ratio of 2, using the ratio of these two vectors is calculated as shown in (2) equation, to determine whether each The similarity between a key frame image and the image to be processed is used to screen out similar key frame images whose similarity is greater than or equal to the second threshold to obtain a set of similar key frame images.
  • the second step is to determine the similar key frame images to which the similar image features belong to obtain a set of similar key frame images.
  • the third step is to select a second image feature whose similarity to the first image feature meets a preset similarity threshold from the image features of the similar key frame images.
  • the second image feature with the highest similarity to the first image feature for example, first, determine the time difference between the acquisition times of at least two similar key frame images, And the image features of the at least two similar key frame images respectively have a similarity difference with the first image feature; then, the time difference is smaller than a third threshold, and the similarity difference is smaller than a fourth threshold.
  • the frame images are combined to obtain a joint frame image; that is, multiple similar key frame images that are close in acquisition time and close to the image to be processed are selected, indicating that these key frame images may be continuous pictures.
  • Such multiple similar key frame images are combined together to form a joint frame image (which can also become an island) to obtain multiple joint frame images; finally, from the image characteristics of the joint frame image, select the The second image feature whose first image feature similarity meets the preset similarity threshold.
  • the sum of similarity between the image feature of each key frame image contained in multiple joint frame images and the first image feature is determined separately; in this way, multiple key frame images contained in multiple joint frame images are determined one by one.
  • the sum of the similarity between the image feature of the frame image and the first image feature is shown in formula (3).
  • the joint frame image with the largest sum of similarity is determined as the target joint frame image with the highest similarity to the image to be processed; finally, according to the identification information of the feature points of the target joint frame image and the image to be processed For the identification information of the feature point, from the image features of the target joint frame image, a second image feature whose similarity with the first image feature meets a preset similarity threshold is selected.
  • the identification information of the feature point of the target joint frame image and the identification information of the feature point of the image to be processed can uniquely identify the feature point of the target joint frame image and the feature point of the image to be processed, based on these two
  • This identification information can very accurately select the second image feature with the highest similarity to the first image feature from the image features of the target joint frame image. This ensures the accuracy of matching the first image feature of the image to be processed with the second image feature, and ensures that the selected second image feature is extremely similar to the first image feature.
  • the above steps S203 to S205 give a way to realize "according to the first image feature, the second image feature is matched from the image features of the key frame image stored in the preset map", in this mode.
  • a preset bag-of-words model to retrieve the second image feature matching the first image feature from the image features of the key frame image, the similarity between the second image feature and the first image feature is ensured.
  • Step S206 Determine the image containing the second image feature as a matching frame image of the image to be processed.
  • the key frame image containing the second image feature indicates that the key frame image is very similar to the image to be processed, so the key frame image is used as the matching frame image of the image to be processed.
  • Step S207 Determine the target Euclidean distance between any two feature points included in the matched frame image and is less than a first threshold, and obtain a target Euclidean distance set.
  • first determine the Euclidean distance between any two feature points contained in the matching frame image, and then select the Euclidean distance less than the first threshold as the target Euclidean distance to obtain the target Euclidean distance set; Process a feature point in the image to obtain a target Euclidean distance set, and then process multiple feature points in the image to be processed to obtain multiple Euclidean distance sets.
  • the target Euclidean distance that is less than the first threshold can also be regarded as first determining the minimum Euclidean distance from a plurality of Euclidean distances, and then determining whether the minimum Euclidean distance is less than the first threshold, and if it is less than, then determining the minimum Euclidean distance
  • the distance is the target Euclidean distance, then the target Euclidean distance set is the set with the smallest Euclidean distance among multiple Euclidean distance sets.
  • Step S208 if the number of target Euclidean distances included in the target Euclidean distance set is greater than a fifth threshold, the 3D position information of the feature points of the key frame image corresponding to the second image feature corresponds to the first image feature
  • the 2D position information of the feature points of the image to be processed determines the position information of the image acquisition device.
  • the number of target Euclidean distances contained in the target Euclidean distance set is greater than the fifth threshold, it means that the number of target Euclidean distances is large enough, and that there are enough feature points that match the features of the first image, indicating that this key frame image The similarity with the image to be processed is high enough.
  • the 3D position information of the feature points of the key frame image and the 2D position information of the feature points of the image to be processed corresponding to the first image feature are used as the front-end pose tracking algorithm (Perspectives-n-Point, PnP) algorithm Input, first obtain the 2D position information (for example, 2D coordinates) of the feature point in the current frame of the image to be processed in the current coordinate system (for example, 3D coordinates), and then according to the map coordinate system
  • the 3D position information of the characteristic points of the key frame image and the 3D position information of the characteristic points in the current frame of the image to be processed in the current coordinate system can be used to solve the position information of the image acquisition device.
  • the above steps S206 to S208 give a way to realize "determine the location information of the image capture device used to capture the image to be processed based on the second image feature and the second image feature", where In the method, the 2D and 3D position information of the key frame image are considered at the same time, and the position and posture can be provided in the positioning result at the same time, so the positioning accuracy of the image acquisition device is improved.
  • the image to be processed is obtained through the image acquisition device, the constructed preset map is loaded, and the preset word bag model is used to retrieve the matching frame image corresponding to the image to be processed.
  • the 2D position information of the feature points of the image to be processed and the 3D position information of the feature points of the key frame image are used as the input of the PnP algorithm to obtain the precise pose of the current camera in the map to achieve the positioning purpose; in this way, through the key Frame images can achieve the positioning purpose, obtain the position and posture of the image acquisition device in the map coordinate system, improve the accuracy of the positioning result, and do not need to rely on external base station equipment, low cost, and strong robustness.
  • FIG. 2B is a schematic diagram of the implementation process of creating a preset map according to the embodiment of the application. As shown in FIG. 2B, the method includes the following steps:
  • Step S221 From the sample image library, select key frame images meeting preset conditions to obtain a set of key frame images.
  • a preset number of corner points are selected from the sample image; the corner points are pixels in the sample image that are significantly different from the preset number of surrounding pixels; for example, 150 corner points are selected point.
  • the second step if the number of identical corner points contained in two sample images with adjacent acquisition times is greater than or equal to the sixth threshold, it is determined that the scene corresponding to the sample image is a continuous scene; the acquisition times of the two sample images are adjacent, It can also be understood as two consecutive sample images to determine the number of the same corner points contained in the two sample images. The larger the number, the higher the correlation between the two sample images and the higher the correlation between the two samples.
  • the image is an image from a continuous scene. Continuous scenes, such as a single indoor environment, such as a bedroom, a living room, or a single meeting room.
  • the third step is to determine that the scene corresponding to the sample image is a discrete scene if the number of the same corner points contained in the two sample images adjacent to the acquisition time is less than the sixth threshold.
  • the scene corresponding to the sample image is a discrete scene
  • the scene corresponding to the sample image is a continuous scene
  • Step S222 Extract the image features of each key frame image to obtain a key image feature set.
  • the image feature of the key frame includes: 2D location information, 3D location information of the feature point of the key frame image, and identification information that can uniquely identify the feature point.
  • Step S223 Determine the ratio of each sample feature point in the key frame image to obtain a ratio vector set.
  • the different sample feature points and the ratio vector set are stored in the preset word bag model, so that the preset word bag model can be used to retrieve the matching of the image to be processed from the key frame image Frame image.
  • the step S223 can be implemented through the following process:
  • the first average number of times is determined according to the first number of sample images contained in the sample image library and the first number of times that the i-th sample feature point appears in the sample image library.
  • the first average number is used to indicate the average number of times the i-th sample feature point appears in each sample image; for example, the first number of sample images is N, and the i-th sample feature point appears in the sample image library
  • the first number of times is n i
  • the first average number of times idf(i) can be obtained by formula (1).
  • the second average number is used to indicate the ratio of the i-th sample feature point to the sample feature points contained in the j-th key frame image; for example, the second number is The second quantity is By formula (1), the second average number tf(i,I t ) can be obtained.
  • the ratio of the sample feature point in the key frame image is obtained, and the ratio vector set is obtained. For example, according to formula (1), multiply the first average number by the second average number to obtain the ratio vector
  • Step S224 Store the ratio vector set and the key image feature set to obtain the preset map.
  • the ratio vector set corresponding to the key frame image and the key image feature set are stored in a preset map, so that when the image acquisition device is located, the ratio vector set and the preset word bag model are used to determine the to-be-processed
  • the ratio vector set corresponding to the image is compared to determine a matching frame image that is highly similar to the image to be processed from the key image feature set.
  • FIG. 2C is a schematic diagram of another implementation process of the positioning method according to the embodiment of the application. As shown in FIG. 2C, the method includes the following steps:
  • Step S231 From the sample image library, select key frame images meeting preset conditions to obtain a set of key frame images.
  • Step S232 Extract the image features of each key frame image to obtain a key image feature set.
  • Step S233 Extract the feature points of the sample image to obtain a sample feature point set containing different feature points.
  • Step S234 Determine the ratio of each sample feature point in the key frame image to obtain a ratio vector set.
  • Step S235 Store the ratio vector set and the key image feature set to obtain the preset map.
  • steps S231 to S235 complete the process of creating the preset map, and store the image feature and ratio vector set of the key frame image in the preset map, so that the image feature of the key frame image can be searched according to the ratio vector set A second image feature that matches the image feature of the image to be processed is generated.
  • Step S236 Load a preset map, and extract the first image feature of the image to be processed.
  • Step S237 According to the first image feature, the second image feature is matched from the image features of the key frame image stored in the preset map.
  • Step S238 Determine location information of an image capture device configured to capture the image to be processed based on the first image feature and the second image feature.
  • the above steps S236 to S238 give the process of realizing the positioning of the image acquisition device.
  • the second image whose features are highly similar to the first image is matched from the key frame images stored in the preset map.
  • Image features and then use the 2D location information and 3D location information of the two image features to finally determine the location information of the collection device.
  • the 2D and 3D position information of the key frame image is used at the same time to ensure the accuracy of the positioning result of the acquisition device, the positioning success rate is high, the robustness is strong, and there is no need to introduce other external sources during the positioning process.
  • Base station equipment reduces the cost.
  • FIG. 3 is a schematic diagram of another implementation process of the positioning method according to the embodiment of the application. As shown in FIG. 3, the method includes the following steps:
  • Step S301 According to the scene to which the sample image belongs, different key frame image selection methods are selected.
  • the scene to which the sample image belongs includes: discrete scene or continuous scene;
  • the key frame image selection methods include manual selection and automatic selection.
  • Manual selection requires the creator to manually select the key frame images that need to be included in the map, while automatic selection is The method of automatically selecting an image as a key frame image according to the frame rate or parallax.
  • 150 FAST feature corner points are extracted from each key frame image. The ratio of the same corner points in two consecutive key frame images is defined as the corner tracking rate.
  • a scene with an ordered key frame image sequence and an average corner tracking rate greater than 30% is defined as a continuous scene, otherwise it is a discrete scene.
  • the key frame image selection method of continuous scenes uses automatic selection method; the key frame image selection method of discrete scenes uses manual selection method.
  • the continuous scene is suitable for a single indoor environment, such as a bedroom, a living room, a single meeting room, etc.; a discrete scene is more suitable for use in multiple indoor environments, such as multiple rooms in a building, or multiple meetings on a first floor Room etc.
  • the key frame image selection strategies of continuous scenes and discrete scenes are different, and the applicable scenes are different. In this way, for indoor discrete or continuous scenes, different key frame image selection methods are used to extract image features for map construction, so that the positioning process does not depend on external base station equipment, low cost, high positioning accuracy, and strong robustness.
  • Step S302 Use the camera to collect key frame images.
  • the camera may be a monocular camera or a binocular camera.
  • step S303 the image features in the key frame image are extracted in real time during the acquisition process.
  • image feature extraction is a process of interpretation and annotation of key frame images.
  • step S303 it is necessary to extract the 2D position information, 3D position information, and identification information of the feature point of the key frame image (ie the descriptor information of the feature point); among them, the 3D position information of the feature point of the key frame image is the key
  • the 2D position information of the feature points of the frame image is mapped in the three-dimensional coordinate system where the preset map is located.
  • the number of extraction is 150 (150 is the empirical value, the number of feature points is too small, the tracking failure rate is high, the number of feature points is too much, which affects the efficiency of the algorithm), which is used for images Tracking; and extract the descriptor of the feature point for feature point matching; secondly, the 3D position information (ie depth information) of the feature point is calculated by the triangulation method to determine the location of the acquisition camera.
  • Step S304 Determine the ratio of each sample feature point in the key frame image in real time during the acquisition process to obtain a ratio vector.
  • step S304 can be understood as, during the acquisition of the key frame image, for the current frame image, the ratio vector of the key frame image is extracted in real time.
  • the word bag model is described in the form of a vocabulary tree.
  • the bag-of-words model includes sample image database 41, which is the root node of the vocabulary tree; sample images 42, 43, and 44, namely leaf nodes 42, 43; sample feature points 1 to 3 are different sample feature points in sample image 42 , The present feature points 4 to 6 are different sample feature points in the sample image 43, and the present feature points 7 to 9 are different sample feature points in the sample image 44.
  • w is the number of feature points extracted from the sample image of the bag of words model. So there are a total of w sample feature points in the bag of words model.
  • Each sample feature point will score the key frame image, and the score value is a floating point number from 0 to 1, so that each key frame image can be represented by a w-dimensional floating point number.
  • This w-dimensional vector is the output of the bag of words model Ratio vector The scoring process is shown in formula (1):
  • N is the number of sample images (that is, the first number)
  • n i is the number of times the sample feature point w i appears in the sample image (that is, the first number)
  • I t is the image I collected at time t
  • T the number that appears in the key frame image I w i is the sample feature points collected at time (i.e., the number of times a second)
  • the total number of sample feature points as the key frame image I t appeared i.e., a second number).
  • the w-dimensional floating point number vector of each key frame image that is, the ratio vector
  • the ratio vector can also be used as the feature information of the preset bag of words model.
  • the above steps S301 to S304 construct an offline preset map that depends on the key frame image.
  • the preset map stores the image characteristics of the key frame image in a binary format (including: 2D location information, 2D location information and identification information) , For example, 2D coordinates, 3D coordinates, and descriptor information) to the local device, when the image acquisition device needs to be performed, the preset map will be loaded and used.
  • Step S305 Load the constructed preset map.
  • Step S306 Use the camera to perform image collection to obtain an image to be processed.
  • Step S307 in the process of acquiring the image to be processed, extract the first image feature in the current frame of the image to be processed in real time.
  • extracting the first image feature in the current frame of the image to be processed in real time is similar to the process of step S303, but there is no need to determine the 3D position information of the image to be processed, because there is no need to provide the 3D image of the image to be processed in the subsequent PnP algorithm. location information.
  • step S308 the matching frame image of the current frame of the image to be processed in the preset map is retrieved through the bag of words model.
  • the search for the matching frame image of the current frame of the image to be processed in the preset map through the bag-of-words model can be understood as using the feature information of the bag-of-words model, that is, the ratio vector set, to retrieve the current frame of the image to be processed in the preview. Set the matching frame image in the map.
  • the step S308 can be implemented through the following process:
  • the first step is to find the similarity between the current frame of the image to be processed and each key frame image.
  • the calculation method of the similarity s(v 1 , v 2 ) is shown in formula (2).
  • v 1 and v 2 respectively represent the first ratio vector of each sample feature point contained in the bag-of-words model in the current frame of the image to be processed, and each sample feature point in the key frame image.
  • the second ratio of the vector If the bag of words model contains w sample feature points, then the first ratio vector and the second ratio vector are both w-dimensional vectors.
  • the similar key frame images whose similarity reaches the second threshold are selected by screening the key frame images to become a set of similar key frame images.
  • similar key frame images whose time stamp difference is less than the third threshold and similarity difference less than the fourth threshold are selected from the set of similar key frame images to join together to obtain a joint frame image (or called an island).
  • the second step can be understood as selecting similar key frame images with close timestamps and similarity matching scores in the set of similar key frame images to be combined together to form islands; in this way, the set of similar key frame images is divided into Multiple joint frame images (ie multiple islands) are created.
  • the ratio of the similarity between the first key frame image and the last key frame image in the joint frame image is very small, and the similarity ratio As shown in formula (3):
  • the third step is to respectively determine the sum of similarity between the image features of each key frame image contained in the multiple joint frame images and the first image feature, as shown in formula (4),
  • the fourth step is to determine the joint frame image with the largest sum of similarity as the target joint frame image with the highest similarity to the image to be processed, and to find the current frame from the target joint frame image to the image to be processed The matching frame image with the highest similarity.
  • Step S309 using the PnP algorithm to determine the location information of the current camera in the map coordinate system.
  • step S309 can be implemented through the following steps:
  • the Nth feature point F CN of the current frame X C of the image to be processed is traversed all feature points of the matching frame image X 3 and the Euclidean distance between any two feature points in the matching frame image is determined.
  • the current frame X c 51 of the image to be processed is a matching frame image X 3 52 that matches the current frame image.
  • the second step is to select the group with the smallest Euclidean distance (ie the target Euclidean distance set) for threshold judgment. If it is less than the first threshold and determine the target Euclidean distance, the target Euclidean distance set will be formed; otherwise, the target Euclidean distance set will not be formed and skip Go to the first step until all the feature points of X C are traversed, and then enter the third step. For example, as shown in Figure 5A, by comparing multiple Euclidean distances, a set of minimum Euclidean distance combinations ⁇ F 1 , F 2 , F 3 ⁇ are obtained.
  • the third step is to form the target Euclidean distance set, which can be expressed as ⁇ F 1 , F 2 , F 3 ⁇ . If the number of elements in the target Euclidean distance set is greater than the fifth threshold, proceed to the fourth step, otherwise the algorithm ends and the matching frame is output X 3 location information.
  • the input of the PnP algorithm is the 3D coordinates of the feature points in the key frame image and the 2D coordinates of the feature points in the current frame of the image to be processed
  • the output of the algorithm is the position of the current frame of the image to be processed in the map coordinate system.
  • the PnP algorithm does not directly obtain the camera pose matrix according to the matching pair sequence, but first obtains the 2D coordinates of the feature points in the current frame of the image to be processed.
  • the 3D coordinates of the feature points in the current frame of the image to be processed in the current coordinate system And then calculate the camera pose according to the 3D coordinate system in the map coordinate system and the 3D coordinates of the feature points in the current frame of the image to be processed in the current coordinate system.
  • the solution of the PnP algorithm starts from the law of cosines. Suppose the center of the current coordinate system is point O, and A, B, and C are three characteristic points in the current frame of the image to be processed, as shown in Figure 5B:
  • the location of the collection device is determined through the transformation from the map coordinate system to the current coordinate system.
  • the above steps S305 to S309 load the constructed offline map for the image to be processed collected by the image acquisition device, retrieve the matching frame image of the image to be processed from the key frame images in the preset map through the bag of words model, and finally adopt PnP
  • the algorithm solves the current camera's precise pose in the map to determine the device's position and attitude in the map coordinate system, so that the positioning result is more accurate, does not need to rely on external base station equipment, low cost, and strong robustness.
  • the 2D coordinates and 3D coordinates of the key frame image are considered at the same time, and the 3D coordinates of the acquisition device can be provided in the positioning result, which improves the positioning accuracy; during the mapping and positioning process, there is no need to introduce other external base station equipment , So the cost is low; and there is no need to introduce algorithms with high error rates such as object recognition, the positioning success rate is high, and the robustness is strong.
  • the embodiment of the present application provides a positioning device, which includes each module included and each unit included in each module, which can be implemented by a processor in a computer device; of course, it can also be implemented by a specific logic circuit;
  • the processor may be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • Fig. 6 is a schematic diagram of the composition structure of a positioning device according to an embodiment of the application.
  • the device 600 includes: a first extraction module 601, a first matching module 602, and a first determination module 603, wherein:
  • the first extraction module 601 is configured to extract first image features of the image to be processed
  • the first matching module 602 is configured to match the second image feature from the image features of the key frame image stored in the preset map according to the first image feature;
  • the first determining module 603 is configured to determine the location information of the image capture device used to capture the image to be processed according to the first image feature and the second image feature.
  • the first image feature of the image to be processed includes: identification information and two-dimensional position information of the feature point of the image to be processed;
  • the second image feature includes: two-dimensional location information, three-dimensional location information, and identification information of the feature points of the key frame image.
  • the three-dimensional position information of the feature points of the key frame image is obtained by mapping the two-dimensional position information of the feature points of the key frame image in the three-dimensional coordinate system where the preset map is located.
  • the first extraction module 601 includes:
  • the first extraction sub-module is configured to extract the feature point set of the image to be processed
  • the first determining submodule is configured to determine the identification information of each feature point in the feature point set and the two-dimensional position information of each feature point in the image to be processed.
  • the first matching module 602 includes:
  • the second determining submodule is configured to respectively determine the ratios of different sample feature points in the feature point set to obtain a first ratio vector
  • the first obtaining submodule is configured to obtain a second ratio vector, where the second ratio vector is the ratio of the plurality of sample feature points among the feature points contained in the key frame image;
  • the first matching sub-module is configured to match a second image feature from the image features of the key frame image according to the first image feature, the first ratio vector and the second ratio vector.
  • the first matching submodule includes:
  • the first determining unit is configured to determine, from the image features of the key frame image, that the similarity with the first image feature is greater than a second threshold according to the first ratio vector and the second ratio vector Image feature
  • the second determining unit is configured to determine similar key frame images to which the similar image features belong to obtain a set of similar key frame images
  • the first selection unit is configured to select, from the image features of the similar key frame images, a second image feature whose similarity with the first image feature meets a preset similarity threshold.
  • the first selection unit includes:
  • the first determining subunit is configured to determine the time difference between the acquisition times of at least two similar key frame images, and the similarity between the image features of the at least two similar key frame images and the first image features, respectively difference;
  • the first joint subunit is configured to combine similar key frame images whose time difference is less than a third threshold and whose similarity difference is less than a fourth threshold to obtain a joint frame image;
  • the first selection subunit is configured to select, from the image features of the joint frame image, a second image feature whose similarity to the first image feature meets a preset similarity threshold.
  • the first selection subunit is further configured to determine the sum of the similarity between the image feature of each key frame image contained in the multiple joint frame images and the first image feature;
  • the joint frame image with the largest sum is determined as the target joint frame image with the highest similarity to the image to be processed; according to the identification information of the feature points of the target joint frame image and the identification information of the feature points of the image to be processed, From the image features of the target joint frame image, a second image feature whose similarity with the first image feature meets a preset similarity threshold is selected.
  • the device further includes:
  • a second determining module configured to determine the image containing the second image feature as a matching frame image of the image to be processed
  • the third determining module is configured to determine a target Euclidean distance between any two feature points contained in the matching frame image and which is less than a first threshold to obtain a target Euclidean distance set.
  • the first determining module 603 includes:
  • the third determining submodule is configured to, if the number of target Euclidean distances included in the target Euclidean distance set is greater than a fifth threshold, based on the three-dimensional position information of the feature point of the key frame image corresponding to the second image feature and the The two-dimensional location information of the feature points of the image to be processed corresponding to the first image feature determines the location information of the image acquisition device.
  • the device further includes:
  • the first selection module is configured to select key frame images meeting preset conditions from the sample image library to obtain a set of key frame images
  • the second extraction module is configured to extract the image features of each key frame image to obtain a key image feature set
  • the third extraction module is configured to extract feature points of the sample image to obtain a sample feature point set containing different feature points
  • the fourth determining module is configured to determine the ratio of each sample feature point in the key frame image to obtain a set of ratio vectors
  • the first storage module is configured to store the ratio vector set and the key image feature set to obtain the preset map.
  • the device further includes:
  • the second selection module is configured to select a preset number of corner points from the sample image; wherein the corner points are pixels in the sample image that are different from the preset number of pixels in the preset area;
  • a fifth determining module configured to determine that the scene corresponding to the sample image is a continuous scene if the number of the same corner points contained in two sample images adjacent to the acquisition time is greater than or equal to a sixth threshold;
  • the sixth determining module is configured to determine that the scene corresponding to the sample image is a discrete scene if the number of identical corner points contained in two sample images adjacent to the acquisition time is less than a sixth threshold.
  • the first selection module includes:
  • the first selection submodule is configured to, if the scene corresponding to the sample image is a discrete scene, select the key frame image from the sample image library according to the input selection instruction;
  • the second selection sub-module is configured to select a key frame image from the sample image library according to a preset frame rate or disparity if the scene corresponding to the sample image is a continuous scene.
  • the fourth determining module includes:
  • the fourth determining sub-module is configured to determine the first average number of times according to the first number of sample images contained in the sample image library and the first number of times the i-th sample feature point appears in the sample image library; wherein, the The first average number is used to indicate the average number of times the i-th sample feature point appears in each sample image;
  • the fifth determining submodule is configured to, based on the second number of times the i-th sample feature point appears in the j-th key frame image and the second number of sample feature points contained in the j-th key frame image, Determine the second average number; wherein, the second average number is used to indicate the ratio of the i-th sample feature points to the sample feature points contained in the j-th key frame image;
  • the sixth determining submodule is configured to obtain the ratio of the sample feature points in the key frame image according to the first average number and the second average number, and obtain the ratio vector set.
  • the above positioning method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to enable The automatic test line of the equipment containing the storage medium executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the positioning method provided in the foregoing embodiment are implemented.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the embodiments of the present application can all be integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a mobile storage device, a read only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk.
  • ROM Read Only Memory
  • the above-mentioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to enable The equipment automatic test line executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks or optical discs and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种定位方法,定位装置,终端及存储介质,其中方法包括:提取待处理图像的第一图像特征(S101);根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征(S102);根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息(S103)。

Description

定位方法及装置、终端、存储介质
相关申请的交叉引用
本申请实施例基于申请号为201910579358.7、申请日为2019年06月28日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请实施例。
技术领域
本申请涉及室内定位技术,涉及但不限于定位方法及装置、终端、存储介质。
背景技术
在相关技术中,在室内场景中,通过识别摄像头采集到的图像中的人物和背景,以确定操作摄像头的用户的位置。将将背景与预先测定的建筑物室内地图匹配,确定背景在室内的对应位置,然后根据背景的位置确认用户在室内的位置,以实现对用户的室内定位。这样,基于图像中的人物、画面背景或固定物体,对操作摄像头的用户的进行定位,定位鲁棒性较差。
发明内容
有鉴于此,本申请实施例为解决相关技术中存在的至少一个问题而提供一种定位方法及装置、终端、存储介质。
本申请实施例的技术方案是这样实现的:
本申请实施例提供了一种定位方法,所述方法包括:
提取待处理图像的第一图像特征;
根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;
根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
在上述方法中,所述待处理图像的第一图像特征包括:所述待处理图像的特征点的标识信息和二维(2-Dimensional,2D)位置信息;
所述第二图像特征包括:所述关键帧图像的特征点的二维位置信息、三维维(3-Dimensional,3D)位置信息和标识信息。
在上述方法中,所述关键帧图像的特征点的三维位置信息是将所述关键帧图像的特征点的二维位置信息映射在所述预设地图所处的三维坐标系中得到的。
在上述方法中,所述提取待处理图像的第一图像特征,包括:
提取所述待处理图像的特征点集合;
确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所述待处理图像中的二维位置信息。
在上述方法中,所述根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征,包括:
分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量;
获取第二比值向量,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;
根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
在上述方法中,所述根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述 关键帧图像的图像特征中,匹配出第二图像特征,包括:
根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征;
确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合;
从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述方法中,所述从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征,包括:
确定至少两个所述相似关键帧图像的采集时间之间的时间差,和所述至少两个相似关键帧图像的图像特征分别与所述第一图像特征的相似度差;
将所述时间差小于第三阈值,且所述相似度差小于第四阈值的相似关键帧图像进行联合,得到联合帧图像;
从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述方法中,所述从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征,包括:
分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和;
将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像;
根据目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,从所述目标联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述方法中,在所述根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息之前,所述方法还包括:
将包含所述第二图像特征的图像,确定为所述待处理图像的匹配帧图像;
确定所述匹配帧图像中包含的任意两个特征点之间,小于第一阈值的目标欧式距离,得到目标欧式距离集合。
在上述方法中,根据所述第二图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息,包括:
如果所述目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,基于所述第二图像特征对应的关键帧图像的特征点的三维位置信息和所述第一图像特征对应的待处理图像的特征点的二维位置信息,确定所述图像采集设备的位置信息。
在上述方法中,在所述采集当前场景下的图像之前,所述方法还包括:
从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合;
提取每一关键帧图像的图像特征,得到关键图像特征集合;
提取样本图像的特征点,得到包含不同的特征点的样本特征点集合;
确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合;
存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
在上述方法中,在所述从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合之前,所述方法还包括:
从所述样本图像中选择预设数量的角点;其中,所述角点为所述样本图像中与预设区域内预设数量的像素点具有差别的像素点;
如果采集时间相邻的两个样本图像中包含的相同的角点数量大于等于第六阈值,确定所述样本图像对应的场景为连续场景;
如果采集时间相邻的两个样本图像中包含的相同的角点数量小于第六阈值,确定所述样本图像对应的场景为离散场景。
在上述方法中,从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合,包括:
如果所述样本图像对应的场景为离散场景,根据输入的选择指令,从样本图像库中选择关键帧图像;
如果所述样本图像对应的场景为连续场景,根据预设的帧率或视差,从样本图像库中选择关键帧图像。
在上述方法中,所述确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合,包括:
根据样本图像库中包含的样本图像的第一数量和第i个样本特征点在样本图像库中出现的第一次数,确定第一平均次数;其中,所述第一平均次数用于表明所述第i个样本特征点平均在每一样本图像中出现的次数;
根据所述第i个样本特征点在第j个关键帧图像中出现的第二次数和所述第j个关键帧图像中包含的样本特征点的第二数量,确定第二平均次数;其中,所述第二平均次数用于表明所述第i个样本特征点占据第j个关键帧图像中包含的样本特征点的比例;
根据所述第一平均次数和所述第二平均次数,得到样本特征点在关键帧图像的中所占的比值,得到所述比值向量集合。
本申请实施例提供一种定位装置,所述装置包括:第一提取模块、第一匹配模块和第一确定模块,其中:
所述第一提取模块,配置为提取待处理图像的第一图像特征;
所述第一匹配模块,配置为根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;
所述第一确定模块,配置为根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
在上述装置中,所述待处理图像的第一图像特征包括:所述待处理图像的特征点的标识信息和二维位置信息;
所述第二图像特征包括:所述关键帧图像的特征点的二维位置信息、三维位置信息和标识信息。
在上述装置中,所述关键帧图像的特征点的三维位置信息是将所述关键帧图像的特征点的二维位置信息映射在所述预设地图所处的三维坐标系中得到的。
在上述装置中,所述第一提取模块,包括:
第一提取子模块,配置为提取所述待处理图像的特征点集合;
第一确定子模块,配置为确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所述待处理图像中的二维位置信息。
在上述装置中,所述第一匹配模块,包括:
第二确定子模块,配置为分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量;
第一获取子模块,配置为获取第二比值向量,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;
第一匹配子模块,配置为根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
在上述装置中,所述第一匹配子模块,包括:
第一确定单元,配置为根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征;
第二确定单元,配置为确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合;
第一选择单元,配置为从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述第一选择单元,包括:
第一确定子单元,配置为确定至少两个所述相似关键帧图像的采集时间之间的时间差,和所述至少两个相似关键帧图像的图像特征分别与所述第一图像特征的相似度差;
第一联合子单元,配置为将所述时间差小于第三阈值,且所述相似度差小于第四阈值的相似关键帧图像进行联合,得到联合帧图像;
第一选择子单元,配置为从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述第一选择子单元,还配置为分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和;将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像;根据目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,从所述目标联合帧图像的图像特征中,选择与所述第一图像特征相 似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述装置还包括:
第二确定模块,配置为将包含所述第二图像特征的图像,确定为所述待处理图像的匹配帧图像;
第三确定模块,配置为确定所述匹配帧图像中包含的任意两个特征点之间,小于第一阈值的目标欧式距离,得到目标欧式距离集合。
在上述装置中,所述第一确定模块,包括:
第三确定子模块,配置为如果所述目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,基于所述第二图像特征对应的关键帧图像的特征点的三维位置信息和所述第一图像特征对应的待处理图像的特征点的二维位置信息,确定所述图像采集设备的位置信息。
在上述装置,所述装置还包括:
第一选择模块,配置为从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合;
第二提取模块,配置为提取每一关键帧图像的图像特征,得到关键图像特征集合;
第三提取模块,配置为提取样本图像的特征点,得到包含不同的特征点的样本特征点集合;
第四确定模块,配置为确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合;
第一存储模块,配置为存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
在上述装置中,所述装置还包括:
第二选择模块,配置为从所述样本图像中选择预设数量的角点;其中,所述角点为所述样本图像中与预设区域内预设数量的像素点具有差别的像素点;
第五确定模块,配置为如果采集时间相邻的两个样本图像中包含的相同的角点数量大于等于第六阈值,确定所述样本图像对应的场景为连续场景;
第六确定模块,配置为如果采集时间相邻的两个样本图像中包含的相同的角点数量小于第六阈值,确定所述样本图像对应的场景为离散场景。
在上述装置中,第一选择模块,包括:
第一选择子模块,配置为如果所述样本图像对应的场景为离散场景,根据输入的选择指令,从样本图像库中选择关键帧图像;
第二选择子模块,配置为如果所述样本图像对应的场景为连续场景,根据预设的帧率或视差,从样本图像库中选择关键帧图像。
在上述装置中,所述第四确定模块,包括:
第四确定子模块,配置为根据样本图像库中包含的样本图像的第一数量和第i个样本特征点在样本图像库中出现的第一次数,确定第一平均次数;其中,所述第一平均次数用于表明所述第i个样本特征点平均在每一样本图像中出现的次数;
第五确定子模块,配置为根据所述第i个样本特征点在第j个关键帧图像中出现的第二次数和所述第j个关键帧图像中包含的样本特征点的第二数量,确定第二平均次数;其中,所述第二平均次数用于表明所述第i个样本特征点占据第j个关键帧图像中包含的样本特征点的比例;
第六确定子模块,配置为根据所述第一平均次数和所述第二平均次数,得到样本特征点在关键帧图像的中所占的比值,得到所述比值向量集合。
本申请实施例提供一种终端,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述定位方法中的步骤。
本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述定位方法中的步骤。
本申请实施例提供一种定位方法及装置、终端、存储介质,其中,首先,提取待处理图像的第一图像特征;然后,根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;最后,根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息;如此,对于任意的待处理图像,通过将图像特征,与预设地图中的关键帧图像的图像特征进行匹配,即可得到预设地图中的匹配帧图像,从而实现对图像采集设备的定位,不需要依赖于图像中的固定物体。
附图说明
图1为本申请实施例定位方法实现流程示意图;
图2A为本申请实施例定位方法的实现流程示意图;
图2B为本申请实施例创建预设地图的实现流程示意图;
图2C为本申请实施例定位方法另一实现流程示意图;
图3为本申请实施例定位方法的又一实现流程示意图;
图4为本申请实施例比值向量的结构示意图;
图5A为本申请实施例确定匹配帧图像的应用场景图;
图5B为本申请实施例确定采集设备的位置信息的结构示意图;
图6为本申请实施例定位装置的组成结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
本申请实施例提供一种定位方法,图1为本申请实施例定位方法实现流程示意图,如图1所示,所述方法包括以下步骤:
步骤S101,提取待处理图像的第一图像特征。
这里,第一图像特征包括:所述待处理图像的特征点的标识信息和2D位置信息。在步骤S101中,首先,提取所述待处理图像的特征点;然后,确定所述特征点的标识信息和所述特征点的在所述待处理图像中的2D位置信息;其中,特征点的标识信息可以理解为是能够唯一标识该特征点的描述子信息。
步骤S102,根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征。
这里,所述第二图像特征包括:所述关键帧图像的特征点的2D位置信息、3D位置信息和标识信息。所述预设地图中关键帧图像的关键图像特征集合,和每一样本特征点,在关键帧图像的中所占的比值对应的比值向量集合。所述步骤S102可以理解为,从预设地图中存储的关键帧图像的图像特征中,选择与第一图像特征匹配度较高的第二图像特征。
步骤S103,根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
这里,基于所述第二图像特征对应的关键帧图像的特征点的3D位置信息和所述第一图像特征对应的待处理图像的特征点的2D位置信息,确定所述图像采集设备的位置信息。比如,首先,在图像采集设备所处的三维坐标空间,将待处理图像的特征点的2D位置信息转换为3D位置信息,然后,将该3D位置信息与预设地图的三维坐标系中的关键帧图像的特征点的3D位置信息,进行比对,以确定图像采集设备的位置信息。这样,同时考虑了特征点的2D位置信息和3D位置信息,那么当对图像采集设备进行定位时,既可以得到图像采集设备的2D位置信息,还可以得到图像采集设备的3D位置信息,也可以理解为,既可以得到图像采集设备的平面空间位置,还可以得到图像采集设备的立体空间位置。
在本申请实施例中,对于采集到的待处理图像,首先,提取图像特征,然后,从预设地图中的关键帧图像的图像特征中选择与该图像特征匹配的第二图像特征,最后,将两个图像特征的特征点的位置信息,即可实现对图像采集设备的定位,即不需要依赖于图像中的固定物体,也不需要依赖于网络,从而提高了定位准确度和定位的鲁棒性。
本申请实施例提供一种定位方法,图2A为本申请实施例定位方法的实现流程示意图,如图2A所示,所述方法包括以下步骤:
步骤S201,提取所述待处理图像的特征点集合。
这里,对待处理图像的特征点进行提取,得到特征点集合。
步骤S202,确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所述待处理图像中的2D位置信息。
这里,对于特征点集合中的每一特征点,确定该特征点的描述子信息即(标识信息),2D位置 信息可以认为是该特征点的2D坐标。
上述步骤S201和步骤S202给出了一种实现“提取待处理图像的第一图像特征”的方式,在该方式中,得到待处理图像的每一特征点的2D坐标和该特征点的描述子信息。
步骤S203,分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量。
这里,所述多个样本特征点互不相同。所述预设的词袋模型中包含多个不同的样本特征点和多个样本特征点在所述关键帧图像中包含的特征点中所占的比值。所述第一比值向量可以是根据样本图像数量、样本特征点在样本图像中出现的次数、样本特征点在待处理图像里出现的次数和待处理图像中出现的样本特征点的总数来确定。
步骤S204,获取第二比值向量。
这里,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;第二比值向量是预先存储在预设的词袋模型中的,所以当需要对待处理图像的图像特征进行匹配时,从预设的词袋模型中获取该第二比值向量。第二比值向量的确定过程与第一比值向量的确定过程类似,均可采用公式(1)进行确定;而且所述第一比值向量和所述第二比值向量的维数相同。
步骤S205,根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
这里,所述步骤S205可以通过以下过程实现:
第一步,根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征。
这里,逐一的比较待处理图像的第一比值向量v 1与每一关键帧图像的第二比值向量v 2,采用这两个比值向量进行如公式(2)所示的计算,即可确定每一关键帧图像与待处理图像的相似度,从而筛选出相似度大于等于第二阈值的相似关键帧图像,得到相似关键帧图像集合。
第二步,确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合。
第三步,从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
这里,从相似关键帧图像包含的图像特征中,选择与第一图像特征相似度最高的第二图像特征;比如,首先,确定至少两个所述相似关键帧图像的采集时间之间的时间差,和所述至少两个相似关键帧图像的图像特征分别与所述第一图像特征的相似度差;然后,将所述时间差小于第三阈值,且所述相似度差小于第四阈值的相似关键帧图像进行联合,得到联合帧图像;也就是说,选择的是采集时间靠近,且与待处理图像的相似度靠近的多个相似关键帧图像,说明这些关键帧图像可能是连续的的画面,所以将这样的多个相似关键帧图像联合在一起,组成联合帧图像(也可以成为岛),这样得到多个联合帧图像;最后,从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。比如,先是分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和;这样,逐一的确定多个联合帧图像中包含的多个关键帧图像的图像特征与第一图像特征的相似度之和,如公式(3)所示。再,将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像;最后,根据目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,从所述目标联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。这样,由于目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,分别能够唯一的标识目标联合帧图像的特征点和待处理图像的特征点,所以基于这两个标识信息,可以非常准确的从所述目标联合帧图像的图像特征中,选择与第一图像特征相似度最高的第二图像特征。从而保证了,为待处理图像的第一图像特征匹配第二图像特征的准确度,保证了选择到的第二图像特征与第一图像特征的相似度极高。
上述步骤S203至步骤S205给出了一种实现“根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征”的方式,在该方式中,通过采用预设的词袋模型从关键帧图像的图像特征中检索出与第一图像特征的相匹配的第二图像特征,保证了第二图像特征与第一图像特征的相似度。
步骤S206,将包含所述第二图像特征的图像,确定为所述待处理图像的匹配帧图像。
这里,包含该第二图像特征的关键帧图像,说明该关键帧图像与待处理图像非常相似,所以将该关键帧图像作为该待处理图像的匹配帧图像。
步骤S207,确定所述匹配帧图像中包含的任意两个特征点之间,小于第一阈值的目标欧式距离,得到目标欧式距离集合。
这里,首先,确定匹配帧图像中包含的任意两个特征点之间的欧式距离,然后,从中选择小于第一阈值的欧式距离,作为目标欧式距离,以得到目标欧式距离集合;这是对于待处理图像中的一个特征点进行处理,可得到一个目标欧式距离集合,那么对于待处理图像中的多个特征点进行处理,则可得到多个欧式距离集合。所述小于第一阈值的目标欧式距离,还可以认为是首先从多个欧式距离中确定最小的欧式距离,然后判断该最小的欧式距离是否小于第一阈值,若小于,则确定该最小的欧式距离为目标欧式距离,那么目标欧式距离集合也就是多个欧式距离集合中,欧式距离最小的一个集合。
步骤S208,如果所述目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,基于所述第二图像特征对应的关键帧图像的特征点的3D位置信息和所述第一图像特征对应的待处理图像的特征点的2D位置信息,确定所述图像采集设备的位置信息。
这里,如果目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,说明目标欧式距离的数量是足够大,也说明与第一图像特征相匹配的特征点足够多,说明这个关键帧图像与待处理图像的相似度足够高。然后,将关键帧图像的特征点的3D位置信息和所述第一图像特征对应的待处理图像的特征点的2D位置信息,作为前端位姿跟踪算法(Perspectives-n-Point,PnP)算法的输入,先求出待处理图像的当前帧中特征点的2D位置信息(比如,2D坐标)在当前坐标系下该特征点的3D位置信息(比如,3D坐标),然后根据地图坐标系下的关键帧图像的特征点的3D位置信息和当前坐标系下的待处理图像的当前帧中特征点的3D位置信息,即可求解图像采集设备的位置信息。
上述步骤S206至步骤S208给出了一种实现“根据所述第二图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息”的方式,在该方式中,同时考虑关键帧图像的2D和3D位置信息,在定位结果上可以同时提供位置和姿态,所以提高了图像采集设备的定位准确度。
在本申请实施例中,是通过图像采集设备,得到待处理图像,加载构建好的预设地图,并利用预设的词袋模型检索匹配到待处理图像相对应的匹配帧图像,最后,再将待处理图像的特征点的2D位置信息和关键帧图像的特征点的3D位置信息,作为PnP算法的输入,以得到当前相机在地图中的精确位姿,以达到定位目的;这样,通过关键帧图像即可达到定位目的,得到图像采集设备在地图坐标系下的位置和姿态,提高了定位结果精度,而且不需要依赖外部基站设备,成本低,鲁棒性强。
本申请实施例提供一种定位方法,图2B为本申请实施例创建预设地图的实现流程示意图,如图2B所示,所述方法包括以下步骤:
步骤S221,从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合。
这里,首先,确定该样本图像对应的场景为连续场景还是离散场景,如果是离散场景,过程如下:
第一步,从所述样本图像中选择预设数量的角点;所述角点为所述样本图像中与周围预设数量的像素点具有较大差别的像素点;比如,选择150个角点。
第二步,如果采集时间相邻的两个样本图像中包含的相同的角点数量大于等于第六阈值,确定所述样本图像对应的场景为连续场景;两个样本图像的采集时间相邻,还可以理解为是连续的两个样本图像,判断这两个样本图像中包含的相同的角点的数量,数量越大,说明这两个样本图像的相关度越高,也说明这两个样本图像是来自于连续场景的图像。连续场景,比如,单一的室内环境,比如,卧室、客厅或单个会议室等。
第三步,如果采集时间相邻的两个样本图像中包含的相同的角点数量小于第六阈值,确定所述样本图像对应的场景为离散场景。这两个样本图像中包含的相同的角点的数量越小,说明这两个样本图像的相关度越低,也说明这两个样本图像是来自于离散场景的图像。离散场景,比如,在多个室内环境下,比如,一栋楼里的多个房间或者一层里的多个会议室等。
然后,如果样本图像对应的场景为离散场景,根据输入的选择指令,从样本图像库中选择关键帧图像;即,如果样本图像属于离散场景,说明多个样本图像对应的不是一个场景,那么用户手动选择关键帧图像,这样,保证了不同的环境下,所选的关键图像的有效性。
如果样本图像对应的场景为连续场景,根据预设的帧率或视差,从样本图像库中选择关键帧图像;即,如果样本图像属于连续场景,说明多个样本图像对应的是同一个场景,那么通过事先设置预设的帧率或者预设的视差,自动选择满足该预设的帧率或者预设的视差的样本图像作为关键帧图像,这样,既所选的关键图像的有效性,还提高了选择关键帧图像的效率。
步骤S222,提取每一关键帧图像的图像特征,得到关键图像特征集合。
这里,关键帧的图像特征包括:关键帧图像的特征点的2D位置信息、3D位置信息和能够唯一标识该特征点的标识信息。得到关键图像特征集合,以便于从关键图像特征集合中匹配出与第一图像特征高度相似的第二图像特征,从而得到相应的匹配帧图像。
步骤S223,确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合。
这里,得到比值向量集合之后,将不同的样本特征点和该比值向量集合存储于预设的词袋模型中,以便于采用预设的词袋模型从关键帧图像中检索出待处理图像的匹配帧图像。所述步骤S223可以通过以下过程实现:
首先,根据样本图像库中包含的样本图像的第一数量和第i个样本特征点在样本图像库中出现的第一次数,确定第一平均次数。第一平均次数用于表明所述第i个样本特征点平均在每一样本图像中出现的次数;比如,样本图像的第一数量为N,第i个样本特征点在样本图像库中出现的第一次数为n i,通过公式(1)即可得到第一平均次数idf(i)。
其次,根据所述第i个样本特征点在第j个关键帧图像中出现的第二次数和所述第j个关键帧图像中包含的样本特征点的第二数量,确定第二平均次数;第二平均次数用于表明所述第i个样本特征点占据第j个关键帧图像中包含的样本特征点的比例;比如,第二次数为
Figure PCTCN2020096488-appb-000001
第二数量为
Figure PCTCN2020096488-appb-000002
通过公式(1)即可得到第二平均次数tf(i,I t)。
最后,根据所述第一平均次数和所述第二平均次数,得到样本特征点在关键帧图像的中所占的比值,得到所述比值向量集合。比如,根据公式(1),将第一平均次数与第二平均次数相乘,即可得到比值向量
Figure PCTCN2020096488-appb-000003
步骤S224,存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
这里,将关键帧图像对应的比值向量集合和关键图像特征集合存储在预设地图中,以便于对图像采集设备进行定位时,采用该比值向量集合与利用预设的词袋模型确定的待处理图像对应的比值向量集合进行比对,以从关键图像特征集合中确定与待处理图像高度相似的匹配帧图像。
在本申请实施例中,对于样本图像的离散或连续场景,采用不同的关键帧图像选取方式,保证了所选择的关键帧图像的有效性,然后,从关键帧图像中提取图像特征,构建预设地图,保证了预设地图的准确性。
本申请实施例提供一种定位方法,图2C为本申请实施例定位方法另一实现流程示意图,如图2C所示,所述方法包括以下步骤:
步骤S231,从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合。
步骤S232,提取每一关键帧图像的图像特征,得到关键图像特征集合。
步骤S233,提取样本图像的特征点,得到包含不同的特征点的样本特征点集合。
步骤S234,确定每一样本特征点,在所述关键帧图像的中所占的比值,得到比值向量集合。
步骤S235,存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
上述步骤S231至步骤S235,完成了预设地图的创建过程,将关键帧图像的图像特征和比值向量集合存储在预设地图中,以便于能够根据比值向量集合从关键帧图像的图像特征中搜索出与待处理图像的图像特征相匹配的第二图像特征。
步骤S236,加载预设地图,并提取待处理图像的第一图像特征。
这里,当对图像采集设备进行定位时,需要先加载预设地图。
步骤S237,根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征。
步骤S238,根据所述第一图像特征和所述第二图像特征,确定配置为采集所述待处理图像的图像采集设备的位置信息。
上述步骤S236至步骤S238,给出了实现对图像采集设备进行定位的过程,在该过程中,通过采用从预设地图中存储的关键帧图像中匹配出与第一图像特征高度相似的第二图像特征,然后,利用利用这两个图像特征中的2D位置信息和3D位置信息,即可最终确定采集设备的位置信息。
在本申请实施例中,同时采用关键帧图像的2D和3D位置信息,保证了对采集设备的定位结果的准确度,定位成功率高,鲁棒性强,而且定位过程中不需要引入其他外部基站设备,降低了成本低廉。
本申请实施例提供一种定位方法,图3为本申请实施例定位方法的又一实现流程示意图,如图3所示,所述方法包括以下步骤:
步骤S301,针对样本图像所属的场景,选择不同的关键帧图像选取方式。
这里,样本图像所属的场景包括:离散场景或连续场景;关键帧图像的选取方法包括手动选择和自动选择两类,手动选择要求建图者手动选择需要纳入地图的关键帧图像,而自动选择是根据帧率或视差自动选择图像作为关键帧图像的方法。在针对关键帧图像的图像特征提取过程中,每个关键帧图像中都会提取出150个FAST特征角点,连续两个关键帧图像中拥有相同角点的比例定义为角点追踪率。本申请实施例将有序的关键帧图像序列并且平均角点追踪率大于30%的场景定义为连续场景,否则即为离散场景。连续场景的关键帧图像选取方法使用自动选择法;而离散场景的关键帧图像选取方法使用手动选择法。连续场景适用于单一的室内环境下,比如卧室、客厅、单个会议室等;离散场景更适合在多个室内环境下使用,比如一栋楼里的多个房间,或者一层里的多个会议室等。在地图构建的过程中,连续场景和离散场景的关键帧图像选取策略不同,适用场景不同。这样针对室内的离散或连续场景,通过不同的关键帧图像的选取方式,提取图像特征进行地图构建,这样定位过程不依赖于外部基站设备,成本低,定位精度高,鲁棒性强。
步骤S302,利用摄像头进行关键帧图像采集。
这里,该摄像头可以是单目摄像头还可以是双目摄像头。
步骤S303,采集过程中实时提取关键帧图像中的图像特征。
这里,图像特征提取是对关键帧图像的一种解释和标注的过程。在步骤S303中,需要提取关键帧图像的特征点的2D位置信息、3D位置信息和标识信息(即该特征点的描述子信息);其中,关键帧图像的特征点的3D位置信息是将关键帧图像的特征点的2D位置信息映射在预设地图所处的三维坐标系中得到的。比如,对关键帧图像提取多个2D的特征点,提取数量为150个(150为经验值,特征点数量过少,跟踪失败率高,特征点数量过多,影响算法效率),用于图像跟踪;并对该特征点进行描述子的提取,用于特征点匹配;其次,通过三角化方法计算得到特征点的3D位置信息(即深度信息),用于确定采集相机的位置。
步骤S304,采集过程中实时确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量。
这里,步骤S304可以理解为,在关键帧图像的采集过程中,针对当前帧图像,实时提取该关键帧图像的的比值向量,如图4所示,用词汇树的形式来描述词袋模型,词袋模型中包括样本图像库41,即词汇树的根结点;样本图像42、43和44,即叶子结点42、43;样本特征点1至3为样本图像42中不同的样本特征点,本特征点4至6为样本图像43中不同的样本特征点,本特征点7至9为样本图像44中不同的样本特征点。在词袋模型中假设有w种样本特征点,即w为词袋模型的样本图像里提取出来的特征点种类数量。所以词袋模型里一共有w个样本特征点。每个样本特征点会对该关键帧图像进行评分,评分值为0~1的浮点数,这样每个关键帧图像都可以用w维的浮点数来表示,这个w维向量就是词袋模型输出的比值向量
Figure PCTCN2020096488-appb-000004
评分的过程,如公式(1)所示:
Figure PCTCN2020096488-appb-000005
其中,N为样本图像数量(即第一数量),n i为样本特征点w i在样本图像中出现的次数(即第一次数),I t为t时刻采集的图像I,
Figure PCTCN2020096488-appb-000006
为样本特征点w i在时刻采集到的关键帧图像I t里出现的次数(即第二次数),
Figure PCTCN2020096488-appb-000007
为关键帧图像I t里出现的样本特征点总数(即第二数量)。通过样本特征点评分,得到每个关键帧图像的w维的浮点数向量,即比值向量,还可以将该比值向量作为预设的词袋模型的特征信息。
上述步骤S301至步骤S304,构建出一张依赖于关键帧图像的离线的预设地图,该预设地图以二进制格式存储关键帧图像的图像特征(包括:2D位置信息、2D位置信息和标识信息,比如,2D坐标、3D坐标、和描述子信息)到本地设备,当需要对图像采集设备进行时,该预设地图将被加载使用。
步骤S305,加载构建好的预设地图。
步骤S306,利用摄像头进行图像采集,得到待处理图像。
步骤S307,待处理图像采集过程中,实时提取待处理图像的当前帧中的第一图像特征。
这里,实时提取待处理图像的当前帧中的第一图像特征与步骤S303的过程类似,但不需要确定待处理图像的3D位置信息,因为在后续的PnP算法中不需要提供待处理图像的3D位置信息。
步骤S308,通过词袋模型检索待处理图像的当前帧在预设地图中的匹配帧图像。
这里,所述通过词袋模型检索待处理图像的当前帧在预设地图中的匹配帧图像,可以理解为利用词袋模型的特征信息即比值向量集合,进行检索待处理图像的当前帧在预设地图中的匹配帧图像。
所述步骤S308可以通过以下过程实现:
第一步,查找待处理图像的当前帧和每个关键帧图像的相似度,相似度s(v 1,v 2)的计算方式如公式(2)所示。
Figure PCTCN2020096488-appb-000008
其中,v 1和v 2分别表示词袋模型中包含的每一样本特征点在所述待处理图像的当前帧中所占的第一比值向量,和每一样本特征点在关键帧图像中所占的第二比值向量。如果词袋模型中包含w种样本特征点,那么第一比值向量和第二比值向量均为w维的向量。通过采用筛选出关键帧图像中相似度达到第二阈值的相似关键帧图像,成为相似关键帧图像集合。
第二步,在相似关键帧图像集合选取时间戳之差小于第三阈值,且相似度差小于第四阈值的相似关键帧图像联合在一起,得到联合帧图像(或被称为岛)。
这里,第二步可以理解为在相似关键帧图像集合选取时间戳靠近,且相似度的匹配分数靠近的相似关键帧图像联合在一起,被成为岛;这样将相似关键帧图像集合就被划分成了多联合帧图像(即多个岛)。联合帧图像中的第一个关键帧图像与最后一个关键帧图像之间的相似度之比非常小,该相似度之比
Figure PCTCN2020096488-appb-000009
如公式(3)所示:
Figure PCTCN2020096488-appb-000010
其中,
Figure PCTCN2020096488-appb-000011
和s(v t,v t-△t)分别表示一前一后两个关键帧图像的与当前帧的待处理图像的相似度。
第三步,分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和,如公式(4)所示,
Figure PCTCN2020096488-appb-000012
第四步,将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像,从所述目标联合帧图像中找出与待处理图像的当前帧相似度最高的匹配帧图像。
步骤S309,采用PnP算法,确定当前相机在地图坐标系中的位置信息。
这里,所述步骤S309可以通过以下步骤实现:
第一步,对待处理图像的当前帧X C的第N个特征点F CN,遍历匹配帧图像X 3的所有特征点,并确定匹配帧图像中任意两个特征点之间的欧式距离。如图5A所示,待处理图像的当前帧X c51,与该当前帧图像匹配的匹配帧图像X 352。计算特征点X 053和X 154之间的欧式距离,得到欧式距离F 0501;计算特征点X 154和X 255之间的欧式距离,得到欧式距离F 1502;计算特征点X 255和X 352之间的欧式距离,得到欧式距离F 2503;计算特征点X c51和X 456之间的欧式距离,得到欧式距离F 3504。
第二步,选择欧式距离最小的一组(即目标欧式距离集合)进行阈值判断,若小于第一阈值,确定为目标欧式距离,则形成目标欧式距离集合,否则不形成目标欧式距离集合,跳转至第一步,直至遍历X C的所有特征点,进入第三步。比如,如图5A所示,通过比较多个欧式距离,得到一组最小的欧式距离组合{F 1,F 2,F 3}。
第三步,形成目标欧式距离集合,可表示为{F 1,F 2,F 3},若目标欧式距离集合的元素数量大于第五阈值,则进行第四步,否则算法结束,输出匹配帧X 3的位置信息。
第四步,基于目标欧式距离集合,调用PnP中的函数求解出X C在地图坐标系下的位置信息。其中,PnP算法的过程如下:
PnP算法的输入是关键帧图像中的特征点的3D坐标和待处理图像的当前帧中特征点的2D坐标,该算法的输出是待处理图像的当前帧在地图坐标系中的位置。
PnP算法不是直接根据匹配对序列求出相机位姿矩阵的,而是先求出待处理图像的当前帧中特征点的2D坐标在当前坐标系下待处理图像的当前帧中特征点的3D坐标,然后根据地图坐标系下的3D坐标系和当前坐标系下的待处理图像的当前帧中特征点的3D坐标求解相机位姿的。PnP算法的求解是从余弦定理开始的,设当前坐标系中心为点O,A、B和C为待处理图像的当前帧中三个特征点,如图5B所示:
根据余弦定理,A、B和C之间的关系如公式(5)所示:
Figure PCTCN2020096488-appb-000013
对上式进行消元,同时除以OC 2,并另
Figure PCTCN2020096488-appb-000014
则可得公式(6):
Figure PCTCN2020096488-appb-000015
接着进行替换,另
Figure PCTCN2020096488-appb-000016
则可得公式(7):
Figure PCTCN2020096488-appb-000017
将公式(4)分别代入公式(6)和(7),则分别得到公式(8)和(9):
(1-w)x 2-w·y 2-2·x·cos<a,c>+2·w·x·y·cos<a,b>+1=0    (8);
(1-v)y 2-v·x 2-2·y·cos<b,c>+2·v·x·y·cos<a,b>+1=0    (9);
其中,由于A、B和C的2D坐标是已知的,所以w,v,cos<a,c>,cos<b,c>,cos<a,b>都是已知量,因此,未知量只有x,y两个,通过公式(8)和(9)可以求得x,y的值,从而,可以求解OA、OB和OC的值,如公式(10)所示:
Figure PCTCN2020096488-appb-000018
最后,即可得到A、B和C三个特征点在当前三维坐标系下的3D坐标,分别可通过公式(11)得到:
Figure PCTCN2020096488-appb-000019
得到A、B和C三个特征点在当前三维坐标系下的3D坐标后,然后通过地图坐标系到当前坐标系的变换,确定采集设备的位置。
上述步骤S305至步骤S309,对于图像采集设备采集到的待处理图像加载构建好的离线地图,通过词袋模型在预设地图中的关键帧图像中检索待处理图像的匹配帧图像,最后采用PnP算法求解当前相机在地图中的精确位姿,以确定该设备在地图坐标系下的位置和姿态,从而使得定位结果精度较高,不需要依赖外部基站设备,成本低,鲁棒性强。
在申请实施例中,同时考虑关键帧图像的2D坐标和3D坐标,在定位结果可以提供采集设备的3D坐标,提高了定位准确度;在建图和定位过程中,不需要引入其他外部基站设备,因此成本低廉; 而且不需要引入物体识别等错误率较高的算法,定位成功率高,鲁棒性强。
本申请实施例提供一种定位装置,该装置包括所包括的各模块、以及各模块所包括的各单元,可以通过计算机设备中的处理器来实现;当然也可通过具体的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。
图6为本申请实施例定位装置的组成结构示意图,如图6所示,所述装置600包括:第一提取模块601、第一匹配模块602和第一确定模块603,其中:
所述第一提取模块601,配置为提取待处理图像的第一图像特征;
所述第一匹配模块602,配置为根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;
所述第一确定模块603,配置为根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
在上述装置中,所述待处理图像的第一图像特征包括:所述待处理图像的特征点的标识信息和二维位置信息;
所述第二图像特征包括:所述关键帧图像的特征点的二维位置信息、三维位置信息和标识信息。
在上述装置中,所述关键帧图像的特征点的三维位置信息是将所述关键帧图像的特征点的二维位置信息映射在所述预设地图所处的三维坐标系中得到的。
在上述装置中,所述第一提取模块601,包括:
第一提取子模块,配置为提取所述待处理图像的特征点集合;
第一确定子模块,配置为确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所述待处理图像中的二维位置信息。
在上述装置中,所述第一匹配模块602,包括:
第二确定子模块,配置为分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量;
第一获取子模块,配置为获取第二比值向量,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;
第一匹配子模块,配置为根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
在上述装置中,所述第一匹配子模块,包括:
第一确定单元,配置为根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征;
第二确定单元,配置为确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合;
第一选择单元,配置为从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述第一选择单元,包括:
第一确定子单元,配置为确定至少两个所述相似关键帧图像的采集时间之间的时间差,和所述至少两个相似关键帧图像的图像特征分别与所述第一图像特征的相似度差;
第一联合子单元,配置为将所述时间差小于第三阈值,且所述相似度差小于第四阈值的相似关键帧图像进行联合,得到联合帧图像;
第一选择子单元,配置为从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述第一选择子单元,还配置为分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和;将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像;根据目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,从所述目标联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
在上述装置中,所述装置还包括:
第二确定模块,配置为将包含所述第二图像特征的图像,确定为所述待处理图像的匹配帧图像;
第三确定模块,配置为确定所述匹配帧图像中包含的任意两个特征点之间,小于第一阈值的目标欧式距离,得到目标欧式距离集合。
在上述装置中,所述第一确定模块603,包括:
第三确定子模块,配置为如果所述目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,基于所述第二图像特征对应的关键帧图像的特征点的三维位置信息和所述第一图像特征对应的待处理图像的特征点的二维位置信息,确定所述图像采集设备的位置信息。
在上述装置,所述装置还包括:
第一选择模块,配置为从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合;
第二提取模块,配置为提取每一关键帧图像的图像特征,得到关键图像特征集合;
第三提取模块,配置为提取样本图像的特征点,得到包含不同的特征点的样本特征点集合;
第四确定模块,配置为确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合;
第一存储模块,配置为存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
在上述装置中,所述装置还包括:
第二选择模块,配置为从所述样本图像中选择预设数量的角点;其中,所述角点为所述样本图像中与预设区域内预设数量的像素点具有差别的像素点;
第五确定模块,配置为如果采集时间相邻的两个样本图像中包含的相同的角点数量大于等于第六阈值,确定所述样本图像对应的场景为连续场景;
第六确定模块,配置为如果采集时间相邻的两个样本图像中包含的相同的角点数量小于第六阈值,确定所述样本图像对应的场景为离散场景。
在上述装置中,第一选择模块,包括:
第一选择子模块,配置为如果所述样本图像对应的场景为离散场景,根据输入的选择指令,从样本图像库中选择关键帧图像;
第二选择子模块,配置为如果所述样本图像对应的场景为连续场景,根据预设的帧率或视差,从样本图像库中选择关键帧图像。
在上述装置中,所述第四确定模块,包括:
第四确定子模块,配置为根据样本图像库中包含的样本图像的第一数量和第i个样本特征点在样本图像库中出现的第一次数,确定第一平均次数;其中,所述第一平均次数用于表明所述第i个样本特征点平均在每一样本图像中出现的次数;
第五确定子模块,配置为根据所述第i个样本特征点在第j个关键帧图像中出现的第二次数和所述第j个关键帧图像中包含的样本特征点的第二数量,确定第二平均次数;其中,所述第二平均次数用于表明所述第i个样本特征点占据第j个关键帧图像中包含的样本特征点的比例;
第六确定子模块,配置为根据所述第一平均次数和所述第二平均次数,得到样本特征点在关键帧图像的中所占的比值,得到所述比值向量集合。
以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的定位方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得包含该存储介质的设备自动测试线执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的定位方法中的步骤。
这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施 例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本申请实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得设备自动测试线执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例中,首先,提取待处理图像的第一图像特征;然后,根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;最后,根据所述第一图像特征和所述第二图像特征,确定配置为采集所述待处理图像的图像采集设备的位置信息;如此,对于任意的待处理图像,通过将图像特征,与预设地图中的关键帧图像的图像特征进行匹配,即可得到预设地图中的匹配帧图像,从而实现对图像采集设备的定位,不需要依赖于图像中的固定物体。

Claims (20)

  1. 一种定位方法,其中,所述方法包括:
    提取待处理图像的第一图像特征;
    根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;
    根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
  2. 根据权利要求1所述的方法,其中,所述待处理图像的第一图像特征包括:所述待处理图像的特征点的标识信息和二维位置信息;
    所述第二图像特征包括:所述关键帧图像的特征点的二维位置信息、三维位置信息和标识信息;
    所述关键帧图像的特征点的三维位置信息是将所述关键帧图像的特征点的二维位置信息映射在所述预设地图所处的三维坐标系中得到的。
  3. 根据权利要求2所述的方法,其中,所述提取待处理图像的第一图像特征,包括:
    提取所述待处理图像的特征点集合;
    确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所述待处理图像中的二维位置信息。
  4. 根据权利要求1所述的方法,其中,所述根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征,包括:
    分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量;
    获取第二比值向量,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;
    根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
  5. 根据权利要求4所述的方法,其中,所述根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征,包括:
    根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征;
    确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合;
    从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
  6. 根据权利要求5所述的方法,其中,所述从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征,包括:
    确定至少两个所述相似关键帧图像的采集时间之间的时间差,和所述至少两个相似关键帧图像的图像特征分别与所述第一图像特征的相似度差;
    将所述时间差小于第三阈值,且所述相似度差小于第四阈值的相似关键帧图像进行联合,得到联合帧图像;
    从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
  7. 根据权利要求6所述的方法,其中,所述从所述联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征,包括:
    分别确定多个联合帧图像中包含的每一关键帧图像的图像特征与所述第一图像特征的相似度之和;
    将相似度之和最大的联合帧图像,确定为与所述待处理图像的相似度最高的目标联合帧图像;
    根据目标联合帧图像的特征点的标识信息和所述待处理图像的特征点的标识信息,从所述目标联合帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
  8. 根据权利要求1至7任一项所述的方法,其中,在所述根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息之前,所述方法还包括:
    将包含所述第二图像特征的图像,确定为所述待处理图像的匹配帧图像;
    确定所述匹配帧图像中包含的任意两个特征点之间,小于第一阈值的目标欧式距离,得到目标 欧式距离集合。
  9. 根据权利要求8所述的方法,其中,根据所述第二图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息,包括:
    如果所述目标欧式距离集合中包含的目标欧式距离的数量大于第五阈值,基于所述第二图像特征对应的关键帧图像的特征点的三维位置信息和所述第一图像特征对应的待处理图像的特征点的二维位置信息,确定所述图像采集设备的位置信息。
  10. 根据权利要求1至7任一项所述的方法,其中,在所述采集当前场景下的图像之前,所述方法还包括:
    从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合;
    提取每一关键帧图像的图像特征,得到关键图像特征集合;
    提取样本图像的特征点,得到包含不同的特征点的样本特征点集合;
    确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合;
    存储所述比值向量集合和所述关键图像特征集合,得到所述预设地图。
  11. 根据权利要求10所述的方法,其中,在所述从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合之前,所述方法还包括:
    从所述样本图像中选择预设数量的角点;
    如果采集时间相邻的两个样本图像中包含的相同的角点数量大于等于第六阈值,确定所述样本图像对应的场景为连续场景;
    如果采集时间相邻的两个样本图像中包含的相同的角点数量小于第六阈值,确定所述样本图像对应的场景为离散场景。
  12. 根据权利要求11所述的方法,其中,从样本图像库中,选择满足预设条件的关键帧图像,得到关键帧图像集合,包括:
    如果所述样本图像对应的场景为离散场景,根据输入的选择指令,从样本图像库中选择关键帧图像;
    如果所述样本图像对应的场景为连续场景,根据预设的帧率或视差,从样本图像库中选择关键帧图像。
  13. 根据权利要求11所述的方法,其中,所述确定每一样本特征点,在关键帧图像的中所占的比值,得到比值向量集合,包括:
    根据样本图像库中包含的样本图像的第一数量和第i个样本特征点在样本图像库中出现的第一次数,确定第一平均次数;其中,所述第一平均次数用于表明所述第i个样本特征点平均在每一样本图像中出现的次数;
    根据所述第i个样本特征点在第j个关键帧图像中出现的第二次数和所述第j个关键帧图像中包含的样本特征点的第二数量,确定第二平均次数;其中,所述第二平均次数用于表明所述第i个样本特征点占据第j个关键帧图像中包含的样本特征点的比例;
    根据所述第一平均次数和所述第二平均次数,得到样本特征点在关键帧图像的中所占的比值,得到所述比值向量集合。
  14. 一种定位装置,其中,所述装置包括:第一提取模块、第一匹配模块和第一确定模块,其中:
    所述第一提取模块,配置为提取待处理图像的第一图像特征;
    所述第一匹配模块,配置为根据所述第一图像特征,从预设地图中存储的关键帧图像的图像特征中,匹配出第二图像特征;
    所述第一确定模块,配置为根据所述第一图像特征和所述第二图像特征,确定用于采集所述待处理图像的图像采集设备的位置信息。
  15. 根据权利要求14所述的装置,其中,所述待处理图像的第一图像特征包括:所述待处理图像的特征点的标识信息和二维位置信息;
    所述第二图像特征包括:所述关键帧图像的特征点的二维位置信息、三维位置信息和标识信息;
    所述关键帧图像的特征点的三维位置信息是将所述关键帧图像的特征点的二维位置信息映射在所述预设地图所处的三维坐标系中得到的。
  16. 根据权利要求15所述的装置,其中,所述第一提取模块,包括:
    第一提取子模块,配置为提取所述待处理图像的特征点集合;
    第一确定子模块,配置为确定所述特征点集合中每一特征点的标识信息和每一所述特征点在所 述待处理图像中的二维位置信息。
  17. 根据权利要求15所述的装置,其中,所述第一匹配模块,包括:
    第二确定子模块,配置为分别确定不同的样本特征点在所述特征点集合中所占的比值,得到第一比值向量;
    第一获取子模块,配置为获取第二比值向量,所述第二比值向量为所述多个样本特征点在所述关键帧图像中包含的特征点中所占的比值;
    第一匹配子模块,配置为根据所述第一图像特征、所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,匹配出第二图像特征。
  18. 根据权利要求17所述的装置,其中,所述第一匹配子模块,包括:
    第一确定单元,配置为根据所述第一比值向量和所述第二比值向量,从所述关键帧图像的图像特征中,确定与所述第一图像特征的相似度大于第二阈值的相似图像特征;
    第二确定单元,配置为确定所述相似图像特征所属的相似关键帧图像,得到相似关键帧图像集合;
    第一选择单元,配置为从所述相似关键帧图像的图像特征中,选择与所述第一图像特征相似度满足预设相似度阈值的第二图像特征。
  19. 一种终端,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现权利要求1至13任一项所述方法中的步骤。
  20. 一种计算机可读存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现权利要求1至13任一项所述方法中的步骤。
PCT/CN2020/096488 2019-06-28 2020-06-17 定位方法及装置、终端、存储介质 WO2020259360A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910579358.7 2019-06-28
CN201910579358.7A CN112150548B (zh) 2019-06-28 2019-06-28 定位方法及装置、终端、存储介质

Publications (1)

Publication Number Publication Date
WO2020259360A1 true WO2020259360A1 (zh) 2020-12-30

Family

ID=73891534

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096488 WO2020259360A1 (zh) 2019-06-28 2020-06-17 定位方法及装置、终端、存储介质

Country Status (2)

Country Link
CN (1) CN112150548B (zh)
WO (1) WO2020259360A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076443A (zh) * 2021-03-05 2021-07-06 Oppo广东移动通信有限公司 一种定位方法、装置、电子设备及存储介质
CN113657164A (zh) * 2021-07-15 2021-11-16 美智纵横科技有限责任公司 标定目标对象的方法、装置、清扫设备和存储介质
WO2022247548A1 (zh) * 2021-05-27 2022-12-01 上海商汤智能科技有限公司 定位方法和装置、电子设备及存储介质
CN117557599A (zh) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 一种3d运动物体追踪方法及系统、存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180150961A1 (en) * 2016-06-30 2018-05-31 Daqri, Llc Deep image localization
CN108269278A (zh) * 2016-12-30 2018-07-10 杭州海康威视数字技术股份有限公司 一种场景建模的方法及装置
CN108596976A (zh) * 2018-04-27 2018-09-28 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN108629843A (zh) * 2017-03-24 2018-10-09 成都理想境界科技有限公司 一种实现增强现实的方法及设备
CN108955718A (zh) * 2018-04-10 2018-12-07 中国科学院深圳先进技术研究院 一种视觉里程计及其定位方法、机器人以及存储介质
CN109544615A (zh) * 2018-11-23 2019-03-29 深圳市腾讯信息技术有限公司 基于图像的重定位方法、装置、终端及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427263A (zh) * 2015-12-21 2016-03-23 努比亚技术有限公司 一种实现图像配准的方法及终端
CN107610108B (zh) * 2017-09-04 2019-04-26 腾讯科技(深圳)有限公司 图像处理方法和装置
CN107888828B (zh) * 2017-11-22 2020-02-21 杭州易现先进科技有限公司 空间定位方法及装置、电子设备、以及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180150961A1 (en) * 2016-06-30 2018-05-31 Daqri, Llc Deep image localization
CN108269278A (zh) * 2016-12-30 2018-07-10 杭州海康威视数字技术股份有限公司 一种场景建模的方法及装置
CN108629843A (zh) * 2017-03-24 2018-10-09 成都理想境界科技有限公司 一种实现增强现实的方法及设备
CN108955718A (zh) * 2018-04-10 2018-12-07 中国科学院深圳先进技术研究院 一种视觉里程计及其定位方法、机器人以及存储介质
CN108596976A (zh) * 2018-04-27 2018-09-28 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN109544615A (zh) * 2018-11-23 2019-03-29 深圳市腾讯信息技术有限公司 基于图像的重定位方法、装置、终端及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076443A (zh) * 2021-03-05 2021-07-06 Oppo广东移动通信有限公司 一种定位方法、装置、电子设备及存储介质
WO2022247548A1 (zh) * 2021-05-27 2022-12-01 上海商汤智能科技有限公司 定位方法和装置、电子设备及存储介质
CN113657164A (zh) * 2021-07-15 2021-11-16 美智纵横科技有限责任公司 标定目标对象的方法、装置、清扫设备和存储介质
CN117557599A (zh) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 一种3d运动物体追踪方法及系统、存储介质
CN117557599B (zh) * 2024-01-12 2024-04-09 上海仙工智能科技有限公司 一种3d运动物体追踪方法及系统、存储介质

Also Published As

Publication number Publication date
CN112150548A (zh) 2020-12-29
CN112150548B (zh) 2024-03-29

Similar Documents

Publication Publication Date Title
WO2021057797A1 (zh) 定位方法及装置、终端、存储介质
WO2020259360A1 (zh) 定位方法及装置、终端、存储介质
WO2020259361A1 (zh) 地图更新方法及装置、终端、存储介质
CN110568447B (zh) 视觉定位的方法、装置及计算机可读介质
CN109947975B (zh) 图像检索装置、图像检索方法及其中使用的设定画面
WO2021057744A1 (zh) 定位方法及装置、设备、存储介质
WO2020259481A1 (zh) 定位方法及装置、电子设备、可读存储介质
CN107369183A (zh) 面向mar的基于图优化slam的跟踪注册方法及系统
US9626585B2 (en) Composition modeling for photo retrieval through geometric image segmentation
CN110738703B (zh) 定位方法及装置、终端、存储介质
US9288636B2 (en) Feature selection for image based location determination
CN111291768A (zh) 图像特征匹配方法及装置、设备、存储介质
Shen et al. Image-based indoor place-finder using image to plane matching
CN112419388A (zh) 深度检测方法、装置、电子设备和计算机可读存储介质
WO2022237026A1 (zh) 平面信息检测方法及系统
CN105740777A (zh) 信息处理方法及装置
CN113408590A (zh) 场景识别方法、训练方法、装置、电子设备及程序产品
US9188444B2 (en) 3D object positioning in street view
WO2015069560A1 (en) Image based location determination
EP3300020A1 (en) Image based location determination
US20150134689A1 (en) Image based location determination
JP2023510945A (ja) シーン識別方法及びその装置、インテリジェントデバイス、記憶媒体並びにコンピュータプログラム
CN110209881B (zh) 视频搜索方法、装置及存储介质
JP6244887B2 (ja) 情報処理装置、画像探索方法、及びプログラム
Kim et al. Semantic Descriptors into Representation for Robust Indoor Visual Place Recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20832706

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20832706

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20832706

Country of ref document: EP

Kind code of ref document: A1