WO2021129145A1 - 一种图像特征点过滤方法以及终端 - Google Patents

一种图像特征点过滤方法以及终端 Download PDF

Info

Publication number
WO2021129145A1
WO2021129145A1 PCT/CN2020/125271 CN2020125271W WO2021129145A1 WO 2021129145 A1 WO2021129145 A1 WO 2021129145A1 CN 2020125271 W CN2020125271 W CN 2020125271W WO 2021129145 A1 WO2021129145 A1 WO 2021129145A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature points
quality score
feature
original image
Prior art date
Application number
PCT/CN2020/125271
Other languages
English (en)
French (fr)
Inventor
邹李兵
张一凡
李保明
宁越
Original Assignee
歌尔股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 歌尔股份有限公司 filed Critical 歌尔股份有限公司
Priority to US17/595,079 priority Critical patent/US12051233B2/en
Publication of WO2021129145A1 publication Critical patent/WO2021129145A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the technical field of image processing, and in particular to a method for filtering image feature points and a terminal.
  • Point cloud-based motion state estimation, 3D reconstruction, and visual positioning are the core of SLAM (Simultaneous Localization And Mapping) technologies, especially in the field of SFM (Structure From Motion, motion-based modeling).
  • SLAM Simultaneous Localization And Mapping
  • SFM Structure From Motion, motion-based modeling
  • Good visual feature points are conducive to improving the accuracy of positioning and maintaining continuous and stable motion tracking.
  • the number of feature points is large and the quality of feature points is not good, it is difficult to quickly relocate after tracking is lost, which affects the positioning efficiency and accuracy.
  • this application is proposed to provide an image feature point filtering method and terminal that overcomes the above problems or at least partially solves the above problems.
  • the feature points are quantitatively scored through the neural network model, and the low quality feature points are effectively filtered. Save storage space and improve positioning performance.
  • a method for filtering image feature points including:
  • the quality scores for the feature points extracted from the image and train the neural network model according to the feature points and the quality scores of the feature points; among them, during the training process, the quality scores of the feature points of the current frame image are based on the feature points in the current frame
  • the quality score in and the quality score in each tracking frame are jointly determined, and the tracking frame includes a preset number of consecutive frames after the tracked current frame;
  • a terminal including an image feature point filtering device, and the image feature point filtering device includes:
  • the model training module is used to set the quality scores for the feature points extracted from the image, and train the neural network model according to the feature points and the quality scores of the feature points.
  • the quality scores of the feature points of the current frame image Based on the quality scores of the feature points in the current frame and the quality scores in each tracking frame are jointly determined, the tracking frame includes a preset number of consecutive frames after the tracked current frame;
  • the feature extraction module is used to obtain a frame of the original image and extract the feature points of the original image after the filtering is started once;
  • the score determination module is used to input the original image and the feature points of the original image into the neural network model to obtain the quality score corresponding to the feature point of the original image and output the quality score corresponding to the feature point of the original image;
  • the filtering module is used to filter the feature points of the original image according to the quality score corresponding to the feature points of the original image and preset filtering rules.
  • a computer-readable storage medium stores a computer program, and the computer program causes a computer to execute the image feature point filtering method in the foregoing method embodiment.
  • the method and terminal for filtering image feature points on the one hand, by setting quality scores for feature points and using neural networks to quantify and scoring feature points and other technical features, feature points with low quality are effectively filtered and the number of feature points is reduced. , Thereby reducing the demand for computing and storage resources; on the other hand, combining the current frame and tracking frame feature score method to consider not only the current weight of the feature point, but also the long-term benefit of the feature point, thereby improving the filtering performance.
  • Fig. 1 shows a schematic flowchart of a method for filtering image feature points according to an embodiment of the present application
  • Fig. 2 shows a schematic diagram of a neural network model training process according to an embodiment of the present application
  • Fig. 3 shows a schematic diagram of an original image, a mask image, and a mask value image according to an embodiment of the present application
  • FIG. 4 shows a schematic flow chart of using a neural network model to predict a quality score corresponding to a feature point according to an embodiment of the present application
  • Fig. 5 shows a block diagram of a terminal according to an embodiment of the present application.
  • Fig. 6 shows a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present application.
  • the image feature point filtering method of the embodiment of the present application is applied to the fields of motion state estimation, three-dimensional reconstruction, and visual positioning based on sparse point clouds, and is the key to technologies such as Visual SLAM, virtual reality, and deep learning.
  • Visual SLAM (or V-SLAM) acquires environmental images through visual sensors, extracts feature points to construct a point cloud map, and realizes positioning and navigation on this basis.
  • Feature point filtering is a link in V-SLAM positioning and tracking, specifically the process of filtering the extracted feature points on the basis of feature point extraction.
  • there are two methods of point cloud generation in tracking and positioning which are the traditional artificial-based feature point method and the automatic feature point method based on deep neural network.
  • the feature point method based on deep neural network uses the good stability and robustness of deep neural network, and the extracted feature points have strong adaptability to the environment and better tracking effect, which is a hot research topic at present.
  • the feature point processing based on deep neural network is to extract the feature points of the image through LSTM (Long Short-Term Memory) or CNN (Convolutional Neural Networks, convolutional neural network) network, and generate feature descriptors to construct Point cloud and realize tracking and relocation on this basis.
  • the feature point cloud generation method based on the deep neural network has good environmental adaptability, good anti-interference, and tracking is not easy to lose, but there will also be a phenomenon of poor relocation performance after tracking is lost.
  • the inventor of the present application found that the main reason for this phenomenon is that when the feature point is selected (during model training), only the feature relevance of the images before and after the frame is considered, and the feature point is not considered to be selected as the feature point to construct the feature point.
  • the long-term tracking performance of the cloud makes it difficult to quickly relocate after the target tracking is lost.
  • the point cloud is too dense, which increases the storage space, and the too dense point cloud also increases the retrieval time.
  • this application proposes a feature point quality evaluation filtering scheme based on a deep neural network. Based on this technical solution, the quality of feature points can be quantitatively scored. In the actual application process, the feature points can be evaluated according to the score. Filtering and screening. For example, in the V-SLAM point cloud construction stage, this technical solution can be applied to sparse the point cloud, retain important feature points, reduce the storage space of the point cloud map, and realize the compression of the point cloud. Moreover, in the relocation phase of V-SLAM, since the feature evaluation mechanism of the embodiment of the application refers to long-term benefits, the matching success rate of feature points can also be improved, thereby improving the retrieval speed and positioning of V-SLAM in the relocation phase. Accuracy meets actual needs.
  • Fig. 1 shows a schematic flowchart of an image feature point filtering method according to an embodiment of the present application.
  • the image feature point filtering method of this embodiment includes the following steps:
  • Step S101 Set quality scores for the feature points extracted from the image, and train the neural network model according to the feature points and the quality scores of the feature points; wherein, during the training process, the quality scores of the feature points of the current frame image are based on the feature points
  • the quality score in the current frame and the quality score in each tracking frame are jointly determined, and the tracking frame includes a preset number of consecutive frames after the tracked current frame.
  • a parameter of quality score is first added to the feature point, which is used to quantify and evaluate the value of the feature point.
  • the quality score can have a predetermined initial value, but each time the image data is input to the neural network for training, the quality score of the feature points in the current frame image is determined by the quality scores of the feature points in the current frame and the tracking frame. This method takes into account the value of the next frame of the current frame of image, balances the value of the next frame and subsequent continuous multi-frame images for this filtering, and improves the filtering accuracy.
  • step S102 after filtering is started once, a frame of original image is acquired and the feature points of the original image are extracted.
  • Step S103 Input the original image and the feature points of the original image into the neural network model to obtain the quality scores corresponding to the feature points of the original image and output the quality scores corresponding to the feature points of the original image.
  • Step S104 Filter the feature points of the original image according to the quality score corresponding to the feature points of the original image and the preset filtering rule.
  • the image feature point filtering method of this embodiment achieves the beneficial effects of setting quality scores for feature points and then quantifying the quality of the feature points through the neural network, and filtering the feature points according to the quality scores. , Improve the efficiency of terminal navigation and positioning, and when quantifying the quality of feature points, the scoring method that combines the feature points of the current frame and the tracking frame takes into account both the current weight of the feature points and the long-term benefits of the feature points. (That is, the performance and value of a certain feature point in subsequent frames), thereby improving the filtering performance and ensuring the positioning accuracy.
  • this embodiment adopts an unsupervised learning method, without manually labeling samples, and dynamically updates the neural network model through online learning training to continuously improve the accuracy of the neural network model. That is to say, training the neural network model according to the feature points and the quality scores of the feature points in this embodiment includes: generating samples according to the image, the feature points of the image, and the quality scores corresponding to the feature points of the image, and using the generated samples Train the neural network model and update the quality score prediction value of the neural network model for the next filtering.
  • the quality score prediction value here is a parameter of the neural network model.
  • the quality score prediction value is the default value.
  • samples are generated and the neural network model is trained to update the default value. Therefore, the accuracy of the model prediction stage is improved, and the neural network model can predict a more accurate feature point quality score after inputting the image data to be filtered.
  • Fig. 2 shows a schematic diagram of a neural network model training process according to an embodiment of the present application. The following describes the neural network model training process of an embodiment of the present application with reference to Fig. 2.
  • step S201 is first performed for an image frame I n to extract feature points.
  • ORB-SLAM is a visual SLAM (Simultaneous Localization And Mapping, synchronous positioning and map construction) based on ORB (Oriented FAST and Rotated Brief) descriptors. This feature detection operator is proposed on the basis of FAST feature detection and BRIEF feature descriptors Yes, its running time is far better than SIFT and SURF, and can be applied to real-time feature detection.
  • ORB feature detection has scale and rotation invariance, as well as invariance to noise and its perspective transformation.
  • the feature points can be stored in the feature point set P n .
  • Step S202 generating a mask map and generating a mask value map.
  • this step obtains the mask image according to the image I n and the feature points of the image I n ; the value of the pixel corresponding to the feature point in the mask image is 256 (for example only), The value of the remaining pixels is 0 (for example only).
  • the mask value image is obtained; wherein the value of the pixel point corresponding to the feature point in the mask value image is the quality score corresponding to the feature point.
  • the process of obtaining the mask value image is realized on the basis of setting the quality scores for the feature points extracted from the image.
  • Fig. 3 shows a schematic diagram of an original image, a mask image, and a mask value image according to an embodiment of the present application. See Fig. 3.
  • the leftmost image I n is the original image. You can see the three extracted images from the original image.
  • the two feature points are feature point A, feature point B and feature point C.
  • the width and height of the original image I n are used as the standard to construct the mask image Mask n and the mask value image MaskV n of the same size.
  • the mask image is shown in the middle image of Fig. 3, and the mask value image is shown in Fig. 3. The image on the right.
  • the initial value setting process set the value of the feature point corresponding to the position in the mask map Mask n in the aforementioned set P n to 256 (for example), and set each feature point in the mask value map MaskV n
  • the values of the points in the corresponding positions are all set to 1 (or set to 1, 3, and 5 respectively).
  • the values of the corresponding points of other pixels in the mask map Mask n are set to 0, and the values of other pixels in the mask value map MaskV n are set to -1. That is, by setting the quality score of the pixel point corresponding to the feature point, and setting the other pixels to the same value, distinguish the pixel point corresponding to the feature point from the rest of the pixels, and then only focus on the pixel point corresponding to the feature point. This improves the efficiency of the algorithm.
  • the mask image stores the pixel position and gray value information of the original image feature points
  • the mask value image stores the pixel position and quality score information of the original image feature points
  • the pixel point corresponding to the feature point A in the mask value image MaskV n in FIG. 3 is the point in the first row and the third column, and the quality score of this point is 1.
  • the pixel point corresponding to the feature point B is the point in the third row and the second column, and the quality score of this point is 3.
  • the pixel point corresponding to the feature point C is the point in the fourth row and the fifth column, and the quality score of this point is 5.
  • setting the quality score for the feature points extracted from the image includes setting the same initial value or different initial values for the quality score of the feature points extracted from the image.
  • the reason why the quality scores corresponding to the three feature points shown on the left side of Figure 3 are different is because the benefits of tracking these three feature points after the current frame are different, or , The importance of the three feature points in the tracking frame after the original image is different, which is also the principle that the application can perform quantitative scoring and filtering on the feature points.
  • Step S203 feature point matching.
  • Feature point matching can be achieved by using existing technology. This implementation For example, focus on the result of feature point matching, that is, there are feature points that are successfully matched or that there are no feature points that are successfully matched.
  • step S204 whether there is a matching feature point, if yes, step S205 is executed; otherwise, the number of matched feature points is 0, and step S207 is executed.
  • step S207 is directly executed, that is, when When there is no matching feature point between the image and the next frame of the image, the feature points extracted from the image are not tracked, and samples are directly generated based on the image, the mask image, and the mask value image.
  • the mask value map of the current frame image is modified according to the tracking result of this time.
  • Step S205 Modify the mask value map.
  • each The quality score in the tracking frame is determined according to the product of the quality score of the feature point in the previous frame of the tracking frame and a discount factor, where the discount factor is a constant greater than 0 and less than 1.
  • tracking is a cyclic process.
  • One tracking includes steps S203 to S206, namely, read in the next frame ⁇ execute S203 to match the feature points of the next frame with the features of the current frame ⁇ there is a matching point, execute Step S205: Modify the mask value (quality score) of the matching point in the mask value map of the current frame ⁇
  • the tracking frame depth is not reached (that is, the judgment result of step S206 in Figure 2 is No)
  • the frame depth that is, the judgment result of step S206 in FIG. 2 is YES
  • the exit condition of the tracking loop is that the depth of the currently tracked frame has reached the preset tracking frame depth.
  • the so-called tracking frame depth is the number of frames that are continuously tracked after the current frame. For example, if you track k consecutive frames after the current frame, then k is the depth threshold, and k is a natural number greater than zero.
  • Tracking embodies the inventive concept of the “cumulative discounted return” of this application, highlights the value of the next frame of the current frame, balances the value of the next frame and subsequent consecutive image frames for the current image, which is beneficial to improve positioning accuracy.
  • the quality score of the feature point in the current frame and the quality score of each tracking frame are summed to obtain the sum value, and the feature point in the mask value image of the current frame is corresponding The quality score of is replaced with the sum value.
  • the quality score of the feature point of the current frame in each tracking is updated by the following update formula:
  • 0 ⁇ 1 is the discount factor
  • v is the quality score of the previous frame of each tracking frame
  • s is the tracking frame depth
  • k is the threshold of the tracking frame depth.
  • k is equal to 4, that is, 4 frames are continuously tracked after the current frame.
  • the initial value of v is the quality score 1 corresponding to the feature points in the mask value image.
  • s 0 means untracked, that is, the image frame is the current frame, and V means the sum value, then for the current frame, the sum value V is equal to 1*1, which is the quality score corresponding to the feature points in the aforementioned mask value image
  • the initial value of 1 is consistent.
  • the quality score of the feature points in the current frame is 1, and the quality score of the feature points in the next frame is equal to that of the current frame. Add ⁇ to the quality score.
  • the tracking frame depth s is equal to 2
  • the next frame and the next two frames are used to determine the characteristics of the current frame.
  • the quality score of the point as mentioned above, the quality score of the feature point in the current frame is 1, and the quality score in the next frame is equal to the score of the current frame plus ⁇ (that is, 1+ ⁇ ), in the next two frames
  • Step S206 tracking whether the frame depth is less than the threshold, if yes, execute step S207, otherwise, return to execute step S204.
  • the tracking frame depth is +1 to determine whether the accumulated tracking frame depth reaches the tracking frame depth threshold. If it reaches the tracking frame depth threshold, stop tracking, otherwise continue tracking to update the quality score of the current frame . That is, calculate the sum of the quality score in the tracking frame and the quality score of the current frame, and replace the mask shown in Figure 3 with the final sum (that is, the sum V of the latest tracked frame) at the end of all tracking. Value the value of the pixel point corresponding to the feature point in the image, that is, the quality score of the feature point in the current frame is obtained.
  • the quality score of the pixel point corresponding to the feature point B is 3, and the quality score of the pixel point corresponding to the feature point C is 5. It can be obtained that the quality scores of the three feature points A, B, and C in the original image are different.
  • Step S207 output the mask value map and the mask map.
  • step S206 if the tracking is successful, the mask value map after the modified mask value is output, such as the mask value map shown on the far right in Figure 3, and if the tracking fails, the original mask value map ( The mask value in the original mask value graph is the initial value and has not been modified after tracking).
  • the mask map is also output in this step to be used to generate samples.
  • step S208 a sample (S, V) is generated.
  • Step S209 Generate a training set.
  • the training set, the verification set and the test set are generated from the sample set ⁇ .
  • Step S210 neural network model training.
  • a first input channel for receiving a mask image and a second input channel for receiving a mask value image are added to the neural network model
  • the image in the sample is input into the neural network model through the input channel of the neural network model, and the mask image and the mask value image are input into the neural network model through the first input channel and the second input channel, respectively, based on the mean square error Loss function, calculate the error between the quality score in the mask value in the sample and the quality score predicted value of the neural network model, and determine the new quality score predicted value of the neural network model according to the error, and realize the update of the neural network model The predicted value of the quality score.
  • the image (the image is a color image) contains three channels of RGB red, green and blue, adding two input channels as the mask layer and the mask value layer, and the image and the mask image are combined through these input channels.
  • the mask value image is input into the neural network.
  • the backbone network of the neural network model is constructed based on VGG16, which is a pre-trained network based on the ImageNet image library of a large number of real images.
  • the loss function of the neural network model is the mean square loss function, as shown below:
  • the mass value y i 'neural network model prediction value, y i is V, i.e., the image mask value MaskV n value in the sample.
  • the quality score prediction value of the neural network model is determined at the end of the neural network model training process. Specifically, the prediction value with the smallest difference from the mask value in the sample is used as the quality score prediction at the end of the neural network model training value.
  • Step S211 the model is output.
  • BP Error Back Propagation
  • the basic idea of the BP algorithm is that the learning process consists of two processes: the forward propagation of the signal and the back propagation of the error.
  • the forward propagation the input samples are passed in from the input layer, processed by each hidden layer layer by layer, and then passed to the output layer. If the actual output of the output layer does not match the expected output, then it goes to the error back propagation stage.
  • This process of adjusting the weights of each layer of signal forward propagation and error back propagation is repeated, and the process of constant weight adjustment is the learning and training process of the network. This process continues until the error of the network output is reduced to an acceptable level, or until the preset number of learning times.
  • Fig. 4 shows a schematic flow chart of using a neural network model to predict the quality score corresponding to a feature point according to an embodiment of the present application. See Fig. 4.
  • a frame of original image is obtained (as shown in Fig. 4 Image frame I n ) and extract feature points of the original image;
  • the mask image and the original image are combined (ie, combined into S) and input into the neural network model to perform neural network prediction.
  • the original image and the mask map that records the feature points of the original image are input into the neural network model; the original image refers to the image to be filtered obtained in a filtering process.
  • the neural network model outputs a mask value map (ie, the output mask value map V).
  • the mask value map is the output of the neural network model and contains the quality score information corresponding to the feature points of the image to obtain the mask value After the map, the quality score corresponding to the feature points of the original image can be determined.
  • the feature points of the original image are filtered according to the quality score corresponding to the feature points of the original image and the preset filtering rules.
  • filtering rules can be preset as needed. For example, according to the quality score corresponding to the feature points of the original image and the preset filtering rules, filtering the feature points of the original image includes: when the preset filtering rule is a threshold filtering rule, the quality score is not greater than the preset quality score threshold The feature points of the original image are filtered out; when the preset filtering rule is the ranking filtering rule, the feature points of the original image are sorted in descending order according to the size of the quality score corresponding to the feature points of the original image. The feature points of a preset number of original images are filtered out.
  • the feature points can be sorted according to the mask value map (for example, reverse sorting from large to small, with a larger value indicating a higher degree of importance), and further combining with filtering rules to finally complete feature point filtering, such as , To filter out all feature points below a certain threshold. That is, the filtering of feature points based on the mask value is realized.
  • the mask value map for example, reverse sorting from large to small, with a larger value indicating a higher degree of importance
  • FIG. 5 shows a block diagram of a terminal according to an embodiment of the present application.
  • the terminal 500 includes an image feature point filtering device 501, and the image feature point filtering device 501 includes:
  • the model training module 5011 is used to set quality scores for the feature points extracted from the image, and train the neural network model according to the feature points and the quality scores of the feature points.
  • the quality score of the feature points of the current frame image The value is determined based on the quality score of the feature point in the current frame and the quality score in each tracking frame, and the tracking frame includes a preset number of consecutive frames after the tracked current frame;
  • the feature extraction module 5012 is used to obtain a frame of original image and extract the feature points of the original image after the filtering is started once;
  • the score determination module 5013 is used to input the original image and the feature points of the original image into the neural network model to obtain the quality scores corresponding to the feature points of the original image and output the quality scores corresponding to the feature points of the original image;
  • the filtering module 5014 is configured to filter the feature points of the original image according to the quality score corresponding to the feature points of the original image and preset filtering rules.
  • the model training module 5011 is used to generate samples according to the image, the feature points of the image, and the quality scores corresponding to the feature points of the image, and use the generated samples to train the neural network model to update the neural network. The predicted value of the quality score of the network model for the next filtering.
  • the model training module 5011 is specifically configured to set the same initial value or different initial values for the quality scores of the feature points extracted from the image.
  • the model training module 5011 is specifically configured to obtain a mask image according to the image and the feature points of the image; wherein the value of the pixel point corresponding to the feature point in the mask image is 256, and the value of the remaining pixel points is 256.
  • the value is 0; according to the image, the feature points of the image, and the quality scores corresponding to the feature points of the image, the mask value image is obtained; the value of the pixel point corresponding to the feature point in the mask value image is the quality score corresponding to the feature point Value; Generate samples based on the image, mask image, and mask value image.
  • the model training module 5011 is specifically configured to track the feature points extracted from the image when there are feature points that are successfully matched between the image and the next frame of the image;
  • the quality scores of the feature points in the current frame and the quality scores in each tracking frame are summed to obtain the sum value, and the quality score corresponding to the feature point in the mask value image is replaced with the sum value; among them, each tracking The quality score in the frame is determined according to the product of the quality score of the feature point in the previous frame of the tracking frame and a discount factor, where the discount factor is a constant greater than 0 and less than 1.
  • the model training module 5011 is specifically used to add a first input channel for receiving a mask image and a second input channel for receiving a mask value image to the neural network model;
  • the image in the sample is input to the neural network model through the input channel of the neural network model, and the mask image and the mask value image are input to the neural network model through the first input channel and the second input channel, respectively, based on the mean square error loss Function, calculate the error between the quality score in the mask value image in the sample and the quality score predicted value of the neural network model, and determine the new quality score predicted value of the neural network model according to the error, and realize the update of the neural network model Quality score prediction value.
  • the model training module 5011 is specifically configured to not track the feature points extracted from the image when there is no successfully matched feature point between the image and the next frame of the image, and directly according to the image, The mask image and the mask value image generate samples.
  • the filtering module 5014 is specifically configured to filter out the feature points of the original image whose quality score is not greater than the preset quality score threshold when the preset filtering rule is a threshold filtering rule;
  • the filter rule is a ranking filter rule
  • the feature points of the original image are sorted in descending order according to the size of the quality score corresponding to the feature points of the original image, and the feature points of the preset number of original images that are sorted are filtered out .
  • FIG. 6 another embodiment of the present application provides a computer-readable storage medium 600, where the computer-readable storage medium stores a computer program, and the computer program causes a computer to execute the image feature point filtering method in any one of the above method embodiments.
  • the computer program causes the computer to execute the following image feature point filtering method:
  • the quality scores for the feature points extracted from the image and train the neural network model according to the feature points and the quality scores of the feature points; wherein, in the training process, the quality scores of the feature points of the current frame image are based on the feature points in the The quality score in the current frame and the quality score in each tracking frame are jointly determined, and the tracking frame includes a preset number of consecutive frames after the tracked current frame;
  • the computer program also causes the computer to execute the following image feature point filtering method:
  • the computer program also causes the computer to execute the following image feature point filtering method:
  • a mask value image is obtained; wherein the value of the pixel point corresponding to the feature point in the mask value image is The quality score corresponding to the feature point;
  • the mask image and the mask value image generate samples.
  • the computer program also causes the computer to execute the following image feature point filtering method:
  • the quality score of the feature point in the current frame and the quality score of each tracking frame are summed to obtain the sum value, and the quality corresponding to the feature point in the mask value image is calculated. Replace the score with the sum value;
  • the quality score in each tracking frame is determined according to the product of the quality score of the feature point in the previous frame of the tracking frame and a discount factor, where the discount factor is a constant greater than 0 and less than 1.
  • the computer program also causes the computer to execute the following image feature point filtering method:
  • the preset filtering rule is a threshold filtering rule, filter out the feature points of the original image whose quality score is not greater than the preset quality score threshold;
  • the preset filtering rule is a ranking filtering rule
  • the feature points of the original image are sorted in descending order, and the preset ones that are ranked lower are sorted. Several feature points of the original image are filtered out.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

一种图像特征点过滤方法以及终端。图像特征点过滤方法包括:为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型(S101);一次过滤启动后,获取一帧原始图像并提取所述原始图像的特征点(S102);将所述原始图像以及所述原始图像的特征点输入到神经网络模型中,得到所述原始图像的特征点对应的质量分值并输出所述原始图像的特征点对应的质量分值(S103);根据所述原始图像的特征点对应的质量分值以及预设过滤规则,过滤所述原始图像的特征点(S104)。图像特征点过滤方法可提高重定位应用场景中特征点的匹配成功率,从而提升定位效率。终端能对特征点进行过滤减少特征点数量,从而减少对计算和存储资源的需求。

Description

一种图像特征点过滤方法以及终端 技术领域
本申请涉及图像处理技术领域,具体涉及一种图像特征点过滤方法以及终端。
发明背景
随着人工智能,5G通信技术的发展,服务机器人等新的终端和应用不断深入生活,为人们提供更加便利的服务。基于点云的运动状态估计、三维重建、视觉定位是SLAM(Simultaneous Localization And Mapping,同步定位与地图构建)等技术的核心,特别是在SFM(Structure From Motion,基于运动的建模)领域,选取良好的视觉特征点,有利于提升定位的精度、保持持续、稳定的运动跟踪。然而,现有技术中特征点数量多且特征点质量不佳,在跟踪丢失后难以快速重定位,影响定位效率和精度。
发明内容
鉴于上述问题,提出了本申请以便提供一种克服上述问题或者至少部分地解决上述问题的图像特征点过滤方法以及终端,通过神经网络模型对特征点进行量化评分,有效过滤质量低的特征点,节省存储空间并提高定位性能。
根据本申请的一个方面,提供了一种图像特征点过滤方法,包括:
为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型;其中,在训练过程中,当前帧图像特征点的质量分值基于特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
一次过滤启动后,获取一帧原始图像并提取原始图像的特征点;
将原始图像以及原始图像的特征点输入到神经网络模型中,得到原始图像的特征点对应的质量分值并输出原始图像的特征点对应的质量分值;
根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点。
根据本申请的另一个方面,提供了一种终端,包括图像特征点过滤装置,图像特征点过滤装置包括:
模型训练模块,用于为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型,其中,在训练过程中,当前帧图像 特征点的质量分值基于特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
特征提取模块,用于一次过滤启动后,获取一帧原始图像并提取原始图像的特征点;
分值确定模块,用于将原始图像以及原始图像的特征点输入到神经网络模型中,得到原始图像的特征点对应的质量分值并输出原始图像的特征点对应的质量分值;
过滤模块,用于根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点。
根据本申请实施例的又一个方面,提供了一种计算机可读存储介质,该计算机可读存储介质存储计算机程序,计算机程序使计算机执行上述方法实施例中的图像特征点过滤方法。
本申请实施例的图像特征点过滤方法和终端,一方面,通过为特征点设置质量分值并利用神经网络对特征点进行量化评分等技术特征,有效过滤质量低的特征点,减少特征点数量,从而减少对计算和存储资源的需求;另一方面,结合当前帧和跟踪帧的特征点评分方式既考虑特征点的当前权重,也考虑特征点的长期收益,从而提升了过滤性能。
附图简要说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1示出了根据本申请一个实施例的图像特征点过滤方法的流程示意图;
图2示出了根据本申请一个实施例的神经网络模型训练的流程示意图;
图3示出了根据本申请一个实施例的原始图像、掩码图像以及掩码值图像的示意图;
图4示出了根据本申请一个实施例的使用神经网络模型预测特征点对应的质量分值的流程示意图;
图5示出了根据本申请一个实施例的终端的框图。
图6示出了根据本申请一个实施例的一种计算机可读存储介质结构示意图。
具体实施方式
下面将参照附图更详细地描述本申请的示例性实施例。虽然附图中显示了本申请的示例性实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本申请,并且能够将本申请的范围完整的传达给本领域的技术人员。
本申请实施例的图像特征点过滤方法,应用在基于稀疏点云的运动状态估计、三维重建、视觉定位等领域,是Visual SLAM、虚拟现实和深度学习等技术的关键。Visual SLAM(或称V-SLAM)是通过视觉传感器获取环境图像,提取特征点构建点云地图,并在此基础上实现定位和导航。特征点过滤是V-SLAM定位追踪中的一个环节,具体是在特征点提取的基础上,对提取的特征点进行过滤的过程。目前,跟踪定位中点云生成有两种方法,分别是传统的基于人工的特征点方法和基于深度神经网络自动特征点方法。基于人工特征点方法,理论基础清晰、易于理解,计算复杂度低,但在实际使用中,特征点易受到纹理、光照、清晰度等因素影响,从而导致特征点提取不合理、跟踪效果差等技术问题。而基于深度神经网络的特征点方法利用深度神经网络良好的稳定性和鲁棒性的特性,提取的特征点对环境有较强的适应性、跟踪效果更好,是目前研究的热点。
基于深度神经网络的特征点处理是通过LSTM(Long Short-Term Memory,长短期记忆网络)或CNN(Convolutional Neural Networks,卷积神经网络)网络提取图像的特征点,并生成特征描述子,从而构建点云并在此基础上实现追踪和重定位。从实际效果来看,基于深度神经网络的特征点云生成方法有较好的环境适应性,抗干扰性好,跟踪不易丢失,但也会出现跟踪丢失后重定位表现较差的现象。本申请发明人经过分析后发现这种现象产生的主要原因是:在特征点选取(模型训练时)时只考虑到前后帧图像的特征关联性,没有考虑到选取该点为特征点构建特征点云的长时间跟踪性能,从而导致在目标跟踪丢失后难以快速重定位。此外,由于没有对特征点数量进行过滤,导致点云过于稠密,增加了存储空间,过于稠密的点云,也增加了检索时间。
为解决上述技术问题,本申请提出一种基于深度神经网络的特征点质量评价过滤方案,基于该技术方案,可实现对特征点质量量化评分,在实际应用过程中可依据分值对特征点进行过滤和筛选,如在V-SLAM点云构建阶段,可应用该技术方案对点云稀疏化,保留重要的特征点,减少点云地图的存储空间, 实现对点云的压缩。而且在V-SLAM重定位阶段,由于本申请实施例的特征点评价机制中参考了长期受益,也可以提高特征点的匹配成功率,从而提升V-SLAM在重定位阶段的检索的速度和定位精度,满足实际需求。
图1示出了根据本申请一个实施例的图像特征点过滤方法的流程示意图,参见图1,本实施例的图像特征点过滤方法包括下列步骤:
步骤S101,为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型;其中,在训练过程中,当前帧图像特征点的质量分值基于特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,跟踪帧包括跟踪到的当前帧之后连续预设数目帧。
本实施例首先为特征点新增质量分值这一个参数,用于量化以及评价特征点的价值。质量分值可以有一个预定的初始值,但在每一次将图像数据输入神经网络进行训练时,当前帧图像中特征点的质量分值由当前帧和跟踪帧中特征点的质量分值共同确定,这种方式考虑到了当前帧图像的下一帧图像的价值,平衡了下一帧和后续连续多帧图像对本次过滤的价值,提高了过滤精度。
步骤S102,一次过滤启动后,获取一帧原始图像并提取原始图像的特征点。
步骤S103,将原始图像以及原始图像的特征点输入到神经网络模型中,得到原始图像的特征点对应的质量分值并输出原始图像的特征点对应的质量分值。
步骤S104,根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点。
由图1所示可知,本实施例的图像特征点过滤方法,实现了为特征点设置质量分值进而通过神经网络对特征点质量进行量化评分,依据质量分值对特征点进行过滤的有益效果,提高了终端导航定位的效率,并且,在对特征点质量进行量化评分时,结合当前帧和跟踪帧的特征点的评分方式既考虑了特征点的当前权重,也考虑了特征点的长期收益(即某一特征点在后续帧中的表现和价值),从而提升了过滤性能,保证了定位精度。
为提高神经网络模型的预测性能,本实施例采用非监督学习方式,无需人工对样本进行标注,通过在线学习训练动态更新神经网络模型,不断提升神经网络模型的精度。也就是说,本实施例的根据特征点以及特征点的质量分值,训练神经网络模型包括:根据图像,图像的特征点以及图像的特征点对应的质量分值生成样本,并利用生成的样本对神经网络模型进行训练,更新神经网络 模型的质量分值预测值,以供下一次过滤使用。
这里的质量分值预测值是神经网络模型的一个参数,在神经网络模型初始化过程中,质量分值预测值为默认值,本实施例中通过生成样本,对神经网络模型进行训练以更新该默认值,从而提高模型预测阶段的精度,保证神经网络模型在输入待过滤的图像数据后,预测出更准确的特征点质量分值。
图2示出了根据本申请一个实施例的神经网络模型训练的流程示意图,以下结合图2对本申请实施例的神经网络模型训练过程进行说明。
参见图2,对于一图像帧I n先执行步骤S201,提取特征点。
这里的图像帧比如是一帧原始图像,原始图像是相对于后面生成的掩码图像和掩码值图像而言的,这里的提取特征点可利用现有的ORB-SLAM或其他特征点提取算法实现,这里不再展开说明。ORB-SLAM是基于ORB(Oriented FASTand Rotated BRIEF)描述子的视觉SLAM(Simultaneous Localization And Mapping,同步定位与地图构建),该特征检测算子是在FAST特征检测和BRIEF特征描述子的基础上提出来的,其运行时间远远优于SIFT和SURF,可应用于实时性特征检测。ORB特征检测具有尺度和旋转不变性,对于噪声及其透视变换也具有不变性。
提取得到特征点之后,可以将特征点存储到特征点集合P n中。
步骤S202,生成掩码图以及生成掩码值图。
在提取得到图像的特征点之后,本步骤根据图像I n以及图像I n的特征点,得到掩码图像;其中掩码图像中与特征点对应的像素点的值为256(仅作示例),其余像素点的值为0(仅作示例)。以及,根据图像,图像的特征点以及图像的特征点对应的质量分值,得到掩码值图像;其中掩码值图像中与特征点对应的像素点的值为特征点对应的质量分值。这里的根据原始图像,特征点以及特征点对应的质量分值,得到掩码值图像的过程是在为图像中提取的特征点设置质量分值的基础上实现的。
图3示出了根据本申请一个实施例的原始图像、掩码图像以及掩码值图像的示意图,参见图3,最左侧的图像I n为原始图像,可以看到原始图像上提取的三个特征点分别是特征点A,特征点B和特征点C。本实施例以原始图像I n的宽度和高度为标准分别构建同等大小的掩码图像Mask n和掩码值图像MaskV n,掩码图像参见图3中间的图像,掩码值图像参见图3最右侧的图像。在初始值设定过程中,将前述集合P n中特征点在掩码图Mask n中对应位置的点的值设置 为256(仅作示例),并将各特征点在掩码值图MaskV n中对应位置的点的值均设为1(或者分别设为1,3,5)。同时将掩码图Mask n中其他像素对应点的值设为0,将掩码值图MaskV n中其他像素点的值设置为-1。即,通过在特征点对应的像素点设置点的质量分值,其他像素点设为同一值,将特征点对应的像素点与其余像素点进行区分,后续只关注特征点对应的像素点,以此提高算法效率。
由上可知,掩码图存储的是原始图像特征点的像素位置和灰度值信息,掩码值图像存储的是原始图像特征点的像素位置和质量分值信息。
通过生成掩码图和掩码值图像,既方便信息处理,也为神经网络模型针对性提取隐藏在掩码图像、掩码值图像中的信息提供了可能。
参见图3,图3的掩码值图像MaskV n中特征点A对应的像素点是第一行第三列的点,该点的质量分值为1。特征点B对应的像素点是第三行第二列的点,该点的质量分值为3。特征点C对应的像素点是第四行第五列的点,该点的质量分值是5。需要说明的是,本实施例为图像中提取的特征点设置质量分值包括为图像中提取的特征点的质量分值设置相同的初始值或不同的初始值。以设置相同的初始值1为例,之所以出现图3左侧图所示三个特征点对应的质量分值不同是因为在当前帧之后对这三个特征点跟踪过程中收益不同,或者说,在原始图像之后的跟踪帧中这三个特征点的重要性不同,这也是本申请能够对特征点进行量化评分和过滤的原理所在。
步骤S203,特征点匹配。
读入下一帧图像,将提取自当前帧I n的第一特征点与提取自下一帧图像I n+1的第二特征点进行匹配,特征点匹配可以采用现有技术实现,本实施例关注特征点匹配结果,即,存在匹配成功的特征点或不存在匹配成功的特征点。
步骤S204,是否存在匹配的特征点,是则执行步骤S205,否则即匹配的特征点个数为0,执行步骤S207。
在本步骤中判断相邻两帧是否存在匹配点(即匹配成功的特征点),如果匹配点的个数等于0,表示相邻两帧不存在匹配点,那么直接执行步骤S207,即,当图像与图像的下一帧图像之间不存在匹配成功的特征点时,不对提取自图像的特征点进行跟踪,直接根据图像,掩码图像以及掩码值图像生成样本。
如果相邻两帧(即,当前帧与下一帧)存在匹配点,比如存在三个匹配点,那么根据本次跟踪结果修改当前帧图像的掩码值图。
步骤S205,修改掩码值图。
在本步骤中,当图像I n与图像I n的下一帧图像I n+1之间存在匹配成功的特征点时,对提取自图像I n的特征点进行跟踪;在跟踪过程中对特征点在当前帧I n中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将掩码值图像中特征点对应的质量分值替换为和值;其中,各跟踪帧中的质量分值,根据特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,折扣因子为大于0且小于1的常数。
参见图2,跟踪是一个循环的过程,一次跟踪包括步骤S203至步骤S206,即,读入下一帧→执行S203将下一帧的特征点与当前帧的特征进行匹配→存在匹配点,执行步骤S205修改当前帧的掩码值图中匹配点的掩码值(质量分值)→当未达到跟踪帧深度(即图2中步骤S206的判断结果为否)返回步骤S203继续跟踪或者达到跟踪帧深度时(即图2中步骤S206的判断结果为是),结束跟踪。跟踪循环的退出条件是当前跟踪的帧的深度达到了预设的跟踪帧深度,所谓的跟踪帧深度是对当前帧之后连续跟踪处理的帧的数目,比如对当前帧之后连续k帧跟踪,那么k即为深度阈值,k是一个大于0的自然数。
跟踪体现了本申请“累积折现回报”的发明构思,突出了本帧的下一帧图像的价值,平衡了下一帧和后面连续几个图像帧对本次图像估计的价值,有利于提高定位精度。
具体的,在每次跟踪过程中对特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将当前帧的掩码值图像中特征点对应的质量分值替换为和值。
比如,通过下列更新公式对每次跟踪中当前帧的特征点的质量分值进行更新:
Figure PCTCN2020125271-appb-000001
其中,0<λ<1为折扣因子,v为每个跟踪帧的上一帧的质量分值,s为跟踪帧深度,k为跟踪帧深度的阈值。
举例来说,k等于4,即在当前帧之后连续跟踪4帧。
根据前述说明,v的初始值即为掩码值图像中特征点对应的的质量分值1。s等于0表示未跟踪,即图像帧是当前帧,V表示和值,那么对于当前帧,其和值V等于1*1即1,这与前述的掩码值图像中特征点对应的质量分值初始值为1是一致的。
当跟踪帧深度s等于1时,考虑当前帧和下一帧这两帧的质量分值,当前帧中特征点的质量分值为1,下一帧中特征点的质量分值等于当前帧的质量分值加上λ。
当跟踪帧深度s等于2时,表示连续跟踪当前帧之后的2帧,称为下一帧和下二帧,共同根据当前帧,下一帧,下二帧这三帧图像确定当前帧的特征点的质量分值,如前述,当前帧中特征点的质量分值为1,下一帧中的质量分值等于当前帧的分值加上λ(即,1+λ),下二帧中的质量分值等于当前帧的质量分值加上下一帧的质量分值再加上λ的2次方与v(v=1)的乘积。
以此类推。
步骤S206,跟踪帧深度是否小于阈值,是则执行步骤S207,否则返回执行步骤S204。
每跟踪成功一次,跟踪帧深度+1,判断累加后的跟踪帧深度是否达到了跟踪帧深度阈值,如果达到了跟踪帧深度阈值,则停止跟踪,否则继续跟踪,以更新当前帧的质量分值。即,计算出跟踪帧中的质量分值与当前帧的质量分值的和,在所有跟踪结束时用最终和值(即跟踪到的最新的帧的和值V)替换图3所示掩码值图像中特征点对应的像素点的值,即得到当前帧中特征点的质量分值。比如,按照上述跟踪过程进行和值替换后,特征点B对应的像素点的质量分值为3,特征点C对应的像素点的质量分值是5。由此可得到原始图像中三个特征点A,B,C的质量分值不同。
步骤S207,输出掩码值图以及掩码图。
在前述步骤S206的基础上,跟踪成功的,输出修改掩码值后的掩码值图,比如图3中最右侧所示的掩码值图,跟踪失败的,输出原始掩码值图(原始掩码值图中掩码值为初始值,未经过跟踪修改)。
此外,本步骤中还输出掩码图以用于生成样本。
步骤S208,生成样本(S,V)。
也就是说,这里将图像,掩码图像以及掩码值图像三者组合生成样本。而后输出样本(S,V)到样本集合Φ,其中S={原始帧I n、Mask n},V=MaskV n
步骤S209,生成训练集。
按照一定的比例从样本集合Φ中抽样生成训练集、验证集以及测试集。
步骤S210,神经网络模型训练。
本实施例中,为神经网络模型增加一个用于接收掩码图像的第一输入通道, 以及一个用于接收掩码值图像的第二输入通道;
将样本中的图像通过神经网络模型的输入通道输入到神经网络模型中,将掩码图像以及掩码值图像分别通过第一输入通道和第二输入通道输入到神经网络模型中,基于均方误差损失函数,计算样本中的掩码值图像中的质量分值与神经网络模型的质量分值预测值之间的误差,根据误差确定神经网络模型新的质量分值预测值,实现更新神经网络模型的质量分值预测值。
也就是说,在图像(图像为彩色图像)包含RGB红绿蓝三通道的基础上增加两个输入通道为掩码图层和掩码值图层,并通过这些输入通道将图像、掩码图像和掩码值图像输入神经网络。可选地,神经网络模型的主干网基于VGG16构建,VGG16是基于大量真实图像的ImageNet图像库预训练的网络。
可选地,神经网络模型的损失函数为均方损失函数,即如下列所示:
Figure PCTCN2020125271-appb-000002
其中y i'神经网络模型的质量分值预测值,y i为V,即样本的掩码值图像MaskV n中的值。神经网络模型的质量分值预测值是在神经网络模型训练过程结束时确定得到的,具体是将与样本中掩码值的差值最小的预测值作为神经网络模型训练结束时的质量分值预测值。
步骤S211,模型输出。
采用BP(误差反向传播,Error Back Propagation)算法,最终生成用于预测特征点质量分值的神经网络模型并输出。
BP算法的基本思想是,学习过程由信号的正向传播与误差的反向传播两个过程组成。正向传播时,输入样本从输入层传入,经各隐层逐层处理后,传向输出层。若输出层的实际输出与期望的输出不符,则转入误差的反向传播阶段。这种信号正向传播与误差反向传播的各层权值调整过程,周而复始地进行,权值不断调整的过程,也就是网络的学习训练过程。此过程一直进行到网络输出的误差减少到可接受的程度,或进行到预先设定的学习次数为止。
至此,模型训练或更新完成,接下来对利用模型进行过滤进行说明。
图4示出了根据本申请一个实施例的使用神经网络模型预测特征点对应的质量分值的流程示意图,参见图4,首先在一次过滤过程中,获取一帧原始图像(如图4中的图像帧I n)并提取原始图像的特征点;
接着,按照前述步骤S202,生成掩码图;
继续,将掩码图和原始图像组合后(即,组合成S)输入到神经网络模型中,进行神经网络预测。也就是说,将原始图像以及记录原始图像特征点的掩码图 共同输入到神经网络模型中;原始图像是指一次过滤过程中获取的待过滤图像。
再次,神经网络模型输出掩码值图(即,输出掩码值图V),掩码值图为神经网络模型输出的,包含图像的特征点对应的质量分值信息的图像,得到掩码值图后即可确定原始图像的特征点对应的质量分值。
最后,根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点。
实际应用中可以根据需要预设过滤规则。比如,根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点包括:当预设过滤规则为阈值过滤规则时,将质量分值不大于预设质量分值阈值的原始图像的特征点过滤掉;当预设过滤规则为排位过滤规则时,根据原始图像的特征点对应的质量分值的大小,对原始图像的特征点进行降序排列,将排序靠后的预设个数个原始图像的特征点过滤掉。
由此可知,本实施例可根据掩码值图对特征点进行排序(比如采用从大到小的逆排序,值大的表示重要程度比较高),进一步结合过滤规则最终完成特征点过滤,比如,过滤掉某个阈值以下的所有特征点。即实现了基于掩码值对特征点的过滤。
需要说明的是,考虑到实际应用时,不是所有应用场景都需要对特征点进行过滤,本实施例在利用神经网络模型预测图像特征点的质量分值之前,根据启动规则(比如稠密点云算法中不启动特征点过滤)判断是否需要启动特征点过滤流程,如果需要则执行图4所示的步骤,如果不需要则按照实际需求对提取的图像特征点直接进行处理即可,从而拓宽了本实施例的图像特征点过滤的应用场景。
图5示出了根据本申请一个实施例的终端的框图,参见图5,终端500包括图像特征点过滤装置501,图像特征点过滤装置501包括:
模型训练模块5011,用于为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型,其中,在训练过程中,当前帧图像特征点的质量分值基于特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
特征提取模块5012,用于一次过滤启动后,获取一帧原始图像并提取原始图像的特征点;
分值确定模块5013,用于将原始图像以及原始图像的特征点输入到神经网 络模型中,得到原始图像的特征点对应的质量分值并输出原始图像的特征点对应的质量分值;
过滤模块5014,用于根据原始图像的特征点对应的质量分值以及预设过滤规则,过滤原始图像的特征点。
在本申请的一个实施例中,模型训练模块5011用于,根据图像、图像的特征点以及图像的特征点对应的质量分值生成样本,并利用生成的样本对神经网络模型进行训练,更新神经网络模型的质量分值预测值,以供下一次过滤使用。
在本申请的一个实施例中,模型训练模块5011具体用于为图像中提取的特征点的质量分值设置相同的初始值或不同的初始值。
在本申请的一个实施例中,模型训练模块5011具体用于根据图像以及图像的特征点,得到掩码图像;其中掩码图像中与特征点对应的像素点的值为256,其余像素点的值为0;根据图像,图像的特征点以及图像的特征点对应的质量分值,得到掩码值图像;其中掩码值图像中与特征点对应的像素点的值为特征点对应的质量分值;根据图像,掩码图像以及掩码值图像生成样本。
在本申请的一个实施例中,模型训练模块5011具体用于当图像与图像的下一帧图像之间存在匹配成功的特征点时,对提取自图像的特征点进行跟踪;在跟踪过程中对特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将掩码值图像中特征点对应的质量分值替换为和值;其中,各跟踪帧中的质量分值,根据特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,折扣因子为大于0且小于1的常数。
在本申请的一个实施例中,模型训练模块5011具体用于为神经网络模型增加一个用于接收掩码图像的第一输入通道,以及一个用于接收掩码值图像的第二输入通道;将样本中的图像通过神经网络模型的输入通道输入到神经网络模型中,将掩码图像以及掩码值图像分别通过第一输入通道和第二输入通道输入到神经网络模型中,基于均方误差损失函数,计算样本中的掩码值图像中的质量分值与神经网络模型的质量分值预测值之间的误差,根据误差确定神经网络模型新的质量分值预测值,实现更新神经网络模型的质量分值预测值。
在本申请的一个实施例中,模型训练模块5011具体用于当图像与图像的下一帧图像之间不存在匹配成功的特征点时,不对提取自图像的特征点进行跟踪,直接根据图像,掩码图像以及掩码值图像生成样本。
在本申请的一个实施例中,过滤模块5014具体用于当预设过滤规则为阈值 过滤规则时,将质量分值不大于预设质量分值阈值的原始图像的特征点过滤掉;当预设过滤规则为排位过滤规则时,根据原始图像的特征点对应的质量分值的大小,对原始图像的特征点进行降序排列,将排序靠后的预设个数个原始图像的特征点过滤掉。
需要说明的是,上述终端实施例中图像特征点过滤的具体实施方式可以参照前述对应图像特征点过滤方法实施例的具体实施方式进行,在此不再赘述。
参见图6,本申请的另一个实施例提供一种计算机可读存储介质600,计算机可读存储介质存储计算机程序,计算机程序使计算机执行上述方法实施例中任一种的图像特征点过滤方法。
具体的,计算机程序使计算机执行下述的图像特征点过滤方法:
为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型;其中,在训练过程中,当前帧图像特征点的质量分值基于所述特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,所述跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
一次过滤启动后,获取一帧原始图像并提取所述原始图像的特征点;
将所述原始图像以及所述原始图像的特征点输入到神经网络模型中,得到所述原始图像的特征点对应的质量分值并输出所述原始图像的特征点对应的质量分值;
进一步的,计算机程序还使计算机执行下述的图像特征点过滤方法:
根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值生成样本,并利用生成的样本对所述神经网络模型进行训练,更新所述神经网络模型的质量分值预测值,以供下一次过滤使用。
进一步的,计算机程序还使计算机执行下述的图像特征点过滤方法:
根据所述图像以及所述图像的特征点,得到掩码图像;
根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值,得到掩码值图像;其中所述掩码值图像中与所述特征点对应的像素点的值为所述特征点对应的质量分值;
根据所述图像,所述掩码图像以及所述掩码值图像生成样本。
进一步的,计算机程序还使计算机执行下述的图像特征点过滤方法:
当所述图像与所述图像的下一帧图像之间存在匹配成功的特征点时,对提取自所述图像的特征点进行跟踪;
在跟踪过程中对所述特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将所述掩码值图像中所述特征点对应的质量分值替换为所述和值;
其中,所述各跟踪帧中的质量分值,根据所述特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,所述折扣因子为大于0且小于1的常数。
进一步的,计算机程序还使计算机执行下述的图像特征点过滤方法:
当所述预设过滤规则为阈值过滤规则时,将质量分值不大于预设质量分值阈值的所述原始图像的特征点过滤掉;
当所述预设过滤规则为排位过滤规则时,根据所述原始图像的特征点对应的质量分值的大小,对所述原始图像的特征点进行降序排列,将排序靠后的预设个数个所述原始图像的特征点过滤掉。
上述计算机可读存储介质实施例中计算机程序实现的具体功能可以参照前述对应图像特征点过滤方法实施例的具体内容,在此不再赘述。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图的一个流程或多个流程和/或方框图的一个方框或多个方框中指定的功能的装置。
需要说明的是术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。
以上所述,仅为本申请的具体实施方式,在本申请的上述教导下,本领域技术人员可以在上述实施例的基础上进行其他的改进或变形。本领域技术人员 应该明白,上述的具体描述只是更好的解释本申请的目的,本申请的保护范围以权利要求的保护范围为准。

Claims (19)

  1. 一种图像特征点过滤方法,其中,该图像特征点过滤方法包括:
    为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型;其中,在训练过程中,当前帧图像特征点的质量分值基于所述特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,所述跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
    一次过滤启动后,获取一帧原始图像并提取所述原始图像的特征点;
    将所述原始图像以及所述原始图像的特征点输入到神经网络模型中,得到所述原始图像的特征点对应的质量分值并输出所述原始图像的特征点对应的质量分值;
    根据所述原始图像的特征点对应的质量分值以及预设过滤规则,过滤所述原始图像的特征点。
  2. 如权利要求1所述的方法,其中,所述根据特征点以及特征点的质量分值,训练神经网络模型包括:
    根据所述图像、所述图像的特征点以及所述图像的特征点对应的质量分值生成样本,并利用生成的样本对所述神经网络模型进行训练,更新所述神经网络模型的质量分值预测值,以供下一次过滤使用。
  3. 如权利要求2所述的方法,其中,所述为图像中提取的特征点设置质量分值包括:
    为图像中提取的特征点的质量分值设置相同的初始值或不同的初始值。
  4. 如权利要求2所述的方法,其中,所述根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值生成样本包括:
    根据所述图像以及所述图像的特征点,得到掩码图像;
    根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值,得到掩码值图像;其中所述掩码值图像中与所述特征点对应的像素点的值为所述特征点对应的质量分值;
    根据所述图像,所述掩码图像以及所述掩码值图像生成样本。
  5. 如权利要求4所述的方法,其中,所述根据所述图像,所述掩码图像以及所述掩码值图像生成样本包括:
    当所述图像与所述图像的下一帧图像之间存在匹配成功的特征点时,对提取自所述图像的特征点进行跟踪;
    在跟踪过程中对所述特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将所述掩码值图像中所述特征点对应的质量分值替换为所述和值;
    其中,所述各跟踪帧中的质量分值,根据所述特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,所述折扣因子为大于0且小于1的常数。
  6. 如权利要求4所述的方法,其中,所述利用生成的样本对所述神经网络模型进行训练包括:
    为神经网络模型增加一个用于接收掩码图像的第一输入通道,以及一个用于接收掩码值图像的第二输入通道;
    将样本中的所述图像通过所述神经网络模型的输入通道输入到所述神经网络模型中,将所述掩码图像以及所述掩码值图像分别通过所述第一输入通道和所述第二输入通道输入到所述神经网络模型中,基于均方误差损失函数,计算样本中的所述掩码值图像中的质量分值与所述神经网络模型的质量分值预测值之间的误差,根据误差确定所述神经网络模型新的质量分值预测值,实现更新所述神经网络模型的质量分值预测值。
  7. 如权利要求4所述的方法,其中,所述根据所述图像,所述掩码图像以及所述掩码值图像生成样本包括:
    当所述图像与所述图像的下一帧图像之间不存在匹配成功的特征点时,不对提取自所述图像的特征点进行跟踪,直接根据所述图像,所述掩码图像以及所述掩码值图像生成样本。
  8. 如权利要求1-7中任一项所述的方法,其中,所述根据所述原始图像的特征点对应的质量分值以及预设过滤规则,过滤所述原始图像的特征点包括:
    当所述预设过滤规则为阈值过滤规则时,将质量分值不大于预设质量分值阈值的所述原始图像的特征点过滤掉;
    当所述预设过滤规则为排位过滤规则时,根据所述原始图像的特征点对应的质量分值的大小,对所述原始图像的特征点进行降序排列,将排序靠后的预设个数个所述原始图像的特征点过滤掉。
  9. 如权利要求4所述的方法,其中,
    所述掩码图像中与所述特征点对应的像素点的值为256,其余像素点的值为0。
  10. 如权利要求5所述的方法,其中,
    通过下列公式对每次跟踪中当前帧的特征点的质量分值进行更新:
    Figure PCTCN2020125271-appb-100001
    其中,0<λ<1为折扣因子,v为每个跟踪帧的上一帧的质量分值,s为跟踪帧深度,k为跟踪帧深度的阈值,V表示和值。
  11. 一种终端,其中,包括图像特征点过滤装置,
    所述图像特征点过滤装置包括:
    模型训练模块,用于为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型,其中,在训练过程中,当前帧图像特征点的质量分值基于所述特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,所述跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
    特征提取模块,用于一次过滤启动后,获取一帧原始图像并提取所述原始图像的特征点;
    分值确定模块,用于将所述原始图像以及所述原始图像的特征点输入到神经网络模型中,得到所述原始图像的特征点对应的质量分值并输出所述原始图像的特征点对应的质量分值;
    过滤模块,用于根据所述原始图像的特征点对应的质量分值以及预设过滤规则,过滤所述原始图像的特征点。
  12. 如权利要求11所述的终端,其中,所述模型训练模块用于,根据所述图像、所述图像的特征点以及所述图像的特征点对应的质量分值生成样本,并利用生成的样本对所述神经网络模型进行训练,更新所述神经网络模型的质量分值预测值,以供下一次过滤使用。
  13. 如权利要求12所述的终端,其中,
    所述模型训练模块,具体用于根据图像以及图像的特征点,得到掩码图像;根据图像,图像的特征点以及图像的特征点对应的质量分值,得到掩码值图像;其中掩码值图像中与特征点对应的像素点的值为特征点对应的质量分值;根据图像、掩码图像以及掩码值图像生成样本。
  14. 如权利要求12所述的终端,其中,
    模型训练模块,具体用于当图像与图像的下一帧图像之间存在匹配成功的特征点时,对提取自图像的特征点进行跟踪;在跟踪过程中对特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将掩码值图像中特征点对应的质量分值替换为和值;其中,各跟踪帧中的质量分值,根据特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,折扣因子为 大于0且小于1的常数。
  15. 一种计算机可读存储介质,其中,计算机可读存储介质存储计算机程序,计算机程序使计算机执行下述的图像特征点过滤方法:
    为图像中提取的特征点设置质量分值,根据特征点以及特征点的质量分值,训练神经网络模型;其中,在训练过程中,当前帧图像特征点的质量分值基于所述特征点在当前帧中的质量分值以及在各跟踪帧中的质量分值共同确定,所述跟踪帧包括跟踪到的当前帧之后连续预设数目帧;
    一次过滤启动后,获取一帧原始图像并提取所述原始图像的特征点;
    将所述原始图像以及所述原始图像的特征点输入到神经网络模型中,得到所述原始图像的特征点对应的质量分值并输出所述原始图像的特征点对应的质量分值;
    根据所述原始图像的特征点对应的质量分值以及预设过滤规则,过滤所述原始图像的特征点。
  16. 如权利要求15所述的计算机可读存储介质,其中,计算机程序还使计算机执行下述的图像特征点过滤方法:
    根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值生成样本,并利用生成的样本对所述神经网络模型进行训练,更新所述神经网络模型的质量分值预测值,以供下一次过滤使用。
  17. 如权利要求16所述的计算机可读存储介质,其中,计算机程序还使计算机执行下述的图像特征点过滤方法:
    根据所述图像以及所述图像的特征点,得到掩码图像;
    根据所述图像,所述图像的特征点以及所述图像的特征点对应的质量分值,得到掩码值图像;其中所述掩码值图像中与所述特征点对应的像素点的值为所述特征点对应的质量分值;
    根据所述图像,所述掩码图像以及所述掩码值图像生成样本。
  18. 如权利要求16所述的计算机可读存储介质,其中,计算机程序还使计算机执行下述的图像特征点过滤方法:
    当所述图像与所述图像的下一帧图像之间存在匹配成功的特征点时,对提取自所述图像的特征点进行跟踪;
    在跟踪过程中对所述特征点在当前帧中的质量分值以及各跟踪帧中的质量分值进行求和运算,得到和值,将所述掩码值图像中所述特征点对应的质量分值替换为所述和值;
    其中,所述各跟踪帧中的质量分值,根据所述特征点在跟踪帧上一帧中的质量分值与一折扣因子的乘积确定,所述折扣因子为大于0且小于1的常数。
  19. 如权利要求15所述的计算机可读存储介质,其中,计算机程序还使计算机执行下述的图像特征点过滤方法:
    当所述预设过滤规则为阈值过滤规则时,将质量分值不大于预设质量分值阈值的所述原始图像的特征点过滤掉;
    当所述预设过滤规则为排位过滤规则时,根据所述原始图像的特征点对应的质量分值的大小,对所述原始图像的特征点进行降序排列,将排序靠后的预设个数个所述原始图像的特征点过滤掉。
PCT/CN2020/125271 2019-12-26 2020-10-30 一种图像特征点过滤方法以及终端 WO2021129145A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/595,079 US12051233B2 (en) 2019-12-26 2020-10-30 Method for filtering image feature points and terminal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911368851.0 2019-12-26
CN201911368851.0A CN111144483B (zh) 2019-12-26 2019-12-26 一种图像特征点过滤方法以及终端

Publications (1)

Publication Number Publication Date
WO2021129145A1 true WO2021129145A1 (zh) 2021-07-01

Family

ID=70520502

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/125271 WO2021129145A1 (zh) 2019-12-26 2020-10-30 一种图像特征点过滤方法以及终端

Country Status (3)

Country Link
US (1) US12051233B2 (zh)
CN (1) CN111144483B (zh)
WO (1) WO2021129145A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111144483B (zh) 2019-12-26 2023-10-17 歌尔股份有限公司 一种图像特征点过滤方法以及终端
US11645505B2 (en) * 2020-01-17 2023-05-09 Servicenow Canada Inc. Method and system for generating a vector representation of an image
CN111638528B (zh) * 2020-05-26 2023-05-30 北京百度网讯科技有限公司 定位方法、装置、电子设备和存储介质
CN111899759B (zh) * 2020-07-27 2021-09-03 北京嘀嘀无限科技发展有限公司 音频数据的预训练、模型训练方法、装置、设备及介质
CN112348855B (zh) * 2020-11-19 2023-05-02 湖南国科微电子股份有限公司 视觉里程计特征点提取方法、系统、电子设备和存储介质
US11727576B2 (en) * 2020-12-18 2023-08-15 Qualcomm Incorporated Object segmentation and feature tracking
CN112783995B (zh) * 2020-12-31 2022-06-03 杭州海康机器人技术有限公司 一种v-slam地图校验方法、装置及设备
CN115790618B (zh) * 2022-11-03 2023-09-01 中科天极(新疆)空天信息有限公司 基于激光雷达的slam定位方法、系统及存储介质
CN116223817B (zh) * 2023-05-04 2023-07-14 北京科卫临床诊断试剂有限公司 基于神经网络的肝脏型脂肪酸结合蛋白测量系统及方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372111A (zh) * 2016-08-22 2017-02-01 中国科学院计算技术研究所 局部特征点筛选方法及系统
CN110287873A (zh) * 2019-06-25 2019-09-27 清华大学深圳研究生院 基于深度神经网络的非合作目标位姿测量方法、系统及终端设备
CN111144483A (zh) * 2019-12-26 2020-05-12 歌尔股份有限公司 一种图像特征点过滤方法以及终端

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778465B (zh) * 2015-05-06 2018-05-15 北京航空航天大学 一种基于特征点匹配的目标跟踪方法
CN105243154B (zh) * 2015-10-27 2018-08-21 武汉大学 基于显著点特征和稀疏自编码的遥感图像检索方法及系统
CN107832802A (zh) * 2017-11-23 2018-03-23 北京智芯原动科技有限公司 基于人脸比对的人脸图像质量评价方法及装置
CN110349156B (zh) * 2017-11-30 2023-05-30 腾讯科技(深圳)有限公司 眼底图片中病变特征的识别方法和装置、存储介质
US10586344B2 (en) * 2018-02-21 2020-03-10 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for feature screening in SLAM
CN108470354B (zh) * 2018-03-23 2021-04-27 云南大学 视频目标跟踪方法、装置和实现装置
CN108764024B (zh) * 2018-04-09 2020-03-24 平安科技(深圳)有限公司 人脸识别模型的生成装置、方法及计算机可读存储介质
CN109215118B (zh) * 2018-09-18 2022-11-29 东北大学 一种基于图像序列的增量式运动结构恢复优化方法
CN109299304B (zh) * 2018-10-25 2021-12-07 科大讯飞股份有限公司 目标图像检索方法及系统
CN109492688B (zh) * 2018-11-05 2021-07-30 深圳一步智造科技有限公司 焊缝跟踪方法、装置及计算机可读存储介质
CN109711268B (zh) * 2018-12-03 2022-02-18 浙江大华技术股份有限公司 一种人脸图像筛选方法及设备
CN109978911B (zh) * 2019-02-22 2021-05-28 青岛小鸟看看科技有限公司 一种图像特征点跟踪方法和相机
US20220157047A1 (en) * 2019-03-15 2022-05-19 Retinai Medical Ag Feature Point Detection
CN110570453B (zh) * 2019-07-10 2022-09-27 哈尔滨工程大学 一种基于双目视觉的闭环式跟踪特征的视觉里程计方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372111A (zh) * 2016-08-22 2017-02-01 中国科学院计算技术研究所 局部特征点筛选方法及系统
CN110287873A (zh) * 2019-06-25 2019-09-27 清华大学深圳研究生院 基于深度神经网络的非合作目标位姿测量方法、系统及终端设备
CN111144483A (zh) * 2019-12-26 2020-05-12 歌尔股份有限公司 一种图像特征点过滤方法以及终端

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BING LI ; RONG XIAO ; ZHIWEI LI ; RUI CAI ; BAO-LIANG LU ; LEI ZHANG: "Rank-SIFT: Learning to rank repeatable local interest points", COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011 IEEE CONFERENCE ON, IEEE, 20 June 2011 (2011-06-20), pages 1737 - 1744, XP032037963, ISBN: 978-1-4577-0394-2, DOI: 10.1109/CVPR.2011.5995461 *
HAN CHAOYI; TAO XIAOMING; DUAN YIPING; LIU XIJIA; LU JIANHUA: "A CNN based framework for stable image feature selection", 2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), IEEE, 14 November 2017 (2017-11-14), pages 1402 - 1406, XP033327794, DOI: 10.1109/GlobalSIP.2017.8309192 *

Also Published As

Publication number Publication date
CN111144483A (zh) 2020-05-12
US20220254146A1 (en) 2022-08-11
CN111144483B (zh) 2023-10-17
US12051233B2 (en) 2024-07-30

Similar Documents

Publication Publication Date Title
WO2021129145A1 (zh) 一种图像特征点过滤方法以及终端
CN111062951B (zh) 一种基于语义分割类内特征差异性的知识蒸馏方法
JP7183385B2 (ja) ノード分類方法、モデル訓練方法並びに、その装置、機器及びコンピュータプログラム
CN108140032B (zh) 用于自动视频概括的设备和方法
JP2022504292A (ja) 画像処理方法、装置、デバイスおよびコンピュータプログラム
CN109658445A (zh) 网络训练方法、增量建图方法、定位方法、装置及设备
CN112232258B (zh) 一种信息处理方法、装置及计算机可读存储介质
CN112541532B (zh) 基于密集连接结构的目标检测方法
CN113011329A (zh) 一种基于多尺度特征金字塔网络及密集人群计数方法
CN110807757A (zh) 基于人工智能的图像质量评估方法、装置及计算机设备
CN113963304B (zh) 基于时序-空间图的跨模态视频时序动作定位方法及系统
CN112818995B (zh) 图像分类方法、装置、电子设备及存储介质
CN113269256A (zh) 一种MiSrc-GAN模型的构建方法及应用
CN111639230A (zh) 一种相似视频的筛选方法、装置、设备和存储介质
CN109978858B (zh) 一种基于前景检测的双框架缩略图像质量评价方法
CN115035341A (zh) 一种自动选择学生模型结构的图像识别知识蒸馏方法
CN114049503A (zh) 一种基于非端到端深度学习网络的显著性区域检测方法
CN113763420A (zh) 一种目标跟踪方法、系统及存储介质和终端设备
CN115953330B (zh) 虚拟场景图像的纹理优化方法、装置、设备和存储介质
CN117576149A (zh) 一种基于注意力机制的单目标跟踪方法
CN116824138A (zh) 基于点击点影响增强的交互式图像分割方法及设备
CN106469437B (zh) 图像处理方法和图像处理装置
CN114758285B (zh) 基于锚自由和长时注意力感知的视频交互动作检测方法
CN111726592A (zh) 获取图像信号处理器的架构的方法和装置
CN116188914A (zh) 在元宇宙交互场景下的图像ai处理方法及元宇宙互动系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20907810

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20907810

Country of ref document: EP

Kind code of ref document: A1