WO2020056688A1 - Method and apparatus for extracting image key point - Google Patents

Method and apparatus for extracting image key point Download PDF

Info

Publication number
WO2020056688A1
WO2020056688A1 PCT/CN2018/106778 CN2018106778W WO2020056688A1 WO 2020056688 A1 WO2020056688 A1 WO 2020056688A1 CN 2018106778 W CN2018106778 W CN 2018106778W WO 2020056688 A1 WO2020056688 A1 WO 2020056688A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
layer
key points
window size
pyramid
Prior art date
Application number
PCT/CN2018/106778
Other languages
French (fr)
Chinese (zh)
Inventor
左韶军
林天鹏
占云龙
赵强
王林召
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2018/106778 priority Critical patent/WO2020056688A1/en
Priority to CN201880095485.3A priority patent/CN112424787A/en
Publication of WO2020056688A1 publication Critical patent/WO2020056688A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the present application relates to the field of electronic technology, and in particular, to a method and device for extracting key points of an image.
  • key points in image pixels are usually used for image matching.
  • Key points (also known as feature points or points of interest) of the image are prominent and representative points in the image. These points can be used to identify images, perform image matching, or implement 3D (3D) reconstruction.
  • the image pyramid is first constructed by down-sampling the image layer by layer, that is, the initial image is used as the image of the first layer, and all pixels in the image of the i-th layer are determined according to a certain level.
  • the downsampling is performed to obtain the down-sampled pixels as the i + 1th layer image, where the pixel size of the i + 1th layer image is smaller than the ith layer image, and i is a positive integer greater than or equal to 1.
  • each layer of image is divided into image blocks. For each image block, the feature score of each pixel is determined based on the FAST (Feature from Accelerated Segment Test) algorithm, and the pixels with the feature score greater than the threshold are determined.
  • FAST Feature from Accelerated Segment Test
  • non-maximum suppression can be performed on each key point of the image block according to a window of a preset size, that is, in a window centered on a key point, the key point with the highest feature score is extracted, which can be understood as a local pole Big value search.
  • the above scheme for obtaining image key points has at least the following problems:
  • the size of the non-maximum suppression window is fixed.
  • the number of extracted key points may be too much.
  • the extracted keys The number of points may be too small, that is, the number of key points extracted for images with different texture complexity is large.
  • the imbalance of key points may affect subsequent processing. For example, an excessive number of key points may cause image matching operations. When the amount is increased, the number of key points is too small, which may lead to lower accuracy of image matching.
  • This embodiment provides a method and a device for extracting key points of an image, which can balance the number of key points extracted by each image.
  • the technical solution is as follows:
  • a method for extracting key points of an image includes: an image processing device acquires an image pyramid of an image, the image pyramid includes N-layer images, N> 1; and determining a target window size according to an i-th layer image of the image pyramid , Determine at least one output key point of the i-th layer image according to the size of the target window, where 1 ⁇ i ⁇ N; determine the output key points of each layer image of the image pyramid as the key points of the image.
  • the image processing device can adjust the window size of non-maximum suppression based on each layer of the image, so that the non-maximum suppression
  • the window size can be changed with each layer of the image, and the number of key points extracted from each image can be balanced.
  • determining the target window size according to the i-th layer image of the image pyramid, and determining at least one output key point of the i-th layer image according to the target window size including: extracting at least one of the i-th layer image of the image pyramid A candidate key point, the target window size is determined according to the i-th layer image, and at least one candidate key point of the i-th layer image is subjected to non-maximum suppression processing according to the target window size to obtain at least one output key point of the i-th layer image .
  • the image processing device can perform non-maximum suppression processing on the candidate key points of each layer of the image. Since the candidate key points are extracted, the number of key points during the non-maximum suppression processing can be reduced and the processing efficiency can be improved. .
  • the image is a static image
  • determining the target window size according to the i-th layer image of the image pyramid includes: if the i-th layer image is a first-layer image, the image processing device determines the preset window size Is the target window size; otherwise, the image processing device determines the i-1th layer image in the image pyramid of the image as the reference image of the i-th layer image, and determines the target window size according to the reference image.
  • the images of two adjacent layers in the same image pyramid can be obtained according to sampling, that is, the textures of the images of different layers and the original image can be the same. Therefore, the textures of the reference image obtained by the above processing and the image of the extracted key points are complicated. Degrees are similar.
  • the image is a frame image in the video stream
  • the target window size is determined according to the i-th layer image of the image pyramid, including: if the image is any other than the first frame image in the video stream
  • the image processing device determines the i-th image of the image pyramid of the previous frame of the image as the reference image of the i-th image of the image pyramid of the image, and determines the target window size according to the reference image.
  • the images presented by two adjacent frames may have similar pictures. Therefore, the reference image obtained through the above processing is similar in texture complexity to the image from which the key points are extracted.
  • the method further includes: if the image is the first frame image in the video stream, and the i-th image of the image pyramid of the image is the first-layer image, the image processing device presets a window The size is determined as the target window size; if the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first layer image, the image processing device The i-1 layer image is determined as the reference image of the i layer image of the image pyramid of the image, and the target window size is determined according to the reference image. If the previous frame image of the first frame image does not exist, the reference image can be determined based on the same method as the static image to improve the accuracy of determining the target window size.
  • determining the target window size according to the reference image includes: the image processing device determines the first key point number of the reference image, and determines the target according to the first key point number and the second key point number of the reference image.
  • Window size where the first number of key points is a preset number of key points and the second number of key points is the number of output key points of a reference image.
  • determining the number of first key points of the reference image includes: determining the number of first key points corresponding to the reference image according to the number of layers of the reference image in the image pyramid to which the reference image belongs, where, There is a preset correspondence relationship between the number of layers in the reference image pyramid and the number of first key points of the reference image.
  • the number of first key points of two adjacent layers of images meets a preset ratio
  • the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid.
  • Each layer of the image pyramid has a different number of first keypoints.
  • the expected number of keypoints in two adjacent layers meets a preset ratio.
  • determining the target window size according to the first key point number and the second key point number of the reference image includes: the image processing device determines a ratio of the second key point number to the first key point number; The corresponding relationship between the preset ratio range and the window level determines the target ratio range where the ratio is located and the target window level corresponding to the target ratio range; the target window size is determined according to the target window level.
  • the ratio of the number of second keypoints to the number of first keypoints can be used to measure the degree to which the number of output keypoints is close to the preset number of keypoints.
  • the ratio range where different ratios correspond to different window levels The larger the corresponding window level, the larger it can be.
  • determining the target window size according to the target window level includes: determining a target window group corresponding to the i-th layer image according to a preset correspondence between the number of layers and the window group.
  • the window group includes at least The window size corresponding to one window level; the window size corresponding to the target window level in the target window group is determined as the target window size.
  • the pixel size of each layer image gradually decreases. If the window size of the same window level is set to decrease layer by layer, the number of key points in each layer image can be balanced.
  • each layer of the image pyramid can be used to express the image at multiple scales, that is, to simulate images with different levels of blur, equalizing the number of key points in each layer of the image can make each level of blur have a certain number of key points, improving the image Matching accuracy.
  • extracting at least one candidate key point of the i-th layer image includes: for each image block of the i-th layer image, determining a feature of each pixel point in the image block according to a preset feature detection algorithm Score, determine the pixel points whose feature score is greater than a preset threshold as candidate key points of the image block; determine candidate key points of each image block of the i-th layer image as at least one candidate key point of the i-th layer image.
  • an apparatus for extracting key points of an image includes at least one module, and the at least one module is configured to implement the method for extracting key points of an image.
  • an image processing device includes a memory and a processor.
  • the memory is used to store instructions.
  • the processor is used to call the instructions and execute the method for extracting key points of the image.
  • a computer-readable storage medium is provided, and when the computer-readable storage medium is run on an image processing device, the image processing device is caused to perform the above-mentioned method for extracting key points of an image.
  • a computer program product containing instructions, which, when the computer program product runs on an image processing device, causes the image processing device to execute the method for extracting key points of an image described above.
  • the target window size can be adjusted based on each layer of images in the image pyramid of the image, that is, the window size of non-maximum suppression is adjusted so that the non-maximum value
  • the size of the suppressed window can be changed with each layer of the image, and the number of key points in each image is balanced to reduce the negative impact on image matching.
  • FIG. 1 is an implementation environment diagram provided by this embodiment
  • FIG. 2 is a schematic structural diagram of an image processing device according to this embodiment.
  • FIG. 3 is a flowchart of a method for extracting key points of an image provided by this embodiment
  • FIG. 4 is a flowchart of a method for obtaining candidate key points provided by this embodiment
  • FIG. 5 is a schematic diagram of a reference image provided by this embodiment.
  • FIG. 6 is a flowchart of a method for extracting key points of an image provided by this embodiment
  • FIG. 7 is a flowchart of a method for extracting key points of an image provided by this embodiment.
  • FIG. 8 is a schematic diagram of a reference image provided by this embodiment.
  • FIG. 9 is a schematic diagram of an apparatus for extracting key points of an image provided by this embodiment.
  • FIG. 1 is a diagram of an implementation environment provided by this embodiment.
  • the implementation environment includes a plurality of terminals 101 and an image processing apparatus 102 for providing services to the plurality of terminals.
  • the plurality of terminals 101 are connected to the image processing apparatus 102 through a wireless or wired network.
  • the image processing device 102 may provide a service for the terminal 101 to extract key points of an image.
  • the image processing device 102 may further have at least one database for storing an image of a key point to be extracted, a key point of the above image, and the like.
  • the terminal 101 may send an image of the key points to be extracted to the image processing device 102.
  • the image processing apparatus 102 may include a processor 210 and a transceiver 220.
  • the transceiver 220 may be connected to the processor 210 as shown in FIG. 2.
  • the transceiver 220 may be used to send and receive messages or data, that is, may receive an image of a key point to be extracted and the like sent by the terminal 101.
  • the processor 210 may be a control center of the image processing apparatus 102, and uses various interfaces and lines to connect various parts of the entire image processing apparatus 102, such as the transceiver 220 and the like.
  • the processor 210 may be an ASIC (Application-Specific Integrated Circuits), which may be used to extract key points of an image.
  • the processor 210 may include one or more processing units.
  • the processor 210 may integrate an application processor and a modem, where the application processor mainly processes an operating system and the modem mainly processes wireless communications.
  • the processor 210 may also be a digital signal processor, a central processing unit, or the like.
  • the image processing apparatus 102 may further include a memory 230.
  • the memory 230 may be configured to store an image of a key point to be extracted, a key point of an image, and the like.
  • the image processing device 102 may further include an input / output interface 240, which may provide an interface between the processor 210 and a peripheral interface module.
  • the peripheral interface module may be a button or the like.
  • the present application introduces a reference image to determine the window size of the non-maximum suppression.
  • the reference image of the i-th layer image can have the following two types: First, when the image is a still image or a frame image in a video stream, the reference of the i-th layer image The image can be the i-1th layer image in the same image pyramid; second, when the image is a frame image in the video stream, the reference image of the ith layer image can be the first image in the image pyramid of the previous frame image i-layer image.
  • a static image may refer to an independent image, and the key points extracted by the image processing device are not related to other images.
  • a static image may be a captured photo; correspondingly, a frame image in a video stream It is not an independent image.
  • the above two types of reference images have in common that the texture complexity of the reference image and the image of the key point to be extracted is similar.
  • the reason is that for the first reference image, the images of two adjacent layers in the same image pyramid can be obtained according to sampling, that is, the texture of the images of different layers can be the same as the original image; for the second reference image, because The time interval between the images of adjacent frames is small, for example, it is only 40 milliseconds. Therefore, the images of the two adjacent frames may be similar, that is, the texture between the images of the adjacent two frames is similar.
  • the image features between images with similar texture complexity are also similar, that is, when the key points are extracted based on the same method, the number of key points obtained is similar. Therefore, if the key points of the reference image have been extracted when the key points are extracted from the image, the number of key points of the reference image can be used to measure whether the corresponding method of extracting key points is appropriate to determine whether the same method is applied or how Make adjustments. Since the non-maximum suppression window can filter the key points, the number of key points in the reference image can also be used to measure whether the size of the corresponding non-maximum suppression window is appropriate. Adjust to equalize the number of key points extracted from each image and avoid large differences in the number of key points obtained for images with large differences in texture complexity.
  • the reference image of the i-th layer image may be other reference images obtained based on the same concept in addition to the above two types. These reference images can be applied to the method for extracting key points of an image provided in this application, which is not limited in this application.
  • An embodiment of the present application provides a method for extracting a key point of a still image or a video image. Taking the reference image of the i-th layer image as the i-th layer image in the same image pyramid as an example, combining specific In an implementation manner, the process flow of the method for extracting key points of an image shown in FIG. 3 is described in detail, and the content may be as follows:
  • step 301 the image processing apparatus acquires an image pyramid of an image.
  • the image pyramid may include N-layer images, N> 1.
  • the image processing device has the ability to extract key points of the image. If the image processing device provides a service for extracting key points of an image for other terminals, it can receive still images or video streams sent by the terminals. Alternatively, the image processing device may have a function of acquiring an image (for example, the image processing device may be a monitoring device), and then key points may be extracted from the acquired image.
  • the image processing device may also store an image of a key point to be extracted.
  • the image processing device may extract key points for each frame of images in the video stream in real time, or may extract key points for stored images, which is not limited in this embodiment.
  • the image processing device may construct an image pyramid.
  • the embodiment does not limit the specific method of constructing the image pyramid.
  • the image pyramid may be constructed based on an upsampling or downsampling method.
  • the process of constructing the image pyramid by the image processing device may be as follows: the image processing device uses the image as the first layer image of the image pyramid, and downsamples the image of the image pyramid layer by layer according to a preset ratio. The next layer of images, until the construction stop condition is reached, stops downsampling the image of the image pyramid to obtain the image pyramid of the image.
  • Downsampling refers to generating a thumbnail of an image
  • the preset ratio may refer to a downsampling ratio.
  • the image pyramid formed by the downsampling method uses the image of the key point to be extracted as the original image, and generates thumbnails of multiple resolutions, that is, the image is expressed at multiple scales.
  • the construction stop condition may be that the constructed image pyramid reaches a preset number of layers, or the highest-level image reaches a preset size. For example, for an image pyramid with an image size of 992 * 744, the image pyramid is constructed with a preset ratio of 1.2. When the image pyramid reaches the eighth layer, the construction is stopped, and the pixel sizes of the first to eighth layers of the image pyramid are 992. * 744, 827 * 620, 689 * 517, 574 * 431, 478 * 359, 399 * 299, 332 * 249, 277 * 208.
  • the image processing device After the image processing device completes the construction of the image pyramid, it can start with the first layer image and extract the key points of the image pyramid image layer by layer.
  • the image processing device may also acquire an image pyramid constructed by other devices on the image, which is not limited in this embodiment.
  • step 302 the image processing device extracts at least one candidate key point of the i-th layer image of the image pyramid.
  • the image processing device can detect candidate key points of the image of the image pyramid layer by layer. For example, it can be based on FAST algorithm, SIFT (Scale-Invariant Feature Transform) algorithm, and SURF (Speeded Up Robust Features) to accelerate robust features. Algorithm or FREAK (Fast Retina Keypoint, Fast Retina Keypoint) algorithm to detect candidate keypoints of the image.
  • FAST algorithm FAST algorithm
  • SIFT Scale-Invariant Feature Transform
  • SURF Speeded Up Robust Features
  • the processing of step 302 may be as follows: for each image block of the i-th layer image, the image processing device determines a feature score of each pixel point in the image block according to a preset feature detection algorithm, and combines the features Pixel points with a score greater than a preset threshold are determined as candidate key points of the image block; candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
  • the FAST algorithm can calculate each pixel in the image and a pixel within a preset circle range around it, and calculate the gradient of the pixel, that is, calculate the feature score of the pixel.
  • the FAST algorithm can also set an initial threshold in advance. If the feature score of a pixel is greater than the initial threshold, it indicates that the pixel is a corner, and these pixels can be used as candidate key points.
  • the initial threshold can be 20
  • the threshold can be 7.
  • the key points detected based on the low threshold are generally more than the key points detected based on the initial threshold.
  • the image processing device does not detect a key point of the image based on the initial threshold, it needs to re-detect the image based on the low threshold. Re-detecting the image will increase the processing time, especially when the key points are extracted by the hardware, requiring more registers and longer processing delays, consuming more costs, and lower processing efficiency. Therefore, in this embodiment, another method for obtaining candidate key points is provided. For each image block, a feature score of each pixel point in the image block is determined.
  • Pixels with feature scores greater than the first threshold are determined as candidate key points of the image block; otherwise, pixels with feature scores greater than the second threshold are determined as candidate key points of the image block; candidates for each image block of the i-th layer image
  • the key point is determined as at least one candidate key point of the i-th layer image.
  • the method may be as follows:
  • the image processing device divides each layer image of the image pyramid into a plurality of image blocks of a preset size. For example, for each layer of image, the image processing device may divide the image into a plurality of image blocks with a pixel size of 31 * 15.
  • the image processing device determines a feature score of a pixel point in the image block, and determines a pixel point with a feature score greater than a second threshold as a first candidate key point of the image block.
  • the second threshold may be the above-mentioned low threshold. After determining the feature score of each pixel, the image processing device may obtain a corresponding score map, and the corresponding feature score is recorded at the position of each pixel. Then, the image processing device may detect key points of the image based on the second threshold. If the feature score is greater than the second threshold, the feature score may be retained on the score map, that is, remain as the first candidate key point; if the feature score is not greater than For the second threshold, the feature score can be set to 0 on the score map. Detecting key points directly based on a low threshold can avoid repeated detection.
  • step 3023 for each image block, the image processing device determines whether the maximum feature score is greater than a first threshold.
  • the image processing device may also determine and store the maximum feature score, for example, while calculating the score map, use a register to count the maximum feature score of candidate key points in the image block. Furthermore, after determining the feature score of each pixel, the image processing device can determine whether the maximum feature score of the pixel in the image block is greater than a first threshold.
  • the first threshold may be the above-mentioned initial threshold, that is, the first threshold is greater than the above-mentioned second threshold.
  • step 3024 if the maximum feature score is greater than the first threshold value, the pixel point with the feature score greater than the first threshold value in the first candidate key point is determined as the second candidate key point of the image block, and the second candidate key point is determined.
  • the first candidate key point can be performed based on the first threshold value. filter. That is, if the feature score of the first candidate key point is greater than the first threshold, the feature score may be retained on the score map, that is, the second candidate key point; if the feature score of the first candidate key point is not greater than the first A threshold, the feature score can be set to 0 on the score map.
  • the second candidate key point after the second candidate key point is determined, it can be determined as a candidate key point of the image block.
  • step 3025 if the maximum feature score is not greater than the first threshold, the first candidate key point is determined as a candidate key point of the image block.
  • the first candidate key point is not filtered, that is, the first candidate key point is determined as a candidate key point of the image block.
  • step 3026 the image processing device determines candidate key points of each image block of each layer image as candidate key points of each layer image.
  • a score map corresponding to the image block can be obtained at the same time.
  • a feature score corresponding to the candidate key point exists, and the value of the pixel position of the non-candidate key point is 0.
  • the image processing device can summarize the candidate key points of each image block as the candidate key points of the layer image, and at the same time, can stitch the scores of each image block according to the position of each image block in the layer image Map to get the score map of this layer image.
  • the i-th layer image of the image pyramid is an image of any layer.
  • the i-th layer image of the image pyramid after obtaining the candidate key points and the score map, non-maximum value suppression can be performed on the candidate key points.
  • the size of the non-maximum suppression window needs to be determined, and the window may be a convolution kernel.
  • This embodiment takes the i-th layer image as the reference image of the i-th layer image as an example, and uses the reference image of the i-th layer image to determine the size of the non-maximum suppression window of the i-th layer image.
  • step 303 the image processing apparatus determines whether the i-th layer image is a first-layer image.
  • step 304 if the i-th layer image is a first-layer image, the image processing device uses the preset window size as the target window size corresponding to the i-th layer image.
  • the preset window size may refer to a default window size.
  • the image processing device Since the image processing device extracts the key points, it starts from the first layer of the image and extracts the key points of the image pyramid image layer by layer. Based on this, the image processing device can obtain the number of layers of the current image, and then can judge the image. Whether it is the first layer image. If the current image is a layer 1 image and no similar points have been extracted before, the image processing device may determine the preset window size as the target window size. That is, non-maximum suppression is performed on the first layer image based on the preset window size.
  • step 305 if the i-th layer image is any layer image other than the first-layer image, the image processing device determines the i-th layer image in the image pyramid of the image as the reference image of the i-th layer image, The number of first keypoints of the reference image is determined, and the target window size corresponding to the i-th layer image is determined according to the number of first keypoints and the number of second keypoints of the reference image.
  • the first number of keypoints may be a preset number of keypoints, and the first number of keypoints may refer to a desired number of keypoints extracted from a reference image.
  • the second number of key points may refer to the number of output key points, and the second number of key points of the reference image may be the number of output key points of the reference image.
  • the image of the previous layer can be determined as The reference image is used to determine the non-maximum suppression window of the current image according to the degree to which the extracted key points of the previous layer meet the requirements, that is, the degree to which the number of output key points of the reference image approaches the preset number of key points. size.
  • each layer of images in the image pyramid has a different number of first keypoints. Therefore, the processing performed by the image processing device to determine the first keypoint may be as follows: the image processing device is located in the image pyramid to which it belongs according to the reference image The number of layers determines the number of first keypoints corresponding to the reference image.
  • the number of first key points of two adjacent layers of images meets a preset ratio, and the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid. That is, it is ensured that the number of the first key points in each layer of the image occupies a certain proportion in the total number of pixels.
  • the correspondence between the number of layers and the number of first key points can be set by the technician according to actual needs.
  • the process of establishing the correspondence between the number of layers and the number of first key points can also be as follows:
  • the number of layers 1 and the first key point are stored as a correspondence relationship term; for the k-th layer, the number of the first key points corresponding to the k-th layer is determined according to the preset number of the first key points corresponding to the k-1 layer,
  • the k-th layer and the corresponding number of first keypoints are stored as a correspondence term, where k> 1.
  • the number of first key points of the first layer image of the image pyramid can be calculated from the number of image pixels.
  • the number of first key points can be 1% of the number of image pixels.
  • the image processing device may calculate the number of the first key points layer by layer according to a preset ratio of down-sampling when constructing the image pyramid, and ensure that the ratio of the number of the first key points to the total number of pixels in each layer image is constant.
  • the specific processing for determining the target window size corresponding to the i-th layer image can be as follows: the image processing device determines the ratio of the number of the second keypoints to the number of the first keypoints; Set the corresponding relationship between the ratio range and the window level, determine the target ratio range where the ratio is located, and the target window level corresponding to the target ratio range; determine the target window size according to the target window level.
  • the ratio of the number of second key points to the number of first key points is used to measure how close the number of output key points is to the preset number of key points.
  • the image processing device may store a correspondence relationship between a ratio range and a window level in advance. For different layers, the correspondence relationship is always established.
  • the corresponding relationship between the ratio range and the window level can be shown in Table 1 below:
  • the window levels are respectively TRENTA, VENTI, GRANDE, and TALL.
  • TRENTA has the largest cup size and TALL has the smallest cup size, that is, the window sizes of different window levels are sorted as TRENTA > VENTI> GRANDE> TALL.
  • the second key point number of the reference image can be obtained, and then the ratio of the second key point number to the first key point number can be calculated.
  • the image processing device can determine the target ratio range in which the ratio is located in the correspondence between the ratio range and the window level, and then can determine the corresponding target window level. After the image processing device determines the target window level, it can obtain the window size corresponding to the target window level and determine the window size as the target window size.
  • the window size of non-maximum suppression can reduce the number of key points obtained after non-maximum suppression processing. Therefore, if the ratio of the number of the second keypoints to the number of the first keypoints is too large, for example, greater than 2, the window size can be appropriately increased to reduce the number of the second keypoints of the current image. Because the reference image and the current image have similar textures, the window size is adjusted by the number of key points actually output by the reference image, so that the number of key points extracted by the image is close to the number of first key points, and the number of key points in each layer of the image is balanced .
  • Each of the above window levels may correspond to a fixed window size, that is, for images of different layers, the window sizes determined according to the same window level are the same.
  • the window sizes determined according to the same window level may be different.
  • the above-mentioned processing for determining the target window size according to the target window level may be as follows: The preset correspondence between the number of layers and the window group determines the target window group corresponding to the i-th layer image; the window size corresponding to the target window level in the target window group is determined as the target window size.
  • the image processing device may store in advance a window group corresponding to each layer of images in the image pyramid, and each window group may include a window size corresponding to at least one window level.
  • the correspondence between the number of layers and the window group can be shown in Table 2 below:
  • the window group of each layer of image includes 4 window levels, which respectively correspond to the above-mentioned TRENTA, VENTI, GRANDE, TALL. It can be seen from Table 1 that the window sizes of the same level in different layers may be the same or different. In general, as the number of layers increases, the window sizes of the same level gradually decrease.
  • the image processing device may determine the target window size group corresponding to the number of layers according to the above-mentioned correspondence between the number of layers and the window size group. After the image processing device determines the target window level in the above process, it can obtain the window size of the target window level in the target window size group as the non-maximum suppressed window size.
  • the reference image is the first-layer image. If the ratio of the number of the second keypoints to the number of the first keypoints is 1.6, the window level can be determined to be VENTI, and the window size can be 21 * 11.
  • each layer of the image pyramid can be used to express the image at multiple scales, that is, to simulate images with different levels of blur, equalizing the number of key points in each layer of the image can make each level of blur have a certain number of key points, improving the image Matching accuracy.
  • the target window size may also be determined based on other information of the reference image.
  • the target window size corresponding to the i-th layer image may be determined according to the size of the non-maximum suppression window of the reference image. Therefore, the processing after determining the reference image in steps 303-305 may also be: the image processing device determines the target window size according to the reference image.
  • step 306 the image processing device performs non-maximum value suppression processing on at least one candidate key point of the i-th layer image according to the target window size to obtain at least one output key point of the i-th layer image.
  • the image processing device can use any candidate key point as the center of the window in the scoring map of the i-th layer image, and determine the key point with the largest feature score within the window, and it will not be the largest
  • the feature score of is set to 0, that is, non-maximum suppression is performed.
  • the candidate key points in the entire score graph are traversed for non-maximum value suppression. When the traversal ends, at least one key point of the i-th layer image can be obtained.
  • the key point can be output as a key point of the image, that is, an output key point is obtained.
  • step 307 After obtaining the output key points of the i-th layer image, you can increase i by 1, that is, continue to repeat the processing of steps 302-306 for the i + 1-th layer image to extract the key points of the i + 1-th layer image until the top layer After the image extraction key points are completed, the process of step 307 is continued.
  • the image processing device may also determine the target window size based on other methods, for example, the target window size may also be determined according to the pixel size of the i-th layer image of the image pyramid, which is not limited in this embodiment. Therefore, the processing of the above steps 302 to 307 may also be: the image processing device determines the target window size according to the i-th layer image of the image pyramid, and determines at least one output key point of the i-th layer image according to the target window size.
  • step 307 the image processing device determines an output key point of each layer image of the image pyramid as a key point of the image.
  • the image processing device can describe the key points of the image, for example, the position, scale, and direction of the key points can be used to describe the key points. Furthermore, the image processing device can store the key points so that the key points can be used for image matching and other processing in subsequent processes.
  • the image processing device uses the ith layer-1 image as a reference image. Since the complexity of the texture of the ith layer image and the ith layer-1 image is similar, it can be based on the reference The image adjusts the window size of the non-maximum suppression so that the number of extracted key points is close to the expected number of key points, and the number of key points of each image is balanced to reduce the negative impact on image matching.
  • the reference image of the i-th layer image in the above process is the i-th layer image.
  • An embodiment of the present application provides a method for extracting a key point of each frame of the image in the video stream.
  • the image is the i-th layer image in the image pyramid of the previous frame as an example.
  • the process flow of the method for extracting the key points of the image shown in FIG. 6 is described in detail.
  • the content can be as follows:
  • step 601 the image processing apparatus acquires an image pyramid of an image.
  • step 601 The specific processing of step 601 is the same as that of step 301 described above, and details are not described herein again.
  • step 602 the image processing apparatus extracts at least one candidate key point of the i-th layer image of the image pyramid.
  • step 603 the image processing device determines whether the image is a first frame image.
  • step 603 There is no necessary timing relationship between step 603 and steps 601-602, and it can be performed synchronously with steps 601-602, or can be performed before steps 601-602, which is not limited in this embodiment.
  • step 604 if the image is the first frame image, the image processing device uses the preset window size as the target window size corresponding to each layer of the image.
  • the image processing device can extract key points for each frame of image according to the chronological order of the video stream. Therefore, if the image is the first frame image and no similar image has been extracted before, then for the image pyramid of the first frame,
  • the target window size corresponding to each layer of images can be a preset window size.
  • the preset window size of each layer of images can be different, which can satisfy the relationship of decreasing layer by layer.
  • the preset window size of each layer of images can be The window size of the TALL level in Table 2 above.
  • the reference image can also be determined based on the method provided in the foregoing embodiment.
  • the process flow of the method for extracting key points of the image shown in FIG. 7 can be as follows:
  • step 6041 the image processing device determines whether the i-th layer image is a first-layer image.
  • step 6042 if the image is the first frame image in the video stream, and the ith layer image of the image pyramid of the image is the first layer image, the image processing device determines the preset window size as the first image pyramid of the image.
  • step 6043 if the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first-layer image, the image processing device converts the i-th image of the image pyramid
  • the -1 layer image is determined as the reference image of the i-th layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
  • step 605 if the image is any frame image other than the first frame image in the video stream, the image processing device determines the i-th layer image of the image pyramid of the previous frame image of the image as the image image
  • the reference image of the ith layer image of the pyramid determines the first key point number of the reference image, and determines the target window size corresponding to the ith layer image of the image pyramid according to the second key point number and the first key point number of the reference image.
  • the image pyramid of the previous frame image can be The image of the corresponding layer in the middle is determined as the reference image, so that the non-polarity of the current image is determined according to the degree to which the reference image has extracted the key points to meet the requirements, that is, the degree to which the number of output key points of the reference image approaches the preset key points Large values suppress the size of the window.
  • the specific processing for determining the target window size according to the reference image is the same as that in the foregoing embodiment, and details are not described herein again.
  • step 606 the image processing device performs non-maximum suppression processing on at least one candidate key point of the i-th layer image of the image pyramid of the image according to the target window size to obtain at least one output key point of the i-th layer image.
  • step 607 the image processing device determines an output key point of each layer image of the image pyramid as a key point of the image.
  • the image processing device uses the corresponding layer image of the previous frame image as the reference image.
  • the texture complexity of a frame of images is similar, and the blurring degree of the same layer of images in the image pyramid is similar. Therefore, the size of the non-maximum suppression window can be adjusted based on the reference image, so that the number of extracted key points is close to the expected number of key points. , To equalize the number of key points in each image in order to reduce the negative impact on image matching.
  • this embodiment also provides a device for extracting key points of an image.
  • the device may be the above-mentioned image processing device or configured in the above-mentioned image processing device. As shown in FIG. 9, the device includes:
  • An obtaining module 910 is configured to obtain an image pyramid of an image, where the image pyramid includes N-layer images, N> 1, and can specifically implement the obtaining function in the above steps 301 and 601, and other hidden steps;
  • a determining module 920 configured to determine a target window size according to the i-th layer image of the image pyramid, and determine at least one output key point of the i-th layer image according to the target window size, where 1 ⁇ i ⁇ N;
  • the output key points of the images of each layer of the image pyramid are determined as the key points of the image; specifically, the determination function in the above steps 302-307, 602-607, and other hidden steps can be implemented.
  • the determining module 920 is configured to:
  • Extract at least one candidate key point of the i-th layer image of the image pyramid determine a target window size according to the i-th layer image, and perform at least one candidate key point of the i-th layer image according to the target window size Non-maximum suppression processing to obtain at least one output key point of the i-th layer image.
  • the image is a static image
  • the determining module 920 is configured to:
  • the i-th layer image is a first-layer image, determining a preset window size as a target window size
  • the image of layer i-1 in the image pyramid of the image is determined as the reference image of the image of layer i, and the target window size is determined according to the reference image.
  • the image is a frame image in a video stream
  • the determining module 920 is configured to:
  • the image is any frame image other than the first frame image in the video stream, determining the i-th layer image of the image pyramid of the previous frame image of the image as the image pyramid of the image A reference image of the i-th layer image, and a target window size is determined according to the reference image.
  • the determining module 920 is further configured to:
  • the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is the first-layer image, determining the preset window size as the target window size;
  • the i-1th image of the image pyramid of the image is The layer image is determined as the reference image of the ith layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
  • the determining module 920 is configured to:
  • the number of the second key points is the number of output key points of the reference image.
  • the determining module 920 is configured to:
  • the number of first keypoints corresponding to the reference image is determined according to the number of layers where the reference image is located in the image pyramid to which the reference image belongs, where the number of layers and the first key to which the reference image is located in the image pyramid to which the reference image belongs. There is a preset correspondence relationship between the number of points.
  • the number of first key points of two adjacent layers of images meets a preset ratio
  • the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid.
  • the determining module 920 is configured to:
  • a target window size is determined according to the target window level.
  • the determining module 920 is configured to:
  • a window size corresponding to the target window level in the target window group is determined as a target window size.
  • the determining module 920 is configured to:
  • a feature score of each pixel in the image block is determined according to a preset feature detection algorithm, and a pixel point whose feature score is greater than a preset threshold is determined as the image block.
  • the candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
  • the foregoing obtaining module 910 may be implemented by a processor, and the determining module 920 may be implemented by a processor and a memory together.
  • the target window size can be adjusted based on each layer of images in the image pyramid of the image, that is, the window size of non-maximum suppression is adjusted so that the non-maximum value
  • the size of the suppressed window can be changed with each layer of the image, and the number of key points in each image is balanced to reduce the negative impact on image matching.
  • the device for extracting image key points only uses the division of the foregoing functional modules as an example for extracting the image key points.
  • the above-mentioned functions can be assigned by different functions.
  • the function module is completed, that is, the internal structure of the image processing device is divided into different function modules to complete all or part of the functions described above.
  • the apparatus for extracting key points of an image provided by the foregoing embodiment belongs to the same concept as the method embodiment for extracting key points of an image. For specific implementation processes, refer to the method embodiments, and details are not described herein again.
  • all or part may be implemented by software, hardware, or a combination thereof.
  • software When implemented using software, it may be all or partly implemented in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center by wire (for example, coaxial cable, fiber optic, twisted pair) or wireless (for example, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes one or more media integrations.
  • the medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape, etc.), an optical medium (such as an optical disk, etc.), or a semiconductor medium (such as a solid state hard disk, etc.).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

A method and apparatus for extracting an image key point, which relate to the technical field of electronics. The method comprises: acquiring an image pyramid of an image (301), the image pyramid comprising N layers of images, and N>1; according to an ith layer image of the image pyramid, determining the size of a target window, and determining at least one output key point of the ith layer image according to the size of the target window, wherein 1≤i≤N; and determining an output key point of each layer image of the image pyramid as a key point of the image. By employing the described method, the number of key points extracted from each image may be equalized.

Description

提取图像关键点的方法及装置Method and device for extracting key points of image 技术领域Technical field
本申请涉及电子技术领域,特别涉及一种提取图像关键点的方法及装置。The present application relates to the field of electronic technology, and in particular, to a method and device for extracting key points of an image.
背景技术Background technique
在图像处理中,通常利用图像像素点中的关键点来进行图像匹配。图像的关键点(又称为特征点或兴趣点)是图像中突出且具有代表意义的一些点,利用这些点可以识别图像、进行图像匹配或者实现3D(3 Dimensions,三维)重建等。In image processing, key points in image pixels are usually used for image matching. Key points (also known as feature points or points of interest) of the image are prominent and representative points in the image. These points can be used to identify images, perform image matching, or implement 3D (3D) reconstruction.
在提取图像关键点的过程中,首先通过对图像进行逐层下采样的方式构建图像金字塔,也即,将初始的图像作为第1层的图像,对第i层图像中的全部像素点按照一定的采样率进行下采样,得到下采样后的像素点作为第i+1层图像,其中第i+1层图像的像素大小小于第i层图像,i为大于等于1的正整数。然后,将每层图像划分为图像块,针对每个图像块,基于FAST(Feature from Accelerated Segment Test,加速分割特征检测)算法确定每个像素点的特征得分,将特征得分大于阈值的像素点确定为关键点。最后,可以根据预设大小的窗口对图像块的各个关键点进行非极大值抑制,也即在以一个关键点为中心的窗口范围内,提取特征得分最高的关键点,可以理解为局部极大值搜索。对图像金字塔中一层图像的每个图像块进行上述处理,即可得到该层图像的关键点。遍历每层图像获取关键点,即可得到整个图像的关键点。In the process of extracting the key points of the image, the image pyramid is first constructed by down-sampling the image layer by layer, that is, the initial image is used as the image of the first layer, and all pixels in the image of the i-th layer are determined according to a certain level. The downsampling is performed to obtain the down-sampled pixels as the i + 1th layer image, where the pixel size of the i + 1th layer image is smaller than the ith layer image, and i is a positive integer greater than or equal to 1. Then, each layer of image is divided into image blocks. For each image block, the feature score of each pixel is determined based on the FAST (Feature from Accelerated Segment Test) algorithm, and the pixels with the feature score greater than the threshold are determined. For the key point. Finally, non-maximum suppression can be performed on each key point of the image block according to a window of a preset size, that is, in a window centered on a key point, the key point with the highest feature score is extracted, which can be understood as a local pole Big value search. By performing the above processing on each image block of an image in a layer in the image pyramid, the key points of the image in this layer can be obtained. By traversing each layer of images to obtain key points, the key points of the entire image can be obtained.
上述获取图像关键点的方案至少存在以下问题:非极大值抑制窗口的大小固定,对于纹理过于丰富的图像,提取到的关键点数目可能过多,对于纹理过于稀少的图像,提取到的关键点数目可能过少,也即对于纹理复杂度不同的图像提取到的关键点数目差别较大,关键点不均衡可能会对后续处理造成影响,例如,关键点数目过多可能导致图像匹配的运算量增加,关键点数目过少可能导致图像匹配的准确性较低。The above scheme for obtaining image key points has at least the following problems: The size of the non-maximum suppression window is fixed. For images with too rich textures, the number of extracted key points may be too much. For images with too few textures, the extracted keys The number of points may be too small, that is, the number of key points extracted for images with different texture complexity is large. The imbalance of key points may affect subsequent processing. For example, an excessive number of key points may cause image matching operations. When the amount is increased, the number of key points is too small, which may lead to lower accuracy of image matching.
发明内容Summary of the Invention
本实施例提供了一种提取图像关键点的方法及装置,可以均衡每张图像提取的关键点数目。所述技术方案如下:This embodiment provides a method and a device for extracting key points of an image, which can balance the number of key points extracted by each image. The technical solution is as follows:
一方面,提供了一种提取图像关键点的方法,该方法包括:图像处理设备获取图像的图像金字塔,图像金字塔包括N层图像,N>1;根据图像金字塔的第i层图像确定目标窗口大小,根据目标窗口大小确定第i层图像的至少一个输出关键点,其中,1≤i≤N;将图像金字塔的各层图像的输出关键点,确定为图像的关键点。In one aspect, a method for extracting key points of an image is provided. The method includes: an image processing device acquires an image pyramid of an image, the image pyramid includes N-layer images, N> 1; and determining a target window size according to an i-th layer image of the image pyramid , Determine at least one output key point of the i-th layer image according to the size of the target window, where 1 ≦ i ≦ N; determine the output key points of each layer image of the image pyramid as the key points of the image.
通过上述处理,图像处理设备在提取图像的关键点的过程中,进行非极大值抑制处理之前,可以基于每层图像对非极大值抑制的窗口大小进行调整,以便非极大值抑制的窗口大小可以随着每层图像进行改变,可以均衡每张图像提取的关键点数目。Through the above processing, during the process of extracting the key points of the image, before performing non-maximum suppression processing, the image processing device can adjust the window size of non-maximum suppression based on each layer of the image, so that the non-maximum suppression The window size can be changed with each layer of the image, and the number of key points extracted from each image can be balanced.
在一种可能的实施方式中,根据图像金字塔的第i层图像确定目标窗口大小,根据目标窗口大小确定第i层图像的至少一个输出关键点,包括:提取图像金字塔的第i层图像的至少一个候选关键点,根据第i层图像确定目标窗口大小,根据目标窗口大小,对第i层图像的至少一个候选关键点进行非极大值抑制处理,得到第i层图像的至少一个输出关 键点。通过上述处理,图像处理设备可以对每层图像的候选关键点进行非极大值抑制处理,由于对候选关键点进行了提取,可以减少非极大值抑制处理时的关键点数目,提高处理效率。In a possible implementation manner, determining the target window size according to the i-th layer image of the image pyramid, and determining at least one output key point of the i-th layer image according to the target window size, including: extracting at least one of the i-th layer image of the image pyramid A candidate key point, the target window size is determined according to the i-th layer image, and at least one candidate key point of the i-th layer image is subjected to non-maximum suppression processing according to the target window size to obtain at least one output key point of the i-th layer image . Through the above processing, the image processing device can perform non-maximum suppression processing on the candidate key points of each layer of the image. Since the candidate key points are extracted, the number of key points during the non-maximum suppression processing can be reduced and the processing efficiency can be improved. .
在一种可能的实施方式中,图像为静态图像,根据图像金字塔的第i层图像确定目标窗口大小,包括:如果第i层图像为第1层图像,则图像处理设备将预设窗口大小确定为目标窗口大小;否则,图像处理设备将图像的图像金字塔中第i-1层图像确定为第i层图像的参考图像,根据参考图像确定目标窗口大小。同一图像金字塔中相邻两层的图像可以根据采样得到,也即,不同层的图像的纹理与原始图像可以是相同的,因此,通过上述处理得到的参考图像与提取关键点的图像的纹理复杂度相似。In a possible implementation manner, the image is a static image, and determining the target window size according to the i-th layer image of the image pyramid includes: if the i-th layer image is a first-layer image, the image processing device determines the preset window size Is the target window size; otherwise, the image processing device determines the i-1th layer image in the image pyramid of the image as the reference image of the i-th layer image, and determines the target window size according to the reference image. The images of two adjacent layers in the same image pyramid can be obtained according to sampling, that is, the textures of the images of different layers and the original image can be the same. Therefore, the textures of the reference image obtained by the above processing and the image of the extracted key points are complicated. Degrees are similar.
在一种可能的实施方式中,图像为视频流中的一帧图像,根据图像金字塔的第i层图像确定目标窗口大小,包括:如果图像为视频流中的除第1帧图像之外的任一帧图像,则图像处理设备将图像的前一帧图像的图像金字塔的第i层图像,确定为图像的图像金字塔的第i层图像的参考图像,根据参考图像确定目标窗口大小。相邻两帧的图像呈现的画面可能较为相近,因此,通过上述处理得到的参考图像与提取关键点的图像的纹理复杂度相似。In a possible implementation manner, the image is a frame image in the video stream, and the target window size is determined according to the i-th layer image of the image pyramid, including: if the image is any other than the first frame image in the video stream For a frame of image, the image processing device determines the i-th image of the image pyramid of the previous frame of the image as the reference image of the i-th image of the image pyramid of the image, and determines the target window size according to the reference image. The images presented by two adjacent frames may have similar pictures. Therefore, the reference image obtained through the above processing is similar in texture complexity to the image from which the key points are extracted.
在一种可能的实施方式中,该方法还包括:如果图像为视频流中的第1帧图像,且图像的图像金字塔的第i层图像为第1层图像,则图像处理设备将预设窗口大小确定为目标窗口大小;如果图像为视频流中的第1帧图像,且图像的图像金字塔的第i层图像为除第1层图像之外的图像,则图像处理设备将图像的图像金字塔的第i-1层图像确定为图像的图像金字塔的第i层图像的参考图像,根据参考图像确定目标窗口大小。第1帧图像的前一帧图像不存在,则可以基于与静态图像同理的方法确定参考图像,提高确定目标窗口大小的准确性。In a possible implementation manner, the method further includes: if the image is the first frame image in the video stream, and the i-th image of the image pyramid of the image is the first-layer image, the image processing device presets a window The size is determined as the target window size; if the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first layer image, the image processing device The i-1 layer image is determined as the reference image of the i layer image of the image pyramid of the image, and the target window size is determined according to the reference image. If the previous frame image of the first frame image does not exist, the reference image can be determined based on the same method as the static image to improve the accuracy of determining the target window size.
在一种可能的实施方式中,根据参考图像确定目标窗口大小,包括:图像处理设备确定参考图像的第一关键点数目,根据参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,其中,第一关键点数目为预设的关键点数目,第二关键点数目为参考图像的输出关键点的数目。通过上述处理,可以根据参考图像的输出关键点的数目接近预设关键点的程度,确定当前图像的非极大值抑制窗口的大小,以便基于调整后的目标窗口大小提取图像的关键点时,使得每层图像的输出关键点接近预设的关键点数目,这样,如果每张图像都接近预设的关键点,则可以达到均衡每张图像的关键点的效果。In a possible implementation manner, determining the target window size according to the reference image includes: the image processing device determines the first key point number of the reference image, and determines the target according to the first key point number and the second key point number of the reference image. Window size, where the first number of key points is a preset number of key points and the second number of key points is the number of output key points of a reference image. Through the above processing, the size of the non-maximum suppression window of the current image can be determined according to the degree to which the number of output keypoints of the reference image approaches the preset keypoints, so as to extract the keypoints of the image based on the adjusted target window size, Make the output keypoints of each layer of images close to the preset number of keypoints. In this way, if each image is close to the preset keypoints, the effect of equalizing the keypoints of each image can be achieved.
在一种可能的实施方式中,确定参考图像的第一关键点数目,包括:根据参考图像在所属的图像金字塔中所处的层数确定所述参考图像对应的第一关键点数目,其中,参考图像在所属的图像金字塔中所处的层数和第一关键点数目之间存在预设的对应关系。通过上述处理,每层图像的第一关键点数目不同,可以适应每层图像的像素大小不同的情况。In a possible implementation manner, determining the number of first key points of the reference image includes: determining the number of first key points corresponding to the reference image according to the number of layers of the reference image in the image pyramid to which the reference image belongs, where, There is a preset correspondence relationship between the number of layers in the reference image pyramid and the number of first key points of the reference image. Through the above processing, the number of first key points of each layer of the image is different, and it can adapt to the situation that the pixel size of each layer of the image is different.
在一种可能的实施方式中,在预设的对应关系中,相邻两层图像的第一关键点数目满足预设比例,预设比例等于图像金字塔中相邻两层图像的像素点数目比例。图像金字塔中的每层图像具有不同的第一关键点数目,相邻两层的期望关键点数目满足预设比例,通过上述处理,可以保证每层图像的第一关键点数目在像素点的总数目中所占的比例一定。In a possible implementation manner, in the preset correspondence relationship, the number of first key points of two adjacent layers of images meets a preset ratio, and the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid. . Each layer of the image pyramid has a different number of first keypoints. The expected number of keypoints in two adjacent layers meets a preset ratio. Through the above processing, the number of first keypoints in each layer of the image can be guaranteed in the total number of pixels. The proportion in the project is certain.
在一种可能的实施方式中,根据参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,包括:图像处理设备确定第二关键点数目和第一关键点数目的比值;根据 预设的比值范围与窗口级别的对应关系,确定比值所处的目标比值范围以及目标比值范围对应的目标窗口级别;根据目标窗口级别,确定目标窗口大小。通过上述处理,可以利用第二关键点数目和第一关键点数目的比值来衡量输出关键点的数目接近预设的关键点数目的程度,不同的比值所处的比值范围对应与不同的窗口级别,比值越大对应的窗口级别也可以越大。对于同一张图像,增大非极大值抑制的窗口大小,可以使得非极大值抑制处理之后得到的关键点数目减少。因此,如果第二关键点数目与第一关键点数目的比值过大,例如大于2,通过比值范围与窗口级别的对应关系,适当调大窗口大小,则可以减少当前图像的输出关键点的数目,达到均衡关键点数目的效果。In a possible implementation manner, determining the target window size according to the first key point number and the second key point number of the reference image includes: the image processing device determines a ratio of the second key point number to the first key point number; The corresponding relationship between the preset ratio range and the window level determines the target ratio range where the ratio is located and the target window level corresponding to the target ratio range; the target window size is determined according to the target window level. Through the above processing, the ratio of the number of second keypoints to the number of first keypoints can be used to measure the degree to which the number of output keypoints is close to the preset number of keypoints. The ratio range where different ratios correspond to different window levels. The larger the corresponding window level, the larger it can be. For the same image, increasing the window size of non-maximum suppression can reduce the number of key points obtained after non-maximum suppression processing. Therefore, if the ratio of the number of the second keypoints to the number of the first keypoints is too large, for example, greater than 2, the number of output keypoints in the current image can be reduced by appropriately increasing the window size through the correspondence between the ratio range and the window level. To achieve the effect of equalizing the number of key points.
在一种可能的实施方式中,根据目标窗口级别,确定目标窗口大小,包括:根据预设的层数与窗口组的对应关系,确定第i层图像对应的目标窗口组,窗口组中包括至少一个窗口级别对应的窗口大小;将目标窗口组中目标窗口级别对应的窗口大小,确定为目标窗口大小。随着层数的增大,每层图像的像素大小逐渐减小,如果相应的将同一窗口级别的窗口大小设置为逐层减小,则可以均衡每层图像的关键点数目。由于图像金字塔的每层图像可以用于对图像多尺度表达,也即模拟不同模糊程度的图像,均衡每层图像的关键点数目,可以使得每种模糊程度都具有一定数目的关键点,提高图像匹配准确性。In a possible implementation manner, determining the target window size according to the target window level includes: determining a target window group corresponding to the i-th layer image according to a preset correspondence between the number of layers and the window group. The window group includes at least The window size corresponding to one window level; the window size corresponding to the target window level in the target window group is determined as the target window size. As the number of layers increases, the pixel size of each layer image gradually decreases. If the window size of the same window level is set to decrease layer by layer, the number of key points in each layer image can be balanced. As each layer of the image pyramid can be used to express the image at multiple scales, that is, to simulate images with different levels of blur, equalizing the number of key points in each layer of the image can make each level of blur have a certain number of key points, improving the image Matching accuracy.
在一种可能的实施方式中,提取第i层图像的至少一个候选关键点,包括:对于第i层图像每个图像块,根据预设的特征检测算法确定图像块中每个像素点的特征得分,将特征得分大于预设阈值的像素点,确定为图像块的候选关键点;将第i层图像的各个图像块的候选关键点,确定为第i层图像的至少一个候选关键点。通过上述处理,可以通过预设阈值对像素点进行筛选,减少非极大值抑制处理时的关键点数目。In a possible implementation manner, extracting at least one candidate key point of the i-th layer image includes: for each image block of the i-th layer image, determining a feature of each pixel point in the image block according to a preset feature detection algorithm Score, determine the pixel points whose feature score is greater than a preset threshold as candidate key points of the image block; determine candidate key points of each image block of the i-th layer image as at least one candidate key point of the i-th layer image. Through the above processing, pixel points can be filtered by a preset threshold, and the number of key points during non-maximum suppression processing can be reduced.
一方面,提供了一种提取图像关键点的装置,该提取图像关键点的装置包括至少一个模块,该至少一个模块用于实现上述提取图像关键点的方法。In one aspect, an apparatus for extracting key points of an image is provided. The apparatus for extracting key points of an image includes at least one module, and the at least one module is configured to implement the method for extracting key points of an image.
一方面,提供了一种图像处理设备,图像处理设备包括存储器和处理器,存储器用于存储指令,处理器用于调用指令并执行上述提取图像关键点的方法。In one aspect, an image processing device is provided. The image processing device includes a memory and a processor. The memory is used to store instructions. The processor is used to call the instructions and execute the method for extracting key points of the image.
一方面,提供了一种计算机可读存储介质,当计算机可读存储介质在图像处理设备上运行时,使得图像处理设备执行上述提取图像关键点的方法。In one aspect, a computer-readable storage medium is provided, and when the computer-readable storage medium is run on an image processing device, the image processing device is caused to perform the above-mentioned method for extracting key points of an image.
一方面,提供了一种包含指令的计算机程序产品,当计算机程序产品在图像处理设备上运行时,使得图像处理设备执行上述提取图像关键点的方法。In one aspect, a computer program product containing instructions is provided, which, when the computer program product runs on an image processing device, causes the image processing device to execute the method for extracting key points of an image described above.
本实施例提供的技术方案带来的有益效果是:The beneficial effects brought by the technical solution provided in this embodiment are:
本实施例中,图像处理设备在对图像提取关键点时,可以基于该图像的图像金字塔中的每层图像调整目标窗口大小,也即调整非极大值抑制的窗口大小,以便非极大值抑制的窗口大小可以随着每层图像进行改变,均衡每张图像的关键点数目,以便减少对图像匹配的负影响。In this embodiment, when the image processing device extracts a key point from the image, the target window size can be adjusted based on each layer of images in the image pyramid of the image, that is, the window size of non-maximum suppression is adjusted so that the non-maximum value The size of the suppressed window can be changed with each layer of the image, and the number of key points in each image is balanced to reduce the negative impact on image matching.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solution in this embodiment more clearly, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. Those of ordinary skill in the art can obtain other drawings according to the drawings without paying creative labor.
图1是本实施例提供的一种实施环境图;FIG. 1 is an implementation environment diagram provided by this embodiment;
图2是本实施例提供的一种图像处理设备的结构示意图;FIG. 2 is a schematic structural diagram of an image processing device according to this embodiment; FIG.
图3是本实施例提供的一种提取图像关键点的方法流程图;FIG. 3 is a flowchart of a method for extracting key points of an image provided by this embodiment; FIG.
图4是本实施例提供的一种获取候选关键点的方法流程图;FIG. 4 is a flowchart of a method for obtaining candidate key points provided by this embodiment; FIG.
图5是本实施例提供的一种参考图像示意图;FIG. 5 is a schematic diagram of a reference image provided by this embodiment;
图6是本实施例提供的一种提取图像关键点的方法流程图;FIG. 6 is a flowchart of a method for extracting key points of an image provided by this embodiment; FIG.
图7是本实施例提供的一种提取图像关键点的方法流程图;FIG. 7 is a flowchart of a method for extracting key points of an image provided by this embodiment;
图8是本实施例提供的一种参考图像示意图;FIG. 8 is a schematic diagram of a reference image provided by this embodiment; FIG.
图9是本实施例提供的一种提取图像关键点的装置示意图。FIG. 9 is a schematic diagram of an apparatus for extracting key points of an image provided by this embodiment.
具体实施方式detailed description
本实施例提供了一种提取图像关键点的方法,该方法可以由图像处理设备实现。图1是本实施例提供的一种实施环境图。该实施环境包括多个终端101、用于为该多个终端提供服务的图像处理设备102。多个终端101通过无线或者有线网络和图像处理设备102连接。图像处理设备102可以为终端101提供提取图像关键点的服务。对于图像处理设备102来说,该图像处理设备102还可以具有至少一种数据库,用以存储待提取关键点的图像、以及上述图像的关键点等等。终端101作为服务的请求方,可以向图像处理设备102发送待提取关键点的图像。This embodiment provides a method for extracting key points of an image, and the method may be implemented by an image processing device. FIG. 1 is a diagram of an implementation environment provided by this embodiment. The implementation environment includes a plurality of terminals 101 and an image processing apparatus 102 for providing services to the plurality of terminals. The plurality of terminals 101 are connected to the image processing apparatus 102 through a wireless or wired network. The image processing device 102 may provide a service for the terminal 101 to extract key points of an image. For the image processing device 102, the image processing device 102 may further have at least one database for storing an image of a key point to be extracted, a key point of the above image, and the like. As the requester of the service, the terminal 101 may send an image of the key points to be extracted to the image processing device 102.
图像处理设备102可以包括处理器210、收发器220。收发器220可以与处理器210连接,如图2所示。收发器220可以用于收发消息或数据,即可以接收终端101发送的待提取关键点的图像等。处理器210可以是图像处理设备102的控制中心,利用各种接口和线路连接整个图像处理设备102的各个部分,如收发器220等。在本申请中,处理器210可以是ASIC(Application-Specific Integrated Circuits,专用集成电路),可以用于提取图像关键点。处理器210可以包括一个或多个处理单元。处理器210可集成应用处理器和调制解调器,其中,应用处理器主要处理操作系统,调制解调器主要处理无线通信。处理器210还可以是数字信号处理器、中央处理器等。图像处理设备102还可以包括存储器230。存储器230可用于存储待提取关键点的图像、图像的关键点等。图像处理设备102还可以包括输入/输出接口240,可以为处理器210和外围接口模块之间提供接口,上述外围接口模块可以是按键等。The image processing apparatus 102 may include a processor 210 and a transceiver 220. The transceiver 220 may be connected to the processor 210 as shown in FIG. 2. The transceiver 220 may be used to send and receive messages or data, that is, may receive an image of a key point to be extracted and the like sent by the terminal 101. The processor 210 may be a control center of the image processing apparatus 102, and uses various interfaces and lines to connect various parts of the entire image processing apparatus 102, such as the transceiver 220 and the like. In this application, the processor 210 may be an ASIC (Application-Specific Integrated Circuits), which may be used to extract key points of an image. The processor 210 may include one or more processing units. The processor 210 may integrate an application processor and a modem, where the application processor mainly processes an operating system and the modem mainly processes wireless communications. The processor 210 may also be a digital signal processor, a central processing unit, or the like. The image processing apparatus 102 may further include a memory 230. The memory 230 may be configured to store an image of a key point to be extracted, a key point of an image, and the like. The image processing device 102 may further include an input / output interface 240, which may provide an interface between the processor 210 and a peripheral interface module. The peripheral interface module may be a button or the like.
在提取图像关键点的过程中,本申请引入参考图像来确定非极大值抑制的窗口大小。对于待提取关键点的图像构成的图像金字塔,第i层图像的参考图像可以有以下两种:第一,当该图像是静态图像或视频流中的一帧图像时,第i层图像的参考图像可以是同一图像金字塔中的第i-1层图像;第二,当该图像是视频流中的一帧图像时,第i层图像的参考图像可以是前一帧图像的图像金字塔中的第i层图像。其中,静态图像可以是指一张独立的图像,图像处理设备对其提取的关键点与其它图像无关,例如,静态图像可以是一张 拍摄的照片;相对应的,视频流中的一帧图像不是独立的图像,每帧图像与其它帧图像存在时间先后的关系,并且图像处理设备对一帧图像提取的关键点与相邻帧的图像相关。In the process of extracting the key points of the image, the present application introduces a reference image to determine the window size of the non-maximum suppression. For the image pyramid formed by the image of the key points to be extracted, the reference image of the i-th layer image can have the following two types: First, when the image is a still image or a frame image in a video stream, the reference of the i-th layer image The image can be the i-1th layer image in the same image pyramid; second, when the image is a frame image in the video stream, the reference image of the ith layer image can be the first image in the image pyramid of the previous frame image i-layer image. Among them, a static image may refer to an independent image, and the key points extracted by the image processing device are not related to other images. For example, a static image may be a captured photo; correspondingly, a frame image in a video stream It is not an independent image. There is a chronological relationship between each frame of images and other frames of images, and the key points extracted by an image processing device for one frame of image are related to the images of adjacent frames.
上述两种参考图像的共同点在于,参考图像与待提取关键点的图像的纹理复杂度近似。原因在于,对于第一种参考图像,同一图像金字塔中相邻两层的图像可以根据采样得到,也即,不同层的图像的纹理与原始图像可以是相同的;对于第二种参考图像,由于相邻帧的图像之间时间间隔较小,例如仅为40毫秒,因此相邻两帧的图像呈现的画面可能较为相近,也即相邻两帧的图像之间的纹理近似。The above two types of reference images have in common that the texture complexity of the reference image and the image of the key point to be extracted is similar. The reason is that for the first reference image, the images of two adjacent layers in the same image pyramid can be obtained according to sampling, that is, the texture of the images of different layers can be the same as the original image; for the second reference image, because The time interval between the images of adjacent frames is small, for example, it is only 40 milliseconds. Therefore, the images of the two adjacent frames may be similar, that is, the texture between the images of the adjacent two frames is similar.
纹理复杂度近似的图像之间的图像特征也近似,也即基于相同的方法提取关键点时,得到的关键点数目近似。因此,如果在对图像提取关键点时,参考图像的关键点已提取完成,则参考图像的关键点数目可以衡量对应的提取关键点方法是否合适,以此判断是否应用相同的方法,或者确定如何进行调整。由于非极大值抑制的窗口可以对关键点进行筛选,因此参考图像的关键点数目所衡量的也可以是对应的非极大值抑制窗口的大小是否合适,进而对非极大值抑制窗口进行调整,以便均衡每张图像提取的关键点数目,避免纹理复杂度相差较大的图像得到的关键点数目相差也较大。The image features between images with similar texture complexity are also similar, that is, when the key points are extracted based on the same method, the number of key points obtained is similar. Therefore, if the key points of the reference image have been extracted when the key points are extracted from the image, the number of key points of the reference image can be used to measure whether the corresponding method of extracting key points is appropriate to determine whether the same method is applied or how Make adjustments. Since the non-maximum suppression window can filter the key points, the number of key points in the reference image can also be used to measure whether the size of the corresponding non-maximum suppression window is appropriate. Adjust to equalize the number of key points extracted from each image and avoid large differences in the number of key points obtained for images with large differences in texture complexity.
当然,第i层图像的参考图像除了上述两种,还可以是基于相同的构思得到的其它参考图像。这些参考图像均可应用到本申请提供的提取图像关键点的方法中,本申请对此不作限定。Of course, the reference image of the i-th layer image may be other reference images obtained based on the same concept in addition to the above two types. These reference images can be applied to the method for extracting key points of an image provided in this application, which is not limited in this application.
本申请的一个实施例提供了一种用于提取静态图像或视频图像的关键点的方法,以第i层图像的参考图像是同一图像金字塔中的第i-1层图像为例,结合具体的实施方式,对图3所示的提取图像关键点的方法处理流程进行详细的说明,内容可以如下:An embodiment of the present application provides a method for extracting a key point of a still image or a video image. Taking the reference image of the i-th layer image as the i-th layer image in the same image pyramid as an example, combining specific In an implementation manner, the process flow of the method for extracting key points of an image shown in FIG. 3 is described in detail, and the content may be as follows:
在步骤301中,图像处理设备获取图像的图像金字塔。In step 301, the image processing apparatus acquires an image pyramid of an image.
其中,图像金字塔可以包括N层图像,N>1。The image pyramid may include N-layer images, N> 1.
本实施例中,图像处理设备具备提取图像关键点的能力。如果图像处理设备为其它终端提供提取图像关键点的服务,则可以接收终端发送的静态图像或视频流。或者,图像处理设备可以具有采集图像的功能(例如,图像处理设备可以是监控设备),进而可以对采集到的图像进行关键点提取。In this embodiment, the image processing device has the ability to extract key points of the image. If the image processing device provides a service for extracting key points of an image for other terminals, it can receive still images or video streams sent by the terminals. Alternatively, the image processing device may have a function of acquiring an image (for example, the image processing device may be a monitoring device), and then key points may be extracted from the acquired image.
图像处理设备还可以对待提取关键点的图像进行存储。图像处理设备可以是实时对视频流中的每帧图像提取关键点,也可以是对存储的图像提取关键点,本实施例不作限定。The image processing device may also store an image of a key point to be extracted. The image processing device may extract key points for each frame of images in the video stream in real time, or may extract key points for stored images, which is not limited in this embodiment.
对于待提取关键点的图像,图像处理设备可以对其构造图像金字塔,本实施例对构造图像金字塔的具体方式不作限定,例如,可以是基于上采样或下采样的方法构造图像金字塔。For the image of the key points to be extracted, the image processing device may construct an image pyramid. The embodiment does not limit the specific method of constructing the image pyramid. For example, the image pyramid may be constructed based on an upsampling or downsampling method.
在一种可能的实施方式中,图像处理设备构造图像金字塔的处理可以如下:图像处理设备将图像作为图像金字塔的第1层图像,根据预设比例,逐层对图像金字塔的图像进行下采样得到下一层图像,直到达到构造停止条件,停止对图像金字塔的图像进行下采样,得到图像的图像金字塔。In a possible implementation manner, the process of constructing the image pyramid by the image processing device may be as follows: the image processing device uses the image as the first layer image of the image pyramid, and downsamples the image of the image pyramid layer by layer according to a preset ratio. The next layer of images, until the construction stop condition is reached, stops downsampling the image of the image pyramid to obtain the image pyramid of the image.
下采样指的是生成图像的缩略图,预设比例可以是指下采样的比例。下采样的方法构成的图像金字塔以待提取关键点的图像作为原始图像,生成多种分辨率的缩略图,也即对图像进行多尺度表达。Downsampling refers to generating a thumbnail of an image, and the preset ratio may refer to a downsampling ratio. The image pyramid formed by the downsampling method uses the image of the key point to be extracted as the original image, and generates thumbnails of multiple resolutions, that is, the image is expressed at multiple scales.
构造停止条件可以是构造的图像金字塔达到预设层数,或者最高层的图像达到预设大小。例如,对像素大小为992*744的图像构造图像金字塔,预设比例为1.2,当图像金字塔达到第8层时停止构造,可以得到图像金字塔第1层到第8层图像的像素大小分别为992*744、827*620、689*517、574*431、478*359、399*299、332*249、277*208。The construction stop condition may be that the constructed image pyramid reaches a preset number of layers, or the highest-level image reaches a preset size. For example, for an image pyramid with an image size of 992 * 744, the image pyramid is constructed with a preset ratio of 1.2. When the image pyramid reaches the eighth layer, the construction is stopped, and the pixel sizes of the first to eighth layers of the image pyramid are 992. * 744, 827 * 620, 689 * 517, 574 * 431, 478 * 359, 399 * 299, 332 * 249, 277 * 208.
图像处理设备将图像金字塔构造完成后,可以从第1层图像开始,逐层对图像金字塔的图像提取关键点。After the image processing device completes the construction of the image pyramid, it can start with the first layer image and extract the key points of the image pyramid image layer by layer.
当然,图像处理设备还可以是获取其它设备对图像构造的图像金字塔,本实施例对此不作限定。Of course, the image processing device may also acquire an image pyramid constructed by other devices on the image, which is not limited in this embodiment.
在步骤302中,图像处理设备提取图像金字塔的第i层图像的至少一个候选关键点。In step 302, the image processing device extracts at least one candidate key point of the i-th layer image of the image pyramid.
图像处理设备可以逐层对图像金字塔的图像检测候选关键点,例如,可以基于FAST算法、SIFT(Scale-Invariant Feature Transform,尺度不变特征变换)算法、SURF(Speeded Up Robust Features,加速稳健特征)算法或FREAK(Fast Retina Keypoint,快速视网膜关键点)算法等检测图像的候选关键点。The image processing device can detect candidate key points of the image of the image pyramid layer by layer. For example, it can be based on FAST algorithm, SIFT (Scale-Invariant Feature Transform) algorithm, and SURF (Speeded Up Robust Features) to accelerate robust features. Algorithm or FREAK (Fast Retina Keypoint, Fast Retina Keypoint) algorithm to detect candidate keypoints of the image.
在一种可能的实施方式中,步骤302的处理可以如下:对于第i层图像每个图像块,图像处理设备根据预设的特征检测算法确定图像块中每个像素点的特征得分,将特征得分大于预设阈值的像素点,确定为图像块的候选关键点;将第i层图像的各个图像块的候选关键点,确定为第i层图像的至少一个候选关键点。In a possible implementation manner, the processing of step 302 may be as follows: for each image block of the i-th layer image, the image processing device determines a feature score of each pixel point in the image block according to a preset feature detection algorithm, and combines the features Pixel points with a score greater than a preset threshold are determined as candidate key points of the image block; candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
以FAST算法为例,FAST算法可以根据图像中的每个像素点计算与其周围预设圆形范围内的像素点,计算该像素点的梯度,也即计算像素点的特征得分。FAST算法中还可以预先设置有初始阈值,如果像素点的特征得分大于初始阈值,则表明该像素点为角点,可以将这些像素点作为候选关键点。Taking the FAST algorithm as an example, the FAST algorithm can calculate each pixel in the image and a pixel within a preset circle range around it, and calculate the gradient of the pixel, that is, calculate the feature score of the pixel. The FAST algorithm can also set an initial threshold in advance. If the feature score of a pixel is greater than the initial threshold, it indicates that the pixel is a corner, and these pixels can be used as candidate key points.
由于可能存在纹理较为平缓的图像,基于初始阈值进行检测时可能检测不到关键点,因此,FAST算法中可以设置有两档阈值,即初始阈值和低阈值,例如,初始阈值可以是20,低阈值可以是7。对于同一张图像而言,由于低阈值可以将角度较为平缓的角点检测出来,基于低阈值检测得到的关键点一般会多于基于初始阈值检测得到的关键点。Because there may be an image with a smooth texture, key points may not be detected when detecting based on the initial threshold. Therefore, two thresholds can be set in the FAST algorithm, that is, the initial threshold and the low threshold. For example, the initial threshold can be 20, low The threshold can be 7. For the same image, because the low threshold can detect corners with relatively gentle angles, the key points detected based on the low threshold are generally more than the key points detected based on the initial threshold.
如果图像处理设备基于初始阈值未检测到图像的关键点,则需要基于低阈值重新对图像进行检测。重新对图像进行检测会增加处理耗时,特别是硬件实现关键点的提取时,需要更多的寄存器和更长的处理时延,消耗更多的成本,处理效率也较低。因此,本实施例中提供了另一种获取候选关键点的方法,对于每个图像块,确定图像块中每个像素点的特征得分,如果存在特征得分大于第一阈值的像素点,则将特征得分大于第一阈值的像素点确定为图像块的候选关键点;否则,将特征得分大于第二阈值的像素点确定为图像块的候选关键点;将第i层图像的各个图像块的候选关键点,确定为第i层图像的至少一个候选关键点。通过上述方法可以实现一次检测提取候选关键点,只需要遍历图像一次,可以进行流水处理,不需要将图像进行回读,可以避免重复检测,提高处理效率,对于硬件实现可以减小复杂度。If the image processing device does not detect a key point of the image based on the initial threshold, it needs to re-detect the image based on the low threshold. Re-detecting the image will increase the processing time, especially when the key points are extracted by the hardware, requiring more registers and longer processing delays, consuming more costs, and lower processing efficiency. Therefore, in this embodiment, another method for obtaining candidate key points is provided. For each image block, a feature score of each pixel point in the image block is determined. If there are pixels with a feature score greater than a first threshold, then Pixels with feature scores greater than the first threshold are determined as candidate key points of the image block; otherwise, pixels with feature scores greater than the second threshold are determined as candidate key points of the image block; candidates for each image block of the i-th layer image The key point is determined as at least one candidate key point of the i-th layer image. The above method can be used to detect and extract candidate key points once. It only needs to traverse the image once, and can perform pipeline processing. It does not need to read back the image. It can avoid repeated detection, improve processing efficiency, and reduce complexity for hardware implementation.
如图4所示的获取候选关键点的方法流程图,该方法可以如下:As shown in the flowchart of the method for obtaining candidate key points shown in FIG. 4, the method may be as follows:
在步骤3021中,图像处理设备将图像金字塔的各层图像划分为多个预设大小的图像块。例如,对于每一层图像,图像处理设备可以将图像划分为多个像素大小为31*15的图像块。In step 3021, the image processing device divides each layer image of the image pyramid into a plurality of image blocks of a preset size. For example, for each layer of image, the image processing device may divide the image into a plurality of image blocks with a pixel size of 31 * 15.
在步骤3022中,对于每个图像块,图像处理设备确定图像块中像素点的特征得分,将 特征得分大于第二阈值的像素点确定为图像块的第一候选关键点。In step 3022, for each image block, the image processing device determines a feature score of a pixel point in the image block, and determines a pixel point with a feature score greater than a second threshold as a first candidate key point of the image block.
上面已经介绍过图像处理设备可以确定像素点的特征得分,此处不再赘述。第二阈值可以是上述低阈值,图像处理设备在确定每个像素点的特征得分后,可以得到一张对应的得分图,在每个像素点的位置上记录着对应的特征得分。然后,图像处理设备可以基于第二阈值检测图像的关键点,如果特征得分大于第二阈值,则可以在得分图上保留该特征得分,也即保留为第一候选关键点;如果特征得分不大于第二阈值,则可以在得分图上将该特征得分置为0。直接基于低阈值检测关键点,可以避免重复检测的情况。It has been described above that the image processing device can determine the feature score of a pixel, which is not repeated here. The second threshold may be the above-mentioned low threshold. After determining the feature score of each pixel, the image processing device may obtain a corresponding score map, and the corresponding feature score is recorded at the position of each pixel. Then, the image processing device may detect key points of the image based on the second threshold. If the feature score is greater than the second threshold, the feature score may be retained on the score map, that is, remain as the first candidate key point; if the feature score is not greater than For the second threshold, the feature score can be set to 0 on the score map. Detecting key points directly based on a low threshold can avoid repeated detection.
在步骤3023中,对于每个图像块,图像处理设备判断最大的特征得分是否大于第一阈值。In step 3023, for each image block, the image processing device determines whether the maximum feature score is greater than a first threshold.
在确定每个像素点的特征得分的过程中,图像处理设备还可以确定最大的特征得分并存储,例如,在计算得分图的同时用一个寄存器统计该图像块中候选关键点的最大特征得分。进而,在确定每个像素点的特征得分后,图像处理设备可以判断图像块中像素点的最大的特征得分是否大于第一阈值。In the process of determining the feature score of each pixel, the image processing device may also determine and store the maximum feature score, for example, while calculating the score map, use a register to count the maximum feature score of candidate key points in the image block. Furthermore, after determining the feature score of each pixel, the image processing device can determine whether the maximum feature score of the pixel in the image block is greater than a first threshold.
该第一阈值可以是上述初始阈值,也即第一阈值大于上述第二阈值。The first threshold may be the above-mentioned initial threshold, that is, the first threshold is greater than the above-mentioned second threshold.
在步骤3024中,如果最大的特征得分大于第一阈值,则将第一候选关键点中特征得分大于第一阈值的像素点确定为图像块的第二候选关键点,将第二候选关键点确定为图像块的候选关键点。In step 3024, if the maximum feature score is greater than the first threshold value, the pixel point with the feature score greater than the first threshold value in the first candidate key point is determined as the second candidate key point of the image block, and the second candidate key point is determined. Candidate key points for image patches.
由于基于第一阈值检测到的关键点更满足工程需求,如果最大的特征得分大于第一阈值,表明存在至少一个更满足工程需求的关键点,则可以基于第一阈值对第一候选关键点进行筛选。也即,如果第一候选关键点的特征得分大于第一阈值,则可以在得分图上保留该特征得分,也即保留为第二候选关键点;如果第一候选关键点的特征得分不大于第一阈值,则可以在得分图上将该特征得分置为0。Since the key points detected based on the first threshold value more satisfy the engineering requirements, if the maximum feature score is greater than the first threshold value, indicating that there is at least one key point that better meets the engineering requirements, the first candidate key point can be performed based on the first threshold value. filter. That is, if the feature score of the first candidate key point is greater than the first threshold, the feature score may be retained on the score map, that is, the second candidate key point; if the feature score of the first candidate key point is not greater than the first A threshold, the feature score can be set to 0 on the score map.
进而,在确定第二候选关键点后,可以将其确定为图像块的候选关键点。Furthermore, after the second candidate key point is determined, it can be determined as a candidate key point of the image block.
在步骤3025中,如果最大的特征得分不大于第一阈值,则将第一候选关键点确定为图像块的候选关键点。In step 3025, if the maximum feature score is not greater than the first threshold, the first candidate key point is determined as a candidate key point of the image block.
如果最大的特征得分不大于第一阈值,表明不存在更满足工程需求的关键点,则不对第一候选关键点进行筛选,也即,将第一候选关键点确定为图像块的候选关键点。If the maximum feature score is not greater than the first threshold, indicating that there are no key points that better meet the engineering requirements, then the first candidate key point is not filtered, that is, the first candidate key point is determined as a candidate key point of the image block.
在步骤3026中,图像处理设备将各层图像的各个图像块的候选关键点,确定为各层图像的候选关键点。In step 3026, the image processing device determines candidate key points of each image block of each layer image as candidate key points of each layer image.
图像处理设备确定图像块的候选关键点后,同时可以得到图像块对应的得分图。得分图中候选关键点的像素点位置上,存在该候选关键点对应的特征得分,而非候选关键点的像素点位置上的值为0。进而,对于一层图像,图像处理设备可以将其各个图像块的候选关键点归纳为该层图像的候选关键点,同时可以按照各个图像块在该层图像中的位置,拼接各个图像块的得分图,得到该层图像的得分图。After the image processing device determines the candidate key points of the image block, a score map corresponding to the image block can be obtained at the same time. At the pixel position of the candidate key point in the score map, a feature score corresponding to the candidate key point exists, and the value of the pixel position of the non-candidate key point is 0. Furthermore, for a layer of image, the image processing device can summarize the candidate key points of each image block as the candidate key points of the layer image, and at the same time, can stitch the scores of each image block according to the position of each image block in the layer image Map to get the score map of this layer image.
设1≤i≤N,i为整数,则图像金字塔的第i层图像即为任一层图像。对于图像金字塔的第i层图像,在得到其候选关键点以及得分图后,可以对候选关键点进行非极大值抑制。在进行非极大值抑制之前,需要确定非极大值抑制窗口的大小,该窗口可以是一种卷积核(kernel)。Let 1≤i≤N, where i is an integer, then the i-th layer image of the image pyramid is an image of any layer. For the i-th layer image of the image pyramid, after obtaining the candidate key points and the score map, non-maximum value suppression can be performed on the candidate key points. Before performing non-maximum suppression, the size of the non-maximum suppression window needs to be determined, and the window may be a convolution kernel.
本实施例以将第i-1层图像作为第i层图像的参考图像为例,利用第i层图像的参考 图像确定第i层图像的非极大值抑制窗口的大小。This embodiment takes the i-th layer image as the reference image of the i-th layer image as an example, and uses the reference image of the i-th layer image to determine the size of the non-maximum suppression window of the i-th layer image.
在步骤303中,图像处理设备判断第i层图像是否为第1层图像。In step 303, the image processing apparatus determines whether the i-th layer image is a first-layer image.
在步骤304中,如果第i层图像为第1层图像,则图像处理设备将预设窗口大小作为第i层图像对应的目标窗口大小。In step 304, if the i-th layer image is a first-layer image, the image processing device uses the preset window size as the target window size corresponding to the i-th layer image.
其中,预设窗口大小可以是指默认的窗口大小。The preset window size may refer to a default window size.
由于图像处理设备在提取关键点时,从第1层图像开始,逐层对图像金字塔的图像提取关键点,则在此基础上,图像处理设备可以获取当前图像的层数,进而可以判断该图像是否为第1层图像。如果当前图像为第1层图像,在此之前没有相似的图像提取过关键点,则图像处理设备可以将预设窗口大小确定为目标窗口大小。也即,基于预设窗口大小对第1层图像进行非极大值抑制。Since the image processing device extracts the key points, it starts from the first layer of the image and extracts the key points of the image pyramid image layer by layer. Based on this, the image processing device can obtain the number of layers of the current image, and then can judge the image. Whether it is the first layer image. If the current image is a layer 1 image and no similar points have been extracted before, the image processing device may determine the preset window size as the target window size. That is, non-maximum suppression is performed on the first layer image based on the preset window size.
在步骤305中,如果第i层图像为除第1层图像之外的任一层图像,则图像处理设备将图像的图像金字塔中第i-1层图像确定为第i层图像的参考图像,确定参考图像的第一关键点数目,根据参考图像的第一关键点数目和第二关键点数目,确定第i层图像对应的目标窗口大小。In step 305, if the i-th layer image is any layer image other than the first-layer image, the image processing device determines the i-th layer image in the image pyramid of the image as the reference image of the i-th layer image, The number of first keypoints of the reference image is determined, and the target window size corresponding to the i-th layer image is determined according to the number of first keypoints and the number of second keypoints of the reference image.
其中,第一关键点数目可以为预设的关键点数目,第一关键点数目可以是指对参考图像提取的关键点后期望得到的数目。第二关键点数目可以是指输出关键点的数目,参考图像的第二关键点数目可以是参考图像的输出关键点的数目。The first number of keypoints may be a preset number of keypoints, and the first number of keypoints may refer to a desired number of keypoints extracted from a reference image. The second number of key points may refer to the number of output key points, and the second number of key points of the reference image may be the number of output key points of the reference image.
如图5所示的参考图像示意图,如果当前图像为除第1层图像之外的任一层图像,在此之前已经提取过前一层图像的关键点,可以将前一层的图像确定为参考图像,以便根据前一层图像已提取关键点的满足需求的程度,也即根据参考图像的输出关键点的数目接近预设的关键点数目的程度,确定当前图像的非极大值抑制窗口的大小。As shown in the reference image diagram in Figure 5, if the current image is any layer image other than the first layer image, the key points of the previous layer image have been extracted before, the image of the previous layer can be determined as The reference image is used to determine the non-maximum suppression window of the current image according to the degree to which the extracted key points of the previous layer meet the requirements, that is, the degree to which the number of output key points of the reference image approaches the preset number of key points. size.
本实施例中,图像金字塔中的每层图像具有不同的第一关键点数目,因此,图像处理设备确定第一关键点的处理可以如下:图像处理设备根据参考图像在所属的图像金字塔中所处的层数确定参考图像对应的第一关键点数目。In this embodiment, each layer of images in the image pyramid has a different number of first keypoints. Therefore, the processing performed by the image processing device to determine the first keypoint may be as follows: the image processing device is located in the image pyramid to which it belongs according to the reference image The number of layers determines the number of first keypoints corresponding to the reference image.
其中,参考图像在所属的图像金字塔中所处的层数和第一关键点数目之间存在预设的对应关系。该预设的对应关系中,相邻两层图像的第一关键点数目满足预设比例,该预设比例等于图像金字塔中相邻两层图像的像素点数目比例。也即保证每层图像的第一关键点数目在像素点的总数目中所占的比例一定。Wherein, there is a preset correspondence relationship between the number of layers where the reference image is located in the image pyramid and the number of first key points. In the preset correspondence relationship, the number of first key points of two adjacent layers of images meets a preset ratio, and the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid. That is, it is ensured that the number of the first key points in each layer of the image occupies a certain proportion in the total number of pixels.
层数与第一关键点数目的对应关系可以由技术人员按照实际需求进行设置,当然,如果图像金字塔的层数过多,建立层数与第一关键点数目的对应关系的处理也可以如下:将第1层与第一关键点数目存储为一个对应关系项;对于第k层,根据预设比例与第k-1层对应的第一关键点数目,确定第k层对应的第一关键点数目,将第k层与对应的第一关键点数目存储为一个对应关系项,其中,k>1。The correspondence between the number of layers and the number of first key points can be set by the technician according to actual needs. Of course, if there are too many layers in the image pyramid, the process of establishing the correspondence between the number of layers and the number of first key points can also be as follows: The number of layers 1 and the first key point are stored as a correspondence relationship term; for the k-th layer, the number of the first key points corresponding to the k-th layer is determined according to the preset number of the first key points corresponding to the k-1 layer, The k-th layer and the corresponding number of first keypoints are stored as a correspondence term, where k> 1.
图像金字塔的第1层图像的第一关键点数目可以由图像像素点数目计算得到,例如,第一关键点数目可以为图像像素点数目的1%,则当第1层图像的像素大小为992*744时,第一关键点数目可以为7380个。进而,图像处理设备可以按照构造图像金字塔时下采样的预设比例,逐层计算第一关键点数目,保证第一关键点数目在每层图像的像素点的总数目中所占的比例一定。通过采用构造图像金字塔时的预设比例来计算每层的第一关键点数目,可以避免对每层图像计算图像像素点数目,减少处理量,提高处理效率。The number of first key points of the first layer image of the image pyramid can be calculated from the number of image pixels. For example, the number of first key points can be 1% of the number of image pixels. When the pixel size of the first layer image is 992 * At 744, the number of first key points can be 7,380. Furthermore, the image processing device may calculate the number of the first key points layer by layer according to a preset ratio of down-sampling when constructing the image pyramid, and ensure that the ratio of the number of the first key points to the total number of pixels in each layer image is constant. By calculating the number of first key points in each layer by using a preset ratio when constructing the image pyramid, the number of image pixels in each layer of the image can be avoided, the amount of processing can be reduced, and the processing efficiency can be improved.
图像处理设备确定参考图像对应的第一关键点数目后,确定第i层图像对应的目标窗口大小的具体处理可以如下:图像处理设备确定第二关键点数目和第一关键点数目的比值;根据预设的比值范围与窗口级别的对应关系,确定比值所处的目标比值范围以及目标比值范围对应的目标窗口级别;根据目标窗口级别,确定目标窗口大小。After the image processing device determines the number of the first keypoints corresponding to the reference image, the specific processing for determining the target window size corresponding to the i-th layer image can be as follows: the image processing device determines the ratio of the number of the second keypoints to the number of the first keypoints; Set the corresponding relationship between the ratio range and the window level, determine the target ratio range where the ratio is located, and the target window level corresponding to the target ratio range; determine the target window size according to the target window level.
本实施例中采用第二关键点数目与第一关键点数目的比值,衡量输出关键点的数目接近预设的关键点数目的程度。图像处理设备中可以预先存储有比值范围与窗口级别的对应关系,对于不同的层数,该对应关系始终成立。比值范围与窗口级别的对应关系可以如下表1所示:In this embodiment, the ratio of the number of second key points to the number of first key points is used to measure how close the number of output key points is to the preset number of key points. The image processing device may store a correspondence relationship between a ratio range and a window level in advance. For different layers, the correspondence relationship is always established. The corresponding relationship between the ratio range and the window level can be shown in Table 1 below:
表1比值范围与窗口级别的对应关系Table 1 Correspondence between ratio range and window level
比值范围Ratio range [0,1][0,1] (1,1.5](1,1.5) (1.5,2](1.5,2) (2,+∞](2, + ∞)
窗口级别Window level TALLTALL GRANDEGRANDE VENTIVENTI TRENTATRENTA
其中,窗口级别分别是TRENTA、VENTI、GRANDE、TALL,参照饮品杯型由大到小的命名,TRENTA的杯型容量最大,TALL的杯型容量最小,也即不同窗口级别的窗口大小排序为TRENTA>VENTI>GRANDE>TALL。Among them, the window levels are respectively TRENTA, VENTI, GRANDE, and TALL. With reference to the beverage cup sizes, TRENTA has the largest cup size and TALL has the smallest cup size, that is, the window sizes of different window levels are sorted as TRENTA > VENTI> GRANDE> TALL.
图像处理设备在确定参考图像后,可以获取该参考图像的第二关键点数目,然后可以计算得到第二关键点数目与第一关键点数目的比值。图像处理设备可以在上述比值范围与窗口级别的对应关系中,判断比值所处的目标比值范围,进而可以确定对应的目标窗口级别。图像处理设备确定目标窗口级别后,可以获取目标窗口级别对应的窗口大小,并将该窗口大小确定为目标窗口大小。After the image processing device determines the reference image, the second key point number of the reference image can be obtained, and then the ratio of the second key point number to the first key point number can be calculated. The image processing device can determine the target ratio range in which the ratio is located in the correspondence between the ratio range and the window level, and then can determine the corresponding target window level. After the image processing device determines the target window level, it can obtain the window size corresponding to the target window level and determine the window size as the target window size.
对于同一张图像,增大非极大值抑制的窗口大小,可以使得非极大值抑制处理之后得到的关键点数目减少。因此,如果第二关键点数目与第一关键点数目的比值过大,例如大于2,则可以适当调大窗口大小,以便减少当前图像的第二关键点数目。由于参考图像与当前图像具有相似的纹理,通过参考图像实际输出的关键点数目对窗口大小进行调整,尽量使得图像提取出的关键点数目接近第一关键点数目,均衡每层图像的关键点数目。For the same image, increasing the window size of non-maximum suppression can reduce the number of key points obtained after non-maximum suppression processing. Therefore, if the ratio of the number of the second keypoints to the number of the first keypoints is too large, for example, greater than 2, the window size can be appropriately increased to reduce the number of the second keypoints of the current image. Because the reference image and the current image have similar textures, the window size is adjusted by the number of key points actually output by the reference image, so that the number of key points extracted by the image is close to the number of first key points, and the number of key points in each layer of the image is balanced .
上述每个窗口级别对应的可以是固定的窗口大小,也即对于不同层的图像,根据同一窗口级别确定的窗口大小是相同的。可选的,在一种可能的实施方式中,对于不同层的图像,根据同一窗口级别确定的窗口大小可以是不同的,上述根据目标窗口级别确定目标窗口大小的处理可以如下:图像处理设备根据预设的层数与窗口组的对应关系,确定第i层图像对应的目标窗口组;将目标窗口组中目标窗口级别对应的窗口大小,确定为目标窗口大小。Each of the above window levels may correspond to a fixed window size, that is, for images of different layers, the window sizes determined according to the same window level are the same. Optionally, in a possible implementation manner, for images of different layers, the window sizes determined according to the same window level may be different. The above-mentioned processing for determining the target window size according to the target window level may be as follows: The preset correspondence between the number of layers and the window group determines the target window group corresponding to the i-th layer image; the window size corresponding to the target window level in the target window group is determined as the target window size.
图像处理设备中可以预先存储有图像金字塔中每层图像对应的窗口组,每个窗口组中可以包括至少一个窗口级别对应的窗口大小。层数与窗口组的对应关系可以如下表2所示:The image processing device may store in advance a window group corresponding to each layer of images in the image pyramid, and each window group may include a window size corresponding to at least one window level. The correspondence between the number of layers and the window group can be shown in Table 2 below:
表2层数与窗口组的对应关系Table 2 Correspondence between the number of layers and the window group
 Zh 第1层Layer 1 第2层Layer 2 第3层Layer 3 第4层Layer 4 第5层Layer 5 第6层Layer 6 第7层Layer 7 第8层Layer 8
TRENTATRENTA 31*1131 * 11 29*1129 * 11 27*1127 * 11 25*1125 * 11 23*923 * 9 23*923 * 9 21*721 * 7 21*721 * 7
VENTIVENTI 23*1123 * 11 21*1121 * 11 19*1119 * 11 17*917 * 9 15*715 * 7 15*715 * 7 13*513 * 5 13*513 * 5
GRANDEGRANDE 15*1115 * 11 13*913 * 9 11*911 * 9 9*79 * 7 7*57 * 5 7*57 * 5 5*35 * 3 5*35 * 3
TALLTALL 11*1111 * 11 9*99 * 9 9*99 * 9 7*77 * 7 5*55 * 5 5*55 * 5 3*33 * 3 3*33 * 3
其中,每层图像的窗口组中包括4个窗口级别,分别对应上述TRENTA、VENTI、GRANDE、 TALL。由表1中可以看出,不同层数的相同级别的窗口大小可以相同也可以不同,总体而言随着层数的增加,相同级别的窗口大小逐渐减小。Among them, the window group of each layer of image includes 4 window levels, which respectively correspond to the above-mentioned TRENTA, VENTI, GRANDE, TALL. It can be seen from Table 1 that the window sizes of the same level in different layers may be the same or different. In general, as the number of layers increases, the window sizes of the same level gradually decrease.
对于第i层图像,图像处理设备可以根据上述层数与窗口大小组的对应关系,确定层数对应的目标窗口大小组。图像处理设备在上述过程中确定目标窗口级别后,可以在目标窗口大小组中获取该目标窗口级别的窗口大小,作为非极大值抑制的窗口大小。例如,对于第2层图像,参考图像为第1层图像,如果计算得到第二关键点数目与第一关键点数目的比值为1.6,则可以确定窗口级别为VENTI,窗口大小为21*11。For the i-th layer image, the image processing device may determine the target window size group corresponding to the number of layers according to the above-mentioned correspondence between the number of layers and the window size group. After the image processing device determines the target window level in the above process, it can obtain the window size of the target window level in the target window size group as the non-maximum suppressed window size. For example, for the second-layer image, the reference image is the first-layer image. If the ratio of the number of the second keypoints to the number of the first keypoints is 1.6, the window level can be determined to be VENTI, and the window size can be 21 * 11.
随着层数的增大,每层图像的像素大小逐渐减小,相应的将同一窗口级别的窗口大小设置为逐层减小,可以均衡每层图像的关键点数目。由于图像金字塔的每层图像可以用于对图像多尺度表达,也即模拟不同模糊程度的图像,均衡每层图像的关键点数目,可以使得每种模糊程度都具有一定数目的关键点,提高图像匹配准确性。As the number of layers increases, the pixel size of each layer of the image gradually decreases, and the window size of the same window level is set to decrease layer by layer, which can balance the number of key points of each layer of image. As each layer of the image pyramid can be used to express the image at multiple scales, that is, to simulate images with different levels of blur, equalizing the number of key points in each layer of the image can make each level of blur have a certain number of key points, improving the image Matching accuracy.
当然,本实施例中还可以基于参考图像的其它信息确定目标窗口大小,例如,可以根据参考图像的非极大值抑制窗口的大小,确定第i层图像对应的目标窗口大小。因此,步骤303-305在确定参考图像之后的处理还可以是:图像处理设备根据参考图像确定目标窗口大小。Of course, in this embodiment, the target window size may also be determined based on other information of the reference image. For example, the target window size corresponding to the i-th layer image may be determined according to the size of the non-maximum suppression window of the reference image. Therefore, the processing after determining the reference image in steps 303-305 may also be: the image processing device determines the target window size according to the reference image.
在步骤306中,图像处理设备根据目标窗口大小对第i层图像的至少一个候选关键点进行非极大值抑制处理,得到第i层图像的至少一个输出关键点。In step 306, the image processing device performs non-maximum value suppression processing on at least one candidate key point of the i-th layer image according to the target window size to obtain at least one output key point of the i-th layer image.
在上述过程中得到目标窗口大小后,图像处理设备可以在第i层图像的得分图中将任一候选关键点作为窗口中心,在窗口的范围内确定特征得分最大的关键点,并将不是最大的特征得分置为0,也即进行非极大值抑制。遍历整个得分图中的候选关键点进行非极大值抑制,当遍历结束时,可以得到第i层图像的至少一个关键点。该关键点可以作为图像的关键点输出,也即得到输出关键点。After obtaining the target window size in the above process, the image processing device can use any candidate key point as the center of the window in the scoring map of the i-th layer image, and determine the key point with the largest feature score within the window, and it will not be the largest The feature score of is set to 0, that is, non-maximum suppression is performed. The candidate key points in the entire score graph are traversed for non-maximum value suppression. When the traversal ends, at least one key point of the i-th layer image can be obtained. The key point can be output as a key point of the image, that is, an output key point is obtained.
得到第i层图像的输出关键点后,可以将i加1,也即继续对第i+1层图像重复步骤302-306的处理,提取第i+1层图像的关键点,直到对最高层的图像提取关键点完成,再继续进行步骤307的处理。After obtaining the output key points of the i-th layer image, you can increase i by 1, that is, continue to repeat the processing of steps 302-306 for the i + 1-th layer image to extract the key points of the i + 1-th layer image until the top layer After the image extraction key points are completed, the process of step 307 is continued.
当然,除了利用参考图像,图像处理设备还可以基于其它方法确定目标窗口大小,例如,还可以根据图像金字塔第i层图像的像素大小确定目标窗口大小,本实施例对此不作限定。因此,上述步骤302-307的处理还可以是:图像处理设备根据图像金字塔的第i层图像确定目标窗口大小,根据目标窗口大小确定第i层图像的至少一个输出关键点。Of course, in addition to using the reference image, the image processing device may also determine the target window size based on other methods, for example, the target window size may also be determined according to the pixel size of the i-th layer image of the image pyramid, which is not limited in this embodiment. Therefore, the processing of the above steps 302 to 307 may also be: the image processing device determines the target window size according to the i-th layer image of the image pyramid, and determines at least one output key point of the i-th layer image according to the target window size.
在步骤307中,图像处理设备将图像金字塔的各层图像的输出关键点,确定为图像的关键点。In step 307, the image processing device determines an output key point of each layer image of the image pyramid as a key point of the image.
当图像金字塔的各层图像都确定了输出关键点后,可以将所有的输出关键点作为图像的关键点。图像处理设备可以对图像的关键点进行描述,例如可以采用关键点的位置、尺度、方向等描述关键点。进而,图像处理设备可以对关键点进行存储,以便后续过程中使用关键点进行图像匹配等处理。After the output key points of each layer of the image pyramid are determined, all the output key points can be used as the image key points. The image processing device can describe the key points of the image, for example, the position, scale, and direction of the key points can be used to describe the key points. Furthermore, the image processing device can store the key points so that the key points can be used for image matching and other processing in subsequent processes.
本实施例中,对于图像金字塔的第i层图像,图像处理设备利用第i-1层图像作为参考图像,由于第i层图像与第i-1层图像的纹理复杂度相近,因此可以基于参考图像对非极大值抑制的窗口大小进行调整,使得提取的关键点数目接近期望得到的关键点数目,均衡每张图像的关键点数目,以便减少对图像匹配的负影响。In this embodiment, for the ith layer image of the image pyramid, the image processing device uses the ith layer-1 image as a reference image. Since the complexity of the texture of the ith layer image and the ith layer-1 image is similar, it can be based on the reference The image adjusts the window size of the non-maximum suppression so that the number of extracted key points is close to the expected number of key points, and the number of key points of each image is balanced to reduce the negative impact on image matching.
上述过程中第i层图像的参考图像是第i-1层图像,本申请的一个实施例提供了一种用于提取视频流中每帧图像的关键点的方法,以第i层图像的参考图像是前一帧图像的图像金字塔中的第i层图像为例,结合具体的实施方式,对图6示的提取图像关键点的方法处理流程进行详细的说明,内容可以如下:The reference image of the i-th layer image in the above process is the i-th layer image. An embodiment of the present application provides a method for extracting a key point of each frame of the image in the video stream. The image is the i-th layer image in the image pyramid of the previous frame as an example. In combination with a specific embodiment, the process flow of the method for extracting the key points of the image shown in FIG. 6 is described in detail. The content can be as follows:
在步骤601中,图像处理设备获取图像的图像金字塔。In step 601, the image processing apparatus acquires an image pyramid of an image.
步骤601的具体处理与上述步骤301同理,此处不再赘述。The specific processing of step 601 is the same as that of step 301 described above, and details are not described herein again.
在步骤602中,图像处理设备提取图像金字塔的第i层图像的至少一个候选关键点。In step 602, the image processing apparatus extracts at least one candidate key point of the i-th layer image of the image pyramid.
在步骤603中,图像处理设备判断图像是否为第1帧图像。In step 603, the image processing device determines whether the image is a first frame image.
步骤603与步骤601-602没有必然的时序关系,可以与步骤601-602同步进行,也可以在步骤601-602之前进行,本实施例对此不作限定。There is no necessary timing relationship between step 603 and steps 601-602, and it can be performed synchronously with steps 601-602, or can be performed before steps 601-602, which is not limited in this embodiment.
在步骤604中,如果图像为第1帧图像,则图像处理设备将预设窗口大小作为每层图像对应的目标窗口大小。In step 604, if the image is the first frame image, the image processing device uses the preset window size as the target window size corresponding to each layer of the image.
图像处理设备可以根据视频流的时间顺序对每帧图像提取关键点,因此,如果图像为第1帧图像,在此之前没有相似的图像提取过关键点,则对于第1帧图像的图像金字塔,每层图像对应的目标窗口大小可以是预设窗口大小,可选的,每层图像的预设窗口大小可以不同,可以满足逐层递减的关系,例如,每层图像的预设窗口大小可以是上述表2中的TALL级别的窗口大小。The image processing device can extract key points for each frame of image according to the chronological order of the video stream. Therefore, if the image is the first frame image and no similar image has been extracted before, then for the image pyramid of the first frame, The target window size corresponding to each layer of images can be a preset window size. Optionally, the preset window size of each layer of images can be different, which can satisfy the relationship of decreasing layer by layer. For example, the preset window size of each layer of images can be The window size of the TALL level in Table 2 above.
可选的,第1帧图像还可以基于上述实施例提供的方法来确定参考图像,如图7所示的提取图像关键点的方法处理流程,具体处理可以如下:Optionally, the reference image can also be determined based on the method provided in the foregoing embodiment. The process flow of the method for extracting key points of the image shown in FIG. 7 can be as follows:
在步骤6041中,图像处理设备判断第i层图像是否为第1层图像。In step 6041, the image processing device determines whether the i-th layer image is a first-layer image.
在步骤6042中,如果图像为视频流中的第1帧图像,且图像的图像金字塔的第i层图像为第1层图像,则图像处理设备将预设窗口大小确定为图像的图像金字塔的第i层图像对应的目标窗口大小。In step 6042, if the image is the first frame image in the video stream, and the ith layer image of the image pyramid of the image is the first layer image, the image processing device determines the preset window size as the first image pyramid of the image. The target window size corresponding to the i-layer image.
在步骤6043中,如果图像为视频流中的第1帧图像,且图像的图像金字塔的第i层图像为除第1层图像之外的图像,则图像处理设备将图像的图像金字塔的第i-1层图像确定为图像的图像金字塔的第i层图像的参考图像,根据参考图像确定目标窗口大小。In step 6043, if the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first-layer image, the image processing device converts the i-th image of the image pyramid The -1 layer image is determined as the reference image of the i-th layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
图7所示的提取第1帧图像的关键点的具体处理与上述实施例同理,此处不再赘述。The specific processing for extracting the key points of the first frame image shown in FIG. 7 is the same as that of the foregoing embodiment, and details are not described herein again.
在步骤605中,如果图像为视频流中的除第1帧图像之外的任一帧图像,则图像处理设备将图像的前一帧图像的图像金字塔的第i层图像,确定为图像的图像金字塔的第i层图像的参考图像,确定参考图像的第一关键点数目,根据参考图像的第二关键点数目和第一关键点数目,确定图像的图像金字塔第i层图像对应的目标窗口大小。In step 605, if the image is any frame image other than the first frame image in the video stream, the image processing device determines the i-th layer image of the image pyramid of the previous frame image of the image as the image image The reference image of the ith layer image of the pyramid determines the first key point number of the reference image, and determines the target window size corresponding to the ith layer image of the image pyramid according to the second key point number and the first key point number of the reference image. .
如图8所示的参考图像示意图,如果当前图像为除第1帧图像之外的任一帧图像,在此之前已经提取过前一帧图像的关键点,可以将前一帧图像的图像金字塔中对应层的图像确定为参考图像,以便根据参考图像已提取关键点的满足需求的程度,也即根据参考图像的输出关键点的数目接近预设的关键点的程度,确定当前图像的非极大值抑制窗口的大小。根据参考图像确定目标窗口大小的具体处理与上述实施例同理,此处不再赘述。As shown in the reference image diagram in Figure 8, if the current image is any frame image other than the first frame image, the key points of the previous frame image have been extracted before, the image pyramid of the previous frame image can be The image of the corresponding layer in the middle is determined as the reference image, so that the non-polarity of the current image is determined according to the degree to which the reference image has extracted the key points to meet the requirements, that is, the degree to which the number of output key points of the reference image approaches the preset key points Large values suppress the size of the window. The specific processing for determining the target window size according to the reference image is the same as that in the foregoing embodiment, and details are not described herein again.
在步骤606中,图像处理设备根据目标窗口大小对图像的图像金字塔的第i层图像的至少一个候选关键点进行非极大值抑制处理,得到第i层图像的至少一个输出关键点。In step 606, the image processing device performs non-maximum suppression processing on at least one candidate key point of the i-th layer image of the image pyramid of the image according to the target window size to obtain at least one output key point of the i-th layer image.
在步骤607中,图像处理设备将图像金字塔的各层图像的输出关键点,确定为图像的关键点。In step 607, the image processing device determines an output key point of each layer image of the image pyramid as a key point of the image.
除确定参考图像的方法之外,提取图像关键点的其余处理与上述实施例同理,本实施例对此不再赘述。Except for the method of determining the reference image, the rest of the processing for extracting the key points of the image is the same as that of the above embodiment, which is not described in this embodiment.
本实施例中,当视频流的任一帧图像作为图像时,对于图像的图像金字塔的第i层图像,图像处理设备利用前一帧图像的对应层图像作为参考图像,由于当前帧图像与前一帧图像的纹理复杂度相近,图像金字塔中相同层图像的模糊程度相似,因此可以基于参考图像对非极大值抑制的窗口大小进行调整,使得提取的关键点数目接近期望得到的关键点数目,均衡每张图像的关键点数目,以便减少对图像匹配的负影响。In this embodiment, when any frame image of the video stream is used as the image, for the i-th layer image of the image pyramid of the image, the image processing device uses the corresponding layer image of the previous frame image as the reference image. The texture complexity of a frame of images is similar, and the blurring degree of the same layer of images in the image pyramid is similar. Therefore, the size of the non-maximum suppression window can be adjusted based on the reference image, so that the number of extracted key points is close to the expected number of key points. , To equalize the number of key points in each image in order to reduce the negative impact on image matching.
基于相同的技术构思,本实施例还提供了一种提取图像关键点的装置,该装置可以是上述图像处理设备,或者配置在上述图像处理设备中,如图9所示,该装置包括:Based on the same technical concept, this embodiment also provides a device for extracting key points of an image. The device may be the above-mentioned image processing device or configured in the above-mentioned image processing device. As shown in FIG. 9, the device includes:
获取模块910,用于获取图像的图像金字塔,所述图像金字塔包括N层图像,N>1,具体可以实现上述步骤301、601中的获取功能,以及其他隐含步骤;An obtaining module 910 is configured to obtain an image pyramid of an image, where the image pyramid includes N-layer images, N> 1, and can specifically implement the obtaining function in the above steps 301 and 601, and other hidden steps;
确定模块920,用于根据所述图像金字塔的第i层图像确定目标窗口大小,根据所述目标窗口大小确定所述第i层图像的至少一个输出关键点,其中,1≤i≤N;将所述图像金字塔的各层图像的输出关键点,确定为所述图像的关键点;具体可以实现上述步骤302-307、602-607中的确定功能,以及其他隐含步骤。A determining module 920, configured to determine a target window size according to the i-th layer image of the image pyramid, and determine at least one output key point of the i-th layer image according to the target window size, where 1≤i≤N; The output key points of the images of each layer of the image pyramid are determined as the key points of the image; specifically, the determination function in the above steps 302-307, 602-607, and other hidden steps can be implemented.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
提取所述图像金字塔的第i层图像的至少一个候选关键点,根据所述第i层图像确定目标窗口大小,根据所述目标窗口大小,对所述第i层图像的至少一个候选关键点进行非极大值抑制处理,得到所述第i层图像的至少一个输出关键点。Extract at least one candidate key point of the i-th layer image of the image pyramid, determine a target window size according to the i-th layer image, and perform at least one candidate key point of the i-th layer image according to the target window size Non-maximum suppression processing to obtain at least one output key point of the i-th layer image.
可选的,所述图像为静态图像,所述确定模块920用于:Optionally, the image is a static image, and the determining module 920 is configured to:
如果所述第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the i-th layer image is a first-layer image, determining a preset window size as a target window size;
否则,将所述图像的图像金字塔中第i-1层图像确定为所述第i层图像的参考图像,根据所述参考图像确定目标窗口大小。Otherwise, the image of layer i-1 in the image pyramid of the image is determined as the reference image of the image of layer i, and the target window size is determined according to the reference image.
可选的,所述图像为视频流中的一帧图像,所述确定模块920用于:Optionally, the image is a frame image in a video stream, and the determining module 920 is configured to:
如果所述图像为视频流中的除第1帧图像之外的任一帧图像,则将所述图像的前一帧图像的图像金字塔的第i层图像,确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is any frame image other than the first frame image in the video stream, determining the i-th layer image of the image pyramid of the previous frame image of the image as the image pyramid of the image A reference image of the i-th layer image, and a target window size is determined according to the reference image.
可选的,所述确定模块920还用于:Optionally, the determining module 920 is further configured to:
如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is the first-layer image, determining the preset window size as the target window size;
如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为除第1层图像之外的图像,则将所述图像的图像金字塔的第i-1层图像确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first layer image, the i-1th image of the image pyramid of the image is The layer image is determined as the reference image of the ith layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
确定所述参考图像的第一关键点数目,根据所述参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,其中,所述第一关键点数目为预设的关键点数目,所述第 二关键点数目为所述参考图像的输出关键点的数目。Determining a first number of key points of the reference image, and determining a target window size according to the first number of key points and the second number of key points of the reference image, where the first number of key points is a preset number of key points Therefore, the number of the second key points is the number of output key points of the reference image.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
根据所述参考图像在所属的图像金字塔中所处的层数确定所述参考图像对应的第一关键点数目,其中,所述参考图像在所属的图像金字塔中所处的层数和第一关键点数目之间存在预设的对应关系。The number of first keypoints corresponding to the reference image is determined according to the number of layers where the reference image is located in the image pyramid to which the reference image belongs, where the number of layers and the first key to which the reference image is located in the image pyramid to which the reference image belongs. There is a preset correspondence relationship between the number of points.
可选的,在所述预设的对应关系中,相邻两层图像的第一关键点数目满足预设比例,所述预设比例等于图像金字塔中相邻两层图像的像素点数目比例。Optionally, in the preset correspondence relationship, the number of first key points of two adjacent layers of images meets a preset ratio, and the preset ratio is equal to the ratio of the number of pixels of two adjacent layers of images in the image pyramid.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
确定所述第二关键点数目和所述第一关键点数目的比值;Determining a ratio between the number of the second key points and the number of the first key points;
根据预设的比值范围与窗口级别的对应关系,确定所述比值所处的目标比值范围以及所述目标比值范围对应的目标窗口级别;Determining a target ratio range in which the ratio is located and a target window level corresponding to the target ratio range according to a preset correspondence between the ratio range and the window level;
根据所述目标窗口级别,确定目标窗口大小。A target window size is determined according to the target window level.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
根据预设的层数与窗口组的对应关系,确定所述第i层图像对应的目标窗口组,所述窗口组中包括至少一个窗口级别对应的窗口大小;Determining a target window group corresponding to the i-th layer image according to a preset correspondence between the number of layers and the window group, where the window group includes a window size corresponding to at least one window level;
将所述目标窗口组中所述目标窗口级别对应的窗口大小,确定为目标窗口大小。A window size corresponding to the target window level in the target window group is determined as a target window size.
可选的,所述确定模块920用于:Optionally, the determining module 920 is configured to:
对于所述第i层图像每个图像块,根据预设的特征检测算法确定所述图像块中每个像素点的特征得分,将特征得分大于预设阈值的像素点,确定为所述图像块的候选关键点;For each image block of the i-th layer image, a feature score of each pixel in the image block is determined according to a preset feature detection algorithm, and a pixel point whose feature score is greater than a preset threshold is determined as the image block. Candidate key points
将所述第i层图像的各个图像块的候选关键点,确定为所述第i层图像的至少一个候选关键点。The candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
需要说明的是,上述获取模块910可以由处理器实现,确定模块920可以由处理器和存储器共同实现。It should be noted that the foregoing obtaining module 910 may be implemented by a processor, and the determining module 920 may be implemented by a processor and a memory together.
本实施例中,图像处理设备在对图像提取关键点时,可以基于该图像的图像金字塔中的每层图像调整目标窗口大小,也即调整非极大值抑制的窗口大小,以便非极大值抑制的窗口大小可以随着每层图像进行改变,均衡每张图像的关键点数目,以便减少对图像匹配的负影响。In this embodiment, when the image processing device extracts a key point from the image, the target window size can be adjusted based on each layer of images in the image pyramid of the image, that is, the window size of non-maximum suppression is adjusted so that the non-maximum value The size of the suppressed window can be changed with each layer of the image, and the number of key points in each image is balanced to reduce the negative impact on image matching.
需要说明的是:上述实施例提供的提取图像关键点的装置在提取图像关键点时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将图像处理设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的提取图像关键点的装置与提取图像关键点的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that the device for extracting image key points provided in the foregoing embodiments only uses the division of the foregoing functional modules as an example for extracting the image key points. In actual applications, the above-mentioned functions can be assigned by different functions. The function module is completed, that is, the internal structure of the image processing device is divided into different function modules to complete all or part of the functions described above. In addition, the apparatus for extracting key points of an image provided by the foregoing embodiment belongs to the same concept as the method embodiment for extracting key points of an image. For specific implementation processes, refer to the method embodiments, and details are not described herein again.
在上述实施例中,可以全部或部分地通过软件、硬件或者其组合来实现,当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令,在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本实施例所述的流程或功能。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、双绞线) 或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何介质或者是包含一个或多个介质集成的服务器、数据中心等数据存储设备。所述介质可以是磁性介质(如软盘、硬盘和磁带等),也可以是光介质(如光盘等),或者半导体介质(如固态硬盘等)。In the above embodiments, all or part may be implemented by software, hardware, or a combination thereof. When implemented using software, it may be all or partly implemented in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions according to this embodiment are generated. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center by wire (for example, coaxial cable, fiber optic, twisted pair) or wireless (for example, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes one or more media integrations. The medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape, etc.), an optical medium (such as an optical disk, etc.), or a semiconductor medium (such as a solid state hard disk, etc.).

Claims (24)

  1. 一种提取图像关键点的方法,其特征在于,所述方法包括:A method for extracting key points of an image, wherein the method includes:
    获取图像的图像金字塔,所述图像金字塔包括N层图像,N>1;Acquiring an image pyramid of an image, the image pyramid including N-layer images, N> 1;
    根据所述图像金字塔的第i层图像确定目标窗口大小,根据所述目标窗口大小确定所述第i层图像的至少一个输出关键点,其中,1≤i≤N;Determine a target window size according to the i-th layer image of the image pyramid, and determine at least one output key point of the i-th layer image according to the target window size, where 1 ≦ i ≦ N;
    将所述图像金字塔的各层图像的输出关键点,确定为所述图像的关键点。An output key point of each layer image of the image pyramid is determined as a key point of the image.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述图像金字塔的第i层图像确定目标窗口大小,根据所述目标窗口大小确定所述第i层图像的至少一个输出关键点,包括:The method according to claim 1, characterized in that the target window size is determined according to the i-th layer image of the image pyramid, and at least one output key point of the i-th layer image is determined according to the target window size, include:
    提取所述图像金字塔的第i层图像的至少一个候选关键点,根据所述第i层图像确定目标窗口大小,根据所述目标窗口大小,对所述第i层图像的至少一个候选关键点进行非极大值抑制处理,得到所述第i层图像的至少一个输出关键点。Extract at least one candidate key point of the i-th layer image of the image pyramid, determine a target window size according to the i-th layer image, and perform at least one candidate key point of the i-th layer image according to the target window size Non-maximum suppression processing to obtain at least one output key point of the i-th layer image.
  3. 根据权利要求1所述的方法,其特征在于,所述图像为静态图像,所述根据所述图像金字塔的所述第i层图像确定目标窗口大小,包括:The method according to claim 1, wherein the image is a static image, and determining the target window size based on the i-th layer image of the image pyramid includes:
    如果所述第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the i-th layer image is a first-layer image, determining a preset window size as a target window size;
    否则,将所述图像的图像金字塔中第i-1层图像确定为所述第i层图像的参考图像,根据所述参考图像确定目标窗口大小。Otherwise, the image of layer i-1 in the image pyramid of the image is determined as the reference image of the image of layer i, and the target window size is determined according to the reference image.
  4. 根据权利要求1所述的方法,其特征在于,所述图像为视频流中的一帧图像,所述根据所述图像金字塔的第i层图像确定目标窗口大小,包括:The method according to claim 1, wherein the image is a frame image in a video stream, and the determining a target window size according to an i-th layer image of the image pyramid comprises:
    如果所述图像为视频流中的除第1帧图像之外的任一帧图像,则将所述图像的前一帧图像的图像金字塔的第i层图像,确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is any frame image other than the first frame image in the video stream, determining the i-th layer image of the image pyramid of the previous frame image of the image as the image pyramid of the image A reference image of the i-th layer image, and a target window size is determined according to the reference image.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method according to claim 4, further comprising:
    如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is the first-layer image, determining the preset window size as the target window size;
    如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为除第1层图像之外的图像,则将所述图像的图像金字塔的第i-1层图像确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first layer image, the i-1th image of the image pyramid of the image is The layer image is determined as the reference image of the ith layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
  6. 根据权利要求3-5任一所述的方法,其特征在于,所述根据所述参考图像确定目标窗口大小,包括:The method according to any one of claims 3-5, wherein determining the target window size according to the reference image comprises:
    确定所述参考图像的第一关键点数目,根据所述参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,其中,所述第一关键点数目为预设的关键点数目,所述第二关键点数目为所述参考图像的输出关键点的数目。Determining a first number of key points of the reference image, and determining a target window size according to the first number of key points and the second number of key points of the reference image, where the first number of key points is a preset number of key points Therefore, the number of the second key points is the number of output key points of the reference image.
  7. 根据权利要求6所述的方法,其特征在于,所述确定所述参考图像的第一关键点数目,包括:The method according to claim 6, wherein the determining the number of first keypoints of the reference image comprises:
    根据所述参考图像在所属的图像金字塔中所处的层数确定所述参考图像对应的第一关键点数目,其中,所述参考图像在所属的图像金字塔中所处的层数和第一关键点数目之间存在预设的对应关系。The number of first keypoints corresponding to the reference image is determined according to the number of layers where the reference image is located in the image pyramid to which the reference image belongs, where the number of layers and the first key to which the reference image is located in the image pyramid to which the reference image belongs There is a preset correspondence relationship between the number of points.
  8. 根据权利要求7所述的方法,其特征在于,在所述预设的对应关系中,相邻两层图像的第一关键点数目满足预设比例,所述预设比例等于图像金字塔中相邻两层图像的像素点数 目比例。The method according to claim 7, characterized in that, in the preset correspondence relationship, the number of first key points of two adjacent layers of images satisfies a preset ratio, and the preset ratio is equal to the adjacent ratio in the image pyramid The ratio of the number of pixels in the two-layer image.
  9. 根据权利要求6所述的方法,其特征在于,所述根据所述参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,包括:The method according to claim 6, wherein determining the target window size according to the number of first keypoints and the number of second keypoints of the reference image comprises:
    确定所述第二关键点数目和所述第一关键点数目的比值;Determining a ratio between the number of the second key points and the number of the first key points;
    根据预设的比值范围与窗口级别的对应关系,确定所述比值所处的目标比值范围以及所述目标比值范围对应的目标窗口级别;Determining a target ratio range in which the ratio is located and a target window level corresponding to the target ratio range according to a preset correspondence between the ratio range and the window level;
    根据所述目标窗口级别,确定目标窗口大小。A target window size is determined according to the target window level.
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述目标窗口级别,确定目标窗口大小,包括:The method according to claim 9, wherein determining the target window size according to the target window level comprises:
    根据预设的层数与窗口组的对应关系,确定所述第i层图像对应的目标窗口组,所述窗口组中包括至少一个窗口级别对应的窗口大小;Determining a target window group corresponding to the i-th layer image according to a preset correspondence between the number of layers and the window group, where the window group includes a window size corresponding to at least one window level;
    将所述目标窗口组中所述目标窗口级别对应的窗口大小,确定为目标窗口大小。A window size corresponding to the target window level in the target window group is determined as a target window size.
  11. 根据权利要求2所述的方法,其特征在于,所述提取所述第i层图像的至少一个候选关键点,包括:The method according to claim 2, wherein the extracting at least one candidate key point of the i-th layer image comprises:
    对于所述第i层图像每个图像块,根据预设的特征检测算法确定所述图像块中每个像素点的特征得分,将特征得分大于预设阈值的像素点,确定为所述图像块的候选关键点;For each image block of the i-th layer image, a feature score of each pixel in the image block is determined according to a preset feature detection algorithm, and a pixel point whose feature score is greater than a preset threshold is determined as the image block. Candidate key points
    将所述第i层图像的各个图像块的候选关键点,确定为所述第i层图像的至少一个候选关键点。The candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
  12. 一种提取图像关键点的装置,其特征在于,所述装置包括:A device for extracting key points of an image is characterized in that the device includes:
    获取模块,用于获取图像的图像金字塔,所述图像金字塔包括N层图像,N>1;An acquisition module, configured to acquire an image pyramid of an image, where the image pyramid includes N-layer images, N> 1;
    确定模块,用于根据所述图像金字塔的第i层图像确定目标窗口大小,根据所述目标窗口大小确定所述第i层图像的至少一个输出关键点,其中,1≤i≤N;将所述图像金字塔的各层图像的输出关键点,确定为所述图像的关键点。A determining module, configured to determine a target window size according to the i-th layer image of the image pyramid, and determine at least one output key point of the i-th layer image according to the target window size, where 1 ≦ i ≦ N; The output key points of each layer of the image pyramid are determined as the key points of the image.
  13. 根据权利要求12所述的装置,其特征在于,所述确定模块用于:The apparatus according to claim 12, wherein the determining module is configured to:
    提取所述图像金字塔的第i层图像的至少一个候选关键点,根据所述第i层图像确定目标窗口大小,根据所述目标窗口大小,对所述第i层图像的至少一个候选关键点进行非极大值抑制处理,得到所述第i层图像的至少一个输出关键点。Extract at least one candidate key point of the i-th layer image of the image pyramid, determine a target window size according to the i-th layer image, and perform at least one candidate key point of the i-th layer image according to the target window size Non-maximum suppression processing to obtain at least one output key point of the i-th layer image.
  14. 根据权利要求12所述的装置,其特征在于,所述图像为静态图像,所述确定模块用于:The device according to claim 12, wherein the image is a static image, and the determining module is configured to:
    如果所述第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the i-th layer image is a first-layer image, determining a preset window size as a target window size;
    否则,将所述图像的图像金字塔中第i-1层图像确定为所述第i层图像的参考图像,根据所述参考图像确定目标窗口大小。Otherwise, the image of layer i-1 in the image pyramid of the image is determined as the reference image of the image of layer i, and the target window size is determined according to the reference image.
  15. 根据权利要求12所述的装置,其特征在于,所述图像为视频流中的一帧图像,所述确定模块用于:The device according to claim 12, wherein the image is a frame image in a video stream, and the determining module is configured to:
    如果所述图像为视频流中的除第1帧图像之外的任一帧图像,则将所述图像的前一帧图像的图像金字塔的第i层图像,确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is any frame image other than the first frame image in the video stream, determining the i-th layer image of the image pyramid of the previous frame image of the image as the image pyramid of the image A reference image of the i-th layer image, and a target window size is determined according to the reference image.
  16. 根据权利要求15所述的装置,其特征在于,所述确定模块还用于:The apparatus according to claim 15, wherein the determining module is further configured to:
    如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为第1层图像,则将预设窗口大小确定为目标窗口大小;If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is the first-layer image, determining the preset window size as the target window size;
    如果所述图像为视频流中的第1帧图像,且所述图像的图像金字塔的第i层图像为除第1层图像之外的图像,则将所述图像的图像金字塔的第i-1层图像确定为所述图像的图像金字塔的第i层图像的参考图像,根据所述参考图像确定目标窗口大小。If the image is the first frame image in the video stream, and the i-th layer image of the image pyramid of the image is an image other than the first layer image, the i-1th image of the image pyramid of the image is The layer image is determined as the reference image of the ith layer image of the image pyramid of the image, and the target window size is determined according to the reference image.
  17. 根据权利要求14-16任一所述的装置,其特征在于,所述确定模块用于:The apparatus according to any one of claims 14-16, wherein the determining module is configured to:
    确定所述参考图像的第一关键点数目,根据所述参考图像的第一关键点数目和第二关键点数目,确定目标窗口大小,其中,所述第一关键点数目为预设的关键点数目,所述第二关键点数目为所述参考图像的输出关键点的数目。Determining a first number of key points of the reference image, and determining a target window size according to the first number of key points and the second number of key points of the reference image, where the first number of key points is a preset number of key points Therefore, the number of the second key points is the number of output key points of the reference image.
  18. 根据权利要求17所述的装置,其特征在于,所述确定模块用于:The apparatus according to claim 17, wherein the determining module is configured to:
    根据所述参考图像在所属的图像金字塔中所处的层数确定所述参考图像对应的第一关键点数目,其中,所述参考图像在所属的图像金字塔中所处的层数和第一关键点数目之间存在预设的对应关系。The number of first keypoints corresponding to the reference image is determined according to the number of layers where the reference image is located in the image pyramid to which the reference image belongs, where the number of layers and the first key to which the reference image is located in the image pyramid to which the reference image belongs. There is a preset correspondence relationship between the number of points.
  19. 根据权利要求18所述的装置,其特征在于,在所述预设的对应关系中,相邻两层图像的第一关键点数目满足预设比例,所述预设比例等于图像金字塔中相邻两层图像的像素点数目比例。The device according to claim 18, wherein in the preset correspondence relationship, the number of first key points of two adjacent layers of images satisfies a preset ratio, and the preset ratio is equal to adjacent in the image pyramid The ratio of the number of pixels in the two-layer image.
  20. 根据权利要求17所述的装置,其特征在于,所述确定模块用于:The apparatus according to claim 17, wherein the determining module is configured to:
    确定所述第二关键点数目和所述第一关键点数目的比值;Determining a ratio between the number of the second key points and the number of the first key points;
    根据预设的比值范围与窗口级别的对应关系,确定所述比值所处的目标比值范围以及所述目标比值范围对应的目标窗口级别;Determining a target ratio range in which the ratio is located and a target window level corresponding to the target ratio range according to a preset correspondence between the ratio range and the window level;
    根据所述目标窗口级别,确定目标窗口大小。A target window size is determined according to the target window level.
  21. 根据权利要求20所述的装置,其特征在于,所述确定模块用于:The apparatus according to claim 20, wherein the determining module is configured to:
    根据预设的层数与窗口组的对应关系,确定所述第i层图像对应的目标窗口组,所述窗口组中包括至少一个窗口级别对应的窗口大小;Determining a target window group corresponding to the i-th layer image according to a preset correspondence between the number of layers and the window group, where the window group includes a window size corresponding to at least one window level;
    将所述目标窗口组中所述目标窗口级别对应的窗口大小,确定为目标窗口大小。A window size corresponding to the target window level in the target window group is determined as a target window size.
  22. 根据权利要求13所述的装置,其特征在于,所述确定模块用于:The apparatus according to claim 13, wherein the determining module is configured to:
    对于所述第i层图像每个图像块,根据预设的特征检测算法确定所述图像块中每个像素点的特征得分,将特征得分大于预设阈值的像素点,确定为所述图像块的候选关键点;For each image block of the i-th layer image, a feature score of each pixel in the image block is determined according to a preset feature detection algorithm, and a pixel point whose feature score is greater than a preset threshold is determined as the image block. Candidate key points
    将所述第i层图像的各个图像块的候选关键点,确定为所述第i层图像的至少一个候选关键点。The candidate key points of each image block of the i-th layer image are determined as at least one candidate key point of the i-th layer image.
  23. 一种图像处理设备,其特征在于,所述图像处理设备包括存储器和处理器,所述存储器用于存储指令,所述处理器用于调用所述指令并执行如权利要求1-11中任一权利要求所述的方法。An image processing device, characterized in that the image processing device includes a memory and a processor, the memory is used to store instructions, and the processor is used to call the instructions and execute any one of claims 1-11 Requires the described method.
  24. 一种计算机可读存储介质,其特征在于,当所述计算机可读存储介质在图像处理设备上运行时,使得所述图像处理设备执行权利要求1-11中任一权利要求所述的方法。A computer-readable storage medium, characterized in that when the computer-readable storage medium is run on an image processing device, the image processing device is caused to perform the method according to any one of claims 1-11.
PCT/CN2018/106778 2018-09-20 2018-09-20 Method and apparatus for extracting image key point WO2020056688A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/106778 WO2020056688A1 (en) 2018-09-20 2018-09-20 Method and apparatus for extracting image key point
CN201880095485.3A CN112424787A (en) 2018-09-20 2018-09-20 Method and device for extracting image key points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/106778 WO2020056688A1 (en) 2018-09-20 2018-09-20 Method and apparatus for extracting image key point

Publications (1)

Publication Number Publication Date
WO2020056688A1 true WO2020056688A1 (en) 2020-03-26

Family

ID=69888165

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106778 WO2020056688A1 (en) 2018-09-20 2018-09-20 Method and apparatus for extracting image key point

Country Status (2)

Country Link
CN (1) CN112424787A (en)
WO (1) WO2020056688A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378865A (en) * 2021-08-16 2021-09-10 航天宏图信息技术股份有限公司 Image pyramid matching method and device
CN117911956A (en) * 2024-03-19 2024-04-19 洋县阿拉丁生物工程有限责任公司 Dynamic monitoring method and system for processing environment of food processing equipment
CN117911956B (en) * 2024-03-19 2024-05-31 洋县阿拉丁生物工程有限责任公司 Dynamic monitoring method and system for processing environment of food processing equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930245A (en) * 2012-09-24 2013-02-13 深圳市捷顺科技实业股份有限公司 Method and system for tracking vehicles
CN105069477A (en) * 2015-08-13 2015-11-18 天津津航技术物理研究所 Method for AdaBoost cascade classifier to detect image object
CN105512638A (en) * 2015-12-24 2016-04-20 黄江 Fused featured-based face detection and alignment method
US20170011520A1 (en) * 2015-07-09 2017-01-12 Texas Instruments Incorporated Window grouping and tracking for fast object detection
CN106529448A (en) * 2016-10-27 2017-03-22 四川长虹电器股份有限公司 Method for performing multi-visual-angle face detection by means of integral channel features

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650615B (en) * 2016-11-07 2018-03-27 深圳云天励飞技术有限公司 A kind of image processing method and terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930245A (en) * 2012-09-24 2013-02-13 深圳市捷顺科技实业股份有限公司 Method and system for tracking vehicles
US20170011520A1 (en) * 2015-07-09 2017-01-12 Texas Instruments Incorporated Window grouping and tracking for fast object detection
CN105069477A (en) * 2015-08-13 2015-11-18 天津津航技术物理研究所 Method for AdaBoost cascade classifier to detect image object
CN105512638A (en) * 2015-12-24 2016-04-20 黄江 Fused featured-based face detection and alignment method
CN106529448A (en) * 2016-10-27 2017-03-22 四川长虹电器股份有限公司 Method for performing multi-visual-angle face detection by means of integral channel features

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378865A (en) * 2021-08-16 2021-09-10 航天宏图信息技术股份有限公司 Image pyramid matching method and device
CN113378865B (en) * 2021-08-16 2021-11-05 航天宏图信息技术股份有限公司 Image pyramid matching method and device
CN117911956A (en) * 2024-03-19 2024-04-19 洋县阿拉丁生物工程有限责任公司 Dynamic monitoring method and system for processing environment of food processing equipment
CN117911956B (en) * 2024-03-19 2024-05-31 洋县阿拉丁生物工程有限责任公司 Dynamic monitoring method and system for processing environment of food processing equipment

Also Published As

Publication number Publication date
CN112424787A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2019153671A1 (en) Image super-resolution method and apparatus, and computer readable storage medium
CN110288547A (en) Method and apparatus for generating image denoising model
US20170109912A1 (en) Creating a composite image from multi-frame raw image data
CN111402170B (en) Image enhancement method, device, terminal and computer readable storage medium
WO2021114868A1 (en) Denoising method, terminal, and storage medium
WO2020108009A1 (en) Method, system, and computer-readable medium for improving quality of low-light images
CN112602088B (en) Method, system and computer readable medium for improving quality of low light images
CN110991287A (en) Real-time video stream face detection tracking method and detection tracking system
CN111985281B (en) Image generation model generation method and device and image generation method and device
WO2017113917A1 (en) Imaging method, imaging apparatus, and terminal
US20220270266A1 (en) Foreground image acquisition method, foreground image acquisition apparatus, and electronic device
WO2022082999A1 (en) Object recognition method and apparatus, and terminal device and storage medium
TW201944291A (en) Face recognition method
WO2021082819A1 (en) Image generation method and apparatus, and electronic device
CN111325798A (en) Camera model correction method and device, AR implementation equipment and readable storage medium
CN111028276A (en) Image alignment method and device, storage medium and electronic equipment
WO2019120025A1 (en) Photograph adjustment method and apparatus, storage medium and electronic device
WO2020259123A1 (en) Method and device for adjusting image quality, and readable storage medium
CN111429371A (en) Image processing method and device and terminal equipment
WO2020056688A1 (en) Method and apparatus for extracting image key point
CN111080683B (en) Image processing method, device, storage medium and electronic equipment
CN112883940A (en) Silent in-vivo detection method, silent in-vivo detection device, computer equipment and storage medium
CN110738625B (en) Image resampling method, device, terminal and computer readable storage medium
JP2015179426A (en) Information processing apparatus, parameter determination method, and program
WO2021000495A1 (en) Image processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933848

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18933848

Country of ref document: EP

Kind code of ref document: A1