CN109117773B

CN109117773B - Image feature point detection method, terminal device and storage medium

Info

Publication number: CN109117773B
Application number: CN201810865350.2A
Authority: CN
Inventors: 张弓
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2021-11-02
Anticipated expiration: 2038-08-01
Also published as: CN109117773A; WO2020024744A1

Abstract

The application is applicable to the technical field of image recognition, and provides a terminal device and a computer readable storage medium for detecting image feature points, wherein the method comprises the following steps: the method comprises the steps of obtaining initial images of natural scenes of multiple categories, extracting initial feature points from the initial images according to the natural scenes of each category, taking the initial feature points meeting preset conditions as target feature points of the current category according to the corresponding relation of the initial feature points in the initial images of the current natural scenes, taking the initial images containing the target feature points as training images of the current category, training the constructed deep neural network through the training images in the training sample set to obtain the trained deep neural network, detecting the images to be detected based on the trained deep neural network to obtain the feature points in the images to be detected, and improving the detection precision in scene detection.

Description

Image feature point detection method, terminal device and storage medium

Technical Field

The present application belongs to the field of image recognition technology, and in particular, to an image feature point detection method, a terminal device, and a computer-readable storage medium.

Background

With the continuous development of computer vision and the increasing demand of users, many image processing technologies are emerging. In order to obtain a good processing effect when performing various kinds of processing on an image, it is sometimes necessary to recognize a scene of the image,

at present, most of detection and identification of image scenes are to calculate a certain response value of an image pixel one by one in an image scale space, and obtain a local extreme value based on the pixel position and scale to obtain a feature point detection result. However, the detection accuracy of such a manner of image feature point detection is low.

Disclosure of Invention

In view of this, embodiments of the present application provide an image feature point detection method, a terminal device, and a computer-readable storage medium, so as to solve the problem that the feature point detection method of the current scene is low in detection accuracy.

A first aspect of an embodiment of the present application provides an image feature point detection method, including:

acquiring initial images of a plurality of categories of natural scenes, wherein each category of natural scenes comprises a plurality of initial images;

for each category of natural scene, respectively extracting initial characteristic points from the initial image of the current category;

acquiring the corresponding relation of the initial characteristic points in each initial image of the current category;

based on the corresponding relation, acquiring initial feature points which accord with preset conditions from the initial feature points of the initial image of the current category as target feature points of the current category;

taking the initial image containing the target feature point of the current category in the initial image of each category as a training image to obtain a training sample set of natural scenes of multiple categories;

training the constructed deep neural network through the training images in the training sample set to obtain a trained deep neural network;

and detecting the image to be detected based on the trained deep neural network to obtain the characteristic points in the image to be detected.

A second aspect of an embodiment of the present application provides a terminal device, including:

the system comprises an initial image acquisition module, a processing module and a display module, wherein the initial image acquisition module is used for acquiring initial images of natural scenes of multiple categories, and each category of natural scene comprises multiple initial images;

the initial characteristic point acquisition module is used for respectively extracting initial characteristic points from the initial images of the current category for each category of natural scene;

a corresponding relation obtaining module, configured to obtain a corresponding relation of the initial feature point in each initial image of the current category;

a target feature point obtaining module, configured to obtain, based on the correspondence, an initial feature point that meets a preset condition from initial feature points of an initial image of a current category as a target feature point of the current category;

the training image acquisition module is used for taking the initial image containing the target feature point of the current category in the initial image of each category as a training image to obtain a training sample set of natural scenes of multiple categories;

the training module is used for training the constructed deep neural network through the training images in the training sample set to obtain the trained deep neural network;

and the detection module is used for detecting the image to be detected based on the trained deep neural network to obtain the characteristic points in the image to be detected.

A third aspect of an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method provided in the first aspect of the embodiment of the present application when executing the computer program.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by one or more processors, performs the steps of the method provided by the first aspect of embodiments of the present application.

A fifth aspect of embodiments of the present application provides a computer program product comprising a computer program that, when executed by one or more processors, performs the steps of the method provided by the first aspect of embodiments of the present application.

The embodiment of the application provides a method for detecting feature points when detecting scenes of images, which comprises the steps of firstly obtaining initial images of natural scenes of a plurality of categories, extracting the initial feature points of the natural scenes of each category, then obtaining the corresponding relations of the initial feature points in different initial images, screening target feature points capable of representing the current natural scenes from the initial images according to the corresponding relations, taking the initial images comprising the target feature points as training images, training the constructed deep neural network model, wherein the trained deep neural network model has the capability of detecting the feature points of the scenes of the images, and the training images for training the deep neural network model in the embodiment of the application are images where the target feature points capable of representing the natural scenes of each category are screened from the initial feature points, the detection precision of the feature points in scene detection can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart of an implementation of a method for detecting image feature points according to an embodiment of the present application;

fig. 2 is a schematic flow chart illustrating an implementation of another image feature point detection method provided in the embodiment of the present application;

fig. 3 is a schematic block diagram of a terminal device provided in an embodiment of the present application;

fig. 4 is a schematic block diagram of another terminal device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

For explaining the technical solution described in the present application, the application scenario of the embodiment of the present application is first introduced, and the present application may be applied to scene detection of an image, for example, the category of the set scenario may be preset to be earth, stream, cloud, rain, snow mountain, or the like. The detection of the scene category needs feature point-based detection, however, unlike face detection, facial features with obvious features can be used as feature points in the scene detection, the face detection can be realized by acquiring the feature points based on specific facial features in the face detection, and the feature points with obvious special features are difficult to manually calibrate for training images in the scene detection. Therefore, it is common to calculate some response value of the image pixel one by one in the image scale space, and find a local extremum value in the three-dimensional space formed by combining the pixel position and the scale to obtain the feature point detection result. Such feature points are not detected accurately enough and may not represent features of the scene. According to the method and the device, the target feature points capable of representing the features of the scene are obtained, then the target feature points in the image are calibrated, the deep neural network is trained through the image with the calibrated target feature points, and the scene feature points of the image are detected through the trained deep neural network, so that the scene of the image is obtained. The following description will be made by way of specific examples.

Fig. 1 is a schematic flow chart of an implementation of a method for detecting image feature points provided in an embodiment of the present application, where as shown in the figure, the method may include the following steps:

step S101, acquiring initial images of multiple categories of natural scenes, wherein each category of natural scenes includes multiple initial images.

In the embodiment of the present application, in order to enable the trained deep neural network to identify multiple classes of natural scenes, the acquired training images need to include images corresponding to multiple classes of natural scenes. In practical application, the deep neural network can also be trained through images of natural scenes of a single category, so that the deep neural network obtained through training can only detect feature points of the images of the natural scenes of the single category when scene detection is carried out.

If the deep neural network after training is required to be capable of performing feature point detection on scenes of multiple categories, initial images of natural scenes of multiple categories may be obtained, for example, 5 natural scenes are set, and a large number of initial images corresponding to each natural scene need to be collected.

And step S102, extracting initial characteristic points from the initial images of the current category respectively for each category of natural scenes.

In the embodiment of the present application, the method for extracting the initial feature points from the initial image includes, but is not limited to, Harris, SUSAN, SIFT, SURF, FAST, MSER, and the like. Taking Harris corner points as an example, dividing an image into M multiplied by M small blocks, performing Harris corner point response calculation on each small block, extracting N points with the maximum corner point response value in each small block as characteristic points, and extracting M multiplied by N characteristic points in one image at most. It is understood that, in practical application, other methods for extracting the feature points of the image are also possible.

Step S103, acquiring the corresponding relation of the initial characteristic points in each initial image of the current category.

In the embodiment of the present application, for a certain category of natural scene, there may be identity and difference between initial feature points of an initial image in the scene. For example, initial feature point a1, initial feature point a2, initial feature point a3, and initial feature point a4 are extracted from initial image a, and initial feature point B1, initial feature point B2, and initial feature point B3 are extracted from initial image B. Since the feature points are extracted from different initial images, it is possible that the initial feature point a1 and the initial feature point b2 are the same type of feature point. That is, the initial feature point a1 in the initial image a and the initial feature point B2 in the initial image B are in a corresponding relationship. This makes it possible to mark the initial feature point a1 and the initial feature point b2 as the same type of initial feature point. When the corresponding relationship of the initial feature points in different initial images is judged, the judgment can be carried out according to the feature information of the initial feature points.

And step S104, acquiring initial characteristic points which accord with preset conditions from the initial characteristic points of the initial image of the current category as target characteristic points of the current category based on the corresponding relation.

In the embodiment of the present application, after the correspondence relationship of the initial feature points is determined, how many types of initial feature points are obtained in total from the initial image of the current natural scene can be obtained.

For more general understanding, we take face detection as an example, and assume that initial feature point a1 (left eye), initial feature point a2 (right eye), initial feature point a3 (nose tip), and initial feature point a4 (face point) are extracted from initial image a of the current natural scene, and initial feature point B1 (nose tip), initial feature point B2 (left eye), and initial feature point B3 (right eye) are extracted from initial image B, then the types of initial feature points of the current natural scene are not: initial feature point a1, initial feature point a2, initial feature point a3, initial feature point a4, initial feature point b1, initial feature point b2, initial feature point b 3; the types of the initial feature points of the current natural scene should be: left eye, right eye, tip of nose, points of the face. This is because the initial feature point a1 and the initial feature point b2 are associated with each other and each represent a left eye, the initial feature point a2 and the initial feature point b3 are associated with each other and each represent a right eye, the initial feature point a3 and the initial feature point b1 are associated with each other and each represent a nose tip, and the initial feature point a4 represents a face point. The feature points in the natural scene image do not have obvious features like the face image, so if the corresponding relation is not determined, the problem that the same scene feature point is represented by different initial feature points occurs.

After determining how many types of initial feature points are obtained in the initial image of the current natural scene in total, the target feature points meeting the preset conditions can be selected from the current initial feature points. For example, the target feature point with a high frequency appearing in different initial images is selected, and the target feature point that matches a preset feature in the initial feature points may also be selected. The process of acquiring the target feature points from the initial feature points is actually acquiring initial feature points capable of representing the features of the current natural scene as target feature points. For example, it is also possible to select, as the target feature point of the current scene, an initial feature point whose difference from initial feature points in other natural scenes is greater than a threshold value from the initial feature points of the current scene. Of course, other preset conditions may be set in practical applications to obtain the target feature point.

Step S105, using the initial image including the target feature point of the current category in the initial image of each category as a training image, to obtain a training sample set of natural scenes of multiple categories.

In the embodiment of the present application, the obtained target feature points are feature points capable of representing a current natural scene, so the target feature points may be marked in an initial image including the target feature points, and the initial image marked with the target feature points may be used as a training image. The natural scenes of each category need to be subjected to a process of selecting target feature points from the initial feature points, so that a training image corresponding to each natural scene can be obtained, and a training sample set of the natural scenes of multiple categories can be obtained.

And step S106, training the constructed deep neural network through the training images in the training sample set to obtain the trained deep neural network.

In an embodiment of the present application, the deep neural network may be a VGG neural network model. The process of training the deep neural network by calibrating the target feature point training image may be: inputting the training image into a deep neural network to obtain an output image, constructing a loss function according to the difference between the feature points detected in the output image and the target feature points, and reversely updating the parameters of each layer in the deep neural network based on the loss function until the feature points are detected by the deep neural network to tend to the calibrated target feature points, namely the deep neural network is converged, so that the trained deep neural network can be obtained. Of course, in practical application, other training modes are also possible.

As another embodiment of the present application, before training the constructed deep neural network through the training images in the training sample set to obtain a trained deep neural network, the method further includes:

and calibrating the natural scene and the target characteristic points of the training images for each training image.

In the embodiment of the application, not only the target feature points can be calibrated for the training image, but also the natural scene corresponding to the training image can be calibrated, so that a classifier can be added at last when the deep neural network is set, and the classifier is used for classifying the natural scene of the image according to the detected feature points, so that the natural scene of the image to be detected can be obtained correspondingly when the deep neural network added with the classifier is used for detecting the feature points of the image.

And S107, detecting the image to be detected based on the trained deep neural network to obtain the characteristic points in the image to be detected.

In the embodiment of the application, the trained deep neural network has the capability of detecting and obtaining the characteristic points infinitely approaching the target characteristic points, so that the characteristic points which can represent the scene of the image to be detected in the image to be detected can be obtained after the image to be detected is input into the trained deep neural network.

According to the method and the device, the training image for training the deep neural network model is the image where the target feature point capable of representing each type of natural scene is screened from the initial feature points, so that the detection accuracy of the feature point in scene detection can be improved.

Fig. 2 is a schematic flow chart of another image feature point detection method provided in the embodiment of the present application, which is a process for describing how to obtain a target feature point on the basis of the embodiment shown in fig. 1, and may include the following steps:

step S201, acquiring initial images of multiple categories of natural scenes, wherein each category of natural scenes includes multiple initial images.

In step S202, for each category of natural scene, initial feature points are extracted from the initial image of the current category.

The contents of steps S201 to S202 are the same as the contents of steps S101 to S102, and the descriptions of steps S101 to S102 may be specifically referred to, and are not repeated herein.

Step S203, acquiring a three-dimensional model of the natural scene of the current category.

In the embodiment of the present application, the three-dimensional model of the natural scene may be pre-established, or may be established according to an initial image of the current natural scene.

As another embodiment of the present application, the acquiring a three-dimensional model of a natural scene of a current category includes:

and based on an image reconstruction algorithm, establishing a three-dimensional model of the natural scene of the current category according to the initial image of the current category.

In the embodiment of the present application, the three-dimensional model of the natural scene of the current category is established based on the initial images, and may be established according to an image sequence composed of a plurality of initial images. Firstly, according to the similarity between any two initial images, the initial images are ranked, so that the similarity between the initial images and the two adjacent images in front and back is the highest. Then, starting from the head of the image sequence, for the adjacent first and second initial images, the SIFT features of each initial image may be obtained, the SIFT features of each initial image are matched to obtain a three-dimensional reconstruction of the first and second initial images, then the three-dimensional reconstruction of the first and second initial images is corrected and extended according to the SIFT feature matching between the second initial image and the third initial image to obtain a three-dimensional reconstruction between the first initial image, the second initial image and the third initial image, and the three-dimensional reconstruction between the first initial image, the second initial image and the third initial image is corrected and extended according to the SIFT feature matching between the third initial image and the fourth initial image to obtain a three-dimensional reconstruction between the first to fourth initial images, … …, and analogizing in turn to obtain the three-dimensional reconstruction results of all the initial images in the current natural scene.

It should be noted that the above process of performing three-dimensional reconstruction on a plurality of initial images to obtain a three-dimensional model is only used for example, and in practical applications, other three-dimensional reconstruction methods may also be used.

Step S204, based on the projection matrix of the initial image of the current category in the three-dimensional model, obtaining the corresponding relation of the initial characteristic point in each initial image of the current category.

In the embodiment of the present application, taking a natural scene as an example, an initial image of a current natural scene may be mapped into a three-dimensional model to obtain a projection matrix of each initial image, or it may be understood that an initial image may be obtained by imaging the three-dimensional model at a viewing angle. After the projection matrix of each initial image in the three-dimensional model is obtained, since the initial feature point is located in the initial image, the corresponding relationship of the initial feature point in each initial image of the current category can be obtained according to the projection matrix of the initial image in the three-dimensional model, as described in the embodiment shown in fig. 1, the process of obtaining the corresponding relationship of the initial feature point in each initial image of the current category may also be a process of matching the initial feature point, and the matching may be performed according to information such as the feature and the position of the initial feature point.

As another embodiment of the present application, the obtaining, based on a projection matrix of the initial image of the current category in the three-dimensional model, a corresponding relationship of the initial feature point in each initial image of the current category includes:

obtaining the position of each initial characteristic point in the three-dimensional model based on a projection matrix of the initial image of the current category in the three-dimensional model;

and obtaining the corresponding relation of each initial characteristic point in each initial image of the current category based on the position of each initial characteristic point in the three-dimensional model.

In the embodiment of the present application, matching may be performed based on the positions of the initial feature points, for example, according to the positions of the first four feature points in the initial image and the projection matrix of the initial image in the three-dimensional model, the positions of the initial feature points in the three-dimensional model may be obtained, and the corresponding relationship of each initial feature point in each initial image of the current category is obtained based on the position of each initial feature point in the three-dimensional model.

Step S205, based on the corresponding relationship, obtaining the frequency of each initial feature point appearing in the initial image of the current category.

And step S206, taking the initial characteristic points with the frequency meeting the preset conditions as target characteristic points of the current category.

In the embodiment of the present application, the condition that the frequency of the initial feature point appearing in the initial image of the current category is used as the feature point to be screened may also be understood as that, if the initial feature point a1 appears in N initial images, the frequency of the initial feature point a1 is recorded as the number N of the initial images where the initial feature point a1 appears.

As another embodiment of the present application, the taking the initial feature point with the frequency meeting the preset condition as the target feature point of the current category includes:

taking the initial characteristic points with the frequency greater than the preset frequency in the initial characteristic points of the current category as target characteristic points of the current category;

or, the initial feature points of the current category are sorted according to the frequency, and a preset number of initial feature points are sequentially selected from the high frequency to the low frequency to serve as the target feature points of the current category.

In the embodiment of the application, the initial feature points of which the occurrence frequency of the same initial feature point in different initial images is greater than the preset frequency can be used as the target feature points, or the number can be preset, and the preset number of initial feature points are sequentially selected from the high frequency to the low frequency to be used as the target feature points of the current category.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present application, and only a part related to the embodiment of the present application is shown for convenience of description.

The terminal device 3 may be a software unit, a hardware unit or a combination of software and hardware unit built in a mobile phone, a tablet computer, a notebook computer and other terminal devices, and may also be integrated into the mobile phone, the tablet computer, the notebook computer and other terminal devices as an independent pendant.

The terminal device 3 includes:

an initial image obtaining module 31, configured to obtain initial images of multiple categories of natural scenes, where each category of natural scenes includes multiple initial images;

an initial feature point obtaining module 32, configured to, for each category of natural scene, respectively extract initial feature points from an initial image of a current category;

a corresponding relation obtaining module 33, configured to obtain a corresponding relation of the initial feature point in each initial image of the current category;

a target feature point obtaining module 34, configured to obtain, based on the corresponding relationship, an initial feature point that meets a preset condition from initial feature points of an initial image of the current category as a target feature point of the current category;

a training image obtaining module 35, configured to use an initial image that includes a target feature point of a current category in an initial image of each category as a training image, and obtain a training sample set of natural scenes of multiple categories;

a training module 36, configured to train the constructed deep neural network through the training images in the training sample set, so as to obtain a trained deep neural network;

and the detection module 37 is configured to detect the image to be detected based on the trained deep neural network, so as to obtain the feature points in the image to be detected.

Optionally, the corresponding relationship obtaining module 33 includes:

a three-dimensional model obtaining unit 331 configured to obtain a three-dimensional model of a current category of natural scenes;

a corresponding relation obtaining unit 332, configured to obtain, based on a projection matrix of the initial image of the current category in the three-dimensional model, a corresponding relation of the initial feature point in each initial image of the current category.

Optionally, the three-dimensional model obtaining unit 331 is further configured to:

Optionally, the correspondence obtaining unit 332 includes:

an initial characteristic point position obtaining subunit, configured to obtain, based on a projection matrix of a current category of initial images in the three-dimensional model, a position of each initial characteristic point in the three-dimensional model;

and the corresponding relation obtaining subunit is used for obtaining the corresponding relation of each initial characteristic point in each initial image of the current category based on the position of each initial characteristic point in the three-dimensional model.

Optionally, the target feature point obtaining module 34 includes:

an initial feature point frequency obtaining unit 341, configured to obtain, based on the correspondence, a frequency of occurrence of each initial feature point in an initial image of the current category;

a target feature point obtaining unit 342, configured to use the initial feature point with the frequency meeting the preset condition as a target feature point of the current category.

Optionally, the target feature point obtaining unit 342 is further configured to:

Optionally, the terminal device 3 further includes:

and the calibration module is used for calibrating the natural scene and the target characteristic point of the training image for each training image before training the constructed deep neural network through the training images in the training sample set and obtaining the trained deep neural network.

It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional units and modules is merely used as an example, and in practical applications, the foregoing function distribution may be performed by different functional units and modules as needed, that is, the internal structure of the terminal device is divided into different functional units or modules to perform all or part of the above-described functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the terminal device may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Fig. 4 is a schematic block diagram of a terminal device according to another embodiment of the present application. As shown in fig. 4, the terminal device 4 of this embodiment includes: one or more processors 40, a memory 41, and a computer program 42 stored in the memory 41 and executable on the processors 40. The processor 40, when executing the computer program 42, implements the steps in the various image feature point detection method embodiments described above, such as the steps S101 to S107 shown in fig. 1. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the modules/units in the terminal device embodiments described above, such as the functions of the modules 31 to 37 shown in fig. 3.

Illustratively, the computer program 42 may be partitioned into one or more modules/units that are stored in the memory 41 and executed by the processor 40 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 42 in the terminal device 4. For example, the computer program 42 may be divided into an initial image acquisition module, an initial feature point acquisition module, a correspondence acquisition module, a target feature point acquisition module, a training image acquisition module, a training module, and a detection module.

Other modules or units can refer to the description of the embodiment shown in fig. 3, and are not described again here.

The terminal device includes, but is not limited to, a processor 40, and a memory 41. Those skilled in the art will appreciate that fig. 4 is only one example of a terminal device 4 and does not constitute a limitation of terminal device 4 and may include more or fewer components than shown, or some components may be combined, or different components, for example, the terminal device may also include an input device, an output device, a network access device, a bus, etc.

The Processor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. The memory 41 may also be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. The memory 41 is used for storing the computer program and other programs and data required by the terminal device. The memory 41 may also be used to temporarily store data that has been output or is to be output.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed terminal device and method may be implemented in other ways. For example, the above-described terminal device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical function division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. An image feature point detection method, comprising:

for each category of natural scene, respectively extracting initial feature points from the initial image of the current category based on a preset initial feature point extraction method;

acquiring the corresponding relation of the initial feature points in each initial image of the current category, wherein the corresponding relation comprises the following steps: based on an image reconstruction algorithm, establishing a three-dimensional model of the natural scene of the current category according to the initial image of the current category; obtaining the position of each initial characteristic point in the three-dimensional model based on a projection matrix of the initial image of the current category in the three-dimensional model; obtaining the corresponding relation of each initial characteristic point in each initial image of the current category based on the position of each initial characteristic point in the three-dimensional model; the corresponding relation is the corresponding relation between the initial characteristic point and the initial characteristic point with the same type relation in each initial image of the current category;

2. The image feature point detection method according to claim 1, wherein the acquiring, based on the correspondence, an initial feature point that meets a preset condition from initial feature points of an initial image of a current category as a target feature point of the current category includes:

acquiring the frequency of each initial characteristic point appearing in the initial image of the current category based on the corresponding relation;

and taking the initial characteristic points with the frequency meeting the preset conditions as target characteristic points of the current category.

3. The image feature point detection method according to claim 2, wherein the taking the initial feature point whose frequency meets a preset condition as the target feature point of the current category includes:

4. The image feature point detection method of claim 1, before training the constructed deep neural network through the training images in the training sample set to obtain the trained deep neural network, further comprising:

5. A terminal device, comprising:

the initial feature point acquisition module is used for respectively extracting initial feature points from initial images of the current category based on a preset initial feature point extraction method for each category of natural scene;

a corresponding relationship obtaining module, configured to obtain a corresponding relationship between the initial feature points in each initial image of the current category, where the corresponding relationship includes: based on an image reconstruction algorithm, establishing a three-dimensional model of the natural scene of the current category according to the initial image of the current category; obtaining the position of each initial characteristic point in the three-dimensional model based on a projection matrix of the initial image of the current category in the three-dimensional model; obtaining the corresponding relation of each initial characteristic point in each initial image of the current category based on the position of each initial characteristic point in the three-dimensional model; the corresponding relation is the corresponding relation between the initial characteristic point and the initial characteristic point with the same type relation in each initial image of the current category;

6. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 4 when executing the computer program.

7. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by one or more processors, implements the steps of the method according to any one of claims 1 to 4.