CN113792752A

CN113792752A - Image feature extraction method and system based on binocular camera and intelligent terminal

Info

Publication number: CN113792752A
Application number: CN202110884275.6A
Authority: CN
Inventors: 朱海涛; 杨超; 王欣亮; 孙钊; 裴姗姗
Original assignee: Beijing Smarter Eye Technology Co Ltd
Current assignee: Beijing Smarter Eye Technology Co Ltd
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-12-14
Anticipated expiration: 2041-08-03
Also published as: CN113792752B

Abstract

The invention discloses an image feature extraction method, system and intelligent terminal based on a binocular camera, wherein the method comprises the following steps: the method comprises the steps of obtaining an original image of a binocular camera, extracting feature points of the original image based on different algorithms, obtaining feature descriptors corresponding to the feature points, fusing the feature points obtained under the different algorithms, obtaining target feature points through screening, fusing the feature descriptors obtained under the different algorithms, obtaining target feature descriptors through screening, constructing constraint conditions according to the target feature descriptors, and constructing a feature point space in a target feature point range based on the constraint conditions to obtain a scene feature set. Therefore, the technical problem that the image feature extraction accuracy of the binocular camera in the prior art is poor is solved, and the accuracy of the image feature extraction is improved.

Description

Image feature extraction method and system based on binocular camera and intelligent terminal

Technical Field

The invention relates to the technical field of visual algorithms, in particular to an image feature extraction method and system based on a binocular camera and an intelligent terminal.

Background

With the development of automatic driving technology, people have increasingly higher requirements on safety and comfort of vehicles for assisting driving. In the automatic driving scene, the feature extraction is used as the basis of other functional algorithms, and the extraction accuracy directly influences the safety of automatic driving. The existing feature extraction scheme mainly focuses on feature extraction of image information, and the actual physical constraint characteristics such as time continuity and space consistency cannot be fully considered, so that the feature extraction accuracy is poor. Therefore, providing an image feature extraction method, system and intelligent terminal based on a binocular camera to improve the accuracy of feature extraction becomes a problem to be solved by those skilled in the art.

Disclosure of Invention

Therefore, the embodiment of the invention provides an image feature extraction method and system based on a binocular camera and an intelligent terminal, and aims to solve the technical problem that the image feature extraction accuracy of the binocular camera in the prior art is poor.

In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:

a binocular camera-based image feature extraction method, the method comprising:

acquiring an original image of a binocular camera;

extracting feature points of the original image based on different algorithms, and obtaining feature descriptors corresponding to the feature points;

fusing the feature points obtained under different algorithms, and obtaining target feature points through screening;

fusing the feature descriptors obtained under different algorithms, and obtaining a target feature descriptor through screening;

constructing a constraint condition according to the target feature descriptor;

and constructing a feature point space in the target feature point range based on the constraint condition to obtain a scene feature set.

Further, the extracting feature points of the original image based on different algorithms and obtaining feature descriptors corresponding to the feature points specifically include:

extracting feature points of the original image based on a first feature extraction algorithm to obtain a first group of feature points and first feature descriptors corresponding to the first group of feature points;

and extracting the feature points of the original image based on a second feature extraction algorithm to obtain a second batch of feature points and a second feature descriptor corresponding to the second batch of feature points.

Further, the first feature extraction algorithm is a SIFT feature extraction algorithm, and the second feature extraction algorithm is an ORB feature extraction algorithm.

Further, fusing the feature points obtained under different algorithms, and obtaining the target feature point through screening, specifically comprising:

calculating ORB features of the first batch of feature points and SIFT features of the second batch of feature points, and fusing the obtained features to obtain fused feature points;

acquiring a fusion feature descriptor corresponding to the fusion feature point;

and if the fusion feature descriptor is judged to be larger than a preset obvious threshold value, reserving the fusion feature point as the target feature point, and using the feature descriptor corresponding to the target feature point as a target feature descriptor.

Further, the constructing a constraint condition according to the target feature descriptor specifically includes:

calculating three-dimensional coordinates (x, y, z) of the target feature points based on parallax information of a binocular camera;

the constraint condition comprises a first constraint condition which is: l2-norm (| (x1, y1, z1), (x2, y2, z2) of three-dimensional coordinates between adjacent two of the target feature points_|L2Is smaller than the preset threshold Dt.

for the original imageDividing, and counting the number of all characteristic points in different label areas based on the dividing result

And the number of feature points satisfying the first constraint condition

The constraint conditions further comprise a second constraint condition, wherein the second constraint condition is as follows: the number of target feature points of which the same region does not satisfy the first constraint condition

Is less than a preset threshold Nt.

setting a target characteristic point P at the previous moment of two adjacent frames of original images in the continuous N frames_t-1Target feature point P at present time_tTarget feature point P at the previous time_t-1Corresponding object feature descriptor F_t-1And the target feature point P at the current time_tCorresponding object feature descriptor F_t；

The constraint conditions further comprise a third constraint condition, wherein the third constraint condition is as follows: the target feature point satisfies the L2-norm F of the target feature descriptor within consecutive N frames_t-1，F_t|_L2Less than a predetermined threshold T_t1And N is a positive integer greater than 1.

obtaining the target characteristic point P of the left eye view at the same time_lAnd target feature point P of right eye view_r；

The constraint conditions further include a fourth constraint condition, and the fourth constraint condition is: the target characteristic point P of the left eye view_lAnd target feature point P of right eye view_rExist simultaneously and meet the target characteristicsL2-norm F of syndrome descriptor_l，F_r||_L2Less than a predetermined threshold T_t2。

The invention also provides an image feature extraction system based on a binocular camera, which is used for implementing the method, and comprises the following steps:

the image acquisition unit is used for acquiring an original image of the binocular camera;

the feature extraction unit is used for extracting feature points of the original image based on different algorithms and obtaining feature descriptors corresponding to the feature points;

the characteristic fusion unit is used for fusing the characteristic points obtained under different algorithms and obtaining target characteristic points through screening; fusing the feature descriptors obtained under different algorithms, and obtaining a target feature descriptor through screening;

the constraint construction unit is used for constructing constraint conditions according to the target feature descriptors;

and the feature set output unit is used for constructing a feature point space in the target feature point range based on the constraint condition so as to obtain a scene feature set.

The present invention also provides an intelligent terminal, including: the device comprises a data acquisition device, a processor and a memory;

the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor is configured to execute one or more program instructions to perform the method as described above.

The image feature extraction method based on the binocular camera comprises the steps of obtaining original images of the binocular camera, extracting feature points of the original images based on different algorithms, obtaining feature descriptors corresponding to the feature points, fusing the feature points obtained under the different algorithms, obtaining target feature points through screening, fusing the feature descriptors obtained under the different algorithms, obtaining target feature descriptors through screening, constructing constraint conditions according to the target feature descriptors, and constructing a feature point space in a target feature point range based on the constraint conditions to obtain a scene feature set. Therefore, the technical problem that the image feature extraction accuracy of the binocular camera in the prior art is poor is solved, and the accuracy of the image feature extraction is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.

The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions that the present invention can be implemented, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the effects and the achievable by the present invention, should still fall within the range that the technical contents disclosed in the present invention can cover.

Fig. 1 is a flowchart of an embodiment of a binocular camera-based image feature extraction method according to the present invention;

fig. 2 is a block diagram of an embodiment of the image feature extraction system based on a binocular camera according to the present invention.

Detailed Description

The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In a specific embodiment, as shown in fig. 1, the image feature extraction method based on a binocular camera provided by the invention comprises the following steps:

s1: the original image of the binocular camera is acquired, and obviously, the original image may be a continuous multi-frame image, and the original image includes an image acquired by a left eye and an image acquired by a right eye.

S2: extracting feature points of the original image based on different algorithms, and obtaining feature descriptors corresponding to the feature points; the feature descriptor means that a feature is expressed by using a mathematical vector.

Specifically, feature point extraction is carried out on the original image based on a first feature extraction algorithm to obtain a first group of feature points and first feature descriptors corresponding to the first group of feature points; and extracting the feature points of the original image based on a second feature extraction algorithm to obtain a second batch of feature points and a second feature descriptor corresponding to the second batch of feature points.

In a specific scenario, the first feature extraction algorithm is a SIFT feature extraction algorithm, and the second feature extraction algorithm is an ORB feature extraction algorithm. At the moment, feature point extraction is carried out on the original image based on different algorithms, and feature descriptors corresponding to the feature points are obtained, specifically, feature point extraction is carried out on the image by utilizing an SIFT feature extraction algorithm, and a first batch of feature points p1 and feature descriptors f1s thereof are obtained; and (5) extracting the feature points of the images by using an ORB feature extraction algorithm to obtain a second batch of feature points p2 and a feature descriptor f2o thereof.

It is to be understood that SIFT, a Scale-invariant feature transform (SIFT), is a description used in the field of image processing. The description has scale invariance, can detect key points in the image and is a local feature descriptor. The SIFT feature is based on some locally apparent points of interest on the object, regardless of the size and rotation of the image. The tolerance to light, noise, and micro-viewing angle changes is also quite high. Based on these characteristics, they are highly significant and relatively easy to retrieve, easily identify objects and are rarely misidentified in feature databases with large denominations. The detection rate of partial object occlusion using the SIFT feature description is also quite high, and even more than 3 SIFT object features are enough to calculate the position and orientation. Under the current hardware speed of computer and the condition of small feature database, the recognition speed can approach to real-time operation. The SIFT features have large information quantity and are suitable for quick and accurate matching in a mass database.

ORB is a short for organized Fast and Rotated Brief and can be used to quickly create feature vectors for key points in an image, which can be used to identify objects in the image. Wherein Fast and Brief are the feature detection algorithm and the vector creation algorithm, respectively. The ORB first looks for a special area from the image, called a keypoint. Key points are small areas, such as corners, that stand out in the image, such as they have the characteristic that the pixel values change sharply from light to dark. The ORB will then compute a corresponding feature vector for each keypoint. The feature vector created by the ORB algorithm contains only 1 and 0, called binary feature vector. The order of 1 and 0 will vary depending on the particular keypoint and the pixel area around it. The vector represents the intensity pattern around the keypoint, so multiple feature vectors can be used to identify larger regions, even particular objects in the image. ORB is characterized by being ultra fast and to some extent immune to noise and image transformations, such as rotation and scaling transformations.

S3: fusing the feature points obtained under different algorithms, and obtaining target feature points through screening;

s4: fusing the feature descriptors obtained under different algorithms, and obtaining a target feature descriptor through screening;

specifically, ORB features of the first batch of feature points and SIFT features of the second batch of feature points are calculated, and the obtained features are fused to obtain fused feature points;

Still taking the above usage scenario as an example, orb feature f1o is calculated for feature point p 1; calculating SIFT feature f2s aiming at the feature point p 2; constructing a fusion characteristic descriptor according to the mode of f1 ═ f1s + f1o and f2 ═ f2s + f2 o; the "+" indicates feature fusion and is not additive in the mathematical sense. Examining the fusion feature descriptors (comprising f1 and f2) aiming at all feature points (comprising p1 and p 2); when the fusion feature descriptor is larger than a preset obvious threshold value, retaining the feature points and the descriptors thereof; otherwise, deleting the feature points and the descriptors thereof. And marking the deleted feature point as point, and marking the corresponding feature descriptor as feature.

S5: constructing a constraint condition according to the target feature descriptor; the constraint condition is a boundary condition when the connection between the point sets is constructed based on the alternative feature points.

the constraint condition comprises a first constraint condition which is: l2-norm (| (x1, y1, z1), (x2, y2, z2) | | of three-dimensional coordinates between two adjacent target feature points_L2Is smaller than the preset threshold Dt. Wherein, the value range of Dt belongs to the set [0,1024 ]]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

segmenting the original image, and counting the number of all characteristic points in different label areas based on segmentation results

And the number of feature points satisfying the first constraint condition

Is less than a preset threshold Nt. Wherein the value range of Nt belongs to the set [0,1024 ]]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

The constraint conditions further comprise a third constraint condition, wherein the third constraint condition is as follows: the target feature point satisfies the L2-norm F of the target feature descriptor within consecutive N frames_t-1，F_t||_L2Less than a predetermined threshold T_t1And N is a positive integer greater than 1. Wherein, T_t1Belong to the set [0,1024 ]]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

The constraint conditions further include a fourth constraint condition, and the fourth constraint condition is: the target characteristic point P of the left eye view_lAnd target feature point P of right eye view_rExist simultaneously and satisfy the L2-norm F of the target feature descriptor_l，F_r||_L2Less than a predetermined threshold T_t2. Wherein, T_t2Fall within the value range ofSet [0,1024]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

In the above usage scenario, the process of constructing the constraint condition according to the target feature descriptor includes:

first, a first constraint C1 is constructed. Based on the parallax information, three-dimensional spatial information calculation (x, y, z) of the feature point is calculated. The first constraint is: the three-dimensional spatial information between feature points should be smooth. The definition of smoothness is: l2-norm (| (x1, y1, z1), (x2, y2, z2) of three-dimensional coordinates between two adjacent feature points_L2Should be less than the preset threshold Dt.

Second, a second constraint C2 is constructed. Segmenting the image, and counting the number of all characteristic points in different label areas based on segmentation results

And the number of feature points satisfying the first constraint condition

The second constraint is: the number of points in the same area meeting the first constraint condition is as much as possible. As many definitions as possible are: number of feature points that do not satisfy the L1 constraint

Should be less than a preset threshold Nt.

Third, a third constraint C3 is constructed. Repeating the above operations for the adjacent quantity frame data in the time sequence to obtain the characteristic point P of the previous time_t-1And the feature point P of the current time_t. According to the respective corresponding characteristic descriptor F_t-1And F_tMatching, wherein the matching scheme is as follows: the feature descriptors between corresponding feature points should be consistent. The consistent definition of a feature descriptor is: l2-norm | Ft-1, Ft | of feature descriptor_L2Less than a predetermined threshold T_t1. The third constraint is: the feature points should have a matching relationship within consecutive N frames; where N is a predetermined threshold, and the value of N is too large or too largeSmall, may cause the constraints to be too tight or too loose, so N should be adjusted empirically for practical use.

Finally, a fourth constraint C4 is constructed. Repeating the above operations for the left and right images at the same time point to obtain the feature point P of the left eye view at the same time_lAnd a feature point P of the right eye view. According to the respective corresponding characteristic descriptor F₁And F_rMatching, wherein the matching scheme is as follows: the feature descriptors between corresponding feature points should be consistent. The consistent definition of a feature descriptor is: l2-norm F of feature descriptor_l，F_r||_L2Less than a predetermined threshold T_t2. The fourth constraint is: the feature points should instead exist in both the left and right views and satisfy the epipolar constraint.

S6: and constructing a feature point space in the target feature point range based on the constraint condition to obtain a scene feature set. Based on the four constraint conditions, a feature point space Γ ∈ { P ∈ C1, P ∈ C2, P ∈ C3, P ∈ C4} is constructed, that is, the feature point P simultaneously belongs to a feature space composed of 4 constraint conditions, and a space formed by the feature points is an intersection of a feature space set composed of the four constraint conditions.

In a specific embodiment, the image feature extraction method based on the binocular camera provided by the invention obtains an original image of the binocular camera, performs feature point extraction on the original image based on different algorithms, obtains feature descriptors corresponding to the feature points, fuses the feature points obtained under different algorithms, obtains target feature points through screening, fuses the feature descriptors obtained under different algorithms, obtains target feature descriptors through screening, constructs constraint conditions according to the target feature descriptors, and constructs a feature point space within the range of the target feature points based on the constraint conditions to obtain a scene feature set. Therefore, the technical problem that the image feature extraction accuracy of the binocular camera in the prior art is poor is solved, and the accuracy of the image feature extraction is improved.

In addition to the above method, the present invention also provides a binocular camera-based image feature extraction system for implementing the above method, as shown in fig. 2, the system comprising:

an image acquisition unit 100 for acquiring an original image of a binocular camera;

and the feature extraction unit 200 is configured to perform feature point extraction on the original image based on different algorithms, and obtain a feature descriptor corresponding to the feature point.

Specifically, the feature extraction unit 200 is configured to perform feature point extraction on the original image based on a first feature extraction algorithm to obtain a first group of feature points and first feature descriptors corresponding to the first group of feature points; and extracting the feature points of the original image based on a second feature extraction algorithm to obtain a second batch of feature points and a second feature descriptor corresponding to the second batch of feature points.

The first feature extraction algorithm is a SIFT feature extraction algorithm, and the second feature extraction algorithm is an ORB feature extraction algorithm.

A feature fusion unit 300, configured to fuse feature points obtained under different algorithms, and obtain a target feature point through screening; and fusing the feature descriptors obtained under different algorithms, and obtaining the target feature descriptor through screening.

The feature fusion unit 300 is specifically configured to:

And a constraint constructing unit 400, configured to construct a constraint condition according to the target feature descriptor.

the constraint conditions compriseA first constraint condition, the first constraint condition being: l2-norm (| (x1, y1, z1), (x2, y2, z2) | | of three-dimensional coordinates between two adjacent target feature points_L2Is smaller than the preset threshold Dt. Wherein, the value range of Dt belongs to the set [0,1024 ]]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

The constraint building unit 400 is specifically configured to:

And the number of feature points satisfying the first constraint condition

The constraint building unit 400 is specifically configured to:

setting a target characteristic point P at the previous moment of two adjacent frames of original images in the continuous N frames_t-1Target feature point P at present time_tTarget feature point P at the previous time_t-1Corresponding object feature descriptor F_t-1And the target feature point P at the current time_tCorresponding object feature descriptor F_t(ii) a The constraint conditions further comprise a third constraint condition, wherein the third constraint condition is as follows: the target feature point satisfies L2-norm | Ft-1, Ft | of the target feature descriptor in the continuous N frames_L2Less than a predetermined threshold T_t1And N is a positive integer greater than 1. Wherein, T_t1Belong to the set[0,1024]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

The constraint building unit 400 is specifically configured to:

The constraint conditions further include a fourth constraint condition, and the fourth constraint condition is: the target characteristic point P of the left eye view_lAnd target feature point P of right eye view_rExist simultaneously and satisfy the L2-norm F of the target feature descriptor_l，F_r||_L2Less than a predetermined threshold T_t2. Wherein, T_t2Belong to the set [0,1024 ]]The influence of the deletion result corresponding to different thresholds is different; the smaller the threshold, the more compact the screening conditions.

A feature set output unit 500, configured to construct a feature point space within the target feature point range based on the constraint condition, so as to obtain a scene feature set.

In the foregoing specific embodiment, the image feature extraction system based on a binocular camera provided by the invention obtains an original image of the binocular camera, performs feature point extraction on the original image based on different algorithms, obtains feature descriptors corresponding to the feature points, fuses the feature points obtained under different algorithms, obtains target feature points through screening, fuses the feature descriptors obtained under different algorithms, obtains target feature descriptors through screening, constructs constraint conditions according to the target feature descriptors, and constructs a feature point space within the range of the target feature points based on the constraint conditions, so as to obtain a scene feature set. Therefore, the technical problem that the image feature extraction accuracy of the binocular camera in the prior art is poor is solved, and the accuracy of the image feature extraction is improved.

In correspondence with the above embodiments, embodiments of the present invention also provide a computer storage medium containing one or more program instructions therein. Wherein the one or more program instructions are for executing the method as described above by a binocular camera depth calibration system.

In an embodiment of the invention, the processor may be an integrated circuit chip having signal processing capability. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.

The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or which may include both volatile and nonvolatile memory.

The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.

The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), SLDRAM (SLDRAM), and Direct Rambus RAM (DRRAM).

The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.

Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

The above embodiments are only for illustrating the embodiments of the present invention and are not to be construed as limiting the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the embodiments of the present invention shall be included in the scope of the present invention.

Claims

1. An image feature extraction method based on a binocular camera is characterized by comprising the following steps:

acquiring an original image of a binocular camera;

constructing a constraint condition according to the target feature descriptor;

2. The image feature extraction method according to claim 1, wherein the extracting feature points of the original image based on different algorithms and obtaining feature descriptors corresponding to the feature points specifically comprises:

3. The image feature extraction method according to claim 2, wherein the first feature extraction algorithm is a SIFT feature extraction algorithm, and the second feature extraction algorithm is an ORB feature extraction algorithm.

4. The image feature extraction method according to claim 3, wherein feature points obtained under different algorithms are fused, and a target feature point is obtained by screening, and specifically includes:

5. The image feature extraction method according to claim 4, wherein the constructing a constraint condition according to the target feature descriptor specifically includes:

the constraint condition comprises a first constraint condition which is: l2-norm (| (x1, y1, z1), (x2, y2, z2) | | of three-dimensional coordinates between two adjacent target feature points_L2Is smaller than the preset threshold Dt.

6. The image feature extraction method according to claim 5, wherein the constructing a constraint condition according to the target feature descriptor specifically includes:

And the number of feature points satisfying the first constraint condition

Is less than a preset threshold Nt.

7. The image feature extraction method according to claim 6, wherein the constructing a constraint condition according to the target feature descriptor specifically includes:

The constraint conditions further comprise a third constraint condition, wherein the third constraint condition is as follows: the target feature point satisfies the L2-norm F of the target feature descriptor within consecutive N frames_t-1，F_t||_L2Less than a predetermined threshold T_t1And N is a positive integer greater than 1.

8. The image feature extraction method according to claim 7, wherein the constructing a constraint condition according to the target feature descriptor specifically includes:

The constraint conditions further include a fourth constraint condition, and the fourth constraint condition is: the target characteristic point P of the left eye view_lAnd target feature point P of right eye view_rExist simultaneously and satisfy the L2-norm F of the target feature descriptor_l，F_r||_L2Less than a predetermined threshold T_t2。

9. A binocular camera based image feature extraction system for implementing the method of any one of claims 1 to 8, the system comprising:

10. An intelligent terminal, characterized in that, intelligent terminal includes: the device comprises a data acquisition device, a processor and a memory;

the data acquisition device is used for acquiring data; the memory is to store one or more program instructions; the processor, configured to execute one or more program instructions to perform the method of any of claims 1-8.