CN117253156B

CN117253156B - Feature description extraction method, device, terminal and medium based on image segmentation

Info

Publication number: CN117253156B
Application number: CN202311533281.2A
Authority: CN
Inventors: 游德创
Original assignee: DeepRoute AI Ltd
Current assignee: DeepRoute AI Ltd
Priority date: 2023-11-17
Filing date: 2023-11-17
Publication date: 2024-03-29
Anticipated expiration: 2043-11-17
Also published as: CN117253156A

Abstract

The invention discloses a feature description extraction method, a device, a terminal and a medium based on image segmentation, wherein the method comprises the following steps: acquiring a first image, and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm; creating a plurality of second images according to the area of the object area; inputting the first image and the second image into a feature extraction frame to extract features, so as to obtain image features; and carrying out feature fusion on the image features to obtain feature points and feature descriptions, and carrying out category constraint on the feature descriptions through category loss functions and category information of object areas. According to the invention, the segmentation processing, the feature fusion and the loss function constraint are carried out on the input image, so that a richer feature map can be obtained, and the distinguishing property of feature points is improved.

Description

Feature description extraction method, device, terminal and medium based on image segmentation

Technical Field

The invention relates to the field of computer vision, in particular to a feature description extraction method, device, terminal and medium based on image segmentation.

Background

Local features are a key technology in the field of computer vision technology. High quality local feature descriptors require good discrimination and accurate feature description for all objects in the picture to enable downstream task processing.

Traditional local feature description techniques such as feature detection algorithms based on scale space can accelerate the feature extraction process by using fast gaussian filters and integral images; based on the algorithm of the FAST feature detector and the BRIEF feature descriptor, key points in the image can be rapidly detected, and the feature descriptor irrelevant to rotation can be extracted; the key points in the image can be rapidly detected based on the algorithm of the binary feature descriptors, and the feature descriptors irrelevant to rotation and scale can be extracted; the algorithm based on the binary feature descriptors can greatly reduce the time of feature extraction while maintaining high recognition rate. However, these conventional methods have poor effects of extracting local features in the case of rotation and translation of an image, noise, serious occlusion, and the like. At present, a plurality of methods based on learning exist, and compared with the traditional method, better reconstruction effect and reasoning speed can be achieved, but because characteristic points and characteristic descriptions are not differentiated enough, mismatching often occurs.

Accordingly, the prior art is still in need of improvement and development.

Disclosure of Invention

The invention aims to solve the technical problems of the prior art by providing a feature description extraction method, a device, a terminal and a medium based on image segmentation, and aims to solve the problem of mismatching caused by insufficient distinguishing of feature points and feature descriptions in the process of local feature description in the prior art.

The technical scheme adopted for solving the technical problems is as follows:

in a first aspect, the present invention provides a feature description extraction method based on image segmentation, where the method includes:

acquiring a first image, and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm;

creating a plurality of second images according to the area of the object area;

inputting the first image and the second image into a feature extraction frame to perform feature extraction to obtain image features;

and carrying out feature fusion on the image features to obtain feature points and feature descriptions, and carrying out category constraint on the feature descriptions through a category loss function and category information of the object region.

In one implementation, the acquiring the first image and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm includes:

acquiring a first image, and inputting the first image into an image segmentation algorithm model;

carrying out segmentation calculation through the image segmentation algorithm model to obtain a plurality of segmentation edges;

obtaining the object area according to the coordinate information of each divided edge;

and carrying out reasoning calculation on the object region through the image segmentation algorithm model to obtain category information of the object region.

In one implementation, the creating a number of second images from the area of the object region includes:

calculating the area of each object region, and arranging the areas of the object regions in a sequence from large to small to obtain an object region sequence;

the number of the preset areas is N, and the first N object areas in the object area sequence are selected to obtain N segmented object areas;

constructing N blank images, wherein the size of the blank images is the same as that of the first image;

and according to the coordinate information of the dividing edge, placing N dividing object areas on corresponding coordinates in the blank image one by one to obtain N second images.

In one implementation manner, the inputting the first image and the second image into a feature extraction frame to perform feature extraction, to obtain image features, includes:

constructing a convolutional neural network model as a feature extraction framework, wherein the convolutional neural network model comprises VGGNet;

and inputting the first image and the second image into the feature extraction frame to perform feature extraction to obtain n+1 image features.

In one implementation manner, the feature point and feature description are obtained by feature fusion of the image features, including:

constructing a feature fusion network;

inputting the N+1 image features into the feature fusion network to perform feature fusion to obtain fusion features;

and obtaining feature points and feature descriptions according to the fusion features.

In one implementation, before the category constraint on the feature description by the category loss function and the category information of the object region, the method includes:

randomly sampling the characteristic points to obtain preselected characteristic points;

combining the preselected characteristic points in pairs to obtain a characteristic point combination;

judging whether two points in each group of characteristic point combinations are in the same object area or not, obtaining a judging result, and determining similarity according to the judging result;

and obtaining the category loss function according to the similarity.

In one implementation manner, the determining whether two points in each group of feature point combinations are in the same object area, to obtain a determination result, and determining the similarity according to the determination result, includes:

if the judging result is that the two points in the characteristic point combination are in the same object area, setting the similarity as a first similarity value;

and if the judging result is that the two points in the characteristic point combination are not in the same object area, setting the similarity as a second similarity value, wherein the first similarity value is higher than the second similarity value.

In a second aspect, an embodiment of the present invention further provides a feature description extracting apparatus based on image segmentation, where the apparatus includes:

the image segmentation module is used for acquiring a first image and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm;

the second image acquisition module is used for creating a plurality of second images according to the area of the object area;

the image feature extraction module is used for inputting the first image and the second image into a feature extraction frame to perform feature extraction to obtain image features;

and the feature fusion module is used for obtaining feature points and feature descriptions through feature fusion of the image features, and carrying out category constraint on the feature descriptions through category loss functions and category information of the object areas.

In a third aspect, an embodiment of the present invention further provides an intelligent terminal, where the intelligent terminal includes a memory, a processor, and an image segmentation-based feature description extraction program stored in the memory and executable on the processor, and when the processor executes the image segmentation-based feature description extraction program, the steps of the image segmentation-based feature description extraction method described in any one of the above are implemented.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium has stored thereon an image segmentation based feature description extraction program, which when executed by a processor, implements the steps of the image segmentation based feature description extraction method as described in any one of the above.

The beneficial effects are that: compared with the prior art, the invention provides a feature description extraction method based on image segmentation. Firstly, a first image is acquired, all object areas on the first image and category information of the object areas are calculated through an image segmentation algorithm, image segmentation is carried out based on a segmentation model, and interference of information on the periphery of an object on extracted feature description is reduced. Then, a plurality of second images are created according to the area of the object area, so that the object area with larger area can be focused, and the feature extraction efficiency is improved. And secondly, inputting the first image and the second image into a feature extraction frame to extract features, obtaining image features, and performing intelligent feature extraction through a neural network frame, thereby improving the efficiency and accuracy of image feature extraction. And then, carrying out feature fusion on the image features to obtain feature points and feature descriptions, and enhancing the richness of the original features by independently carrying out feature extraction on the segmented regions and then carrying out feature fusion on the segmented regions and the original images. Finally, category constraint is carried out on the feature description through category loss functions and category information of the object area, the loss constraint can be carried out through the category information, and feature points can be better distinguished, so that better feature description is obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.

Fig. 1 is a schematic flow chart of a feature description extraction method based on image segmentation according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a feature extraction and fusion framework provided by an embodiment of the present invention.

Fig. 3 is a schematic diagram of a feature extraction network according to an embodiment of the present invention.

Fig. 4 is an example of an original image provided by an embodiment of the present invention.

Fig. 5 is an example of a segmented image provided by an embodiment of the present invention.

Fig. 6 is a schematic block diagram of a feature description extracting apparatus based on image segmentation according to an embodiment of the present invention.

Fig. 7 is a schematic block diagram of an internal structure of an intelligent terminal according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and effects of the present invention clearer and more specific, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.

It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

In the field of computer vision, local feature extraction is a key technology, and in order to obtain feature points with strong differentiation and good description performance, the main stream method mainly comprises the following two types of methods:

1. corresponding points are obtained through a loss function constraint and rotation translation transformation mode, and the better characteristic points are expected to be learned finally through the constraint of the loss function;

2. by combining the intermediate feature results of several stages in the feature extraction process, a richer feature is expected.

However, the current learning-based method is to perform feature extraction on a whole picture, continuously reduce the dimension of the features extracted in the previous stage through a plurality of stages, and finally extract feature descriptors on the features with small dimensions. And the learning of constraint characteristic points can be performed without considering category information, so that similarity of the characteristic descriptions in the objects and dissimilarity of the characteristic descriptions among the objects cannot be recognized, and the finally extracted characteristic descriptions are not distinguished sufficiently, so that the subsequent training process is influenced.

Aiming at the problems, the invention provides a feature description extraction method, a device, a terminal and a medium based on image segmentation. And expanding the original image based on the segmentation model to obtain richer features. Considering that the information of the periphery of the object possibly interferes with the extracted feature description, the influence of other objects on the periphery is removed by independently extracting the features of the segmented region and then carrying out feature fusion with the original image, so that the original features are enhanced. Meanwhile, the loss constraint can be carried out by adding the segmented category information, and better feature description can be obtained.

The embodiment provides a feature description extraction method based on image segmentation. As shown in fig. 1, the method comprises the steps of:

step S100, acquiring a first image, and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm;

specifically, in the field of computer vision technology, image segmentation is the first step in image analysis and is the basis of computer vision. Image segmentation refers to dividing an image into a plurality of mutually disjoint regions according to features such as gray scale, color, spatial texture, geometry, etc., such that the features exhibit consistency or similarity within the same region and differ significantly between different regions. In a simple manner, the object is separated from the background in one image.

In this embodiment, an original image, i.e., a first image, is first acquired, and an object region is separated from an image background by a segmentation algorithm, and category information of the object region is added to solve the problem of insufficient discrimination of extracted feature points. The segmentation algorithm adopted in the embodiment is a segmentation method based on edge detection, such as a Segment analysis model, and is used for generating edge points, and specifically includes a cannyQ operator, a Sobel operator and a Prewit operator. The first image is typically converted to a gray scale image and then edge points are found by calculating gradients or differences between pixel values, i.e. thresholding the gradients or differences to distinguish edges from noise. Meanwhile, the category information of the object area is extracted, and the category information is used for carrying out loss constraint, so that the feature points can be better described.

In one implementation, the step S100 in this embodiment includes the following steps:

s101, acquiring a first image, and inputting the first image into an image segmentation algorithm model;

step S102, carrying out segmentation calculation through the image segmentation algorithm model to obtain a plurality of segmentation edges;

step S103, obtaining the object area according to the coordinate information of each divided edge;

and step S104, carrying out reasoning calculation on the object region through the image segmentation algorithm model to obtain category information of the object region.

Specifically, the image segmentation algorithm is used to generate edge points of each object in the first image, and then the edge points are connected to obtain the segmentation edges. The coordinates of the edge points are collected and summarized, so that the coordinate information of the dividing edge in the first image can be obtained, and sometimes the coordinates of the edge of the object are partially curved, so that the coordinate information can be recorded once at intervals of n pixels. The object region is combined with the semantic information through a segmentation algorithm, so that the category information of the object region can be obtained. Fig. 4 is a first image acquired, and the segmentation image in fig. 5 can be obtained by performing computational reasoning through a segmentation algorithm. Each of which may be assigned category information such as high buildings, bridges, cars, etc. The category information further defines the properties of the feature points of the local feature, and the category information of the feature points in the same object region should be similar. The category information of the feature points in different object areas is dissimilar, so that the accuracy of feature point classification is improved.

Step 200, creating a plurality of second images according to the area of the object area;

in one implementation, the step S200 in this embodiment includes the following steps:

step S201, calculating the area of each object region, and arranging the areas of the object regions in a sequence from large to small to obtain an object region sequence;

step S202, presetting the number of the areas as N, and selecting the first N object areas in the object area sequence to obtain N segmented object areas;

step S203, constructing N blank images, wherein the size of the blank images is the same as that of the first image;

and S204, according to the coordinate information of the dividing edge, placing N divided object areas on corresponding coordinates in the blank image one by one to obtain N second images.

Specifically, in the present embodiment, N object areas having the largest area are selected, and N blank images having the same size as the first image are created at the same time, wherein RGB (red, green, blue color modes) of the blank images are all [255, 255, 255]. And filling the corresponding rgb information into blank pictures to corresponding positions according to the coordinate information of the N divided object areas, and if the number of the N divided object areas is less than the number of the N objects, the objects are blank pictures with the original image size. Thus, N second images are obtained. The second image expands the original dataset and can be characterized more in a segmented basis.

Step S300, inputting the first image and the second image into a feature extraction frame to extract features, and obtaining image features;

in one implementation, the step S300 in this embodiment includes the following steps:

step S301, constructing a convolutional neural network model as a feature extraction framework, wherein the convolutional neural network model comprises VGGNet;

step S302, inputting the first image and the second image into the feature extraction frame to perform feature extraction, so as to obtain n+1 image features.

Specifically, as shown in fig. 2, a picture to be input is divided into a plurality of mask areas in advance through a segmentation model, N small objects are obtained by segmentation from an original image, each object area represents one type, a second image is obtained, the first image and the second image are input into a feature extraction frame, original feature information is enhanced, feature extraction is performed through the feature extraction frame of fig. 3, and n+1 features are obtained. The present embodiment adopts VGGNet as a feature extraction framework. VGG Net (Visual Geometry Group Net, oxford university computer vision group network) is a convolutional neural network. The VGG Net consists of 5 layers of convolution layers, 3 layers of full connection layers and 1 layer of softmax output layers, wherein the layers are separated by using a maximizing pool, and the activating units of all the hidden layers adopt a ReLU (Rectified Linear Unit, linear rectification function) function. VGGNet has simple structure, clarity, and model parameter is regular not chaotic, can further reduce the parameter under the prerequisite of the receptive field that is equivalent with big convolution kernel, through increasing degree of depth, overlap convolution and ReLU repeatedly, can promote neural network's nonlinear learning ability, and has more channel numbers, and more channel numbers can make the model draw more characteristics.

And step 400, carrying out feature fusion on the image features to obtain feature points and feature descriptions, and carrying out category constraint on the feature descriptions through a category loss function and category information of the object region.

In one implementation, the step S400 in this embodiment includes the following steps:

step S401, constructing a feature fusion network;

step S402, inputting the N+1 image features into the feature fusion network to perform feature fusion, so as to obtain fusion features;

and step S403, obtaining feature points and feature description according to the fusion features.

Specifically, as shown in fig. 2, the obtained n+1 image features are fused, and may be directly added, or added according to a certain weight to obtain a new fused feature, and then a learnable convolution layer is passed to obtain feature points and feature descriptions of the fused feature.

In one implementation, the step S400 described in this embodiment includes the following steps:

m101, randomly sampling the characteristic points to obtain preselected characteristic points;

step M102, combining the preselected feature points pairwise to obtain a feature point combination;

m103, judging whether two points in each group of characteristic point combinations are in the same object area or not, obtaining a judging result, and determining similarity according to the judging result;

and step M104, obtaining the class loss function according to the similarity.

Specifically, in this embodiment, feature classification is constrained by a loss function, and the principle is that similarity between multiple divided regions is calculated, feature descriptions of mask (mask) regions of the same type are constrained to be similar, and feature descriptions of mask regions of different types are far apart. And by removing the wrong pairing, ensuring successful pairing of the feature points, classifying the successfully paired feature points, classifying the same object together, and removing the classification of different objects. Therefore, the feature points with the same angle can be identified, and the feature points have high probability of being the same object, so that each feature group can be separated.

Specifically, as shown in fig. 5, in one embodiment, after the features are fused to obtain a richer feature, the output feature description is subjected to category constraint. Firstly, random sampling is carried out, 10000 feature points are selected for combination in pairs, and the method specifically comprises the following steps: extracting m points from a plurality of segmented object areas of an original image, combining every two points, and combining the first 10000 feature points; traversing 10000 feature point combinations during loss calculation, wherein when the feature point combinations are in the same mask area, the internal features should be similar; when the feature points are in different mask areas, the features should be dissimilar, thus obtaining the similarity. And the characteristics among constraint areas and inside the areas of the category loss function are obtained through the similarity, so that the accuracy of the classification of the local characteristics in subsequent training can be improved.

In one implementation, the step M103 in this embodiment includes the following steps:

m1031, setting the similarity as a first similarity value if the judging result is that two points in the feature point combination are in the same object area;

and M1032, if the judging result is that the two points in the feature point combination are not in the same object area, setting the similarity as a second similarity value, wherein the first similarity value is higher than the second similarity value.

Specifically, a first similarity value and a second similarity value are preset, and the first similarity value needs to be ensured to be higher than the second similarity value, so that whether two points in the feature point combination are in the same region or not can be indicated by the similarity.

As shown in fig. 6, the present embodiment further provides a feature description extracting apparatus based on image segmentation, the apparatus including:

the image segmentation module 10 is used for acquiring a first image and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm;

a second image acquisition module 20, configured to create a plurality of second images according to the area of the object region;

an image feature extraction module 30, configured to input the first image and the second image into a feature extraction frame for feature extraction, so as to obtain image features;

and the feature fusion module 40 is configured to obtain feature points and feature descriptions by feature fusion of the image features, and perform category constraint on the feature descriptions by a category loss function and category information of the object region.

In one implementation, the image segmentation module 10 includes:

the first image acquisition unit is used for acquiring a first image and inputting the first image into the image segmentation algorithm model;

the segmentation unit is used for carrying out segmentation calculation through the image segmentation algorithm model to obtain a plurality of segmentation edges;

an object region obtaining unit, configured to obtain the object region according to coordinate information of each of the dividing edges;

a category information acquisition unit for performing inference calculation on the object region through the image segmentation algorithm model to obtain category information of the object region

In one implementation, the second image acquisition module 20 includes:

the sequencing unit is used for calculating the area of each object region and sequencing the areas of the object regions in order from large to small to obtain an object region sequence;

a segmented object region obtaining unit, configured to preset the number of regions to N, and select the first N object regions in the object region sequence to obtain N segmented object regions;

a blank image construction and acquisition unit, configured to construct N blank images, where the size of the blank image is the same as the size of the first image;

and the image synthesis unit is used for placing the N divided object areas on the corresponding coordinates in the blank image one by one according to the coordinate information of the dividing edge to obtain N second images.

In one implementation, the image feature extraction module 30 includes:

the modeling unit is used for constructing a convolutional neural network model as a feature extraction framework, wherein the convolutional neural network model comprises VGGNet;

and the feature extraction unit is used for inputting the first image and the second image into the feature extraction frame to perform feature extraction to obtain N+1 image features.

In one implementation, the feature fusion module 40 includes:

the feature fusion network construction unit is used for constructing a feature fusion network;

the fusion feature acquisition unit is used for inputting the N+1 image features into the feature fusion network to perform feature fusion to obtain fusion features;

and the characteristic point and characteristic description acquisition unit is used for acquiring characteristic points and characteristic descriptions according to the fusion characteristics.

In one implementation, the apparatus further comprises:

the pre-selection feature point acquisition unit is used for randomly sampling the feature points to obtain pre-selection feature points;

the characteristic point combination obtaining unit is used for combining the preselected characteristic points in pairs to obtain a characteristic point combination;

the similarity acquisition unit is used for judging whether two points in each group of characteristic point combinations are in the same object area or not, obtaining a judging result and determining similarity according to the judging result;

and the category loss function obtaining unit is used for obtaining the category loss function according to the similarity.

In one implementation, the similarity obtaining unit includes:

the first similarity obtaining unit is used for setting the similarity as a first similarity value if the judging result is that two points in the characteristic point combination are in the same object area;

and the second similarity obtaining unit is used for setting the similarity as a second similarity value if the judging result is that the two points in the characteristic point combination are not in the same object area, wherein the first similarity value is higher than the second similarity value.

Based on the above embodiment, the present invention further provides an intelligent terminal, and a functional block diagram thereof may be shown in fig. 7. The intelligent terminal comprises a processor, a memory, a network interface, a display screen and a temperature sensor which are connected through a system bus. The processor of the intelligent terminal is used for providing computing and control capabilities. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the intelligent terminal is used for communicating with an external terminal through network connection. The computer program is executed by a processor to implement a feature description extraction method based on image segmentation. The display screen of the intelligent terminal can be a liquid crystal display screen or an electronic ink display screen, and a temperature sensor of the intelligent terminal is arranged in the intelligent terminal in advance and used for detecting the running temperature of internal equipment.

It will be appreciated by those skilled in the art that the schematic block diagram shown in fig. 7 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the smart terminal to which the present inventive arrangements are applied, and that a particular smart terminal may include more or fewer components than shown, or may combine certain components, or may have a different arrangement of components.

In one embodiment, an intelligent terminal is provided, the intelligent terminal includes a memory, a processor, and an image segmentation-based feature description extraction program stored in the memory and executable on the processor, and when the processor executes the image segmentation-based feature description extraction program, the processor implements the following operation instructions:

creating a plurality of second images according to the area of the object area;

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, operational database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual operation data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

In summary, the invention discloses a feature description extraction method, a device, a terminal and a medium based on image segmentation, wherein the method comprises the following steps: acquiring a first image, and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm; creating a plurality of second images according to the area of the object area; inputting the first image and the second image into a feature extraction frame to extract features, so as to obtain image features; and obtaining feature points and feature descriptions by carrying out feature fusion on the image features, and carrying out category constraint on the feature descriptions by a category loss function and category information of the object region. According to the invention, the segmentation processing, the feature fusion and the loss function constraint are carried out on the input image, so that a richer feature map can be obtained, and the distinguishing property of feature points is improved.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A feature description extraction method based on image segmentation, the method comprising:

creating a plurality of second images according to the areas of the object areas, wherein the second images are N images obtained according to N object areas with the largest areas;

feature fusion is carried out on the image features to obtain feature points and feature descriptions, and category constraint is carried out on the feature descriptions through category loss functions and category information of the object areas;

the step of acquiring a first image and calculating all object areas and category information of the object areas on the first image through an image segmentation algorithm comprises the following steps:

carrying out reasoning calculation on the object region through the image segmentation algorithm model to obtain category information of the object region;

before the category constraint is performed on the feature description by the category loss function and the category information of the object area, the method comprises the following steps:

and obtaining the category loss function according to the similarity.

2. The image segmentation based feature description extraction method according to claim 1, wherein creating a number of second images from the area of the object region comprises:

3. The image segmentation-based feature description extraction method according to claim 2, wherein the inputting the first image and the second image into a feature extraction frame for feature extraction, to obtain image features, comprises:

4. The method for extracting feature descriptions based on image segmentation according to claim 3, wherein the feature fusion is performed on the image features to obtain feature points and feature descriptions, and the method comprises the following steps:

constructing a feature fusion network;

5. The method for extracting feature descriptions based on image segmentation according to claim 1, wherein the determining whether two points in each group of feature point combinations are in the same object region, to obtain a determination result, and determining the similarity according to the determination result, comprises:

6. A feature description extracting apparatus based on image segmentation, the apparatus comprising:

the second image acquisition module is used for creating a plurality of second images according to the areas of the object areas, wherein the second images are N images obtained according to N object areas with the largest areas;

the feature fusion module is used for carrying out feature fusion on the image features to obtain feature points and feature descriptions, and carrying out category constraint on the feature descriptions through category loss functions and category information of the object areas;

the image segmentation module comprises:

the category information acquisition unit is used for carrying out reasoning calculation on the object area through the image segmentation algorithm model to obtain category information of the object area;

the apparatus further comprises:

7. An intelligent terminal, characterized in that the intelligent terminal comprises a memory, a processor and an image segmentation based feature description extraction program stored in the memory and executable on the processor, the processor implementing the steps of the image segmentation based feature description extraction method according to any one of claims 1-5 when executing the image segmentation based feature description extraction program.

8. A computer-readable storage medium, on which an image segmentation-based feature description extraction program is stored, which, when executed by a processor, implements the steps of the image segmentation-based feature description extraction method according to any one of claims 1-5.