CN112085106A

CN112085106A - Image identification method and device applied to multi-image fusion, electronic equipment and storage medium

Info

Publication number: CN112085106A
Application number: CN202010947804.8A
Authority: CN
Inventors: 薛峰; 张万友; 夏炎
Original assignee: Jiangsu Timi Intelligent Technology Co ltd
Current assignee: Jiangsu Timi Intelligent Technology Co ltd
Priority date: 2020-09-10
Filing date: 2020-09-10
Publication date: 2020-12-15

Abstract

The invention discloses an image identification method and device applied to multi-image fusion, electronic equipment and a storage medium, and belongs to the technical field of image identification. The method avoids the process of fusing images, and firstly extracts and matches the image characteristics; then, respectively using the original input of the image to perform target detection, comparing the position of a target boundary frame obtained after the detection with the characteristic matching position, and screening to leave a target object; therefore, the original feature information of the image is retained to the maximum extent, the original feature of the image is convenient to be input into a machine learning model for extracting, the accuracy of image recognition is improved, and the problem that the algorithm for extracting the image target is not accurate due to difference between the training image and the actually used image sample is solved.

Description

Image identification method and device applied to multi-image fusion, electronic equipment and storage medium

Technical Field

The invention belongs to the technical field of image recognition, and particularly relates to an image recognition method and device applied to multi-image fusion, electronic equipment and a storage medium.

Background

In recent years, deep learning methods are mature in target detection and identification applications of images, and algorithm accuracy is greatly improved compared with traditional feature extraction. For remote sensing images and aerial photography applications, a camera often cannot capture all objects at one time, and splicing and fusion processing is performed after a large number of photos of repeated superposition of target objects are shot. And then the spliced image is used as a network model for inputting deep learning. The method has the advantages that all target objects can be completed by one image, but the image effect after image splicing and fusion is not good in many cases along with the influence of the quality of the shot image and the lens angle, so that the validity of the detection result is difficult to ensure when the image with poor fusion effect is input into a network for feature extraction. Secondly, since the deep learning algorithm is trained based on samples, the images used in the training are all original images taken individually. And the images after splicing and fusion all adopt a weighted superposition mode. Therefore, the sample acquisition difference between the training image and the actually used image is caused, and the detection result is influenced by the splicing and fusion effect, so that the fused algorithm for extracting the target has great uncertainty.

Disclosure of Invention

Problems to be solved

Aiming at the problem that in the existing image recognition, spliced and fused images are adopted as training images, and the algorithm of image target extraction is not accurate due to the difference between the training images and the samples of the actually used images, the method avoids the process of fusing the images, and extracts and matches the image characteristics firstly; then, respectively using the original input of the image to perform target detection, comparing the position of a target boundary frame obtained after the detection with the characteristic matching position, and screening to leave a target object; therefore, the original feature information of the image is retained to the maximum extent, the original feature of the image is conveniently input into the machine learning model to be extracted, and the accuracy of image identification is improved.

Technical scheme

In order to solve the above problems, the present invention adopts the following technical solutions.

The invention provides an image identification method applied to multi-image fusion, which comprises the following steps:

s102: acquiring an image to be identified, calibrating the image to be identified into a template image and a target image, and respectively extracting feature points of the template image and the target image through a feature extraction algorithm;

s104: traversing the feature points of the template image, screening the feature point information through distance, and drawing a feature matching area according to the screened feature point information;

s106: detecting the template image and the target image through a target detection algorithm, and acquiring a detection frame of a target object in the template image and the target image;

s108: and calculating the relative position of the detection frame and the feature matching area, and cutting the image to be recognized according to the relative position.

In some embodiments, before the step of acquiring the image to be recognized and calibrating the image to be recognized as the template image and the target image, the method further includes:

acquiring two images to be identified, and calculating the width and the height of the two images to be identified;

in the horizontal direction, taking the maximum value of the heights in the two images to be identified as the splicing height; adding the widths of the two images to be identified in the vertical direction to obtain a spliced width;

and constructing an image drawing board according to the spliced height and width, and randomly calibrating one image as a template image and the other image as a target image.

In some embodiments, the step of extracting the feature points of the template image and the target image respectively by a feature extraction algorithm includes:

and extracting the characteristics of the template image and the target image through an AKAZE algorithm, and respectively storing the extracted characteristic point information into matrix data structures.

In some embodiments, the step of traversing feature points of the template image, and the step of filtering the feature point information by distance includes:

traversing the characteristic points of the template image through a proximity algorithm to obtain a proximity distance and a secondary proximity distance;

calculating the ratio of the approach distance to the secondary approach distance, and keeping the characteristic points of which the ratio accords with a preset numerical range as matching values in the matrix data structure; feature points for which the ratio does not meet a predetermined range of values are subtracted from the matrix data structure.

In some embodiments, the step of drawing the feature matching region according to the filtered feature point information includes:

and acquiring pixel coordinates corresponding to the feature points, and drawing a feature matching area according to the splicing direction of the target image and the template image and the pixel coordinates.

In some embodiments, calculating the relative positional relationship of the detection box and the feature matching region includes:

according to the coordinate information of the detection frame and the feature matching area;

judging whether the detection frame is in a feature matching area or not; if not, cutting the template image and the target image according to the size of the detection frame;

when the detection frame is overlapped with the feature matching area, detecting whether the target object is in the overlapped area; cutting the template image and the target image according to the judgment result; wherein the detection frame comprises a template image and a detection frame of a target object in a target image.

In some embodiments, when the detection frame overlaps the feature matching region, detecting whether the target object is within the overlap region; the step of cutting the template image and the target image according to the judgment result comprises the following steps:

if the corresponding characteristic line in the overlapping area does not detect the target object, cutting the template image and the target image according to the size of the detection frame;

if there is a detection target object in the overlapping detection corresponding to the characteristic line, determining whether the target object and the target object in the step S106 are of one type;

if not, correspondingly cutting the template image and the target image according to the size of the detection frame;

if so, cutting the detection frame with the largest area and outputting.

The second aspect of the present invention provides an image recognition method and apparatus applied to multi-image fusion, including:

the system comprises a characteristic extraction module, a feature extraction module and a feature extraction module, wherein the characteristic extraction module is used for acquiring an image to be identified, calibrating the image to be identified into a template image and a target image, and respectively extracting characteristic points of the template image and the target image through a characteristic extraction algorithm;

the screening module is used for traversing the feature points of the template image, screening the feature point information through distance, and drawing a feature matching area according to the screened feature point information;

the target object detection module is used for detecting the template image through a target detection algorithm and acquiring a detection frame of a target object in the template image; and

and the identification module is used for calculating the relative position of the detection frame and the feature matching area and cutting the template image and the target image according to the result of the relative position.

A third aspect of the present invention provides an electronic device, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected in sequence, the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the above method.

A fourth aspect of the invention provides a readable storage medium, the storage medium storing a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method described above.

Has the beneficial effects of

Compared with the prior art, the invention has the beneficial effects that:

(1) the method avoids the process of fusing images, and firstly extracts and matches the image characteristics; then, respectively using the original input of the image to perform target detection, comparing the position of a target boundary frame obtained after the detection with the characteristic matching position, and screening to leave a target object; original feature information of the image is reserved to the greatest extent, the original feature of the image is conveniently input into the machine learning model to be extracted, and the accuracy of image recognition is improved;

(2) the error matching elimination algorithm is used, the ratio of the approach distance to the secondary approach distance is obtained through the KNN algorithm for screening, and the accuracy of the image feature points can be guaranteed to the maximum extent;

(3) the invention abandons the fusion algorithm with longer detection time, so that the image detection speed is higher.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps. In the drawings:

FIG. 1 is a flowchart of an image recognition method applied to multi-image fusion according to an embodiment of the present invention;

FIG. 2 is a block diagram of an adaptive image stitching and fusing apparatus according to an embodiment of the present invention;

FIG. 3 is a flowchart of an image adaptive boundary fusion method according to an embodiment of the present invention;

fig. 4 is a block diagram of an electronic device provided by an embodiment of the invention.

Detailed Description

Hereinafter, embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the embodiments described herein.

Exemplary method

As shown in fig. 1 and 3, an image recognition method applied to multi-image fusion includes the following steps:

s102: the method comprises the steps of obtaining an image to be identified, calibrating the image to be identified into a template image and a target image, and respectively extracting feature points of the template image and the target image through a feature extraction algorithm.

Specifically, the image to be recognized in this example may obtain an original image of the pointer code table to be recognized in a photo or a video, for example, the image is read by a camera, and a frame of the current image stream is directly read by using a camera reading Api; randomly selecting one image as a template image, and taking the rest images as target images; and extracting the characteristics of the template image and the target image through an AKAZE algorithm, and respectively storing the extracted characteristic point information into a matrix data structure, wherein the characteristic point information comprises the characteristic point information of the template image and the characteristic point information of the target image. It should be understood by those skilled in the art that the feature extraction algorithm herein may be a SIFT (scale invariant feature transform) algorithm, etc., and is not limited herein.

As a variation column, before the steps of acquiring an image to be recognized and calibrating the image to be recognized as a template image and a target image, the method further comprises the following steps:

It should be noted that, in the embodiment of the present invention, generally, a template image may be used as a template, and features (referred to as "template features" herein) characterizing the template image or an object of interest therein are obtained from the template image, where the template features may be points and/or edges that may characterize the template image or the object image; extracting features from the target image or template image, which may also be points and/or edges; the present example extracts feature point information, but not limited thereto.

S104: traversing the feature points of the template image, screening the feature point information through distance, and drawing a feature matching area according to the screened feature point information.

Specifically, traversing the feature points of the template image through a proximity algorithm to obtain a proximity distance and a secondary proximity distance; the feature points here refer to a large number of feature points that can be obtained by using the AKAZE detection algorithm, and these feature points are described by some numerical values (e.g., illuminance, angle, gray value, etc.). The approach distance and the next approach distance are used for explaining the similarity of the KNN screening feature points, the absolute value of the subtraction of the values of the feature points of the two images is used as the distance (generally called the feature distance because the calculation mode is very similar to the distance formula), the KNN algorithm provides a plurality of similarities, the similarities are sorted according to the distance calculation, the value with the minimum feature distance of each feature is selected as the approach distance, and the next minimum value is used as the next approach distance.

Calculating a ratio of the approach distance to a next approach distance, and keeping a close-range point of which the ratio meets a preset numerical range as a matching value in the matrix data structure; deleting from the matrix data structure the near distance points for which the ratio does not conform to a predetermined range of values.

In this example, the threshold is defined as thread, the near distance value is defined as, and the next near distance value is defined as, calculated according to the method; if yes, the feature point is used as a matching point and is kept in the matrix data structure determined in the step S102; if the feature point is deleted from the matrix data structure determined in step S102, in this example, the selection of the threshold value thread may be determined according to the actual situation, and is not limited herein. The error matching elimination algorithm used in the example can obtain the ratio of the approach distance to the secondary approach distance through the KNN algorithm for screening, and the accuracy of the image feature points can be guaranteed to the greatest extent.

It should be noted that the feature points detected in this example are pixel coordinates, and when horizontal stitching is required, the series of coordinates are sorted according to the x-axis (image horizontal direction); ordering according to the y axis (image vertical reversal) needing vertical splicing; the sorting way here is to bubble sort to find the maximum value of the y-axis or x-axis coordinate. If the two images are spliced in the left-right horizontal direction, the x axis (image horizontal pixels) is used as a separation area; similarly, how to vertically stitch the top and bottom, the y-axis (vertical pixels of the image) is used as the separation region.

In one possible embodiment, the target image maximum coordinate position (x 1, y 1) and the source image minimum coordinate position (x 2, y 2) are determined, where x1 is the abscissa of the target image maximum coordinate position, x2 is the abscissa of the target image minimum coordinate position, and y1 is the ordinate of the target image maximum coordinate position; y2 is the ordinate of the maximum coordinate position of the target image; and taking the coordinates of the two vertexes as the coordinates of the feature matching area, and taking the coordinates of (x 1, y 2) and (x 2, y 1) as the other two vertexes to obtain a rectangular feature matching area.

S106: and detecting the template image and the target image through a target detection algorithm, and acquiring a detection frame of a target object in the template image and the target image.

Specifically, the target detection algorithm may be that a target image training set is used to train a target detection YOLO model to obtain a detection model of the target object, and then the target detection model is used to detect a test sample to be identified to obtain a coordinate position of a rectangular bounding box with the target object in the test set.

S108: and calculating the relative position of the detection frame and the feature matching area, and cutting the image to be recognized according to the result of the relative position.

Specifically, according to the coordinates of the detection frame and the feature matching area; judging whether the detection frame is in a feature matching area or not; if not, cutting the target image and the template image according to the size of the detection frame; wherein the detection frame comprises a template image and a detection frame of a target object in a target image.

When the detection frame is overlapped with the feature matching area, detecting whether the target object is in the overlapped area; cutting the target image and the template image according to the judgment result;

if the corresponding characteristic line in the overlapping area does not detect the target object, the target image and the template image are detected according to the size of the detection frame;

if there is a detection target object in the overlapping detection corresponding to the characteristic line, determining whether the target object and the target object in the step S106 are of one type; the class is the meaning of class category in target detection, for example, if the detected categories are all people, the logic judgment is carried out to determine whether the areas are repeated; if the detection categories are one person and one vase, both the person and the vase need to be kept.

If not, cutting the target image and the template image according to the size of the detection frame; if so, the detection frame with the largest area is cut out and output as the original image. It should be noted that the maximum area here refers to the detection frame with the largest area when the detection frame obtained in the target image is compared with the detection frame in the template image. The class is the meaning of class category in target detection, for example, if the detected categories are all people, the logic judgment is carried out to determine whether the areas are repeated; if the detection categories are one person and one vase, both the person and the vase need to be kept.

Exemplary devices

As shown in fig. 2, an adaptive image stitching and fusing apparatus includes:

the feature extraction module 20 is configured to acquire an image to be identified, calibrate the image to be identified into a template image and a target image, and extract feature points of the template image and the target image respectively through a feature extraction algorithm.

The module is further used for acquiring two images to be identified and calculating the width and the height of the two images to be identified; in the horizontal direction, taking the maximum value of the heights in the two images to be identified as the splicing height; adding the widths of the two images to be identified in the vertical direction to obtain a spliced width; constructing an image drawing board according to the spliced height and width, and randomly calibrating one image as a template image and the other image as a target image; and extracting the characteristics of the template image and the target image through an AKAZE algorithm, and respectively storing the extracted characteristic point information into matrix data structures.

And the screening module 30 is configured to traverse the feature points of the template image, screen the feature point information by distance, and draw a feature matching area according to the screened feature point information. Specifically, traversing the feature points of the template image through a proximity algorithm to obtain a proximity distance and a secondary proximity distance; calculating a ratio of the approach distance to a next approach distance, and keeping a close-range point of which the ratio meets a preset numerical range as a matching value in the matrix data structure; deleting the close range points of which the ratio does not conform to a predetermined numerical range from the matrix data structure; and acquiring pixel coordinates corresponding to the feature points, and drawing a feature matching area according to the splicing direction of the target image and the template image and the pixel coordinates.

The target object detection module 40 is configured to detect the template image through a target detection algorithm, and acquire a detection frame of a target object in the template image; and

and the identification module 50 is used for calculating the relative position of the detection frame and the feature matching area, and cutting the image to be identified according to the result of the relative position. Specifically, according to the coordinates of the detection frame and the feature matching area; judging whether the detection frame is in a feature matching area or not; if not, cutting the template image and the target image according to the size of the detection frame; when the detection frame is overlapped with the feature matching area, detecting whether the target object is in the overlapped area; and cutting the template image and the target image according to the judgment result. If the corresponding characteristic line in the overlapping area does not detect the target object, cutting the template image and the target image according to the size of the detection frame; if there is a detection target object in the overlapping detection corresponding to the characteristic line, determining whether the target object and the target object in the step S106 are of one type; if not, cutting the template image and the target image according to the size of the detection frame; if so, the detection frame with the largest area is cut out and output as the original image.

Exemplary electronic device

Next, an electronic apparatus according to an embodiment of the present application is described with reference to fig. 4. The electronic device may be the removable device itself or a stand-alone device separate therefrom that may communicate with the removable device to receive the captured input signals therefrom and to transmit the combined image information thereto.

FIG. 4 illustrates a block diagram of an electronic device in accordance with an embodiment of the present application.

As shown in fig. 4, the electronic device 10 includes one or more processors 11 and memory 12.

The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.

Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 11 to implement the decision-making methods of the various embodiments of the present application described above and/or other desired functionality.

In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown). For example, the input device 13 may include various devices such as a camera, a video player, and the like. The input device 13 may also include, for example, a keyboard, a mouse, and the like. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 10 relevant to the present application are shown in fig. 4, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 10 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the decision-making method according to various embodiments of the present application described in the "exemplary methods" section of this specification above.

The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a decision method according to various embodiments of the present application described in the "exemplary methods" section above of this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. An image recognition method applied to multi-image fusion is characterized by comprising the following steps:

2. The image recognition method applied to multi-image fusion according to claim 1, wherein before the step of obtaining the image to be recognized and calibrating the image to be recognized as the template image and the target image, the method further comprises:

3. The image recognition method applied to multi-image fusion according to claim 2, wherein the step of extracting the feature points of the template image and the target image respectively through a feature extraction algorithm comprises:

4. The image recognition method applied to multi-image fusion according to claim 3, wherein the step of traversing the feature points of the template image and screening the feature point information by distance comprises:

5. The image recognition method applied to multi-image fusion according to claim 4, wherein the step of drawing the feature matching region according to the filtered feature point information comprises:

6. The image recognition method applied to multi-image fusion according to claim 1, wherein calculating the relative positional relationship between the detection frame and the feature matching region comprises:

7. The image recognition method applied to multi-image fusion according to claim 6, wherein when the detection frame overlaps the feature matching region, whether the target object is within the overlapping region is detected; the step of cutting the template image and the target image according to the judgment result comprises the following steps:

if so, cutting the detection frame with the largest area and outputting.

8. An image recognition method and device applied to multi-image fusion is characterized by comprising the following steps:

9. An electronic device comprising a processor, an input device, an output device, and a memory, the processor, the input device, the output device, and the memory being connected in series, the memory being configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-7.

10. A readable storage medium, characterized in that the storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.