WO2019000653A1

WO2019000653A1 - Image target identification method and apparatus

Info

Publication number: WO2019000653A1
Application number: PCT/CN2017/101704
Authority: WO
Inventors: 程雪岷; 毕洪生; 王育琦; 王嵘; 张临风
Original assignee: 清华大学深圳研究生院
Priority date: 2017-06-30
Filing date: 2017-09-14
Publication date: 2019-01-03
Also published as: CN110334706A; CN107330465B; CN110334706B; CN107330465A

Abstract

The invention discloses an image target identification method and apparatus. The image target identification method includes the steps of: S1, conducting binary processing for each pixel point in an image, and dividing the pixel points into effective pixel points and background points; S2, setting the magnitude of a third threshold according to the total number of pixel points of the image and the size scope of a target to be identified, comparing the number of effective pixel points in the connected area in a binary picture with the third threshold, setting the pixel points in the area as background points if the number is smaller than the third threshold so as to remove the area; S3, determining an external connecting rectangle frame of the remaining connected areas to form a framing area; and S4, taking connected areas with overlapping framing area as a combined integral area, and determining an external connecting rectangle frame of the integral area. In the image, the image content in the external connecting rectangle frame is identified as a target. The target identification method can effectively identify each target objects in the image with low contrast degree.

Description

Image target recognition method and device

[Technical Field]

The invention relates to an image object recognition method and device.

【Background technique】

Target recognition in images is a process that uses various algorithms to distinguish specific targets or features in an image from the machine, and provides a basis for further processing of the differentiated targets. In today's information network, it can be widely applied to many fields. The human eye tends to be slow in recognizing a specific target. If it is necessary to identify or distinguish a large amount of data or a large number of images, it requires a lot of manpower and material resources, using machine recognition instead of human eye recognition, and using computer computing to replace people. Eye use can increase speed and reduce energy consumption, which is very beneficial for the field of image recognition. For example, to identify the video frame image of a thousand intersections, it is required to find out the traffic flow through, obviously using machine recognition is far from the human eye recognition; similarly, if the image target recognition system is added to the robot, it is quite Adding "eyes" to the robot is also very beneficial for the development of AI technology. At present, people not only apply image recognition technology to face recognition, item recognition, etc., but also apply it to handwriting recognition and so on, which greatly facilitates people's lives.

Image target recognition technology generally follows the following processes: image preprocessing, image segmentation, feature extraction, and feature recognition or matching. However, the processed image is generally a clearer image, and there are few ways to image with lower contrast, and it is difficult to segment and extract effective target features.

[Summary of the Invention]

The technical problem to be solved by the present invention is to make up for the deficiencies of the prior art described above, and to provide an image object recognition method and apparatus, which can effectively recognize each target object in an image for an image with low contrast.

The technical problem of the present invention is solved by the following technical solutions:

An image object recognition method includes the following steps: S1, binarizing each pixel in an image into an effective pixel point and a background point, thereby converting the image into a binarized image; S2, according to the pixel of the image The total number of points and the size range of the target to be identified are set to a size of a third threshold, and the number of effective pixel points in the connected area in the binarized picture is compared with a third threshold, if less than The third threshold is used to set the pixel points in the area as the background point, thereby removing the area; S3, determining the circumscribed rectangular frame for the remaining connected areas to form a frame-taking area; wherein, the circumscribed rectangular frame The four sides are flat with the four sides of the image Line 4; S4, the connected area with overlapping areas of the frame is regarded as the combined whole area, and the circumscribed rectangular frame of the whole area is determined, and the four sides of the circumscribed rectangular frame are respectively parallel to the four sides of the image; in the image, the circumscribed rectangular frame The image content in the image is the recognized target.

An image object recognition device includes a binarization processing module, an area removal module, an area frame extraction module, and a region merging module; wherein the binarization processing module is configured to binarize and divide each pixel in the image An effective pixel and a background point, thereby converting the image into a binarized picture; the area removing module is configured to set a third threshold according to the total number of pixels of the image and the size range of the target to be identified And comparing the number of effective pixel points in the connected area in the binarized picture with a third threshold, if less than the third threshold, setting the pixel points in the area as the background point, thereby removing The area frame extraction module is configured to determine an circumscribed rectangular frame for each of the remaining connected areas to form a frame extraction area, wherein the four sides of the circumscribed rectangular frame are respectively parallel to the four sides of the image; the area merging module The connected area that overlaps the framed area is regarded as the merged whole area, and the circumscribed rectangular frame of the whole area is determined, and the four sides of the circumscribed rectangular frame are respectively associated with the figure. Parallel to the four sides of the image content is circumscribed rectangle box identified target.

The beneficial effects of the present invention compared to the prior art are:

The image object recognition method and device of the present invention converts into a binarized picture by binarization processing, and compares the number of pixel points in the image with a threshold value of the target size range to be identified, and then effectively discards the background area. . Finally, the image is segmented and merged by the connected domain method, thereby effectively identifying the location of the target in the image and the number of images in the image. Through the above steps, the present invention can improve the accuracy of identifying images with low contrast and unclear image features.

[Description of the Drawings]

1 is a flow chart of an image object recognition method according to an embodiment of the present invention;

2 is an effect diagram of a whole image converted to a binarized image according to an embodiment of the present invention;

Figure 3 is an effect diagram of Figure 2 after optimization to remove scatter noise;

Figure 4 is an effect diagram after removing the interference area in Figure 3;

FIG. 5 is an effect diagram of determining an circumscribed rectangular frame in an image according to an embodiment of the present invention; FIG.

6 is an effect diagram of determining a circumscribed rectangular frame by combining partial regions in an image according to an embodiment of the present invention;

7 is a schematic diagram of a binary classification of a support vector machine according to an embodiment of the present invention;

8 is a schematic diagram of a multivariate classification of a support vector machine according to an embodiment of the present invention;

9 is a flow chart of a first classification process of a specific embodiment of the present invention;

10 is an original diagram of edge information to be extracted in a specific embodiment of the present invention;

Figure 11 is an image of the region of interest of Figure 10;

Figure 12 is an image obtained after the feature point extraction in Figure 11;

FIG. 13 is a schematic diagram showing the distribution in the feature point statistical method in the specific embodiment of the present invention.

【Detailed ways】

The present invention will be further described in detail below in conjunction with the specific embodiments and with reference to the accompanying drawings.

As shown in FIG. 1 , it is a flowchart of an image object recognition method in the specific embodiment, which includes the following steps:

S1, binarizing each pixel in the image, dividing into effective pixel points and background points, thereby converting the image into a binarized picture.

In this step, the binarization conversion process facilitates subsequent identification of the location of the target. In the case of binarization, preferably, the first window is set centering on the pixel point, and the first threshold value is set by the average value and the standard deviation of the pixel values of the pixel points in the first window, The first threshold is compared with the pixel value of the pixel, and if the pixel value is greater than the first threshold, the pixel is set as the effective pixel; otherwise, the pixel is set as the background point.

Wherein, the first threshold can be obtained according to the following formula:

Wherein, when the pixel point (x, y) is centered, T(x, y) represents a first threshold corresponding to the pixel point (x, y); and R represents a standard of pixel values of pixels of the entire image. a dynamic range of difference; k is a set deviation coefficient, taking a positive value; m(x, y) represents an average value of pixel values of pixel points in the first window; δ(x, y) represents the first The standard deviation of the pixel grayscale values of the pixels within the window. Through the above calculation formula, the first threshold value can be adaptively adjusted according to the standard deviation of the pixel gray value of the pixel point in the first window.

In this process, the window is swept around the pixel, and the threshold is set by the average pixel value of the pixel in the first window and the standard deviation of the pixel value. For high contrast areas of the image, the standard deviation δ(x, y) approaches R, so that the threshold T(x, y) is set to be approximately equal to the mean m(x, y), ie the central pixel point (x, y) The pixel value is compared with a threshold approximating the average pixel value of the local window, which is greater than the threshold, that is, greater than the average pixel value, thereby being confirmed as a valid pixel point. In the field where the local contrast is very low, the standard deviation δ(x, y) is much smaller than R, so that the threshold T(x, y) obtained is smaller than the mean m(x, y). When comparing, the pixel value of the central pixel (x, y) is compared with a threshold smaller than the average pixel value of the local window, instead of always comparing with the fixed mean, so that the central pixel larger than the threshold can be reserved as Effective to avoid missing potential target pixels in the blurred area. The threshold value corresponding to each pixel point is set by using the local area as described above, and the threshold value is adaptively adjusted by using the standard deviation of the pixel points in the first window, so that the threshold value is adaptively adjusted according to the contrast of the image, so that the image can be Each pixel is accurately divided to avoid missing valid pixels due to image blur.

Comparing the first threshold with the pixel value of the pixel, if the pixel value is greater than the threshold, the point is a valid pixel, which can be set as a white point, as shown by the white point in FIG. 2; otherwise, as a background point, such as The pixel points of the black area shown in Fig. 2, thereby converting the entire image into a binarized picture.

Further preferably, the method further includes a process of performing a reconfirmation process on the binarized image, including: setting a second window centered on the pixel point, and setting a second threshold value according to the number of pixel points in the second window And comparing the number of effective pixel points in the second window with the second threshold, if the second threshold is greater than the second threshold, setting the pixel point as a valid pixel point; otherwise, setting the pixel point as a background point . In this step, the size of the second window may be the same as or different from the size of the first window.

Wherein, the second threshold can be obtained according to the following formula:

Wherein, the floor function represents a rounding down operation, and z represents the number of pixels in the second window. In this calculation method, a square window is taken as an example.

Can indicate the length of the side,

It represents the square of the diagonal line. After rounding the root number, it can be approximated as the rounding of the diagonal length. That is, the method of setting the second threshold is to use the number of pixels on the diagonal of the second window as a threshold. The meaning of subtracting 2 is to remove one pixel of its own, and then remove a possible effective pixel point, so that the threshold setting is more accurate. Of course, the rest of the way to customize the threshold is also feasible, as long as the most effective pixels can be identified.

The above further optimization process, on the basis of binarization, continues to select the second window centered on the pixel (the window size can be customized), thereby viewing the number of valid points in the second window as a whole, and The comparison is made from the set threshold. If it is larger than the threshold, the center pixel is set as the effective pixel point, otherwise it is noise, set as the background point, and removed. In this step, through the comparison process of the number of local effective pixel points in the second window, the central pixel point with more effective pixel points around is reconfirmed as a valid point, and the central pixel point with not too many effective pixel points around is Confirmed as a background point, effectively removing the scatter points in the image in Figure 2. In addition, it is also important to connect the breakpoints generated after the partial local area processing, for example, black spots may be converted into white in the process, thereby connecting adjacent white dots to form a connection. White area. This further optimization process facilitates subsequent accurate area identification. As shown in FIG. 3, the effect diagram after removing the scatter noise is further optimized.

S2, setting a third threshold according to a total number of pixels of the image and a size range of the target to be identified, and performing the number of effective pixel points in the connected region of the binarized picture with a third threshold In comparison, if it is smaller than the third threshold, the pixel points in the area are all set as background points, thereby removing the area.

After binarization of the image, scattered effective pixels in some areas, some areas concentrate more effective pixels, thus forming a connected area. In this process, the connected domains in the entire binarized picture are filtered to detect the area where the target is located, and the area of the interference is removed.

Specifically, the size of the third threshold is set, and the size of the third threshold is set according to the total number of pixels of the entire image and the size range of the target to be identified. The size of the third threshold may be set according to the following formula: {(a*b)*c/d}/e, where a*b represents the number of all pixels in the entire image, and a represents the pixel in the width direction. Number, b represents the number of pixels in the length direction; c represents the minimum size of the target to be identified; d represents the maximum size of the target to be identified; and e represents the maximum number of objects to be identified included in the estimated picture of a*b size. Taking the target to be identified as a plankton as an example, the size of the plankton is generally in the range of 20 μm to 5 cm. The total number of pixels included in the image acquired by the plankton collection device is 2448*2050. Estimate a picture containing up to 10 of the largest plankton (estimated, it can be viewed 1:1 according to the size and biological size of the whole picture, the size of the whole picture is 3 cm * 3.5 cm, 10.5 square centimeters, to float The average organism accounts for an area of 1 square centimeter, so rounding is estimated to include up to 10). When the third threshold is set, the third threshold is set to 200.736 by [(2448*2050)*20/50000]/10 setting.

Comparing the number of valid points in the connected area with the set third threshold. If the third threshold is smaller than the third threshold, it indicates that the effective points in the connected areas are insufficient, and the interference area is The pixels are set to the background point and the area is rounded off. As shown in FIG. 4, it is a schematic diagram of the effect after the interference area is omitted in FIG.

S3, determining a circumscribed rectangular frame for the remaining connected areas to form a frame taking area; wherein the four sides of the circumscribed rectangular frame are respectively parallel to the four sides of the image.

After the step S2, in the connected area, part of the area is discarded, and part of the area is reserved. For each of the remaining connected areas, the circumscribed rectangular frame in the horizontal direction of each area is determined by the above-described step S3 to form a frame-taking area. The circumscribed rectangle is a rectangle, and the four sides of the rectangle pass through the upper, lower, left, and right boundary pixels of the area (the top, bottom, left, and right pixels). The circumscribed rectangle in the horizontal direction indicates that the four sides of the rectangle are parallel to the four sides of the image and are horizontal. After the bounding rectangle is determined, the content in the rectangle is the framed area. As shown in FIG. 5, it is a schematic diagram of the effect after determining the circumscribed rectangular frame.

S4, the connected area with overlapping areas in the frame is regarded as the combined whole area, and the whole area is determined. Connect the rectangle, the four sides of the circumscribed rectangle are parallel to the four sides of the image, and the image content of the circumscribed rectangle is the recognized target.

For the area taken by the frame, some areas are independent and scattered, and some areas overlap each other. For the overlapping parts of the rectangular frame, the connected area of this part is regarded as the combined whole area, and the circumscribed rectangular frame of the horizontal direction is determined for the whole area.

As shown in FIG. 6, after the step S4, the effect of the circumscribed rectangular frame is determined in the image. With respect to Figure 5, some of the areas in Figure 6 are taken from a spliced box of circumscribed rectangles. In Figure 6, the image content in each circumscribed rectangle is the identified target, thereby screening out the location of the suspected target, and the corresponding number.

In the specific embodiment, when the blurred image (for example, an image formed in a water body with high turbidity) is processed through the above steps, the local threshold is compared, and the pixel is accurately binarized as an effective point or a background noise point, and then Perform denoising again on the connected domain after binarization, connect the domain frame processing and merge processing, so as to effectively segment the image and extract the region of interest where the target is located, which can improve the contrast and image features. Clear images for accurate recognition. This target recognition method is particularly suitable for the recognition of plankton photographed in water.

After identifying the area where the target is located, the classification method may be further combined with the classification method to identify the category information of the target. Specifically, M may be included, normalized processing of each type of sample, and features of each type of sample, such as boundary gradient, edge density, and distribution of feature points obtained by an edge extraction algorithm, are hierarchically extracted according to categories. Imported into the training device for learning; N, normalizes the area to be detected, extracts the characteristics of each region, and introduces the classifier, classifies each region according to the learning situation of step M, and statistical results, Thereby identifying the category information to which the target belongs. In the specific embodiment, the following two classification schemes are respectively classified from two aspects: a boundary gradient and a morphological structural unit feature. Of course, in practical applications, other classification methods that are more suitable may be selected according to actual conditions.

In order to facilitate the classification and recognition processing, each of the extracted regions is normalized and processed into an image containing 128*128 pixels.

The first classification scheme: the SVM+HOG classification method is used to analyze the boundary gradient for classification. After the normalized background denoising process is performed on the normalized image, the edge density and the boundary gradient of the extracted image are extracted into a histogram, so that the support vector machine (SVM) combined with the direction gradient histogram (HOG) is used to measure the image. Analyze and identify which category of target. SVM is a traditional binary classifier, and its principle is shown in Figure 7. Where x ₁ represents a sample point with a denser line below; x ₂ represents a sample point where the upper line is sparse. The meaning of ω ^T x+b=0 is: the linear equation is used to divide the hyperplanes of different samples; the 1 and -1 on the right side of the linear equation represent two species.

Represents the distance between the outermost parallel faces of the two categories. Taking the target to be identified as a plankton as an example, there are many kinds of plankton, and only binary is not enough, so in the specific embodiment, it is optimized into a plurality of classifiers.

The classification process consists of the following steps:

The samples are trained prior to classification (samples are selected beforehand). The training process is as follows: the n-type samples are divided into two types according to the dichotomy: 1~n/2 and n/2+1~n, and then the edge density and boundary gradient statistics of the two types of samples are included; The process continues to classify and count the two categories in a two-pointed manner until the sample is sorted into a separate category, indicating the end of training. The schematic is shown in Figure 8.

When classifying, the image of each connected domain after normalization is extracted, and the edge density and boundary gradient of the image in each region are extracted respectively. According to the edge density and the gradient information, the statistical information of the sample obtained by the training is compared, and the image is classified. For n/2 categories in n major categories, the classification process is repeated, the images are classified into n/4 categories in n/2 categories, and the classification is repeated until the images are classified into one of the categories, thereby obtaining an image. The biological category to which it belongs. The flow chart of the classification is shown in Figure 9.

When finding a certain category, since the image to be detected is unknown to the classifier, time is most important for the type of search. The most common ways of searching and sorting are bubbling, dichotomy, and quick sorting. In terms of time complexity, the bubbling algorithm is O(n ² ), the dichotomy is O(log ₂ n), and the fast ordering is O(n*logn). In the specific embodiment, the dichotomy is finally selected as the searching means.

The second classification scheme: the feature point distribution algorithm (shape-context) is used to analyze the morphological structural unit features for classification. The feature points are extracted by the edge fast extraction algorithm. The algorithm can directly extract the edges of the graph, so that the extracted points can be used as feature points to more effectively see the edges and feature distribution of the graph. The edge fast extraction algorithm is accurate and time consuming. Taking the original image shown in FIG. 10 as an example, the size is 2448*2050, and the image of the plankton in the region of interest is as shown in FIG. 11 and the size is 210*210. The process of extracting the feature points of the suspected plankton region is time-consuming. At 54 seconds, the image of the feature points (black pixels) obtained after the extraction is as shown in FIG.

The process of analyzing boundary gradients for classification includes the following steps:

Before the classification, the sample is trained (the sample is selected in advance), and the training process is: the sample is processed by the edge fast extraction algorithm to obtain the distribution of the edge and the feature points, and then the feature points shown in FIG. 13 are obtained. The method calculates the distribution of feature points, and statistically distributes the feature points of each sample in a separate text. The distribution of the feature points of all samples is calculated to complete the training. The statistical method shown in FIG. 13 is that 8 points are equally centered on the feature points (45° is an area, 360° is divided into 8 areas), and then 5 areas are spread out according to the size of the graphic feature, that is, the feature is The point is centered, the maximum radius of the circumcircle that can contain all the feature points, and the maximum radius is divided into five equal parts to form five circles, and each circle is divided into eight regions according to the above, thereby all the feature points in the graph Divided into 40 areas.

When classifying, the image of each connected domain after normalization is processed by the edge fast extraction algorithm to obtain the distribution of edges and feature points, and then the feature point distribution is statistically analyzed by the method shown in FIG. The statistical feature point distribution result is compared with the statistical point distribution statistical result of each sample obtained by the training, thereby identifying the category to which the image to be detected belongs.

Through the above-designed multi-class classifiers and various types of trainers, it is better to classify targets, such as the world's thousands of species.

An embodiment of the present invention further provides an image object recognition apparatus, including a binarization processing module, an area removal module, an area frame extraction module, and a region merging module; wherein the binarization processing module is configured to use each pixel in the image Point binarization processing, which is divided into effective pixel points and background points, thereby converting the image into binarized pictures; the area removing module is used for the total number of pixels according to the image and the size range of the target to be identified Setting a size of the third threshold, comparing the number of valid pixel points in the connected area in the binarized picture with a third threshold, and if smaller than the third threshold, the pixel points in the area are Set as a background point to remove the area; the area frame fetching module is configured to determine the circumscribed rectangular frame of the remaining connected areas to form a frame taking area; wherein the four sides of the circumscribed rectangular frame respectively correspond to the four sides of the image Parallel; the area merging module is used to treat the connected area with overlapping areas of the frame as the merged whole area, and determine the circumscribed rectangular frame of the whole area, and the external connection Shaped frame of the four sides respectively parallel to the four sides of the image, the image content is circumscribed rectangle box identified target. The object recognition device of the present embodiment can improve the accuracy of recognizing an image with low contrast and unclear image characteristics.

The above image object recognition apparatus may further include a feature extraction module, a trainer learning module, and a classification recognition module.

The feature extraction module is configured to acquire features of the target regions identified in the image and collect statistics, and are used to acquire features of the samples of each category and count, for example, features of samples of the respective biological species.

The trainer learning module is configured to import the features of the samples of the various kinds obtained by the feature extraction module into the trainer, and learn according to the characteristics of each type of sample.

The classification and identification module is configured to import, into the classifier, the feature of the region where the target is located in the image obtained by the feature extraction module, where the classifier is configured to perform the feature of the region where the target is located and the result of the sample training learning. The comparison, thereby classifying the targets within the region, and obtaining the category information to which the target belongs. Based on the sample of the biological species, the biological category information of the target of the biological species can be obtained.

The image object recognition device adds the above module, and can further analyze the identified target of a certain kind, and acquire the category to which the target belongs, such as the biological category information.

The above is a further detailed description of the present invention in connection with the specific preferred embodiments, and the specific embodiments of the present invention are not limited to the description. It will be apparent to those skilled in the art that <RTIgt; </ RTI> <RTIgt; </ RTI> <RTIgt; </ RTI> <RTIgt; </ RTI> <RTIgt;

Claims

An image object recognition method, comprising: the following steps: S1, binarizing each pixel in an image into an effective pixel point and a background point, thereby converting the image into a binarized image; S2, And setting a third threshold according to a total number of pixels of the image and a size range of the target to be identified, and comparing the number of effective pixel points in the connected region in the binarized picture with a third threshold, If it is smaller than the third threshold, the pixel points in the area are set as the background point, thereby removing the area; and S3, the circumscribed rectangular frame is determined for each of the remaining connected areas to form a frame-taking area; The four sides of the circumscribed rectangular frame are respectively parallel to the four sides of the image; S4, the connected area with overlapping areas of the frame is regarded as the merged whole area, and the circumscribed rectangular frame of the whole area is determined, and the four sides of the circumscribed rectangular frame are respectively Parallel to the four sides of the image; in the image, the image content in the circumscribed rectangle is the recognized target.
The image object recognition method according to claim 1, wherein in step S1, each pixel in the image is subjected to binarization processing: setting a first window centering on the pixel, and passing the pixel in the first window The average value and the standard deviation of the pixel values of the points are set to a size of the first threshold, the first threshold is compared with the pixel value of the pixel, and if the pixel value is greater than the first threshold, the pixel is set as the effective pixel; Otherwise, set the pixel as the background point.
The image object recognition method according to claim 2, wherein the first threshold is obtained according to the following formula:
Wherein, when the pixel point (x, y) is centered, T(x, y) represents a first threshold corresponding to the pixel point (x, y); and R represents a pixel gray value of a pixel of the entire image. The dynamic range of the standard deviation; k is the set deviation coefficient, taking a positive value; m(x, y) represents the average value of the pixel values of the pixel points in the first window; δ(x, y) represents the The standard deviation of the pixel gray value of the pixel within the first window.
The image object recognition method according to claim 2, wherein the step S1 further comprises the step of: performing reconfirmation processing on the basis of the binarization processing: setting the second window centering on the pixel point, according to The number of pixels in the second window is set to a size of the second threshold; the number of effective pixels in the second window is compared with the second threshold, and if it is greater than the second threshold, the pixel is set Is a valid pixel; otherwise, the pixel is set as the background point.
The image object recognition method according to claim 4, wherein the second threshold is obtained according to the following formula:
Wherein, the floor function represents a rounding down operation, and z represents the number of pixels in the second window.
The image object recognition method according to claim 1, wherein in the step S2, the third threshold is obtained according to the following formula: {(a*b)*c/d}/e, wherein a* b represents the number of all pixels in the entire image, a represents the number of pixels in the width direction, b represents the number of pixels in the length direction; c represents the minimum size of the target to be identified; d represents the maximum size of the target to be identified ;e indicates the maximum number of targets to be identified included in the estimated a*b size picture.
The image object recognition method according to claim 1, wherein the object to be identified is a plankton to be identified.
The image object recognition method according to claim 1, further comprising the steps of: normalizing each type of sample, and extracting features of each type of sample hierarchically according to categories, and importing Learning in the trainer; N, normalizing the region in which the identified target is located, extracting the respective features of each region, and introducing the classifier, classifying each region according to the learning situation of step M, and counting As a result, the category information to which the target in the area belongs is identified.
The image object recognition method according to claim 1, further comprising the step S5 of acquiring the type information of the identified target: S51, sample training: dividing the n types of samples into 1 to n/2 according to a binary method. And n/2+1~n two categories, the edge density and boundary gradient statistics of the graphs of the samples of the two categories are included; the above process is repeated, and the n/2 classes of the two classes are divided according to the dichotomy The way to continue classification and statistics until the sample is sorted into a single category, and the edge density and boundary gradient of the graph of the samples of each individual category are counted; S52, the regions in which the target is located are normalized; S53, Classification: For each region after normalization, extract the edge density and boundary gradient of the image in each region, and compare the statistical information of the sample obtained in step S51 according to the edge density and the boundary gradient information to classify the image. To n/2 categories in n major categories, repeat the above classification process, classify images into n/4 categories in n/2 categories, repeat the classification process until the images are classified Wherein a separate category, category information obtained so as to acquire the target region belongs.
The image object recognition method according to claim 1, further comprising the step S6 of acquiring the type information of the identified target: S61, sample training: processing the n types of samples by using an edge fast extraction algorithm to obtain edges and features. According to the distribution of points, the distribution of feature points is statistically analyzed by feature point statistics method to calculate the distribution of feature points of samples of each category; S62, normalize each area where the target is located Processing; S63, classification: the image of each region after normalization is processed by the edge fast extraction algorithm to obtain the distribution of the edge and the feature points, and then the feature point distribution is statistically analyzed by the feature point statistical method, and the statistics are collected. The result is compared with the statistical result of the samples of the respective categories obtained by the training in step S61, thereby identifying the category information to which the target belongs.
An image object recognition device, comprising: a binarization processing module, an area removal module, an area frame extraction module, and a region merging module; wherein the binarization processing module is configured to binary values of each pixel in the image Processing, dividing into effective pixel points and background points, thereby converting the image into a binarized picture; the area removing module is configured to set the number according to the total number of pixels of the image and the size range of the target to be identified The size of the three thresholds is used to compare the number of effective pixel points in the connected area in the binarized picture with a third threshold. If the value is smaller than the third threshold, the pixel points in the area are set as the background. Pointing, thereby removing the area; the area frame fetching module is configured to determine the circumscribed rectangular frame of the remaining connected areas to form a frame taking area; wherein the four sides of the circumscribed rectangular frame are respectively parallel to the four sides of the image; The area merging module is used to treat the connected area with overlapping areas of the frame as the merged whole area, and determine the circumscribed rectangular frame of the whole area, and the quaternary rectangular frame is four. Sides respectively parallel to the four sides of the image, the image content is circumscribed rectangle box identified target.
The image object recognition apparatus according to claim 11, further comprising: a feature extraction module, a trainer learning module, and a classification recognition module; wherein the feature extraction module is configured to acquire the target identified in the region merge module And the statistics of the regions are used to obtain the features of the various kinds of samples and are counted; the trainer learning module is configured to import the features of the samples of the various kinds obtained by the feature extraction module into the training device, the training The device is configured to learn according to the characteristics of each type of sample; the classification and identification module is configured to import, into the classifier, the feature of the region where the target is located in the image obtained by the feature extraction module, and the classifier is configured to use the target The characteristics of the region in which it is located are compared with the results obtained by the trainer for the sample training learning, and the objects in the region are classified to obtain the category information to which the target belongs.