CN111161243A

CN111161243A - Surface defect detection method for industrial products based on sample enhancement

Info

Publication number: CN111161243A
Application number: CN201911390407.9A
Authority: CN
Inventors: 许玉格; 郭子兴; 戴诗陆; 吴宗泽
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-05-15
Anticipated expiration: 2039-12-30
Also published as: CN111161243B

Abstract

The invention discloses a method for detecting surface defects of industrial products based on sample enhancement. Enhancement; 3) Randomly splicing and enhancing images with defects and normal images; 4) Iterative training using Cascade-RCNN algorithm; 5) Obtaining Cascade-RCNN detection model; 6) Using Cascade-RCNN detection model to detect industrial products to be detected The surface image and the texture template image that is determined to be defect-free are subjected to sliding window detection, the results detected by the sliding window are spliced, and the results obtained by the two are compared, and finally the defect category and area annotation of the image to be detected are obtained. The invention can effectively reduce the influence of illumination, exposure, displacement and other conditions on defect detection, improve the detection stability, and at the same time improve the distinguishing ability of the two-stage target detector for patterns and backgrounds, and reduce the false detection rate.

Description

Industrial product surface defect detection method based on sample enhancement

Technical Field

The invention relates to the technical field of industrial product surface defect detection, in particular to a sample enhancement-based industrial product surface defect detection method.

Background

Defect detection is an important part of the production process, and ensures the reliability of industrial products. Surface defect detection of industrial products requires precise positioning of defect positions on a surface and classification of the positioned defects, which is a typical target detection problem. In the past, the surface defect detection technology of industrial products generally uses the traditional machine vision technology to perform operations such as picture gray level binarization, edge contour extraction, template matching and the like, and the defects of the operations are that the operations are very sensitive to changes such as illumination, displacement and the like of pictures and the robustness is poor. In addition, previous surface defect detection studies were based on solid color product surfaces, and since the surface texture features of specially textured products and the defect texture features are similar, it is difficult to distinguish between the two in previous methods.

The target detection in deep learning is realized by taking a convolutional neural network as a feature extractor, and the extracted feature graph is insensitive to changes such as illumination, displacement and the like and has better robustness. A two-stage target detector is composed of a Region Proposal Network (Region Proposal Network) and a classification regression Network, wherein the Region Proposal Network is responsible for generating suggestions of regions where targets may be located, and the classification regression Network classifies the suggested regions and finely adjusts a labeling frame. The function of the network consists of classification loss and regression loss weighting, and a random gradient descent method is adopted for back propagation iteration.

The existing two-stage deep learning target detector has high precision and good universality, but the problems that defects and background textures are difficult to distinguish, normal pictures without the defects cannot participate in model training, pictures of industrial products have high video memory requirements and the like still exist in the surface defect detection with the textures.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a sample enhancement-based industrial product surface defect detection method, can effectively reduce the influence of conditions such as illumination, exposure and displacement on defect detection, improves the detection stability, improves the resolution of a two-stage target detector on patterns and backgrounds, and reduces the false detection rate.

In order to achieve the purpose, the technical scheme provided by the invention is as follows: the industrial product surface defect detection method based on sample enhancement comprises the following steps:

1) carrying out size standardization operation on a picture set on the surface of an industrial product, wherein the picture containing the defects has a corresponding defect marking file, carrying out cutting operation on the defect picture and the defect marking file corresponding to each picture, and dividing the defect picture and the defect marking file into a normal picture set and a defect picture set according to the cut marks;

2) normalizing and enhancing online random data of the defect picture set obtained in the step 1), including randomly turning the defect picture set up and down, left and right, and dividing the defect picture set into batches;

3) for each defective picture in the batch in the step 2), randomly searching a normal picture which corresponds to the defective picture and has the same cutting position and the same texture template pattern in the normal picture set, performing left or right splicing operation on the normal picture and the defective picture, and correspondingly modifying the label file;

4) performing iterative training on the pictures and labels of each batch obtained in the step 3) by using a Cascade-RCNN algorithm, and finishing a round of training after finishing training all batches;

5) after finishing one round of training, repeating the steps 2) to 4) until reaching the set iteration round, outputting and storing parameters in the network to obtain a Cascade-RCNN detection model;

6) and 5) performing sliding window detection on the surface picture of the industrial product to be detected and the determined non-defective texture template picture by using the Cascade-RCNN detection model obtained in the step 5), splicing the results detected by the sliding window, and comparing the results obtained by the sliding window and the determined non-defective texture template picture to obtain the defect type and the region label of the picture to be detected finally.

In step 1), the image set on the surface of the industrial product comprises a template image set Z consisting of a defect-containing image set X, a defect-free normal image set Y and a pattern example image of each texture template, wherein the defect image set X contains labels, and each defect label is a rectangular label frame with a format of (name, category, X)_min,y_min,x_max,y_max) Where name represents the picture name, category represents the type of defect, (x)_min,y_min) The horizontal and vertical coordinates (x) of the upper left corner of the rectangular labeling box_max,y_max) The horizontal and vertical coordinates of the lower right corner of the rectangular labeling frame are represented, and neither the picture set Y nor the picture set Z has labeling information; carrying out size standardization on the three picture sets to enable all pictures to be RGB pictures of H x W, wherein H and W are the height and width of the pictures;

the same average cutting is carried out on the three picture sets, and the defect marks on the picture set X are cut according to the following rules: cutting the defective rectangular marking frame in the same way as the picture, mapping the defective rectangular marking frame into the range of the cut small picture, if the marking is cut off, calculating the proportion of the area of the cut rectangular marking frame to the area of the original rectangular marking frame, if the area is larger than a set threshold epsilon, keeping the marking, and otherwise, discarding;

dividing the picture into a defect picture set X according to the cut marking information_newSet of normal pictures Y_newAnd template picture set Z_newAnd stored according to the position where they were cut.

The step 3) comprises the following steps:

3.1) for the defect picture x which is randomly flipped in step 2)_i∈X_newFinding out corresponding normal picture set M e (Y) according to template texture and cutting position_new,Z_new) So that x is_iAnd the texture template and cut position are the same for all samples in M, where X_newFor a set of cut defect pictures, Y_new,Z_newRespectively a normal picture set and a template picture set after cutting;

3.2) randomly selecting a normal picture y_iE.g. M, according to x_iSize of (a) to y_iIs filled with a value of 0, such that y_iSize and x of_iAre all the same in size and for y_iCarrying out normalization processing;

3.3) to y_iCarrying out data enhancement operation, namely randomly turning up, down, left and right;

3.4) generating a random number between (0,1) at 50%Probability is such that x_new＝(x_i,y_i) Another 50% probability is such that x_new＝(y_i,x_i) I.e. randomly splicing left or right, where x_newRepresenting the generated new sample;

3.5) processing the marking information according to the splicing mode, if the picture x has defects_iOn the left, there is no need to change the defect label, if the defect picture x_iAnd on the right, the rectangular marking box needs to be corrected correspondingly.

In step 4), the Cascade-RCNN algorithm comprises a trunk network, a region proposing network and a classification regression network, which are respectively used for extracting features, generating region suggestions and classifying and fine-tuning candidate frames; the convolutional neural network ResNeXt-101 and the feature pyramid FPN are used as a backbone network, the area proposal network uses an area proposal network part in a two-stage target detector, namely, fast-RCNN, and the classification regression network uses a multilayer cascade network.

In step 6), the following detection process is performed:

6.1) for an industrial product surface picture to be detected, using a preset sliding window size, using the Cascade-RCNN detection model obtained in the step 5) to perform sliding window detection on the picture to be detected, and mapping the result back to the area of the original image to obtain the label format (category, x) of each defect_min,y_min,x_max,y_maxScore), category denotes the defect class, (x)_min,y_min) The horizontal and vertical coordinates (x) of the upper left corner of the rectangular labeling box_max,y_max) The horizontal and vertical coordinates of the lower right corner of the rectangular labeling frame are represented, score represents the confidence coefficient of defect judgment, and the value of the confidence coefficient is (0, 1);

6.2) judging the defect labels close to the edges of the sliding windows, if the adjacent sliding windows have labels with the same category and the similar position size, carrying out label merging operation according to the confidence degree sequence, wherein the merged rectangular label frame is the minimum circumscribed rectangle of a plurality of rectangular label frames, and calculating the new confidence degree to obtain the average value of the minimum circumscribed rectangle, which is as follows:

score_new＝(score₁+score₂+...+score_n)/n

wherein, score_newIndicates the new confidence, score_iRepresenting the confidence of the ith rectangular labeling frame participating in synthesis, and n representing the total number of the rectangular labeling frames participating in synthesis;

6.3) adopting the steps 6.1) and 6.2) to carry out defect detection on the template picture set Z in advance under the offline condition, and storing the obtained result;

6.4) detecting the surface picture of the industrial product to be detected online by adopting the steps 6.1) and 6.2), comparing the obtained result with the detection result of the corresponding template stored in the step 6.3), adopting IoU as a comparison standard, and obtaining the following calculation formula:

wherein DR represents a defect rectangular marking frame detected on the to-be-detected image, and GT represents a real defect rectangular marking frame; the specific method for comparison is as follows: and comparing the defect labels belonging to the same category, and if IoU is greater than a set threshold tau and the defect confidence coefficient on the picture to be detected is less than a set threshold gamma, considering the defect labels on the picture to be detected as texture false detection and removing the texture false detection, thereby obtaining the final defect position labels and the corresponding categories of the picture to be detected.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. the method adopts deep learning target detection as an integral detection frame, reduces the problem of algorithm quality reduction caused by ambient illumination, camera exposure and displacement, and improves the stability of the algorithm for detecting the surface defects of industrial products.

2. The method provided by the invention cuts data with larger picture size, increases the training data amount, reduces the requirement on video memory in the training process, enables the picture to keep the original size input during training, and is not easy to lose the characteristics of tiny flaws. Meanwhile, the method for detecting the combined defect frame by using the sliding window during detection ensures the uniformity and the integrity of the output of the defect marking frame.

3. In order to improve the difference of the detector on the background patterns and the defects of the texture textiles, in addition to the traditional deep learning data enhancement method, before each iteration of each picture, a mixed splicing mode of the defect pictures and the corresponding normal pictures of the texture patterns is used as an online data enhancement method, so that in the training process, under the condition that the data volume is not increased in a large scale, the diversity of negative samples is enhanced, and the false detection rate of the detector is reduced.

4. The texture template picture is used for pre-detection in the detection process, and the false detection position generated by the texture template is relatively fixed, so that the false detection of small defects generated by the texture template at certain fixed positions on the surface of an industrial product can be eliminated by comparing the defect positions and types of the picture result to be detected and the template result thereof, and the overall identification accuracy is improved.

Drawings

FIG. 1 is a training flow diagram of a sample-based enhanced industrial product surface defect detection method.

FIG. 2 is a defect detection flow chart of a sample enhancement based industrial product surface defect detection method.

Detailed Description

The present invention will be further described with reference to the following specific examples.

The example uses the real collected data of textile pattern with patterns, which includes 15 defects such as stain, stitch mark, hole, etc., the pattern template has 68 kinds, including each template picture, several normal pictures and defect pictures with marks, and the picture size is 4096 x 1810 to 4096 x 1696.

As shown in fig. 1 and fig. 2, the method for detecting surface defects of an industrial product based on sample enhancement provided by the present embodiment includes the following steps:

1) and carrying out size standardization operation on the textile picture set with the patterns, wherein the pictures containing the defects have corresponding defect marking files, cutting the defect pictures and the corresponding defect marks of each picture, and dividing the pictures into a normal picture set and a defect picture set according to the cut marks.

The textile data set with patterns comprises a defect picture set X, a normal picture set Y and a pattern template picture set Z, wherein the defect picture set X contains marks, and each mark of each defect is a rectangular mark frame with a format of (name, category, X)_min,y_min,x_max,y_max) Where name represents the picture name, category represents the type of defect, (x)_min,y_min) The horizontal and vertical coordinates (x) of the upper left corner of the rectangular labeling box_max,y_max) And the horizontal and vertical coordinates of the lower right corner of the rectangular labeling frame are represented, and the image sets Y and Z have no labeling information. The three sets of pictures were normalized in size so that all pictures were RGB pictures of 4096 x 1810.

The three picture sets are equally cut, and the defect marks on the picture set X are cut according to the following rules: and cutting the defective rectangular marking frame in the same way as the picture, mapping the defective rectangular marking frame into the range of the cut small picture, if the marking is cut off, calculating the proportion of the area of the cut rectangular marking frame to the area of the original rectangular marking frame, if the area is greater than a set threshold value epsilon, namely 0.25, keeping the marking, and otherwise, discarding the marking.

2) Normalizing and enhancing online random data of the defect picture set obtained in the step 1), wherein the normalization and online random data enhancement comprise random turning up and down, left and right and dividing the defect picture set into batches. The method specifically comprises the following steps: the value 0 is filled in the multiple of the length and the width of 32, the pictures are normalized, a series of operations of random inversion is carried out with the probability of 50%, and then batch processing is carried out, so that the training of the network is facilitated, the training batch number used in the example is 1, namely, one picture is obtained in each batch, and the size of the picture is 1024 x 928.

3) And 2) for each defect picture in the batch in the step 2), randomly searching a normal picture which corresponds to the same normal picture and has the same cutting position and the same texture template pattern in the normal picture set, performing left or right splicing operation on the normal picture and the defect picture, and correspondingly modifying the label file.

3.1) for the defect picture x which is randomly flipped in step 2)_i∈X_newFinding out the pattern and cutting position according to the templateCorresponding normal picture set M e (Y)_new,Z_new) So that x is_iAnd the texture template and cut position are the same for all samples in M, where X_newFor a set of cut defect pictures, Y_new,Z_newRespectively a normal picture set and a template picture set after cutting.

3.2) randomly selecting a normal picture y_iE is M, for y_iIs filled with a value of 0, such that y_iIs also filled to 1024 x 928 for y_iAnd (6) carrying out normalization processing.

3.3) to y_iAnd performing data enhancement operation, namely randomly turning up, down, left and right.

3.4) generating a random number between (0,1) such that x is given a probability of 50%_new＝(x_i,y_i) Another 50% probability is such that x_new＝(y_i,x_i) I.e. randomly splicing left or right, where x_newThe new samples generated are shown, and the new sample size after splicing is 2048 x 928.

3.5) processing the marking information according to the splicing mode, if the picture x has defects_iOn the left, the defect label does not need to be changed, if the defect picture x_iOn the right, 1024 is added to both the minimum and maximum of the broadside coordinates in the rectangular box label.

4) And (3) carrying out iterative training on the pictures and labels of each batch obtained in the step 3) by using a Cascade-RCNN algorithm, and finishing a round of training after all batches are trained.

The Cascade-RCNN algorithm comprises a trunk network, a region proposing network and a classification regression network, which are respectively used for extracting features, generating region propositions and classifying and fine-tuning candidate frames. In the invention, a convolutional neural network ResNeXt-101 and a characteristic pyramid FPN are used as a backbone network, a region proposing network in a two-stage target detector, namely, fast-RCNN, is used as a region proposing network, and a multi-layer cascade network is used as a classification regression network. And training all the enhanced pictures for one round, and finishing one round of training.

5) And after finishing one round of training, repeating the steps 2) to 4) until reaching the set iteration round, wherein the set iteration round is 12 in the example, and outputting and storing the parameters in the network to obtain the Cascade-RCNN detection model.

6) And 5) carrying out sliding window detection on the textile picture to be detected and the pattern template picture determined to be free of defects by using the Cascade-RCNN detection model obtained in the step 5), splicing the results detected by the sliding window, and comparing the results obtained by the sliding window and the pattern template picture to obtain the defect type and the area label of the picture to be detected.

6.1) setting the size of a sliding window to be 1024 x 905 for a picture of the surface of an industrial product to be detected, performing sliding window detection on the picture to be detected by using the Cascade-RCNN detection model obtained in the step 5), and mapping the result back to the region of an original image to obtain the label format (category, x) of each defect_min,y_min,x_max,y_maxScore), category, indicates the defect class, (x)_min,y_min) The horizontal and vertical coordinates (x) of the upper left corner of the rectangular labeling box_max,y_max) And the horizontal and vertical coordinates of the lower right corner of the rectangular labeling frame are represented, the score represents the confidence coefficient of defect judgment, and the value of the confidence coefficient is (0, 1).

6.2) judging the defect labels close to the edges of the sliding windows, wherein if the adjacent sliding windows have labels with the same category and similar position and size, the adjacent standard of the judgment label is as follows: the distance between the rectangular marking frame and the boundary position of the picture is less than 20 pixels, and the shortest distance between the rectangular marking frame and the other marking frame is less than 30 pixels. And then, carrying out annotation merging operation on the holding annotation frames meeting the conditions according to the confidence degree sequence of the holding annotation frames, and merging the defective rectangular annotation frames to allow at most one annotation in each cutting area to participate. The merged rectangle frame is the minimum bounding rectangle of the plurality of rectangle frames, and the new confidence calculation takes the mean value of the rectangle frames as follows:

score_new＝(score₁+score₂+...+score_n)/n

wherein, score_iAnd representing the confidence of the ith rectangular labeling box participating in synthesis, and n represents the total number of the rectangular boxes participating in synthesis.

6.3) adopting the steps 6.1) and 6.2) to carry out defect detection on the pattern template picture Z of the textile with the patterns in advance under the offline condition, and storing the obtained result.

6.4) adopting the steps 6.1) and 6.2) to carry out defect detection on the textile picture with the patterns needing to be detected on line, comparing the obtained result with the result of the corresponding template stored in the step 6.3), and adopting an interaction over Union (IoU) as a comparison standard, wherein the calculation formula is as follows:

wherein DR represents a defect rectangular marking frame detected on the to-be-detected image, and GT represents a real defect rectangular marking frame. The specific method for comparison is to compare the defect labels belonging to the same category, and if IoU is greater than a set threshold τ of 0.5 and the confidence of the defect on the picture to be detected is less than a set threshold γ of 0.3, the defect label on the picture to be detected is regarded as a texture false detection and is removed, so as to obtain the final defect position label and the corresponding category of the picture to be detected.

The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be regarded as equivalent substitutions, and are included in the scope of the present invention.

Claims

1. A method for detecting surface defects of industrial products based on sample enhancement is characterized in that, comprising the following steps:

1) Carry out the size standardization operation on the surface picture set of industrial products, the pictures containing defects have corresponding defect annotation files, and cut the defect pictures and the corresponding defect annotation files of each picture, and divide them into normal pictures according to the cut annotations collections and defect picture collections;

2) Normalize the defect picture set obtained in step 1) and enhance online random data, including random flipping up and down, left and right, and dividing into batches;

3) For each defective picture in a batch in step 2), randomly find a normal picture with the same cutting position and the same texture template pattern in the normal picture set, and perform left or right splicing with the defective picture operation, and modify the annotation file accordingly;

4) Use the Cascade-RCNN algorithm to iteratively train the pictures and labels of each batch obtained in step 3), and complete one round of training after training all batches;

5) After completing one round of training, repeat steps 2) to 4) until the set iteration round is reached, output and save the parameters in the network to obtain the Cascade-RCNN detection model;

6) Use the Cascade-RCNN detection model obtained in step 5) to perform sliding window detection on the surface image of the industrial product to be detected and the texture template image that is determined to be defect-free, splicing the results detected by the sliding window, and compare the results obtained by the two. The results are compared, and finally the defect category and area annotation of the image to be detected are obtained.

2. The method for detecting surface defects of industrial products based on sample enhancement according to claim 1, characterized in that: in step 1), the set of images on the surface of industrial products comprises a set of pictures containing defects X and a set of normal pictures Y without defects Template image set Z composed of style example images of each texture template, wherein the defect image set X contains annotations, and the annotation of each defect is a rectangular annotation box, and its format is (name, category, x _min , y _min , x _max , y _max ), where name represents the image name, category represents the type of defect, (x _min , y _min ) represents the horizontal and vertical coordinates of the upper left corner of the rectangular labeling box, (x _max , y _max ) represents the lower right corner of the rectangular labeling box The horizontal and vertical coordinates of the image set, Y and Z, have no label information; the size of the three image sets is standardized, so that all the images are H*W RGB images, where H and W are the height and width of the image;

Make the same average cut for the three image sets, and cut the defect annotations on the image set X. The rule is: make the same cut as the picture and map the rectangular annotation frame of the defect to the small image after the cut. , if the label is cut off, calculate the ratio of the area of the cut rectangular label frame to the area of the original rectangular label frame, if it is greater than the set threshold ε, keep the label, otherwise discard it;

According to the marked information after cutting, the pictures are divided into a defect picture set X _new , a normal picture set Y _new and a template picture set Z _new , and they are saved according to their cut positions.

3. The method for detecting surface defects of industrial products based on sample enhancement according to claim 1, wherein the step 3) comprises the following steps:

3.1) For the randomly flipped defect image x _i ∈ X _new in step 2), find the corresponding normal image set _M∈ (Y _new ,Z _new ) according to its template texture and cutting position, so that all the The texture template and cutting position of the sample are the same, wherein X _new is the defect picture set after cutting, and Y _new and Z _new are the normal picture set and template picture set after cutting respectively;

3.2) Randomly select a normal picture y _i ∈ M, and fill the edge of _yi with 0 value according to the size of _xi , so that the size of _yi is exactly the same as that of _xi , and normalize _yi ;

3.3) Perform a data enhancement operation on _yi , that is, flip up, down, left and right at random;

3.4) Generate a random number between (0, 1), with 50% probability to make x _new = (x _i , y _i ), and another 50% probability to make x _new = (y _i , x _i ), that is Randomly perform left or right splicing, where x _new represents the new sample generated;

3.5) The labeling information is processed according to the splicing method. If the defect picture _xi is on the left, there is no need to change the defect label. If the defect picture _xi is on the right, the rectangular labeling box needs to be corrected accordingly.

4. The industrial product surface defect detection method based on sample enhancement according to claim 1, is characterized in that: in step 4), described Cascade-RCNN algorithm comprises three parts of backbone network, regional proposal network, classification and regression network, They are used to extract features, generate region proposals and classification and fine-tuning of candidate boxes; among them, the convolutional neural network ResNeXt-101 and feature pyramid FPN are used as the backbone network, and the region proposal network uses the two-stage target detector Faster-RCNN in the region In the proposed network part, the classification and regression network uses a multi-layer cascaded network.

5. the industrial product surface defect detection method based on sample enhancement according to claim 1, is characterized in that: in step 6) in, carry out following detection process:

6.1) For a surface image of an industrial product to be detected, use the pre-set sliding window size, use the Cascade-RCNN detection model obtained in step 5) to perform sliding window detection on the image to be detected, and then map the result back to the original On the area of the figure, the label format of each defect is (category, x _min , y _min , x _max , y _max , score), category represents the defect category, and (x _min , y _min ) represents the upper left corner of the rectangular labeling box. Horizontal and vertical coordinates, (x _max , y _max ) represents the horizontal and vertical coordinates of the lower right corner of the rectangular labeling box, score represents the confidence level of defect judgment, and the confidence level is between (0, 1);

6.2) Judging the defect annotations near the edge of the sliding window, if the adjacent sliding windows have annotations of the same category and similar position and size, the annotation merging operation is performed according to their confidence order, and the merged rectangular annotation frame is a plurality of rectangular annotations The minimum enclosing rectangle of the box, and the new confidence calculation takes their mean, as follows:

score _new =(score ₁ +score ₂ +...+score _n )/n

Among them, score _new represents the new confidence, score _i represents the confidence of the i-th rectangular annotation frame participating in the synthesis, and n represents the total number of rectangular annotation frames participating in the synthesis;

6.3) Use steps 6.1) and 6.2) to perform defect detection on the template image set Z in advance under the offline situation, and save the obtained results;

6.4) Use steps 6.1) and 6.2) to detect the surface image of the industrial product that needs to be detected online, and compare the obtained result with the detection result of the corresponding template saved in step 6.3), using IoU as the comparison standard, and its calculation formula is as follows:

Among them, DR represents the defect rectangle annotation frame detected on the image to be inspected, GT represents the real defect rectangle annotation frame; the specific method of comparison is to compare the defect annotations belonging to the same category, if the IoU is greater than the set threshold value τ, and the defect confidence degree on the image to be detected is less than the set threshold γ, the defect annotation on the image to be detected is considered to be a texture misdetection, and it is eliminated, thereby obtaining the final defect location annotation and corresponding category of the image to be detected.