CN113850308A

CN113850308A - Target classification method for complex scene

Info

Publication number: CN113850308A
Application number: CN202111081734.3A
Authority: CN
Inventors: 张笑钦
Original assignee: Big Data And Information Technology Research Institute Of Wenzhou University
Current assignee: Big Data And Information Technology Research Institute Of Wenzhou University
Priority date: 2021-09-15
Filing date: 2021-09-15
Publication date: 2021-12-28

Abstract

The invention discloses a target classification method facing a complex scene, which comprises the following steps: collecting an image to be trained and an image to be classified; preprocessing the acquired image; making a data set by using the preprocessed image to be trained; training a target classification model; processing the preprocessed image to be classified by using a target classification model; according to the method, the crawling module is used for crawling image information containing complex scenes from a plurality of websites, and screening the crawled images to produce a data set, so that the quantity and quality of training data are guaranteed; meanwhile, the target classification result can be efficiently obtained by training the target classification model and processing the preprocessed image to be classified by using the target classification model.

Description

Target classification method for complex scene

Technical Field

The invention relates to the technical field of image processing, in particular to a target classification method for complex scenes.

Background

At present, carriers of information are various, images are the most common in all types of carriers, people can quickly identify targets in the images with the help of computers and classification algorithms, and the cost of manpower and material resources is greatly reduced.

Therefore, it is an urgent need to solve the above problems by providing a new technical solution.

Disclosure of Invention

In view of this, the present invention provides a target classification method for complex scenes to solve the above technical problems.

In order to achieve the purpose, the invention provides the following technical scheme:

a target classification method facing complex scenes comprises the following steps: collecting an image to be trained and an image to be classified; preprocessing the acquired image; making a data set by using the preprocessed image to be trained; training a target classification model; processing the preprocessed image to be classified by using a target classification model; and displaying the target classification result.

In the above scheme, the acquiring the image to be trained includes the following steps: crawling image information containing complex scenes from a plurality of websites through a crawling module; the crawling process is managed through a crawler management module; the crawling image information is stored through a storage module, the crawling module comprises a downloading unit, an analyzing unit, a memory unit and a database unit, the downloading unit is used for downloading a webpage to be crawled, the analyzing unit is connected with the downloading unit, the analyzing unit is used for analyzing useful data in the webpage downloaded by the downloading unit and storing the useful data in an object, the memory unit is connected with the analyzing unit, the memory unit is used for conducting persistence processing on the object analyzed by the analyzing unit, the database unit is connected with the memory unit, and the database unit is used for storing the data processed by the memory unit; the crawler management module comprises a state display unit and a crawler control unit, wherein the state display unit is used for displaying the specific state in the crawler process, and the crawler control unit is used for controlling the starting, pausing and crawling interval time of the crawler; the storage module comprises a computer hard disk.

In the above scheme, the acquiring the image to be classified includes acquiring image information to be classified by a camera module and storing the acquired image information to be classified in a storage module, the camera module includes a CMOS sensor, a motorized zoom lens, a zoom driving motor, an LED lamp set, a relay, a brightness sensor, an infrared sensor and a pan/tilt head, the CMOS sensor, the zoom driving motor, the LED lamp set, the relay, the brightness sensor and the infrared sensor are all mounted on the pan/tilt head, the CMOS sensor is used for acquiring image information, the motorized zoom lens is connected to the CMOS sensor, the zoom driving motor is used for driving the motorized zoom lens to zoom, the relay is connected to the LED lamp set, the relay is used for driving the LED lamp set to illuminate, and the brightness sensor is used for acquiring ambient brightness information, the infrared sensor is used for detecting whether a moving target exists in the monitoring area.

In the above scheme, the preprocessing the acquired image includes preprocessing the acquired image to be trained, and the preprocessing the acquired image to be trained includes the following steps: carrying out normalization processing on the collected images to be trained, and processing the images to be trained with the same specification; setting a characteristic point threshold value, acquiring the number of characteristic points of the image to be trained, and deleting the image to be trained of which the number of the characteristic points is greater than the characteristic point threshold value; carrying out feature binarization processing on the deleted residual image to be trained through a fuzzy mean value binarization algorithm, and separating a target to be detected and a background; and calculating the proportion of the target to be detected in the whole image, and deleting the image to be trained, wherein the proportion of the target to be detected in the whole image is smaller than a preset proportion value.

In the above scheme, the preprocessing the acquired image further includes preprocessing the acquired image to be classified, and the preprocessing the acquired image to be classified includes the following steps: carrying out distortion correction processing on the image to be classified through a convolutional neural network algorithm; carrying out noise reduction treatment on the image to be classified after distortion correction treatment through a median filtering algorithm; and carrying out image graying processing on the image to be classified after the noise reduction processing.

In the above solution, the making of the data set by using the preprocessed image to be trained includes the following steps: carrying out reduction operation on the preprocessed image to be trained through an OpenCv library resize function; dividing the image to be trained after the reduction operation into two parts according to a preset proportion, wherein one part is used as a training data set, and the other part is used as a test data set; the training data set is packed into a training TFRecords picture set, and the test data set is packed into a test TFRecords picture set.

In the above scheme, the training of the target classification model includes the following steps: training a target detection network through an SSD algorithm and a data set made by using the preprocessed image to be trained; and training the target classification network by a convolutional neural network algorithm and a data set made by using the preprocessed image to be trained.

In the foregoing solution, the processing the pre-processed image to be classified by using the target classification model includes inputting the pre-processed image to be classified into a target detection network for target detection, and the inputting the pre-processed image to be classified into the target detection network for target detection includes the following steps: performing feature extraction on the preprocessed image to be classified through a feature extraction module to obtain a feature map; performing transposition convolution on the lower layer of the feature extraction module, and calculating the resolution of the low-layer feature map obtained after the transposition convolution; repeating the transposing convolution process until the resolution of the low-level feature map of the feature extraction module is consistent with the resolution of the high-level feature map of the feature extraction module; fusing the high-level feature map and the low-level feature map through a fusion convolution module; inputting the feature graph output by the fusion convolution module into a convolution predictor for prediction; and selecting the best prediction result through a maximum suppression algorithm, wherein the feature extraction module comprises a plurality of pairs of a pooling layer and at least one convolution layer.

In the above scheme, the processing the pre-processed image to be classified by using the target classification model further includes inputting a target obtained through the target detection network into the target classification network for target classification, and the inputting the target obtained through the target detection network into the target classification network for target classification includes the following steps: respectively extracting the characteristics of the target obtained by the target detection network through a common convolution module and a Gabor convolution module; splicing the feature map output by the common convolution module and the feature map output by the Gabor convolution module to obtain a new feature map; performing dimension increasing operation on the new feature map; adding the feature map after the dimension is increased to a target obtained by the target detection network through a summation operation; and summarizing results obtained by the summation operation through a global average pooling layer, and finishing a classification task through a softmax classifier.

In the scheme, the displaying of the target classification result comprises displaying the target classification result through a display module and sending the target classification result to a remote terminal for displaying, the display module comprises an LCD touch display screen, a physical key and a power indicator, the physical key and the power indicator are connected with the LCD touch display screen, the physical key is used for turning on and off the LCD touch display screen, and the power indicator is used for indicating the power connection state of the LCD touch display screen.

In conclusion, the beneficial effects of the invention are as follows: image information containing complex scenes is crawled from a plurality of websites through a crawling module, and a data set is manufactured after the crawled images are screened, so that the quantity and the quality of training data are guaranteed; meanwhile, the target classification result can be efficiently obtained by training the target classification model and processing the preprocessed image to be classified by using the target classification model.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention.

FIG. 1 is a step diagram of a complex scene-oriented object classification method according to the present invention.

FIG. 2 is a diagram illustrating the steps of acquiring an image to be trained according to the present invention.

FIG. 3 is a schematic diagram of the composition of the crawling module of the present invention.

FIG. 4 is a diagram illustrating the steps of acquiring an image to be classified according to the present invention.

Fig. 5 is a schematic view of the composition of the camera module of the present invention.

FIG. 6 is a diagram illustrating the steps of pre-processing the acquired image to be trained in the present invention.

Fig. 7 is a step diagram of preprocessing the acquired image to be classified in the present invention.

FIG. 8 is a diagram illustrating the steps of creating a data set using pre-processed images to be trained in the present invention.

FIG. 9 is a diagram of the steps for training a target classification model according to the present invention.

FIG. 10 is a diagram illustrating the steps of target detection in the present invention.

FIG. 11 is a step diagram of object classification in the present invention.

FIG. 12 is a schematic view of the display module according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

As shown in fig. 1, the method for classifying objects oriented to complex scenes of the present invention includes the following steps:

step S1: collecting an image to be trained and an image to be classified;

step S2: preprocessing the acquired image;

step S3: making a data set by using the preprocessed image to be trained;

step S4: training a target classification model;

step S5: processing the preprocessed image to be classified by using a target classification model;

step S6: and displaying the target classification result.

As shown in fig. 2, the acquiring of the image to be trained includes the following steps:

step S111: crawling image information containing complex scenes from a plurality of websites through a crawling module;

step S112: the crawling process is managed through a crawler management module;

step S113: and storing the crawled image information through a storage module.

As shown in fig. 3, the crawling module includes a downloading unit, an analyzing unit, a memory unit and a database unit, the downloading unit is used for downloading a web page to be crawled, the analyzing unit is connected to the downloading unit, the analyzing unit is used for analyzing useful data in the web page downloaded by the downloading unit and storing the useful data in an object, the memory unit is connected to the analyzing unit, the memory unit is used for performing persistence processing on the object analyzed by the analyzing unit, the database unit is connected to the memory unit, and the database unit is used for storing the data processed by the memory unit.

Furthermore, the crawler management module comprises a state display unit and a crawler control unit, wherein the state display unit is used for displaying the specific state in the crawler process, and the crawler control unit is used for controlling the starting, pausing and crawling interval time of the crawler; the storage module comprises a computer hard disk.

As shown in fig. 4, the acquiring the image to be classified includes the following steps:

step S121: acquiring image information to be classified through a camera module;

step S122: and storing the acquired image information to be classified to a storage module.

As shown in fig. 5, the camera module includes a CMOS sensor, a motorized zoom lens, a zoom driving motor, an LED lamp set, a relay, a brightness sensor, an infrared sensor, and a pan-tilt, where the CMOS sensor, the motorized zoom lens, the zoom driving motor, the LED lamp set, the relay, the brightness sensor, and the infrared sensor are all mounted on the pan-tilt, the CMOS sensor is used to collect image information, the motorized zoom lens is connected to the CMOS sensor, the zoom driving motor is used to drive the motorized zoom lens to zoom, the relay is connected to the LED lamp set, the relay is used to drive the LED lamp set to illuminate, the brightness sensor is used to obtain ambient brightness information, and the infrared sensor is used to detect whether a moving target exists in a monitoring area.

As shown in fig. 6, the preprocessing the acquired image includes preprocessing the acquired image to be trained, and the preprocessing the acquired image to be trained includes the following steps:

step S211: carrying out normalization processing on the collected images to be trained, and processing the images to be trained with the same specification;

step S212: setting a characteristic point threshold value, acquiring the number of characteristic points of the image to be trained, and deleting the image to be trained of which the number of the characteristic points is greater than the characteristic point threshold value;

step S213: carrying out feature binarization processing on the deleted residual image to be trained through a fuzzy mean value binarization algorithm, and separating a target to be detected and a background;

step S214: and calculating the proportion of the target to be detected in the whole image, and deleting the image to be trained, wherein the proportion of the target to be detected in the whole image is smaller than a preset proportion value.

As shown in fig. 7, the preprocessing the acquired image further includes preprocessing the acquired image to be classified, and the preprocessing the acquired image to be classified includes the following steps:

step S221: carrying out distortion correction processing on the image to be classified through a convolutional neural network algorithm;

step S222: carrying out noise reduction treatment on the image to be classified after distortion correction treatment through a median filtering algorithm;

step S223: and carrying out image graying processing on the image to be classified after the noise reduction processing.

As shown in fig. 8, the making of the data set using the preprocessed image to be trained includes the following steps:

step S31: carrying out reduction operation on the preprocessed image to be trained through an OpenCv library resize function;

step S32: dividing the image to be trained after the reduction operation into two parts according to a preset proportion, wherein one part is used as a training data set, and the other part is used as a test data set;

step S33: the training data set is packed into a training TFRecords picture set, and the test data set is packed into a test TFRecords picture set.

In this embodiment, the zoom-out operation includes the following steps: reading the preprocessed image to be trained through an OpenCv library; setting a clipping vertex value of an OpenCv library read image; and (5) clipping the image read by the OpenCv library according to the clipping vertex value through a resize function.

As shown in fig. 9, the training of the target classification model includes the following steps:

step S41: training a target detection network through an SSD algorithm and a data set made by using the preprocessed image to be trained;

step S42: and training the target classification network by a convolutional neural network algorithm and a data set made by using the preprocessed image to be trained.

In this embodiment, the target detection network includes a feature extraction module, a transposed convolution module, a fusion convolution module, and a convolution predictor, where the transposed convolution module is connected to the feature extraction module, the fusion convolution module is connected to the transposed convolution module, and the convolution predictor is connected to the fusion convolution module.

In this embodiment, the target classification network includes a general convolution module, a Gabor convolution module, a concatenation module, a dimension-increasing operation module, a summation operation module, a global average pooling layer, and a softmax classifier, the concatenation module is connected with the general convolution module and the Gabor convolution module, the dimension-increasing operation module is connected with the concatenation module, the summation operation module is connected with the dimension-increasing operation module, the global average pooling layer is connected with the summation operation module, and the softmax classifier is connected with the global average pooling layer.

As shown in fig. 10, the processing the pre-processed image to be classified by using the target classification model includes inputting the pre-processed image to be classified into a target detection network for target detection, and the inputting the pre-processed image to be classified into the target detection network for target detection includes the following steps:

step S511: performing feature extraction on the preprocessed image to be classified through a feature extraction module to obtain a feature map;

step S512: performing transposition convolution on the lower layer of the feature extraction module, and calculating the resolution of the low-layer feature map obtained after the transposition convolution;

step S513: repeating the transposing convolution process until the resolution of the low-level feature map of the feature extraction module is consistent with the resolution of the high-level feature map of the feature extraction module;

step S514: fusing the high-level feature map and the low-level feature map through a fusion convolution module; inputting the feature graph output by the fusion convolution module into a convolution predictor for prediction;

step S515: and selecting the best prediction result through a maximum suppression algorithm.

Further, the feature extraction module includes a plurality of pairs of a pooling layer and at least one convolution layer.

As shown in fig. 11, the processing the pre-processed image to be classified by using the target classification model further includes inputting the target acquired through the target detection network into the target classification network for target classification, and the inputting the target acquired through the target detection network into the target classification network for target classification includes the following steps:

step S521: respectively extracting the characteristics of the target obtained by the target detection network through a common convolution module and a Gabor convolution module;

step S522: splicing the feature map output by the common convolution module and the feature map output by the Gabor convolution module to obtain a new feature map;

step S523: performing dimension increasing operation on the new feature map;

step S524: adding the feature map after the dimension is increased to a target obtained by the target detection network through a summation operation;

step S525: and summarizing results obtained by the summation operation through a global average pooling layer, and finishing a classification task through a softmax classifier.

As shown in fig. 12, displaying the target classification result includes displaying the target classification result and sending the target classification result to a remote terminal for display through a display module, where the display module includes an LCD touch display screen, a physical button and a power indicator, the physical button and the power indicator are connected to the LCD touch display screen, the physical button is used to turn on and turn off the LCD touch display screen, and the power indicator is used to indicate a power connection state of the LCD touch display screen.

In this embodiment, the power indicator adopts double-colored LED pilot lamp, the double-colored LED pilot lamp is in when LCD touch display screen connects the power, show for green when LCD touch display screen cuts off the power, show for red.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A target classification method facing complex scenes is characterized by comprising the following steps:

collecting an image to be trained and an image to be classified;

preprocessing the acquired image;

making a data set by using the preprocessed image to be trained;

training a target classification model;

processing the preprocessed image to be classified by using a target classification model;

and displaying the target classification result.

2. The method for classifying the target facing the complex scene according to claim 1, wherein the step of acquiring the image to be trained comprises the following steps: crawling image information containing complex scenes from a plurality of websites through a crawling module; the crawling process is managed through a crawler management module; the crawling image information is stored through a storage module, the crawling module comprises a downloading unit, an analyzing unit, a memory unit and a database unit, the downloading unit is used for downloading a webpage to be crawled, the analyzing unit is connected with the downloading unit, the analyzing unit is used for analyzing useful data in the webpage downloaded by the downloading unit and storing the useful data in an object, the memory unit is connected with the analyzing unit, the memory unit is used for conducting persistence processing on the object analyzed by the analyzing unit, the database unit is connected with the memory unit, and the database unit is used for storing the data processed by the memory unit; the crawler management module comprises a state display unit and a crawler control unit, wherein the state display unit is used for displaying the specific state in the crawler process, and the crawler control unit is used for controlling the starting, pausing and crawling interval time of the crawler; the storage module comprises a computer hard disk.

3. The method for classifying complex scene-oriented objects according to claim 1, wherein the acquiring of the images to be classified includes acquiring image information to be classified by a camera module and storing the acquired image information to be classified in a storage module, the camera module includes a CMOS sensor, a motorized zoom lens, a zoom driving motor, an LED lamp set, a relay, a brightness sensor, an infrared sensor and a pan-tilt, the CMOS sensor, the motorized zoom lens, the zoom driving motor, the LED lamp set, the relay, the brightness sensor and the infrared sensor are all mounted on the pan-tilt, the CMOS sensor is used for acquiring image information, the motorized zoom lens is connected with the CMOS sensor, the zoom driving motor is used for driving the motorized zoom lens to zoom, the relay is connected with the LED lamp set, and the relay is used for driving the LED lamp set to illuminate, the brightness sensor is used for acquiring environment brightness information, and the infrared sensor is used for detecting whether a moving target exists in a monitoring area.

4. The complex scene-oriented object classification method according to claim 1, wherein the preprocessing the collected image includes preprocessing a collected image to be trained, and the preprocessing the collected image to be trained includes the following steps: carrying out normalization processing on the collected images to be trained, and processing the images to be trained with the same specification; setting a characteristic point threshold value, acquiring the number of characteristic points of the image to be trained, and deleting the image to be trained of which the number of the characteristic points is greater than the characteristic point threshold value; carrying out feature binarization processing on the deleted residual image to be trained through a fuzzy mean value binarization algorithm, and separating a target to be detected and a background; and calculating the proportion of the target to be detected in the whole image, and deleting the image to be trained, wherein the proportion of the target to be detected in the whole image is smaller than a preset proportion value.

5. The complex scene-oriented object classification method according to claim 4, wherein the preprocessing the collected image further comprises preprocessing the collected image to be classified, and the preprocessing the collected image to be classified comprises the following steps: carrying out distortion correction processing on the image to be classified through a convolutional neural network algorithm; carrying out noise reduction treatment on the image to be classified after distortion correction treatment through a median filtering algorithm; and carrying out image graying processing on the image to be classified after the noise reduction processing.

6. The complex scene-oriented object classification method according to claim 1, wherein the data set is produced by using the preprocessed image to be trained, and the method comprises the following steps: carrying out reduction operation on the preprocessed image to be trained through an OpenCv library resize function; dividing the image to be trained after the reduction operation into two parts according to a preset proportion, wherein one part is used as a training data set, and the other part is used as a test data set; the training data set is packed into a training TFRecords picture set, and the test data set is packed into a test TFRecords picture set.

7. The complex scene-oriented object classification method according to claim 1, wherein the training of the object classification model comprises the steps of: training a target detection network through an SSD algorithm and a data set made by using the preprocessed image to be trained; and training the target classification network by a convolutional neural network algorithm and a data set made by using the preprocessed image to be trained.

8. The complex scene-oriented object classification method according to claim 1, wherein the processing the pre-processed image to be classified by using the object classification model comprises inputting the pre-processed image to be classified into an object detection network for object detection, and the inputting the pre-processed image to be classified into the object detection network for object detection comprises the following steps: performing feature extraction on the preprocessed image to be classified through a feature extraction module to obtain a feature map; performing transposition convolution on the lower layer of the feature extraction module, and calculating the resolution of the low-layer feature map obtained after the transposition convolution; repeating the transposing convolution process until the resolution of the low-level feature map of the feature extraction module is consistent with the resolution of the high-level feature map of the feature extraction module; fusing the high-level feature map and the low-level feature map through a fusion convolution module; inputting the feature graph output by the fusion convolution module into a convolution predictor for prediction; and selecting the best prediction result through a maximum suppression algorithm, wherein the feature extraction module comprises a plurality of pairs of a pooling layer and at least one convolution layer.

9. The complex scene-oriented object classification method according to claim 8, wherein the processing of the preprocessed image to be classified by using the object classification model further comprises inputting an object obtained through an object detection network into an object classification network for object classification, and the inputting of the object obtained through the object detection network into the object classification network for object classification comprises the following steps: respectively extracting the characteristics of the target obtained by the target detection network through a common convolution module and a Gabor convolution module; splicing the feature map output by the common convolution module and the feature map output by the Gabor convolution module to obtain a new feature map; performing dimension increasing operation on the new feature map; adding the feature map after the dimension is increased to a target obtained by the target detection network through a summation operation; and summarizing results obtained by the summation operation through a global average pooling layer, and finishing a classification task through a softmax classifier.

10. The complex scene-oriented target classification method according to claim 1, wherein the displaying of the target classification result comprises displaying the target classification result through a display module and sending the target classification result to a remote terminal for display, the display module comprises an LCD touch display screen, a physical button and a power indicator, the physical button and the power indicator are connected with the LCD touch display screen, the physical button is used for turning on and off the LCD touch display screen, and the power indicator is used for indicating a power connection state of the LCD touch display screen.