WO2021203618A1 - 图像样本生成方法及系统、目标检测方法 - Google Patents

图像样本生成方法及系统、目标检测方法 Download PDF

Info

Publication number
WO2021203618A1
WO2021203618A1 PCT/CN2020/113998 CN2020113998W WO2021203618A1 WO 2021203618 A1 WO2021203618 A1 WO 2021203618A1 CN 2020113998 W CN2020113998 W CN 2020113998W WO 2021203618 A1 WO2021203618 A1 WO 2021203618A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
fused
pixel
target
images
Prior art date
Application number
PCT/CN2020/113998
Other languages
English (en)
French (fr)
Inventor
李一清
周凯
Original Assignee
浙江啄云智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江啄云智能科技有限公司 filed Critical 浙江啄云智能科技有限公司
Priority to US17/910,346 priority Critical patent/US20230162342A1/en
Publication of WO2021203618A1 publication Critical patent/WO2021203618A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/0008Industrial image inspection checking presence/absence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection

Definitions

  • the present disclosure relates to the field of security inspection technology, for example, to an image sample generation method and system, and a target detection method.
  • X-ray is a kind of electromagnetic radiation with a wavelength shorter than visible light. It has stronger solid and liquid penetrating ability than visible light, and can even penetrate steel plates of a certain thickness.
  • X-rays pass through objects, the internal structure of objects with different material composition, different densities and different thicknesses can absorb X-rays to varying degrees. The greater the density and thickness, the more absorbed rays; the smaller the density and thickness, the less absorbed rays.
  • the pixel value of the generated image reflects the density value of the object, so the intensity of the rays transmitted from the object can reflect the internal structure information of the object.
  • the system will set the color of the security image obtained through the perspective, and set the color of the organic matter to orange, the inorganic matter to blue, and the mixture to Set it as green.
  • the specific color difference is determined by the degree of absorption of X-rays by the object. The higher the degree of absorption, the darker the color; the lower the degree of absorption, the lighter the color. Therefore, in addition to the shape characteristics, the collected X-ray images will also show different colors according to the material. The above characteristics can be used for analysis and recognition when identifying items.
  • Radiation imaging technology is the mainstream technology in the security inspection system widely used in many countries. This technology irradiates the detected object with rays (such as X-rays).
  • the radiographic image of the detected object is obtained through computer processing.
  • the security inspector observes the X-ray image according to the shape and color band of common contraband to identify whether there are suspicious prohibited items in the image. This method of manual interpretation is low in efficiency, has a high rate of missed detection and has a high labor cost.
  • deep learning technology has made breakthrough progress in classification, recognition, detection, segmentation, and tracking in the field of computer vision.
  • deep convolutional neural networks learn useful features from a large amount of data under the training of big data, and have the advantages of fast speed, high accuracy, and low cost.
  • a large part of the reason why deep learning is superior to traditional methods is that deep learning is based on a large amount of data, especially in the field of security inspection, deep learning requires a large amount of data.
  • the mainstream approach is data enhancement, but not blindly increasing the amount of data can improve the detection performance of the model. It also needs to be affected by external factors such as the placement angle of the detection target and the background environment. Samples are used to restore the security images in the real scene, and the detection network can be trained to improve the detection accuracy and recall rate of contraband, which adds to the cost of collecting and labeling data.
  • the sample data with annotated information is mainly collected by collecting a large number of on-site real-shot images, and then manually labeling the on-site real-shot images.
  • the problems of low, high labor cost, large influence of human factors, and low accuracy make it difficult to generate a large amount of annotation data required for training the model in a short time.
  • the invention patents with application numbers "CN201910228142.6” and "CN201911221349.7" provide a development method for simulating real samples for difficult cases. In practice, it is found that the above-mentioned existing methods still have complex algorithms and are aimed at different scenarios. The problem of inflexible application and sample effect needs to be improved.
  • the present disclosure provides an image sample generation method and system, and a target detection method, which solve the problems of difficult data collection and labeling of deep learning training samples and large amounts of data, use simple algorithms to quickly provide effective training samples for the detection of contraband, and It can flexibly adapt to target detection tasks in different scenarios.
  • the present disclosure provides an image sample generation method, including:
  • the image to be fused includes at least one real shot security image and at least one target security image, and the number of images to be fused is recorded as N, where N ⁇ 2 and N is an integer;
  • the images to be fused with the normalized size are fused to form a new sample.
  • the fusion method is as follows: for each pixel (i, j, k) of the new sample, at the pixel (i, j, k) When the N pixels in the corresponding N images to be fused all satisfy a mean [j][k] ⁇ , the pixel value of the pixel point (i,j,k) is If at least one pixel in the N images to be fused corresponding to the pixel point (i, j, k) does not satisfy a mean [j][k] ⁇ ⁇ , the pixel point (i, j ,k) has a pixel value Among them, ⁇ is the background color threshold, 0 ⁇ 1, l represents the lth picture, 1 ⁇ l ⁇ N, Indicates the pixel gray value of the j-th row and k-th column of each image to be fused after the size is normalized, a norm [i][j][k] represents the normalized
  • the operations of determining the image to be fused, performing size normalization on the image to be fused, and fusing the image to be fused after the size normalization to form a new sample are performed multiple times, until a preset number is obtained
  • the new sample of is used as the sample composition for training.
  • the present disclosure also provides an image sample generation system, including: a scene data generation module, a target data generation module, a data preprocessing module, an image preprocessing module to be fused, an image fusion module, and a sample library generation module, wherein,
  • the scene data generation module is configured to perform scene composition analysis on the items to be inspected in the security inspection place; according to the scene composition analysis, obtain a real shot security inspection image of the target scene with a corresponding composition ratio;
  • the target data generating module is configured to obtain a marked target security check image, the target security check image being taken by a security check device;
  • the image pre-processing module to be fused is configured to determine the image to be fused, wherein the image to be fused includes at least one real shot security image and at least one target security image, and the number of images to be fused is recorded as N, N ⁇ 2 and N is an integer; normalize the size of the image to be fused;
  • the image fusion module is configured to fuse the images to be fused after the normalized size to form a new sample.
  • the fusion method is as follows: for each pixel (i, j, k) of the new sample, in the pixel
  • the pixels of the pixel point (i,j,k) Value is In the case that at least one pixel in the N images to be fused corresponding to the pixel point (i, j, k) does not satisfy a mean [j][k] ⁇ ⁇
  • the pixel point (i, j,k) has a pixel value Among them, ⁇ is the background color threshold, 0 ⁇ 1, l represents the lth picture, 1 ⁇ l ⁇ N, It represents the size of a normalized after each image to be fused j th row of the k-th column pixel grayscale values, a norm [i] [j ] [k
  • the generating sample library module is configured to execute the determination of the image to be fused for multiple times, the size normalization of the image to be fused, and the fusion of the image to be fused after the size normalization to form a new sample Until a preset number of new samples are obtained as the sample composition for training.
  • the present disclosure also provides a target detection method, including:
  • the preset target detection model is obtained by training the image samples obtained by the above-mentioned image sample generation method
  • the detection result of the security check image is determined, wherein the detection result includes information such as the type and location of the contraband.
  • FIG. 1 is a flowchart of a method for generating image samples based on deep learning according to an embodiment of the present invention
  • FIG. 2 is an X-ray image obtained by a method for generating a sample according to an embodiment of the present invention
  • Fig. 3 is an X-ray image obtained by photographing a real object according to an embodiment of the present invention.
  • Prohibited items Articles that are not allowed to be manufactured, purchased, used, possessed, stored, transported or imported without permission, such as weapons, ammunition, explosives (such as explosives, detonators, fuse, etc.).
  • Security inspection images images acquired by security inspection equipment.
  • the security inspection equipment or security inspection machines involved in this disclosure are not limited to X-ray security inspection equipment.
  • Security inspection equipment and/or security inspection machines that can perform security inspections by imaging are all within the scope of this disclosure. For example, terahertz imaging equipment.
  • a method for generating image samples based on deep learning proposed in the present disclosure includes:
  • Target scenarios include airports, railway stations, bus stations, government office buildings, embassies, conference centers, convention centers, hotels, shopping malls, large-scale events, post offices, schools, logistics industries, industrial inspections, express delivery transfers and other places.
  • the items that need security check such as luggage, express parcels, bags, bags, goods, etc.
  • the target is contraband (such as guns, explosives)
  • the target scene refers to the container where the contraband is located, that is, a place that can be set to store the contraband.
  • no target is included in the target scene.
  • the type of scene is related to the place.
  • the main scene is luggage in places such as airport and train station, and the scene corresponding to the express transfer yard is express parcel.
  • the scenes of transit centers in different geographical locations will be different.
  • the express transit center in Haining the scene is generally express parcels with clothing, located in Kunshan
  • the scene is that express parcels with electronic devices occupy the majority.
  • X-rays are electromagnetic radiation with a wavelength shorter than visible light, and have stronger solid and liquid penetration capabilities than visible light. , It can even penetrate a certain thickness of steel plate.
  • X-rays pass through objects, the internal structure of objects with different material composition, different densities and different thicknesses can absorb X-rays to varying degrees. The greater the density and thickness, the more absorbed rays; the smaller the density and thickness, the less absorbed rays.
  • the pixel value of the generated image reflects the density value of the object, so the intensity of the rays transmitted from the object can reflect the internal structure information of the object.
  • the system will set the color of the security image obtained through the perspective, and set the color of the organic matter to orange, the inorganic matter to blue, and the mixture to Set it as green.
  • the specific color difference is determined by the degree of absorption of X-rays by the object. The higher the degree of absorption, the darker the color; the lower the degree of absorption, the lighter the color. Therefore, in addition to the shape characteristics, the collected X-ray images will also show different colors according to the material. The above characteristics can be used for analysis and recognition when identifying items.
  • the selection of the necessary scene data in the sample of this disclosure can also be focused on different places, for example, the detection of contraband provided for the transit yard with clothing as the main cargo Network, during the training of the detection network, the data using the express package with clothing as the scene will be used as the sample or the method of the present disclosure will be used to make the sample. Therefore, when acquiring the real shot security image of the target scene, the security inspection will be performed. The scene composition analysis is performed on the items to be detected in the place, and the target scene image of the corresponding proportion is selected.
  • Security inspection images can use X-ray security inspection equipment or other security inspection equipment such as terahertz security inspection equipment for image acquisition. This embodiment does not limit the type or model of security inspection equipment, as long as it can be set for security inspection and can obtain security inspection images. .
  • the target image is also captured by the security inspection equipment.
  • the scene of the target is not set, and the security inspection image contains only the target.
  • contraband is the general term for the target in the embodiment of the present invention.
  • the labeling personnel will label each target to make the target a labeled target, and the labeling content includes the rectangular frame and type of the target. The more target data, the better.
  • the enhancement method includes a geometric transformation operation and/or a pixel transformation operation.
  • the geometric transformation operations include one or more of rotation, scaling, and cropping operations; the annotation information needs to be synchronously transformed during geometric transformation; the pixel transformation operations include noise operations, blur transformations, and perspective operations. , One or more of brightness operation and contrast operation.
  • the rotation operation rotate the image clockwise/counterclockwise to a certain angle, reducing the probability of the image having an inclination angle and causing recognition failure.
  • the zooming operation when the image sample is generated by matting, the zooming ratio is input, and then the zoomed-sized image is extracted from the original image and then compressed into the original image size.
  • the cropping operation by performing cropping processing on the cutout image sample, the probability of the recognition failure caused by the lack or occlusion of the image is reduced.
  • the noise adding operation generating a noise matrix according to the mean value and Gaussian covariance, adding noise to the original image matrix, and then judging the legitimacy of the pixel values of multiple points.
  • the blur transformation is implemented by using the blur function of OpenCV, that is, a blur block is added to the original image.
  • the perspective operation transform the four corner points of the original image to four new points according to the input perspective ratio, and then perform perspective on the entire point of the original image based on the corresponding mapping relationship of the four points before and after the transformation.
  • the brightness and contrast operation adopts a method of adjusting the Red Green Blue (RGB) value of each pixel to implement the brightness and contrast operation of the image.
  • RGB Red Green Blue
  • the data in S1 and S2 is preprocessed, and the processing method includes but is not limited to one or more of pixel gray value processing, denoising, background difference, and de-artifacting.
  • the pixel gray values of the i-th characteristic layer in the data of S1 and S2 are respectively processed in the following manner:
  • i 1, 2, 3; a norm [i] is the pixel gray value of the i-th characteristic layer after processing; a[i] is the pixel gray value of the i-th characteristic layer before processing, MAX_PIXEL_VAL[i ] Is the theoretical maximum gray value of the i-th feature layer.
  • the characteristic layer is a color channel.
  • the first feature layer is the red (Red, R) channel
  • the second feature layer is the green (Green, G) channel
  • the third feature layer is the blue (Blue, B) channel.
  • the corresponding relationship is not limited.
  • S3 Determine the image to be fused, where the image to be fused includes at least one real shot security image of the target scene and at least one target image, and the number of images to be fused is recorded as N, where N is an integer greater than or equal to 2.
  • N an integer greater than or equal to 2.
  • the selected images are normalized in size.
  • the at least two X-ray images may be the same or different, and the sizes of the at least two X-ray images may be the same or different.
  • the length and width of the normalized image are set by the size of the smallest bounding rectangle of the image to be fused.
  • w new max(w 1 ,w 1 )
  • h new max(h 1 ,h 2 )
  • the normalization process of each image size is performed by adding a new image
  • the area is filled with a background color, so that the target in the original image can not be changed.
  • the background color is related to the device that collects the X-ray image and can be adjusted according to the X-ray image.
  • the fusion method is as follows:
  • the pixel value of the new sample is set to Among them, ⁇ is the background color threshold, 0 ⁇ 1, and l is the first picture, Is the pixel gray value of each image to be fused after the size is normalized, a mean [j][k] is the gray value of the pixel in row j and column k, and a norm [i][j][k] is The gray value of the pixel of the i-th feature layer in the j-th row and the k-th column.
  • S6 Iterate S3, S4, and S5 several times until enough samples are obtained as the sample composition for training.
  • composition of the targeted images to be fused can be determined according to different places, which is consistent with the idea of step S1 in the embodiment of the present invention.
  • the composition ratio of the real shot security image of the target scene in the image to be fused is selected according to the scene ratio of the actual daily situation, such as 60% of large luggage and 30% of bags and bags as the target Scenes.
  • the image sample generation method based on deep learning in embodiment 1 obtains the real shot security image of the target scene based on the analysis of the security check site, obtains the labeled target image, determines the image to be fused, and applies the new fusion algorithm to the image to be fused
  • the method of obtaining new samples after processing does not require on-site shooting of a large number of target images in real scenes, and no need to manually label the real images in the above-mentioned complex environment.
  • the algorithm is simple, and it can achieve flexible and rapid generation of site-specific new Sample images, with high sample realism and high labeling accuracy, provide a large amount of available sample data with labeled information for model training, and solve the problem of sample collection of some contraband items such as pistols, explosives and other difficult-to-obtain items in the field of contraband identification.
  • the new sample obtained by the method of the embodiment of the present invention shown in FIG. 2 is almost the same as the image containing the detection target actually shot in FIG. , To provide a large number of available sample data with labeled information for model training, thereby improving the efficiency and accuracy of target detection tasks in the process of intelligent security inspection using deep learning methods.
  • Embodiment 2 An image sample generation system based on deep learning, including: a scene data set, a target data set, a preprocessing module, an image preprocessing module to be fused, an image fusion module to generate a sample library.
  • the scene data set is composed of real shot security images of the target scene described in Embodiment 1, and the target data set is composed of the labeled target images described in Embodiment 1.
  • the real security inspection image and the target image are composed of X-ray images of the article, and the X-ray image of the article can be collected using X-ray security inspection equipment; the articles include luggage, express parcels, and bulky goods.
  • the data in the scene data set and the target data set are preprocessed, and the processing methods include, but are not limited to, one or more of pixel gray value processing, denoising, background difference, and de-artifacting. kind.
  • the pixel gray values of the i-th feature layer in the scene data set and the target data set are respectively processed in the following manner:
  • i 0,1,2; a norm [i] is the pixel gray value of the i-th characteristic layer after processing; a[i] is the pixel gray value of the i-th characteristic layer before processing, MAX_PIXEL_VAL[i ] Is the theoretical maximum gray value of the i-th feature layer.
  • data enhancement is performed on the images in the scene data set and the target data set, and the enhanced images are also components of the scene data set and the target data set respectively;
  • the enhancement method includes geometric transformation operations and/or pixels Transformation operation.
  • the geometric transformation operation includes one or more of a rotation operation, a zoom operation, and a cropping operation;
  • the pixel transformation operation includes a noise addition operation, a blur transformation, a perspective operation, and a brightness operation.
  • the rotation operation rotate the image clockwise/counterclockwise to a certain angle, reducing the probability of the image having an inclination angle and causing recognition failure.
  • the zooming operation when the image sample is generated by matting, the zooming ratio is input, and then the zoomed-sized image is extracted from the original image and then compressed into the original image size.
  • the cropping operation by performing cropping processing on the cutout image sample, the probability of the recognition failure caused by the lack or occlusion of the image is reduced.
  • the noise adding operation generating a noise matrix according to the mean value and Gaussian covariance, adding noise to the original image matrix, and then judging the legitimacy of the pixel values of multiple points.
  • the blur transformation is implemented by using the blur function of OpenCV, that is, a blur block is added to the original image.
  • the perspective operation transform the four corner points of the original image into four new points according to the input perspective ratio, and then perform perspective on the entire point of the original image based on the corresponding mapping relationship of the four points before and after the transformation.
  • the brightness and contrast operation adopts the method of adjusting the RGB value of each pixel to realize the brightness and contrast operation of the image.
  • the pre-processing module of the image to be fused is configured to arbitrarily select at least one image in the scene data set and at least one image in the target data set and normalize the size thereof.
  • the size normalization module is configured to randomly take N (N ⁇ 2) X-ray images from the original sample at a time for size normalization; the at least two X-ray images may be the same or different, and at least two X-ray images
  • the size of the radiographic images may be the same or different, and they are all within the protection scope of the present disclosure.
  • the required sample quantity and quality requirements are achieved by repeatedly taking repeatedly and arbitrarily.
  • the length and width of the normalized image are set by the size of the smallest bounding rectangle of the image to be fused.
  • w new max(w 1 ,w 1 )
  • h new max(h 1 , h 2 )
  • the length and width of the two X-ray images are (w 1 , h 1 ) and (w 2 , h 2 ) respectively
  • the background color is related to the device that collects the X-ray image and can be adjusted according to the X-ray image.
  • the image fusion module is set to fuse each position pixel of the image obtained by the image preprocessing module to be fused, and the fusion method is as follows:
  • the pixel value of the new sample is set to Among them, ⁇ is the background color threshold, 0 ⁇ 1, and l is the first picture, Is the pixel gray value of each image to be fused after the size is normalized, a mean [j][k] is the gray value of the pixel in row j and column k, and a norm [i][j][k] is The gray value of the pixel of the i-th feature layer in the j-th row and the k-th column.
  • the generated sample library includes sample images generated by an image fusion module.
  • the number of sample images in the generated sample library is determined by the execution times of the preprocessing module, the image preprocessing module to be fused, and the image fusion module.
  • Embodiment 3 Corresponding to the above-mentioned deep learning-based image sample generation method, according to an embodiment of the present invention, a target detection method is also provided, including:
  • Step 1 Obtain the security inspection image of the article and preprocess the image; wherein, the preprocessing method includes but is not limited to one or more of image normalization, denoising, background difference, and de-artifacting.
  • the image is normalized with a preset size.
  • 500*500 is used as an example.
  • Gaussian smoothing algorithm is used to denoise the image.
  • the value of each point in the image after Gaussian smoothing is obtained by weighted average of its own and other pixel values in the domain. For example, a template is used to scan each pixel in the image, and the weighted average gray value of pixels in the area determined by the template is used to replace the value of the center pixel of the template.
  • Gaussian smoothing the fine noise on the image is removed. Although the edge information in the image is weakened to a certain extent, the edge is still retained relative to the noise.
  • Step 2 Extract image features of the preprocessed image through a preset convolutional neural network.
  • Step 3 Obtain the target area of the security inspection image through the preset target detection model; the preset target detection model is obtained through training of the image samples obtained by the method in Embodiment 1 of the present invention.
  • the training process of the preset target detection model mainly includes the following steps:
  • the preset deep learning network model includes a feature extraction module, a target detection network, and a loss calculation module; the preset feature extraction module and target detection
  • the networks are all convolutional neural network models; 3. Train the feature extraction module and the target detection network through the training data set to obtain a trained deep learning target detection model.
  • the training process includes: inputting the image samples obtained in the method of embodiment 1 of the present invention into the feature extraction module for feature extraction, obtaining image features, and then inputting the image features into the target detection network model to obtain candidate predictions of the image , Input the candidate prediction into the loss calculation module to calculate the loss function, and train the preset deep learning target detection model through the gradient backpropagation algorithm.
  • Step 4 Output the detection result of the security inspection image, including information such as the type and location of the contraband.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

一种用于深度学习的图像样本生成方法及系统、目标检测方法。该图像样本生成方法包括:对安检场所的待检测物品进行场景组成分析;根据所述场景组成分析,获得相应组成比例的目标场景的实拍安检图像;获得带标注的目标安检图像,所述目标安检图像通过安检设备拍摄所得;分别对所述实拍安检图和所述目标安检图像中的第i特征层的像素灰度值处理;确定待融合图像;对所述待融合图像进行尺寸归一化;将尺寸归一化后的待融合图像进行融合形成新样本;多次执行所述确定待融合图像,所述对所述待融合图像进行尺寸归一化以及所述将尺寸归一化后的待融合图像进行融合形成新样本的操作,直至获取到预设数量的新样本作为用于训练的样本组成。

Description

图像样本生成方法及系统、目标检测方法
本申请要求在2020年04月08日提交中国专利局、申请号为202010267813.2的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开涉及安检技术领域,例如涉及一种图像样本生成方法及系统、目标检测方法。
背景技术
X射线是比可见光波长还要短的一种电磁辐射,具有比可见光更强的固体、液体穿透能力,甚至能够穿透一定厚度的钢板。当X射线穿过物品时,不同物质组成、不同密度和不同厚度的物品的内部结构能够不同程度地吸收X射线。密度、厚度越大,吸收射线越多;密度、厚度越小,吸收射线越少。生成图像的像素值反映物体实物的密度值,所以从物品透射出来的射线强度就能够反映出物品内部结构信息。通常,为了更直观地了解被检测物的物质组成,系统会把透视得到的安检图像进行颜色设定,把属于有机物的物体颜色设定为橙色,把无机物设定为蓝色,把混合物设定为绿色。具体颜色差异则根据物体对X射线的吸收程度而定,吸收程度越高,颜色就越深;吸收程度越低,颜色就越浅。因此采集到的X射线图像除了具有形状特性还会根据材质显现不同的颜色,在进行物品识别的时候就可以利用上述特性进行分析识别。辐射成像技术是多国广泛使用的安检系统中的主流技术,该技术以射线(如X射线)照射被检测物体,根据探测器接收到的信号,再经过计算机的处理得到被检测物体的射线图像,安检员通过观察X光图像根据常见的违禁品的形状及色带辨别图像中是否有可疑违禁物品。这种人工判读的方法效率低,漏检率高并且有很高的人工成本。
随着人工智能技术的不断发展,深度学习技术已经在计算机视觉领域的分类、识别、检测、分割、跟踪等方面都取得了突破性的进展。相较于传统的机器视觉方法,深度卷积神经网络在大数据的训练下,从大量数据中学习出有用的特征,具有速度快、精度高、成本低等优势。深度学习能优于传统方法的很大一部分原因是因为深度学习是建立在大量数据的基础上的,特别是在安检领域,深度学习更是需要大量的数据。如何克服深度学习依赖数据集的特点,主流的做法就是数据增强,但是并不是一味的增加数据量就能够提升模型的检测性能,还需要受到检测目标的放置角度、背景环境等外界因素影响的难例样本 来还原真实场景下的安检图像,训练检测网络才能提高违禁品检出准确率和召回率,这就更加加重了采集数据和标注数据所需的成本。
带标注信息的样本数据主要通过采集大量的现场实拍图像,再由人工手动对现场实拍图像进行信息标注,一方面获取大量的现场实拍图像难度比较大,另一方面,还存在标注效率低、人工成本高、人为因素影响大、准确度低的问题,导致难以在短时间内产生训练模型所需的大量标注数据。为了解决上述问题,申请号为“CN201910228142.6”和“CN201911221349.7”的发明专利就提供了针对难例模拟真实样本的开发方法,实践中发现上述已有方法仍存在算法复杂,针对不同场景运用不灵活以及样本效果有待提高的问题。
发明内容
本公开提供了一种图像样本生成方法及系统、目标检测方法,解决深度学习训练样本数据采集、标注难且数据量大的问题,使用简单算法为违禁品的检测快速提供有效的训练样本,并且能够灵活适应不同场景下的目标检测任务。
本公开提供了一种图像样本生成方法,包括:
对安检场所的待检测物品进行场景组成分析;
根据所述场景组成分析,获得相应组成比例的目标场景的实拍安检图像;
获得带标注的目标安检图像,所述目标安检图像通过安检设备拍摄所得;
分别对所述实拍安检图像中的第i特征层的像素灰度值和所述目标安检图像中的第i特征层的像素灰度值以如下方式处理:
Figure PCTCN2020113998-appb-000001
其中,i=1,2,3;a norm[i]为处理后的所述第i特征层的像素灰度值;a[i]为处理前的所述第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为所述第i特征层理论上的最大灰度值;
确定待融合图像,其中,所述待融合图像包括至少一张实拍安检图像和至少一张目标安检图像,将所述待融合图像的个数记作N,N≥2且N为整数;
对所述待融合图像进行尺寸归一化;
将尺寸归一化后的待融合图像进行融合形成新样本,融合方法如下:针对所述新样本的每个像素点(i,j,k),在所述像素点(i,j,k)对应的N张待融合图像中的N个像素点均满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
Figure PCTCN2020113998-appb-000002
所述像素点(i,j,k)对应的所述N张待融合图像中的至少一个像素点不满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
Figure PCTCN2020113998-appb-000003
其中,δ为背景色阈值,0<δ<1,l表示第l张图 片,1≤l≤N,
Figure PCTCN2020113998-appb-000004
表示尺寸归一化后的每张待待融合图像的第j行第k列的像素灰度值,a norm[i][j][k]表示尺寸归一化后的每张待融合图像的第j行第k列第i特征层的像素灰度值,1≤j≤尺寸归一化后的每张待融合图像的最大行数,1≤k≤尺寸归一化后的每张待融合图像的最大例数;
多次执行所述确定待融合图像,所述对所述待融合图像进行尺寸归一化以及所述将尺寸归一化后的待融合图像进行融合形成新样本的操作,直至获取到预设数量的新样本作为用于训练的样本组成。
本公开还提供了一种图像样本生成系统,包括:场景数据生成模块,目标数据生成模块、数据预处理模块、待融合图像预处理模块、图像融合模块,生成样本库模块,其中,
所述场景数据生成模块,设置为对安检场所的待检测物品进行场景组成分析;根据所述场景组成分析,获得相应组成比例的目标场景的实拍安检图像;
所述目标数据生成模块,设置为获得带标注的目标安检图像,所述目标安检图像通过安检设备拍摄所得;
所述数据预处理模块,设置为分别对所述实拍安检图像中的第i特征层的像素灰度值和所述目标安检图像中的第i特征层的像素灰度值以如下方式处理:
Figure PCTCN2020113998-appb-000005
其中,i=1,2,3;a norm[i]为处理后的所述第i特征层的像素灰度值,a[i]为处理前的所述第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为所述第i特征层理论上的最大灰度值;
所述待融合图像预处理模块,设置为确定待融合图像,其中,所述待融合图像包括至少一张实拍安检图像和至少一张目标安检图像,将所述待融合图像的个数记作N,N≥2且N为整数;对所述待融合图像进行尺寸归一化;
所述图像融合模块,设置为将尺寸归一化后的待融合图像进行融合形成新样本,融合方法如下:针对所述新样本的每个像素点(i,j,k),在所述像素点(i,j,k)对应的N张待融合图像中的N个像素点均满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
Figure PCTCN2020113998-appb-000006
在所述像素点(i,j,k)对应的所述N张待融合图像中的至少一个像素点不满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
Figure PCTCN2020113998-appb-000007
其中,δ为背景色阈值,0<δ<1,l表示第l张图片,1≤l≤N,
Figure PCTCN2020113998-appb-000008
表示尺寸归一化后的每张待融合图像的第j行第k列的像素灰度值,a norm[i][j][k]表示尺寸归一化后的每张待融合图像的第j行第k列第i特征层的像素灰度值,1≤j≤尺寸归 一化后的每张待融合图像的最大行数,1≤k≤尺寸归一化后的每张待融合图像的最大例数;
所述生成样本库模块,设置为多次执行所述确定待融合图像,所述对所述待融合图像进行尺寸归一化以及所述将尺寸归一化后的待融合图像进行融合形成新样本的操作,直至获取到预设数量的新样本作为用于训练的样本组成。
本公开还提供了一种目标检测方法,包括:
获取物品的安检图像,并对所述安检图像进行预处理;
通过预设的卷积神经网络提取所述预处理后的安检图像的图像特征;
将所述图像特征输入预设的目标检测模型得到所述安检图像的目标区域;其中,所述预设的目标检测模型通过上述图像样本生成方法所得的图像样本训练得到;
根据得到的所述安检图像的目标区域,确定所述安检图像的检测结果,其中,所述检测结果包括违禁品的种类、位置等信息。
附图说明
图1为本发明实施例提供的一种基于深度学习的图像样本生成方法的流程框图;
图2为本发明实施例提供的一种样本生成方法得到的X射线图像;
图3为本发明实施例提供的一种拍摄真实物品得到的X射线图像。
具体实施方式
下面将结合实施例中的附图,对实施例中的技术方案进行描述,所描述的实施例仅仅是部分实施例,而不是全部的实施例。
首先,对本公开一个或多个实施例涉及的名词术语进行解释。
违禁品:法律规定的不准私自制造、购买、使用、持有、储存、运输进出口的物品,例如武器、弹药、爆炸物品(如炸药、雷管、导火索等)等。
安检图像:利用安检设备获取的图像,本公开所涉及安检设备或安检机并不仅限于X光安检设备,可通过成像方式进行安检的安检设备和/或安检机均是本公开所要保护的范围,例如太赫兹成像设备等。
实施例1:
如图1所示,本公开提出的一种基于深度学习的图像样本生成方法,包括:
S1:获得目标场景的实拍安检图像,组成场景数据集。
目标场景包括会在机场、火车站、汽车站、政府机关大楼、大使馆、会议中心、会展中心、酒店、商场、大型活动、邮局、学校、物流行业、工业检测、快递中转场等场所中出现的需要安检的物品,例如行李、快递包裹、包、袋、货物等。如果目标是违禁品(例如枪支、爆炸物)的情况,目标场景就是指违禁品所处的容器也就是说能够设置为收纳违禁品的地方。一实施例中,目标场景中不包含目标。通常情况下,场景的类型与场所相关,例如机场火车站这类场所中行李为主要场景,快递中转场对应的场景为快递包裹。作为普遍现象,即使同为快递中转场的情况下,处于不同地理位置的中转场,场景也会有差别,例如设于海宁的快递中转场,场景一般为包有衣物的快递包裹,设于昆山的快递中转场,场景为包有电子器件的快递包裹占据多数。
场景不同,成像效果就不一样,以X射线安检设备为例,从原理上分析如下:X射线是比可见光波长还要短的一种电磁辐射,具有比可见光更强的固体、液体穿透能力,甚至能够穿透一定厚度的钢板。当X射线穿过物品时,不同物质组成、不同密度和不同厚度的物品的内部结构能够不同程度地吸收X射线。密度、厚度越大,吸收射线越多;密度、厚度越小,吸收射线越少。生成图像的像素值反映物体实物的密度值,所以从物品透射出来的射线强度就能够反映出物品内部结构信息。通常,为了更直观地了解被检测物的物质组成,系统会把透视得到的安检图像进行颜色设定,把属于有机物的物体颜色设定为橙色,把无机物设定为蓝色,把混合物设定为绿色。具体颜色差异则根据物体对X射线的吸收程度而定,吸收程度越高,颜色就越深;吸收程度越低,颜色就越浅。因此采集到的X射线图像除了具有形状特性还会根据材质显现不同的颜色,在进行物品识别的时候就可以利用上述特性进行分析识别。
综合上述目标场景和成像效果的介绍可以知道作为本公开样本中必不可少的场景数据的选取也可以根据场所的不同有所侧重,比如,为以服装为主要货物的中转场提供的违禁品检测网络,在检测网络训练时会采用上述以装有服装的快递包裹为场景的数据用作样本或者用本公开方法来制作成样本,因此在获取目标场景的实拍安检图像的时候,会对安检场所的待检测物品进行场景组成分析,选取相应比例的目标场景图像。
安检图像可使用X射线安检设备或者其他安检设备例如太赫兹安检设备进行图像采集,本实施例并不限定安检设备的种类或型号,只要是能设置为安检的,能够获得安检图像的设备均可。
S2:获得带标注的目标图像,组成目标数据集。
目标的种类为一种或多种,目标的数量为一个或多个。目标图像也通过安检设备拍摄所得,不设定目标所处场景,安检图像中仅含有目标。作为举例, 在安检领域,违禁品即为本发明实施例中的目标的总称,标注人员会对每一个目标进行标注使目标成为带有标注的目标,标注内容包括目标的矩形框和种类。目标数据越多越好。
还可以对本实施例S1、S2中图像进行数据增强后再分别并入场景数据集和目标数据集;所述增强方法包括几何变换操作和/或像素变换操作。所述的几何变换操作包括旋转操作、缩放操作、裁剪操作中的一种或多种;几何变换时需对标注信息进行同步变换;所述的像素变换操作包括加噪操作、模糊变换、透视操作、亮度操作及对比度操作中的一种或多种。所述旋转操作:将图像顺/逆时针旋转一定角度,降低图像有倾角而导致识别失败的概率。所述缩放操作:在通过抠图产生图像样本时,输入缩放比例,然后在原图抠取缩放后尺寸的图像再压缩成原图大小。所述裁剪操作:通过将抠图图像样本进行裁剪处理降低图像有缺失或者遮挡而导致识别失败的概率。所述加噪操作:根据均值和高斯协方差生成噪声矩阵,在原图像矩阵加上噪声,再判断多个点像素值的合法性。所述模糊变换采用OpenCV的blur函数实现,即在原图像中增加一个模糊块。所述透视操作:将原图的四个角点按照输入透视比例变换到新的四个点,再由变换前后的这四个点的对应映射关系,将原图的整个点进行透视。所述亮度和对比度操作采用调整每个像素的红绿蓝(Red Green Blue,RGB)值的方法来实现对图像的亮度和对比度操作。
作为本发明的一个实施例,对S1、S2中所述数据进行预处理,处理方式包括但不限于像素灰度值处理、去噪、背景差分、去伪影中的一种或多种。作为本发明的一个实施例,分别对S1、S2所述数据中第i特征层的像素灰度值以如下方式处理:
Figure PCTCN2020113998-appb-000009
其中,i=1,2,3;a norm[i]为处理后的第i特征层的像素灰度值;a[i]为处理前的第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为第i特征层理论上的最大灰度值。
一实施例中,特征层为颜色通道。例如,第1特征层为红色(Red,R)通道,第2特征层为绿色(Green,G)通道,第3特征层为蓝色(Blue,B)通道,本文对特征层序号与颜色通道的对应关系不作限定。
S3:确定待融合图像,其中,待融合图像包括至少一张目标场景的实拍安检图像和至少一张目标图像,将待融合图像的个数记作N,N为大于或等于2的整数。作为本发明实施例的实施方式,分别从场景数据集和目标数据集中任选一张图像组成待融合图像,即N=2。
S4:对待融合图像进行尺寸归一化。
作为本发明的一个实施例,对选出的图像进行尺寸归一化,所述至少两张X射线图像可以相同或者不同,至少两张X射线图像的尺寸可以相同也可以不同,都是本公开的保护范围。
所述尺寸归一化的图像长宽以待融合图像的最小外接矩形框大小设定,以X光图像为两张时为例,取w new=max(w 1,w 1),h new=max(h 1,h 2),其中,两张X光图像的长宽分别为(w 1,h 1),(w 2,h 2);每张图像尺寸归一化过程通过对图像新增区域以背景色填充实现,这样可以不改变原有图像内目标,所述背景色与采集X光图像的设备相关,可根据X光图像进行调整。
S5:将S4所得图像进行融合形成新样本。
融合方法如下:
在新样本(i,j,k)的像素点上N张图像对应的像素点均有a mean[j][k]≥δ时,
Figure PCTCN2020113998-appb-000010
在剩余像素点上,新样本的像素值设置为
Figure PCTCN2020113998-appb-000011
其中,δ为背景色阈值,0<δ<1,l是第l张图片,
Figure PCTCN2020113998-appb-000012
为尺寸归一化后每张待融合图像的像素灰度值,a mean[j][k]为第j行第k列像素的灰度值,a norm[i][j][k]为第j行第k列第i特征层的像素的灰度值。
S6:多次迭代S3、S4、S5,直至获取到足够样本数作为用于训练的样本组成。
可以根据场所不同,确定具有针对性的待融合图像组成,与本发明实施例中S1步骤思路一致。例如,用于机场的检测网络样本,待融合图像中目标场景的实拍安检图像组成比例根据其每日实际情况下场景比例选择,例如60%的大件行李和30%的包、袋作为目标场景。
实施例1中的基于深度学习的图像样本生成方法,通过获得基于安检场所分析的目标场景的实拍安检图像,获得带标注的目标图像,确定待融合图像,将待融合图像以新的融合算法处理后得到新样本的方法,无需现场拍摄大量实拍真实场景下的目标图像,也无需手动对上述复杂环境下实拍图像进行标注,算法简单,即可实现灵活快速生成具有场所针对性的新样本图像,样本真实感高且标注准确度高,为模型训练提供大量可用的带标注信息的样本数据,解决了违禁品识别领域部分违禁品如手枪,爆炸物等难获物品的样本采集难题。通过对比发现图2所示本发明实施例方法得到的新样本与图3实拍的含有检测目标的图像几乎一致,在彩色图像下显示出了更加逼真的效果,真实感高且标注 准确度高,为模型训练提供大量可用的带标注信息的样本数据,进而提高了利用深度学习方法执行智能安检过程中的目标检测任务的效率以及准确率。
实施例2:一种基于深度学习的图像样本生成系统,包括:场景数据集,目标数据集、预处理模块、待融合图像预处理模块、图像融合模块,生成样本库。
所述场景数据集由实施例1所述的目标场景的实拍安检图像组成,所述目标数据集由实施例1所述的带标注的目标图像组成。
该实拍安检图像和该目标图像由物品X光图像组成,所述物品X光图像可使用X射线安检设备进行图像采集;所述物品包括行李、快递包裹、大件货物等。
作为本发明的一个实施例,对场景数据集、目标数据集中所述数据进行预处理,处理方式包括但不限于像素灰度值处理、去噪、背景差分、去伪影中的一种或多种。作为本发明的一个实施例,分别对场景数据集、目标数据集所述数据中第i特征层的像素灰度值以如下方式处理:
Figure PCTCN2020113998-appb-000013
其中,i=0,1,2;a norm[i]为处理后的第i特征层的像素灰度值;a[i]为处理前的第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为第i特征层理论上的最大灰度值。
在一实施例中,对场景数据集和目标数据集中的图像进行数据增强,增强后的图像也分别是场景数据集和目标数据集的组成部分;所述增强方法包括几何变换操作和/或像素变换操作。
在一示例性实施例中,所述的几何变换操作包括旋转操作、缩放操作、裁剪操作中的一种或多种;所述的像素变换操作包括加噪操作、模糊变换、透视操作、亮度操作及对比度操作中的一种或多种。所述旋转操作:将图像顺/逆时针旋转一定角度,降低图像有倾角而导致识别失败的概率。所述缩放操作:在通过抠图产生图像样本时,输入缩放比例,然后在原图抠取缩放后尺寸的图像再压缩成原图大小。所述裁剪操作:通过将抠图图像样本进行裁剪处理降低图像有缺失或者遮挡而导致识别失败的概率。所述加噪操作:根据均值和高斯协方差生成噪声矩阵,在原图像矩阵加上噪声,再判断多个点像素值的合法性。所述模糊变换采用OpenCV的blur函数实现,即在原图像中增加一个模糊块。所述透视操作:将原图的四个角点按照输入透视比例变换到新的四个点,再由变换前后的这四个点的对应映射关系,将原图的整个点进行透视。所述亮度和对比度操作采用调整每个像素的RGB值的方法来实现对图像的亮度和对比度操 作。
所述待融合图像预处理模块设置为任意选取至少一张所述场景数据集中的图像和至少一张目标数据集中的图像以及将其尺寸归一化。
所述尺寸归一化模块设置为从原始样本中每次任取N(N≥2)张X光图像进行尺寸归一化;所述至少两张X射线图像可以相同或者不同,至少两张X射线图像的尺寸可以相同也可以不同,都是本公开的保护范围。本实施例通过不断重复任取至达到所需样本数量、质量需求。
所述尺寸归一化的图像长宽以待融合图像的最小外接矩形框大小设定,以单次取X光图像为两张时为例,取w new=max(w 1,w 1),h new=max(h 1,h 2),其中,两张X光图像的长宽分别为(w 1,h 1),(w 2,h 2);每张图像尺寸归一化过程通过对图像新增区域以背景色填充实现,这样可以不改变原有图像内目标,所述背景色与采集X光图像的设备相关,可根据X光图像进行调整。
所述图像融合模块设置为对待融合图像预处理模块所得图像的每一个位置像素点进行融合,融合方法如下:
在新样本(i,j,k)的像素点上N张图像对应的像素点均有a mean[j][k]≥δ时,
Figure PCTCN2020113998-appb-000014
在剩余像素点上,新样本的像素值设置为
Figure PCTCN2020113998-appb-000015
其中,δ为背景色阈值,0<δ<1,l是第l张图片,
Figure PCTCN2020113998-appb-000016
为尺寸归一化后每张待融合图像的像素灰度值,a mean[j][k]为第j行第k列像素的灰度值,a norm[i][j][k]为第j行第k列第i特征层的像素的灰度值。
所述生成样本库包括图像融合模块所生成的样本图像。
生成样本库中的样本图像数量由预处理模块、待融合图像预处理模块、图像融合模块执行次数决定。
实施例3:对应于上述基于深度学习的图像样本生成方法,根据本发明实施例,还提供了一种目标检测方法,包括:
步骤1:获取物品的安检图像,并对图像预处理;其中,预处理方式包括但不限于图像归一化、去噪、背景差分、去伪影中的一种或多种。
以预设尺寸对图像进行归一化操作,此实施例中以500*500为例。
利用高斯平滑算法对图像进行去噪,高斯平滑后的图像中的每一个点的值,都由其自身和领域内的其他像素值经过加权平均后得到。例如,使用一个模板扫描图像中的每一个像素,用模板确定的领域内像素的加权平均灰度值代替模 板中心像素点的值。高斯平滑后,图像上细小的噪声被去除,虽然对图像中的边缘信息有一定的削弱,但是相对噪声来说,边缘还是被保留了。背景差分算法通过提取整幅图像(500*500)的灰度值中值作为背景的灰度值,再计算图像中每个像素点灰度值与背景的差值绝对值:I sub=|I fg-bg|,式中bg为整幅图像的中值,已知相比于背景点与背景灰度值的差异,异物点具有更大的差异,因此将差值的绝对值I sub看做像素点属于异物点的可能性,该值越大对应的像素点越有可能是异物点。
步骤2:通过预设的卷积神经网络提取所述预处理后图像的图像特征。
步骤3:通过预设的目标检测模型得到安检图像的目标区域;所述预设的目标检测模型通过本发明实施例1方法所得图像样本训练得到。
所述预设的目标检测模型的训练过程,主要包括以下步骤:
1.收集本发明实施例1方法所得图像样本,构建训练数据集;2.预设深度学习网络模型包含特征提取模块、目标检测网络、损失计算模块;所述预设的特征提取模块和目标检测网络均为卷积神经网络模型;3.通过训练数据集训练所述特征提取模块和目标检测网络,获得训练好的深度学习目标检测模型。
所述训练过程包括:将所述本发明实施例1方法所得图像样本输入到特征提取模块中进行特征提取,获得图像特征,再将图像特征输入到目标检测网络模型,得到所述图像的候选预测,将候选预测输入损失计算模块计算损失函数,并通过梯度反传算法训练所述预设深度学习目标检测模型。
步骤4:输出安检图像的检测结果,包括违禁品的种类、位置等信息。
对于前述的方法实施例,为了简便描述,故将其都表述为一系列的动作组合,但本申请并不受所描述的动作顺序的限制,因为依据本申请,一些步骤可以采用其它顺序或者同时进行。其次,本文中所描述的实施例均属于实施例,所涉及的动作和模块并不一定都是本申请所必须的。
在上述实施例中,对多个实施例的描述都有侧重,一个实施例中没有详述的部分,可以参见其它实施例的相关描述。
虽然上述实施例均应用于安检场景,但根据本公开的技术方案可以理解,本公开的技术方案还可以应用于除安检场景以外的利用X射线原理进行图像采集的场景,例如,在医学影像CT检查场景下的病灶检测分析。

Claims (5)

  1. 一种图像样本生成方法,包括:
    对安检场所的待检测物品进行场景组成分析;
    根据所述场景组成分析,获得相应组成比例的目标场景的实拍安检图像;
    获得带标注的目标安检图像,所述目标安检图像通过安检设备拍摄所得;
    分别对所述实拍安检图像中的第i特征层的像素灰度值和所述目标安检图像中的第i特征层的像素灰度值以如下方式处理:
    Figure PCTCN2020113998-appb-100001
    其中,i=1,2,3;a norm[i]为处理后的所述第i特征层的像素灰度值,a[i]为处理前的所述第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为所述第i特征层理论上的最大灰度值;
    确定待融合图像,其中,所述待融合图像包括至少一张实拍安检图像和至少一张目标安检图像,将所述待融合图像的个数记作N,N≥2且N为整数;
    对所述待融合图像进行尺寸归一化;
    将尺寸归一化后的待融合图像进行融合形成新样本,融合方法如下:针对所述新样本的每个像素点(i,j,k),在所述像素点(i,j,k)对应的N张待融合图像中的N个像素点均满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
    Figure PCTCN2020113998-appb-100002
    在所述像素点(i,j,k)对应的所述N张待融合图像中的至少一个像素点不满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
    Figure PCTCN2020113998-appb-100003
    其中,δ为背景色阈值,0<δ<1,l表示第l张图片,1≤l≤N,
    Figure PCTCN2020113998-appb-100004
    表示尺寸归一化后的每张待融合图像的第j行第k列的像素灰度值,a norm[i][j][k]表示尺寸归一化后的每张待融合图像的第j行第k列第i特征层的像素灰度值,1≤j≤尺寸归一化后的每张待融合图像的最大行数,1≤k≤尺寸归一化后的每张待融合图像的最大例数;
    多次执行所述确定待融合图像,所述对所述待融合图像进行尺寸归一化以及所述将尺寸归一化后的待融合图像进行融合形成新样本的操作,直至获取到预设数量的新样本作为用于训练的样本组成。
  2. 根据权利要求1所述的方法,其中,所述目标安检图像中,带标注的目标的种类为至少一种,目标的数量为至少一个。
  3. 根据权利要求1所述的方法,在所述获得相应组成比例的目标场景的实拍安检图像之后,还包括:
    对所述实拍安检图进行数据增强;
    在所述获得带标注的目标安检图像之后,还包括:
    对所述目标安检图像进行数据增强;
    其中,数据增强方法包括几何变换操作和像素变换操作中的至少之一。
  4. 一种图像样本生成系统,包括:场景数据生成模块,目标数据生成模块、数据预处理模块、待融合图像预处理模块、图像融合模块、生成样本库模块,其中,
    所述场景数据生成模块,设置为对安检场所的待检测物品进行场景组成分析;根据所述场景组成分析,获得相应组成比例的目标场景的实拍安检图像;
    所述目标数据生成模块,设置为获得带标注的目标安检图像,所述目标安检图像通过安检设备拍摄所得;
    所述数据预处理模块,设置为分别对所述实拍安检图像中的第i特征层的像素灰度值和所述目标安检图像中的第i特征层的像素灰度值以如下方式处理:
    Figure PCTCN2020113998-appb-100005
    其中,i=1,2,3;a norm[i]为处理后的所述第i特征层的像素灰度值,a[i]为处理前的所述第i特征层的像素灰度值,MAX_PIXEL_VAL[i]为所述第i特征层理论上的最大灰度值;
    所述待融合图像预处理模块,设置为确定待融合图像,其中,所述待融合图像包括至少一张实拍安检图像和至少一张目标安检图像,将所述待融合图像的个数记作N,N≥2且N为整数;对所述待融合图像进行尺寸归一化;
    所述图像融合模块,设置为将尺寸归一化后的待融合图像进行融合形成新样本,融合方法如下:针对所述新样本的每个像素点(i,j,k),在所述像素点(i,j,k)对应的N张待融合图像中的N个像素点均满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
    Figure PCTCN2020113998-appb-100006
    在所述像素点(i,j,k)对应的所述N张待融合图像中的至少一个像素点不满足a mean[j][k]≥δ的情况下,所述像素点(i,j,k)的像素值为
    Figure PCTCN2020113998-appb-100007
    其中,δ为背景色阈值,0<δ<1,l表示第l张图片,1≤l≤N,
    Figure PCTCN2020113998-appb-100008
    表示尺寸归一化后的每张待融合图像的第j行第k列的像素灰度值,a norm[i][j][k]表示尺寸归一化后的每张待融合图像的第j行第k列第i特征层的像素灰度值,1≤j≤尺寸归一化后的每张待融合图像的最大行数,1≤k≤尺寸归一化后的每张待融合图像的最大例数;
    所述生成样本库模块,设置为多次执行所述确定待融合图像,所述对所述待融合图像进行尺寸归一化以及所述将尺寸归一化后的待融合图像进行融合形 成新样本的操作,直至获取到预设数量的新样本作为用于训练的样本组成。
  5. 一种目标检测方法,包括:
    获取物品的安检图像,并对所述安检图像进行预处理;
    通过预设的卷积神经网络提取预处理后的安检图像的图像特征;
    将所述图像特征输入预设的目标检测模型得到所述安检图像的目标区域;其中,所述预设的目标检测模型通过权利要求1-3中任一项所述的图像样本生成方法所得的图像样本训练得到;
    根据得到的所述安检图像的目标区域,确定所述安检图像的检测结果,其中,所述检测结果包括违禁品的种类和位置信息。
PCT/CN2020/113998 2020-04-08 2020-09-08 图像样本生成方法及系统、目标检测方法 WO2021203618A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/910,346 US20230162342A1 (en) 2020-04-08 2020-09-08 Image sample generating method and system, and target detection method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010267813.2 2020-04-08
CN202010267813.2A CN111145177B (zh) 2020-04-08 2020-04-08 图像样本生成方法、特定场景目标检测方法及其系统

Publications (1)

Publication Number Publication Date
WO2021203618A1 true WO2021203618A1 (zh) 2021-10-14

Family

ID=70528817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113998 WO2021203618A1 (zh) 2020-04-08 2020-09-08 图像样本生成方法及系统、目标检测方法

Country Status (3)

Country Link
US (1) US20230162342A1 (zh)
CN (1) CN111145177B (zh)
WO (1) WO2021203618A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495017A (zh) * 2022-04-14 2022-05-13 美宜佳控股有限公司 基于图像处理的地面杂物检测方法、装置、设备及介质
CN114648494A (zh) * 2022-02-28 2022-06-21 扬州市苏灵农药化工有限公司 基于工厂数字化的农药悬浮剂生产控制系统
CN114821194A (zh) * 2022-05-30 2022-07-29 深圳市科荣软件股份有限公司 一种设备运行状态识别方法及装置
CN115019112A (zh) * 2022-08-09 2022-09-06 威海凯思信息科技有限公司 基于图像的目标对象检测方法、装置及电子设备
CN115035352A (zh) * 2022-03-23 2022-09-09 成都智元汇信息技术股份有限公司 基于智能识图盒子性能的验证方法及系统
CN116740220A (zh) * 2023-08-16 2023-09-12 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN117253144A (zh) * 2023-09-07 2023-12-19 建研防火科技有限公司 一种火灾风险分级管控方法
CN117689980A (zh) * 2024-02-04 2024-03-12 青岛海尔科技有限公司 构建环境识别模型的方法、识别环境的方法及装置、设备

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145177B (zh) * 2020-04-08 2020-07-31 浙江啄云智能科技有限公司 图像样本生成方法、特定场景目标检测方法及其系统
CN111539957B (zh) * 2020-07-07 2023-04-18 浙江啄云智能科技有限公司 一种用于目标检测的图像样本生成方法、系统及检测方法
CN111898659A (zh) * 2020-07-16 2020-11-06 北京灵汐科技有限公司 一种目标检测方法及系统
CN111709948B (zh) * 2020-08-19 2021-03-02 深兰人工智能芯片研究院(江苏)有限公司 容器瑕疵检测方法和装置
CN112001873B (zh) * 2020-08-27 2024-05-24 中广核贝谷科技有限公司 一种基于集装箱x射线图像的数据生成方法
CN112235476A (zh) * 2020-09-15 2021-01-15 南京航空航天大学 一种基于融合变异的测试数据生成方法
CN112488044A (zh) * 2020-12-15 2021-03-12 中国银行股份有限公司 图片处理方法及装置
CN112560698B (zh) * 2020-12-18 2024-01-16 北京百度网讯科技有限公司 图像处理方法、装置、设备和介质
CN115147671A (zh) * 2021-03-18 2022-10-04 杭州海康威视系统技术有限公司 对象识别模型的训练方法、装置及存储介质
CN116994002B (zh) * 2023-09-25 2023-12-19 杭州安脉盛智能技术有限公司 一种图像特征提取方法、装置、设备及存储介质
CN117372275A (zh) * 2023-11-02 2024-01-09 凯多智能科技(上海)有限公司 一种图像数据集扩充方法、装置及电子设备
CN117523341B (zh) * 2023-11-23 2024-06-21 中船(北京)智能装备科技有限公司 一种深度学习训练图像样本的生成方法、装置及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948565A (zh) * 2019-03-26 2019-06-28 浙江啄云智能科技有限公司 一种用于邮政业的违禁品不开箱检测方法
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network
US20200090350A1 (en) * 2018-09-18 2020-03-19 Caide Systems, Inc. Medical image generation, localizaton, registration system
CN110910467A (zh) * 2019-12-03 2020-03-24 浙江啄云智能科技有限公司 一种x射线图像样本生成方法、系统及用途
CN111145177A (zh) * 2020-04-08 2020-05-12 浙江啄云智能科技有限公司 图像样本生成方法、特定场景目标检测方法及其系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10333912A (ja) * 1997-05-28 1998-12-18 Oki Electric Ind Co Ltd ファジイルール作成方法および装置
CN104463196B (zh) * 2014-11-11 2017-07-25 中国人民解放军理工大学 一种基于视频的天气现象识别方法
CN108932735B (zh) * 2018-07-10 2021-12-28 广州众聚智能科技有限公司 一种生成深度学习样本的方法
CN109948562B (zh) * 2019-03-25 2021-04-30 浙江啄云智能科技有限公司 一种基于x射线图像的安检系统深度学习样本生成方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200090350A1 (en) * 2018-09-18 2020-03-19 Caide Systems, Inc. Medical image generation, localizaton, registration system
CN109948565A (zh) * 2019-03-26 2019-06-28 浙江啄云智能科技有限公司 一种用于邮政业的违禁品不开箱检测方法
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network
CN110910467A (zh) * 2019-12-03 2020-03-24 浙江啄云智能科技有限公司 一种x射线图像样本生成方法、系统及用途
CN111145177A (zh) * 2020-04-08 2020-05-12 浙江啄云智能科技有限公司 图像样本生成方法、特定场景目标检测方法及其系统

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114648494A (zh) * 2022-02-28 2022-06-21 扬州市苏灵农药化工有限公司 基于工厂数字化的农药悬浮剂生产控制系统
CN114648494B (zh) * 2022-02-28 2022-12-06 扬州市苏灵农药化工有限公司 基于工厂数字化的农药悬浮剂生产控制系统
CN115035352B (zh) * 2022-03-23 2023-08-04 成都智元汇信息技术股份有限公司 基于智能识图盒子性能的验证方法及系统
CN115035352A (zh) * 2022-03-23 2022-09-09 成都智元汇信息技术股份有限公司 基于智能识图盒子性能的验证方法及系统
CN114495017A (zh) * 2022-04-14 2022-05-13 美宜佳控股有限公司 基于图像处理的地面杂物检测方法、装置、设备及介质
CN114821194A (zh) * 2022-05-30 2022-07-29 深圳市科荣软件股份有限公司 一种设备运行状态识别方法及装置
CN114821194B (zh) * 2022-05-30 2023-07-25 深圳市科荣软件股份有限公司 一种设备运行状态识别方法及装置
CN115019112A (zh) * 2022-08-09 2022-09-06 威海凯思信息科技有限公司 基于图像的目标对象检测方法、装置及电子设备
CN116740220A (zh) * 2023-08-16 2023-09-12 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN116740220B (zh) * 2023-08-16 2023-10-13 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN117253144A (zh) * 2023-09-07 2023-12-19 建研防火科技有限公司 一种火灾风险分级管控方法
CN117253144B (zh) * 2023-09-07 2024-04-12 建研防火科技有限公司 一种火灾风险分级管控方法
CN117689980A (zh) * 2024-02-04 2024-03-12 青岛海尔科技有限公司 构建环境识别模型的方法、识别环境的方法及装置、设备
CN117689980B (zh) * 2024-02-04 2024-05-24 青岛海尔科技有限公司 构建环境识别模型的方法、识别环境的方法及装置、设备

Also Published As

Publication number Publication date
US20230162342A1 (en) 2023-05-25
CN111145177B (zh) 2020-07-31
CN111145177A (zh) 2020-05-12

Similar Documents

Publication Publication Date Title
WO2021203618A1 (zh) 图像样本生成方法及系统、目标检测方法
CN109948562B (zh) 一种基于x射线图像的安检系统深度学习样本生成方法
Jain An evaluation of deep learning based object detection strategies for threat object detection in baggage security imagery
CN109948565B (zh) 一种用于邮政业的违禁品不开箱检测方法
CN109284738B (zh) 不规则人脸矫正方法和系统
Gu et al. Automatic and robust object detection in x-ray baggage inspection using deep convolutional neural networks
Rogers et al. A deep learning framework for the automated inspection of complex dual-energy x-ray cargo imagery
CN110910467B (zh) 一种x射线图像样本生成方法、系统及用途
WO2019114145A1 (zh) 监控视频中人数检测方法及装置
CN111539957B (zh) 一种用于目标检测的图像样本生成方法、系统及检测方法
CN109635634A (zh) 一种基于随机线性插值的行人再识别数据增强方法
Xu et al. YOLO-MSFG: toward real-time detection of concealed objects in passive terahertz images
CN110189375A (zh) 一种基于单目视觉测量的图像目标识别方法
CN111539251B (zh) 一种基于深度学习的安检物品识别方法和系统
Chumuang et al. Analysis of X-ray for locating the weapon in the vehicle by using scale-invariant features transform
Zou et al. Dangerous objects detection of X-ray images using convolution neural network
Jaccard et al. Using deep learning on X-ray images to detect threats
CN112069907A (zh) 基于实例分割的x光机图像识别方法、装置及系统
Guo et al. Detection method of photovoltaic panel defect based on improved mask R-CNN
Zhu et al. AMOD-net: Attention-based multi-scale object detection network for X-ray baggage security inspection
CN117218672A (zh) 一种基于深度学习的病案文字识别方法及系统
Huang et al. Anchor-free weapon detection for x-ray baggage security images
CN116721294A (zh) 一种基于层次细粒度分类的图像分类方法
Marnissi et al. GAN-based vision Transformer for high-quality thermal image enhancement
CN110992324A (zh) 一种基于x射线图像的智能危险品检测方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20930140

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20930140

Country of ref document: EP

Kind code of ref document: A1