Disclosure of Invention
In view of the above, it is necessary to provide an image synthesis method, an apparatus, a computer device, and a storage medium capable of improving the efficiency of sample image acquisition in view of the above technical problems.
An image data synthesis method comprising:
acquiring an article image of a category to be fused;
randomly amplifying the article image to obtain an amplified article image;
acquiring a source image, wherein the source image comprises a package image;
segmenting the source image according to the wrapping image, and determining a wrapping position in the source image;
and performing fusion processing by using the parcel position and the augmented object image to generate a fused image.
In one embodiment, the fusing the augmented article image using the parcel location to generate a fused image includes:
determining a fusion position corresponding to the augmented article image by using the parcel position;
and performing fusion processing according to the fusion position and the augmented article image to generate a fused image.
In one embodiment, the determining the fusion position corresponding to the augmented item image by using the parcel position includes:
acquiring a fusion range in the package position, wherein the fusion range is obtained by counting the labeling information of the source image;
determining the size of the augmented article image according to the category to be fused;
and selecting a range corresponding to the size of the augmented article image in the fusion range, and recording the selected range as a fusion position.
In one embodiment, the method further comprises:
acquiring marking information corresponding to a source image;
extracting multiple types of object images to be fused from the labeling information;
acquiring the number of the images of the article to be fused;
and when the number of the article images meets the number of the article images, splicing the article images.
In one embodiment, the method further comprises:
acquiring the splicing number corresponding to the fused image;
determining a coincidence region between the fused images;
and splicing the corresponding fused images according to the overlapped areas and the splicing number.
In one embodiment, the determining the overlapping region between the fused images includes:
acquiring a splicing coincidence range, wherein the splicing coincidence range is obtained by counting according to the labeling information of a plurality of source images;
and randomly selecting a superposition area between the multiple fused images in the splicing superposition range.
An image data synthesis apparatus, the apparatus comprising:
the acquisition module is used for acquiring an article image of a category to be fused;
the augmentation module is used for randomly augmenting the article image to obtain an augmented article image;
the acquisition module is also used for acquiring a source image, wherein the source image comprises a package image;
the segmentation module is used for segmenting the source image according to the wrapped image and determining the wrapped position in the source image;
and the fusion module is used for carrying out fusion processing on the parcel position and the augmented object image to generate a fused image.
In one embodiment, the fusion module is further configured to determine a fusion location corresponding to the augmented item image using the parcel location; and performing fusion processing according to the fusion position and the augmented article image to generate a fused image.
A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the computer program to perform the steps of:
acquiring an article image of a category to be fused;
randomly amplifying the article image to obtain an amplified article image;
acquiring a source image, wherein the source image comprises a package image;
segmenting the source image according to the wrapping image, and determining a wrapping position in the source image;
and performing fusion processing by using the parcel position and the augmented object image to generate a fused image.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an article image of a category to be fused;
randomly amplifying the article image to obtain an amplified article image;
acquiring a source image, wherein the source image comprises a package image;
segmenting the source image according to the wrapping image, and determining a wrapping position in the source image;
and performing fusion processing by using the parcel position and the augmented object image to generate a fused image.
According to the image data synthesis method, the image data synthesis device, the computer equipment and the storage medium, the article images of the to-be-fused categories are obtained, and the article images are randomly augmented to obtain a plurality of augmented article images. If there are more article images to be fused, a large number of randomly augmented article images can be obtained. The wrapping position in the source image can be determined by segmenting the source image, and the wrapping position, the augmented article image and the source image are subjected to fusion processing, so that a large number of randomly augmented article images can be respectively fused with the source image to obtain a large number of sample images. Therefore, massive sample images required for training the AI intelligent judging model can be quickly and effectively acquired.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The image data synthesis method provided in the embodiment of the present invention can be applied to an application environment as shown in fig. 1. The first terminal 104 is connected to the security check machine 102 via a network. The first terminal 104 and the server 106 are connected via a network. The second terminal 108 is connected to the server 106 via a network. The security check machine 102 scans the baggage item to generate a color map and transmits the color map to the first terminal 104. The first terminal 104 uploads the color map to the server 106. The server 106 acquires the color map, the second terminal 108 acquires the color map in the server, and the source image is obtained by labeling the object images of various categories in the color map. The source image contains marking information, and the marking information comprises an article type, an article name and an article position (comprising coordinates). The second terminal 108 acquires the article image of the category to be fused, and randomly amplifies the article image to obtain an amplified article image. The second terminal 108 segments the source image according to the parcel image in the source image, determines the parcel position in the source image, and performs fusion processing on the parcel position, the augmented article image and the source image to generate a fused image.
In one embodiment, as shown in fig. 2, an image data synthesis method is provided, which is described by taking the second terminal in fig. 1 (for simplicity of description, the second terminal is hereinafter simply referred to as the terminal) as an example, and specifically includes:
step 202, acquiring an article image of a category to be fused.
The items may be classified into different categories according to the need for security inspection. The security inspection scenes are different, and the types of the articles to be detected are also different. The item category to be detected, that is, the category to be fused in the present embodiment. The categories to be fused may be recorded in an item category list. For example, the item category list includes a tool category, a container category, a tool category, and the like.
The terminal obtains an article category list corresponding to the security check scene, and extracts categories to be fused from the article category list. The number of the article images corresponding to the category to be fused may be one, or two or more (may be collectively referred to as a plurality of images). If the item image corresponding to the category to be fused is plural, the shape, size, etc. of the item in each item image may be different.
And the terminal acquires a source image according to the category to be fused. The generation mode of the item image of the category to be fused may include various modes. For example, the terminal may segment the article image of the to-be-fused category according to the labeling information of the source image, and extract the corresponding article image. The method can also be used for directly extracting the object image to be fused from the webpage by the terminal. The source images can be further classified into source images (also referred to as first source images) containing articles of the category to be fused and source images (also referred to as second source images) not containing articles of the category to be fused according to the labeling information of the source images. In a source image containing the object image of the category to be fused, segmentation is carried out according to the outline of the object image of the category to be fused through the marking information of the source image, and therefore the object image of the category to be fused is extracted.
And 204, randomly amplifying the article image to obtain an amplified article image.
And when the article image corresponding to the category to be fused can be one, randomly amplifying the article image by the terminal. The object images are subjected to affine transformation, namely the object images are subjected to linear transformation and translation, and a plurality of object images in different shapes can be expanded through the transformation of different parameters of one object image in the expansion process. When the article images corresponding to the to-be-fused categories can be multiple, the terminal respectively carries out random augmentation on each article image, and therefore a large number of augmented article images can be obtained.
And step 206, acquiring a source image, wherein the source image comprises a package image.
And the terminal acquires a source image which does not contain the article to be fused, namely a second source image. A source image may include a plurality of items, including items to be detected and other items. Each item has corresponding label information. The labeling information includes category, name, location, etc. The position may be represented by coordinates or the like. Further, other items may include packages. The package may have one or more items therein. It should be noted that, in order to perform effective image data fusion, that is, to simulate a situation in which a dangerous object is carried in a package in a real security inspection scene, in this embodiment, the second source image is adopted as a basic image to be fused. And fusing the images of the articles to be detected to the corresponding range of the package position through fusion processing, wherein the package of the second source image can have partial articles or no articles.
And step 208, segmenting the source image according to the wrapping image, and determining the wrapping position in the source image.
The method comprises the steps that a terminal obtains a source image which does not contain the object to be fused, image data of images of various objects are marked in the source image, and the image data contain the object images and marking data corresponding to the object images. The source image comprises a parcel image corresponding to the luggage article, statistics is carried out according to image data corresponding to the parcel image and the labeling information of the source image, the position of the parcel image in the source image is determined, and the source image is segmented according to the position of the parcel image.
And step 210, fusing the positions of the packages and the augmented object images to generate fused images.
And the terminal acquires a second source image stored in the server, wherein the second source image is marked with image data of the images of the articles in various categories, and the position of the package in the source image is determined according to statistics of the image data corresponding to the package image and the marking information of the source image. And acquiring the article image amplified in the step 204, and fusing the range of the package position in the source image with the amplified article image. Thereby generating a fused image.
In this embodiment, the article images are randomly augmented by obtaining the article images of the category to be fused, so as to obtain a plurality of augmented article images. If there are more article images to be fused, a large number of randomly augmented article images can be obtained. By segmenting the source image, the parcel position in the source image can be determined, the parcel position, the augmented article image and the source image are fused,
therefore, a large number of randomly amplified object images can be respectively fused with the source images to obtain a large number of sample images. Therefore, massive sample images required for training the AI intelligent judging model can be quickly and effectively acquired.
In one embodiment, the fusing processing is performed by using the parcel position and the augmented article image to generate a fused image, and the fusing processing includes: determining a fusion position corresponding to the augmented article image by using the parcel position; and performing fusion processing according to the fusion position and the augmented article image to generate a fused image.
In this embodiment, statistics is performed according to image data corresponding to the parcel image and the annotation information of the source image, and the position of the parcel in the source image is determined. And counting the image data of the augmented article image, and calculating to obtain the size of the augmented article image. And determining the fusion position corresponding to the augmented article image by using the package position according to the size of the augmented article image. And fusing the augmented object image with the fused position by using a fusion algorithm to generate a fused image.
Further, the fusion processing mode may be various. The augmented article image and/or each category article image to be fused in the category list and the second source image may be referred to as an image to be fused. For the sake of brevity, the augmented article image and/or each category article image to be fused in the category list may be referred to as a first image to be fused, and the second source image may be referred to as a second image to be fused. The terminal obtains a color image (also called as RGB image) corresponding to the first image to be fused and a color image corresponding to the second source image, and converts the color image corresponding to the first image to be fused and the color image corresponding to the second source image into corresponding high-low energy images through the first deep neural network model respectively. The high and low energy images represent penetration information of the target under different energy levels of the X-ray particle beam. If the penetration rate of the a object is x1 and the b object is x2, the penetration rate of the a, b objects when stacked together can be approximated as x1 x 2. Then, the superimposed color image can be estimated according to a general coloring standard based on the value of x1 × x 2. The terminal acquires first high-low energy data corresponding to a first image to be fused, the terminal acquires second high-low energy data corresponding to a second image to be fused, and product calculation is carried out on the first high-low energy data and the second high-low energy data in a high-low energy channel to obtain high-low energy data corresponding to the fused image. And the terminal acquires high-low energy data corresponding to the fusion image, and converts the high-low energy image into a corresponding color image through the conversion of the second deep neural network model. Therefore, the effective fusion of the first image to be fused and the second source image is completed. Thereby obtaining a positive sample image which is consistent with the actual security inspection scene.
The terminal can also convert the first image to be fused in the RGB format to the HSV channel to obtain a first H channel component, a first S channel component and a first V channel component. The terminal may also convert the second image to be fused in the RGB format to the HSV channel to obtain a second H channel component, a second S channel component, and a second V channel component. And the terminal superposes the first H channel component and the second H channel component to obtain a superposed H channel component. And the terminal superposes the first S channel component and the second S channel component to obtain a superposed S channel component. And the terminal superposes the first V channel component and the second V channel component to obtain a superposed V channel component. And the terminal converts the superposed H channel component, the superposed S channel component and the superposed V channel component into HSV color space images and converts the HSV images back into RGB images. Therefore, the effective fusion of the first image to be fused and the second source image is completed. Thereby obtaining a positive sample image which is consistent with the actual security inspection scene.
The terminal can also acquire a first R channel component, a first G channel component and a first B channel component which correspond to the first image to be fused in the RGB channel respectively. The terminal can also acquire a second R channel component, a second G channel component and a second B channel component which correspond to a second image to be fused in the RGB channel respectively. And the terminal superposes the first R channel component and the second R channel component to obtain a superposed R channel component. And the terminal superposes the first G channel component and the second G channel component to obtain a superposed G channel component. And the terminal superposes the first B channel component and the second B channel component to obtain a superposed B channel component. And the terminal converts the superposed R channel component, the superposed G component and the superposed B component into an RGB image. Therefore, the effective fusion of the first image to be fused and the second source image is completed. Thereby obtaining a positive sample image which is consistent with the actual security inspection scene.
The fusion processing mode is utilized to well simulate the situation of superposition of the images of the articles, and the accuracy of recognizing the dangerous articles by training the AI intelligent judgment model is improved.
In one embodiment, determining a fusion position corresponding to the augmented item image using the parcel position comprises: acquiring a fusion range in the package position, wherein the fusion range is obtained by counting the label information of the source image; determining the size of the augmented article image according to the article image of the category to be fused; and selecting a range corresponding to the size of the augmented object image within the fusion range, and recording the selected range as a fusion position.
As shown in fig. 3, in the source image 302, a package image 304 needs to be fused with an augmented item image 306. And randomly selecting a fusion position within the range of the parcel positions. The source image comprises various article images and image data corresponding to the article images, statistics is carried out on the image data corresponding to the package images and the label information of the source image to obtain the position range of the package images in the source image, the position range of the package images is selected as a fusion range, and a fusion position is randomly selected in the fusion range. And counting the image data of the item images to be fused to obtain the size of the item images to be fused, and determining the size of the augmented item images according to the size of the item images to be fused. And randomly selecting a range corresponding to the size of the augmented article image in the fusion range, and recording the range as a fusion position. When only one augmented item image needs to be fused within the parcel range, the positions of the augmented item image within the fusion range may be 306(a), 306(b), 306(c), and 306 (d). There are also many options for the fusion site as long as it does not exceed the fusion range.
Furthermore, a plurality of augmented object images can be fused in the fusion range, and the augmented object images can be object images of the same category or object images of different categories, such as tools and tools. Thus, a large number of diverse sample images can be fused.
In one embodiment, the method further comprises the steps of obtaining the annotation information corresponding to the source image; extracting multiple types of object images to be fused from the labeling information; acquiring the number of the images of the article to be fused; and when the number of the object images meets the number of the object images, splicing the object images.
In the embodiment, a terminal acquires a source image, the source image comprises labeling information, the labeling information is labeled with image data of each category of article image, the outline of the article image is determined according to the image data of each category of article image, the article image is divided according to the outline of the article image, images of various to-be-fused articles are selected from an article category list, the number of the to-be-fused article images is acquired, and when the number of the article images meets the number of the article images, the to-be-fused articles are spliced.
Furthermore, before the number of the object images to be fused is obtained, the number of the object images to be fused is determined, and various strategies are provided for determining the number of the object images to be fused. The terminal can acquire the labeling information from the source image, the labeling information labels the image data of the image of each category of articles, the statistical range of the number of the article images is obtained by counting the image data of the image of each category of articles, and then the number of the articles to be fused is randomly selected in the range for each article image. And the number of the images of the objects to be fused can be fixed by the terminal. After the number of the to-be-fused article images is determined, when the selected article images of the multiple to-be-fused categories meet the number of the to-be-fused article images, splicing the to-be-fused article images. By the method, sample images containing images of different types of articles can be obtained.
In one embodiment, the method further comprises obtaining the number of splices corresponding to the fused image; determining a coincidence region between the fused images; and splicing the corresponding fused images according to the overlapped areas and the splicing number.
In this embodiment, before splicing the fused article images, the overlapping area between the fused article images needs to be determined. The terminal obtains a source image, the source image comprises labeling information, and image data of the object images of all categories are labeled in the labeling information, wherein the image data corresponding to all the package images and the labeling information of the source image are selected for statistics, and the overlapping area between all the package images is obtained. And determining the coincidence area between the fused images according to the coincidence area of each wrapped image in the source image. And acquiring the splicing number corresponding to the fused images, and splicing the corresponding fused images according to the overlapping areas and the splicing number. The splicing comprises transverse splicing, for example, in a real security check scene, after the luggage object is placed on a conveyor belt of a security check machine, the conveyor belt of the security check machine rolls continuously towards one direction, and the scanned images of the luggage object acquired by the security check machine also appear continuously towards one direction, namely, the transverse splicing simulates the condition that the luggage object is stuck when passing the security check. The sample images spliced by the method truly fit the condition of package adhesion in a real security inspection scene.
In one embodiment, determining a region of coincidence between the fused images comprises: acquiring a splicing coincidence range, wherein the splicing coincidence range is obtained by counting according to the labeling information of a plurality of source images; and randomly selecting a superposition area between the multiple fused images in the splicing superposition range.
In the embodiment, a plurality of source images are obtained, the source images include labeling information, and the labeling information is labeled with image data of the images of the articles in each category. The method comprises the steps of selecting image data corresponding to all package images, carrying out statistics according to the image data of all package images and the label information of a source image to obtain the overlapping area of all packages, taking the overlapping area of all packages in the source image as the overlapping range between fused images, randomly selecting the overlapping area between a plurality of fused images in the overlapping range of splicing, obtaining the corresponding splicing number in the fused images, and splicing according to the overlapping area and the fused images corresponding to the splicing number. According to the method, a large number of diverse spliced images can be obtained by splicing the fused images. Therefore, the sample images simulating the adhesion of the packages can be well spliced.
In one embodiment, as shown in fig. 4, there is provided an image data synthesizing apparatus including: an obtaining module 402, an augmenting module 404, a partitioning module 406, and a fusing module 408, wherein:
an obtaining module 402, configured to obtain an image of an article to be fused.
And an augmentation module 404, configured to randomly augment the article image to obtain an augmented article image.
The obtaining module 402 is further configured to obtain a source image, where the source image includes a parcel image.
And a segmentation module 406, configured to segment the source image according to the parcel image, and determine a parcel position in the source image.
And a fusion module 408, configured to perform fusion processing on the item image after the parcel position and the augmentation, so as to generate a fused image.
In one embodiment, the fusion module 408 is further configured to determine a fusion location corresponding to the augmented item image using the parcel location; and performing fusion processing according to the fusion position and the augmented article image to generate a fused image.
In the embodiment, the terminal obtains a source image, image data of each category of article image is marked in the source image, statistics is carried out according to image data corresponding to a package image and marking information of the source image to obtain the position of the package in the source image, the range of the package position is selected as a fusion range, and a position which is adaptive to the size of the augmented article image is randomly selected as a fusion position in the fusion range. And performing fusion processing according to the fusion position and the augmented article image to generate a fused image. Therefore, by randomly selecting the fusion position within the range of the parcel position, the augmented object image can be fused into different positions to generate a large number of different sample images.
In one embodiment, the fusion module 408 is further configured to obtain a fusion range in the package position, where the fusion range is obtained by counting the label information of the source image; determining the size of the augmented article image according to the article image of the category to be fused; and selecting a range corresponding to the size of the augmented object image within the fusion range, and recording the selected range as a fusion position.
In one embodiment, the obtaining module 402 is further configured to obtain labeling information corresponding to the source image to extract multiple types of article images to be fused.
In one embodiment, the obtaining module 402 is further configured to obtain a stitching number corresponding to the fused image.
In yet another embodiment, the obtaining module 402 is further configured to obtain a splicing retuning range.
In one embodiment, as shown in fig. 5, the apparatus further comprises: a stitching module 512, wherein:
and the splicing module 512 is used for splicing the object images to be fused or splicing the fused object images.
In one embodiment, a computer device is provided, the internal structure of which may be as shown in FIG. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image data synthesis method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps in the image data synthesis method provided by the various embodiments described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.