CN113496235A - Image processing method, device and system, storage medium and computing equipment - Google Patents

Image processing method, device and system, storage medium and computing equipment Download PDF

Info

Publication number
CN113496235A
CN113496235A CN202010197303.2A CN202010197303A CN113496235A CN 113496235 A CN113496235 A CN 113496235A CN 202010197303 A CN202010197303 A CN 202010197303A CN 113496235 A CN113496235 A CN 113496235A
Authority
CN
China
Prior art keywords
image
target
network
training
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010197303.2A
Other languages
Chinese (zh)
Inventor
范托
高强华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010197303.2A priority Critical patent/CN113496235A/en
Publication of CN113496235A publication Critical patent/CN113496235A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker

Abstract

The application discloses an image processing method, an image processing device, an image processing system, a storage medium and computing equipment. Wherein, the method comprises the following steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event. The method and the device solve the technical problems that in the related art, the target image is obtained by directly obtaining the image or marking the image, and the obtaining cost of data and marking is high.

Description

Image processing method, device and system, storage medium and computing equipment
Technical Field
The application relates to the field of traffic video monitoring, in particular to an image processing method, an image processing device, an image processing system, a storage medium and computing equipment.
Background
Under the existing traffic video monitoring scene, traffic incident detection mainly aims at common incident detection, such as vehicle congestion, vehicle pause and the like, and low-frequency incidents (such as tunnel fire, two-vehicle collision, two-vehicle rear-end collision, scraping and the like) are not detected. However, low frequency events or image annotation (e.g., depth of field of the video image) have specific requirements, such as tunnel fire detection, specific accident detection classification, video image physical space mapping, and the like.
In order to solve the above problems, the method can be implemented by training a corresponding algorithm model, but the algorithm model cannot be trained by sufficient data due to the high acquisition cost and the limited number of low-frequency events or image labels, and the processing accuracy of the algorithm model is low.
Aiming at the problem that the acquisition cost of data and annotation is high when a target image is obtained by directly acquiring the image or labeling the image in the related technology, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, an image processing system, a storage medium and computing equipment, which are used for at least solving the technical problems that in the related art, a target image is obtained by directly obtaining an image or labeling the image, and the obtaining cost of data and labels is high.
According to an aspect of an embodiment of the present application, there is provided an image processing method including: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
According to another aspect of the embodiments of the present application, there is also provided an image processing method, including: acquiring a first image containing a target event; acquiring a second acquired image; and processing the first image and the second image by using a countermeasure generation network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and comprise a target event.
According to another aspect of the embodiments of the present application, there is also provided an image processing apparatus including: the image acquisition module is used for acquiring a first image containing a target event; and the processing module is used for processing the first image by using the countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image contains a target event.
According to another aspect of the embodiments of the present application, there is also provided an image processing apparatus including: the first acquisition module is used for acquiring a first image containing a target event; the second acquisition module is used for acquiring a second acquired image; and the processing module is used for processing the first image and the second image by using the countermeasure generating network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and comprise a target event.
According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program, wherein when the program runs, an apparatus on which the storage medium is controlled to execute the above-mentioned image processing method.
According to another aspect of the embodiments of the present application, there is also provided a computing device, including a processor and a memory, the processor being configured to execute a program stored in the memory, wherein the program executes to perform the image processing method described above.
According to another aspect of the embodiments of the present application, there is also provided an image processing system including: a processor; and a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
According to another aspect of the embodiments of the present application, there is also provided an image processing method, including: acquiring a first image containing a target event; processing the first image by using a countermeasure generating network to obtain a target image; wherein the countermeasure generation network includes: the generator network is used for processing the first image to obtain a target image, and the discriminator network comprises: the image acquisition device comprises a sampling layer, a plurality of first convolution layers, a plurality of second convolution layers, a plurality of third convolution layers and a judging layer, wherein the first convolution layer is connected with the sampling layer, the third convolution layer is connected with the first convolution layer and the second convolution layer, the judging layer is connected with the third convolution layer, the sampling layer is used for sampling a target image based on a first target label and sampling a second image based on a second target label, and the target image and the second image are input to the plurality of second convolution layers.
In the embodiment of the application, after the first image containing the target event is acquired, the corresponding target image can be generated by using the countermeasure generation network, so that the image containing the target event in the actual scene is acquired. It is easy to note that image generation can be realized through the countermeasure generation network, and images with actual styles and containing target events can be generated in large quantities without additional data annotation or association, so that sufficient training samples are provided for actual algorithm models, the processing accuracy of the algorithm models is ensured, the acquisition cost of data and annotation is reduced, and the technical problem that the acquisition cost of data and annotation is high when the target images are obtained by directly acquiring the images or annotating the images in the related art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing an image processing method according to an embodiment of the present application;
FIG. 2 is a flow chart of a first image processing method according to an embodiment of the application;
FIG. 3 is a schematic view of an alternative image processing method according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of an alternative Cycle-Gan structure according to embodiments of the present application;
FIG. 5 is a detailed structural diagram of an alternative countermeasure generation network according to an embodiment of the application;
FIG. 6 is a detailed block diagram of an alternative network of discriminators according to embodiments of the present application;
FIG. 7 is a flow chart of a second image processing method according to an embodiment of the application;
FIG. 8 is a schematic diagram of an image processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic diagram of another image processing apparatus according to an embodiment of the present application;
FIG. 10 is a flow chart of a third image processing method according to an embodiment of the present application;
FIG. 11 is a flow chart of a fourth method of image processing according to an embodiment of the present application; and
fig. 12 is a block diagram of a computer terminal according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
the countermeasure generation network: a deep learning network model, by (at least) two modules in a framework: the mutual game learning of the generative model and the discriminant model produces a fairly good output.
Depth of field: may refer to the range of distances between the front edge of the camera lens or other imager and the subject measured for imaging that enables a sharp image to be taken.
A 3D engine: the set of algorithm implementations may be an algorithm implementation set that abstracts real materials into expressions such as polygons or various curves, performs correlation calculations in a computer, and outputs a final image.
Crawler: it may be a program or script that automatically captures web information according to certain rules.
Directional crawler: the crawler can accurately acquire the target site information.
Viewing angle: the angle between the line of sight and the vertical direction of the display and the like can be defined, and when observing an object, the included angle between the light rays led out from two ends (up, down, left and right) of the object and the optical center of human eyes can be defined.
Example 1
According to an embodiment of the present application, there is provided an image processing method, it should be noted that the steps shown in the flowchart of the drawings may be executed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in an order different from that here.
The method provided by the embodiment of the application can be executed in a mobile terminal, a computer terminal or a similar operation device. Fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing an image processing method. As shown in fig. 1, the computer terminal 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 104 for storing data, and a transmission device 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial BUS (USB) port (which may be included as one of the ports of the BUS), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 104 can be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the image processing method in the embodiment of the present application, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, that is, implementing the image processing method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).
It should be noted here that in some alternative embodiments, the computer device (or mobile device) shown in fig. 1 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 1 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in the computer device (or mobile device) described above.
Under the above operating environment, the present application provides an image processing method as shown in fig. 2. Fig. 2 is a flowchart of a first image processing method according to an embodiment of the present application. As shown in fig. 2, the method comprises the steps of:
step S202, acquiring a first image containing a target event;
the target event in the above steps may be an event with a low occurrence frequency in a traffic video monitoring scene, such as a tunnel fire, a two-vehicle collision, a two-vehicle rear-end collision, a scratch, or the like, or an event with a high labeling cost, such as depth of field of a video image, but is not limited thereto, and may also be other events, such as a common event of a vehicle congestion, a vehicle pause, or the like.
Due to the fact that the occurrence frequency of the target events is low or the labeling cost is high, a large number of images containing the target events cannot be obtained directly through traffic video monitoring. In order to solve this problem, the first image may be generated in advance by simulating a target scene corresponding to a target event, or may be extracted from a web page by a web crawler, but is not limited thereto. In the embodiment of the present application, the first image is generated by simulation as an example.
For example, taking a tunnel fire incident as an example, as shown in fig. 3, in order to obtain a large number of images of a specific tunnel fire after receiving a tunnel fire detection requirement, a computing device, such as a server, may first simulate a tunnel fire scene by using a 3D engine, so as to generate a first image including the tunnel fire incident, and directly obtain the images of the fire in the tunnel without traffic video monitoring.
And step S204, processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
The second image in the above steps may be a real captured actual scene image, and the image may be directly obtained through traffic video monitoring. Because the occurrence frequency of the target event is low or the labeling cost is high, the second image may be an actually acquired normal image and may not contain the target event.
Based on the principle of generating a network against, the real collected data can be used as a migration source, and a large number of training samples including the second image are used for training in advance, so that the trained network can convert the attribute parameters of the input image into the attribute parameters of the second image. Optionally, the attribute parameters may be parameters of an actual scene, such as style, color, material, and the like, but are not limited thereto, and the style is taken as an example in the embodiment of the present application for description. Therefore, after the first image is input to the trained network, the real scene style can be migrated to the first image, so as to generate a target image with an actual style and containing a target event.
For example, still taking the tunnel fire incident as an example, as shown in fig. 3, the computing device may first acquire an image of a normal scene directly through traffic video monitoring, and use the image as training data to train the countermeasure generating network, and then, after generating an image including the tunnel fire incident in a simulation mode, may input the image into the trained countermeasure generating network to generate a large number of target images including the tunnel fire incident.
Based on the scheme provided by the above embodiment of the application, after the first image containing the target event is acquired, the corresponding target image can be generated by using the countermeasure generation network, so that the image containing the target event in the actual scene is acquired. It is easy to note that image generation can be realized through the countermeasure generation network, and images with actual styles and containing target events can be generated in large quantities without additional data annotation or association, so that sufficient training samples are provided for actual algorithm models, the processing accuracy of the algorithm models is ensured, the acquisition cost of data and annotation is reduced, and the technical problem that the acquisition cost of data and annotation is high when the target images are obtained by directly acquiring the images or annotating the images in the related art is solved.
In the above embodiments of the present application, the method further includes the steps of: obtaining a plurality of training data, wherein each training data comprises: the method comprises the steps that a first image, a first target label corresponding to the first image, a second image and a second target label corresponding to the second image are obtained; training an initial network by using a plurality of training data to obtain a confrontation generation network, wherein the initial network comprises: the image processing system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
The initial network in the above steps may be an existing Cycle-Gan countermeasure generation network, but is not limited thereto, and may be another countermeasure generation network. For example, as shown in fig. 4, the Cycle-Gan structure may include two generators a and b, which respectively complete generation of the data field X to the data field Y and generation of the data field Y to the data field X, and two discriminators a and b, which respectively complete discrimination of whether generated data is the data field X and discrimination of whether generated data is the data field Y. The network can train the generator a and the generator b in an unsupervised learning mode without marking image sample pairs.
In the embodiment of the application, the migration of the second image to the first image style can be completed by using only one generator. As shown in fig. 5, a generator network a and an authenticator network a are taken as an example for explanation. Suppose that the image domain X is a first image, the image domain Y is a second image, and the dotted line frame is the existing Cycle-Gan model frame. In the embodiment of the application, not only the image information is used, but also the detection target frame label corresponding to the image is additionally used, that is, the target label is used, so that the migration effect of the target in the image is enhanced. And the target label loss can be additionally added to the network part of the discriminator, so that the generation-resisting network can carry out full-screen migration and pay more attention to the migration of the target style in the image.
For example, still taking a tunnel fire event as an example, as shown in fig. 3, in order to generate a target image, a countermeasure generating network needs to be trained, a large number of tunnel fire images generated by simulation and a large number of actual scene images may be acquired, and at the same time, a target frame corresponding to the simulated tunnel fire images and a target frame corresponding to the actual scene images are acquired, so as to obtain training samples, and the countermeasure generating network is trained by using the training samples.
Through the scheme, the countermeasure generation network provided by the embodiment of the application utilizes more information, and avoids style migration originally based on a full picture, so that the gap between a background and a target is possibly lost in the style migration process. The generated data has better stability in the later target detection model training, and is easy to be quickly generalized to different traffic scenes (a ball machine, a gunlock, a fisheye camera and the like).
In the above embodiment of the present application, training the initial network by using a plurality of training data to obtain the confrontation generating network includes: training a generator network with a first image of a plurality of training data; training the discriminator network by utilizing the image output by the generator network, the first target label corresponding to the first image in the plurality of training data, the second image and the second target label corresponding to the second image to obtain a confrontation generating network; wherein the discriminator network comprises: the image sampling device comprises a sampling layer, a plurality of first convolution layers, a plurality of second convolution layers, a plurality of third convolution layers and a judging layer, wherein the first convolution layer is connected with the sampling layer, the third convolution layer is connected with the first convolution layer and the second convolution layer, the judging layer is connected with the third convolution layer, the sampling layer is used for sampling an image output by a generator network based on a first target label and sampling a second image based on a second target label, and the image output by the generator network and the second image are input to the plurality of second convolution layers.
As shown in fig. 5, the acquired first image is used as a data field X, the second image is used as a data field Y, the first image is input into a generator network a to obtain a generated image, and the generated image, the second image and two target frames are input into a discriminator network a, wherein the second image is used as a target image, and monitoring of the target frames is increased to obtain a probability of whether the generated image is the data field Y, if the probability is not 0.5, the discriminator network a can accurately discriminate the target image generated by the generator network a, and the target image generated by the generator network a has low similarity to the data field Y, and then training is continued; if the probability is 0.5, it indicates that the discriminator network a cannot accurately discriminate the target image generated by the generator network a, that is, the similarity between the target image generated by the generator network a and the data field Y is high, and at this time, it is determined that the training of the antagonistic generation network is completed, and the training may be stopped.
As shown in fig. 6, in the embodiment of the present application, three convolutional layers are taken as an example, but the structure and the mathematical structure of the convolutional layers are not limited. The dashed box is the existing discriminator network structure that will equivalently discriminate whether it is a real image sample based on the whole image information, so the phasing will cause the generator to equivalently generate samples at the full-map level. The discriminator branch based on the target frame is mainly added in the structure, and after the weak supervision based on the target frame is added, the discriminator can have certain difference on the discrimination of the picture target and the discrimination outside the target, so that the discriminator can be used for reflecting that the generator focuses on the generation in the region of the target frame. The specific implementation flow of the network structure is as follows:
sampling an image based on a target frame to generate a target region enhanced image which has the same size and the same number of channels as an original image, then learning and extracting weight information of a target region through the connection of a plurality of convolution layers, and fusing the weight information with a feature map of an original image in a multiplication mode; the number of downsampling of the convolution layers of the two branches is consistent, and the size of a weight graph of the target area is ensured to be consistent with the size of a feature graph of the original graph; and outputting the judgment result after the feature diagram after being fused passes through the plurality of layers of convolution layers and the judgment layer.
It should be noted that the loss function layer of the discriminator network still adopts the loss function of the existing discriminator network.
For example, still taking the tunnel fire event as an example, as shown in fig. 3, the training process against the generation network is as follows: inputting the tunnel fire image generated by simulation into a generator network, finishing training the generator network, inputting the image output by the generator network, the actual scene image and the two target labels into a discriminator network, finishing training the discriminator network, and determining whether the training is finished according to the output result of the discriminator network in the training process.
It should be noted that, in order to quickly and accurately complete the training requirement, the discriminator network may be trained first to ensure that the discrimination accuracy of the discriminator network meets the requirement, and then the generator network may be trained to ensure that the generator network can generate images with higher similarity.
In the above embodiments of the present application, the obtaining of the first image including the target event includes at least one of: generating a first image by a three-dimensional engine; and capturing the webpage content by using a web crawler to obtain a first image.
The three-dimensional engine in the above steps may be a game engine disposed in a computing device, such as a game engine of GTAV, but is not limited thereto and may be other engines.
In an alternative embodiment, the target scene may be simulated by a 3D engine, and the target scene image may be further generated, that is, the first image is obtained. In another alternative embodiment, the first image may be obtained by crawling a network image according to keywords by using a directional crawler.
For example, still taking the tunnel fire event as an example, as shown in fig. 3, a 3D engine is deployed in the computing device, and a first image containing the target event can be generated by the 3D engine, and meanwhile, the target image with the actual style is generated by combining the actually acquired ordinary image.
In the above embodiments of the present application, generating the first image by the three-dimensional engine includes: generating a target event through a three-dimensional engine; determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; based on the target event and the perspective parameter, a first image is generated.
The image capturing device in the above steps may be a camera for capturing video in traffic video monitoring, but is not limited thereto. The arrangement parameter may be the placement position of the camera in the actual scene.
In an alternative embodiment, the target time may be simulated by the 3D engine, and the 3D engine may further simulate the view angle of the target scene according to the camera placement of the actual scene, to generate the first image. In the process of generating the first image by the 3D engine, the placement condition of the camera is considered, so that the generated first image visual angle is ensured to be the same as the actually acquired second image visual angle, and the finally obtained target image is ensured to be more practical.
For example, still taking a tunnel fire event as an example, as shown in fig. 3, a 3D engine is deployed in the computing device, the 3D engine may be set, a tunnel fire scene is selected, and scene entities, such as setting a fire vehicle and other non-fire vehicles, may be set in the scene, and other parameters that affect fire such as weather and lighting may be set, and after the setting is completed, the tunnel fire may be simulated by the 3D engine. Also, the view angle of the 3D engine may be calculated and set according to the situation of the cameras disposed in the actual tunnel, thereby generating a target scene generation map (i.e., the first image described above).
In the above embodiment of the present application, after the first image is processed by using the countermeasure generating network to obtain the target image, the method further includes the following steps: training a target detection model by using a target image; and processing the obtained detection image by using the trained target detection model to obtain a detection result, wherein the detection result is used for representing whether the detection image contains a target event or not.
The target detection model in the above steps may be a model for detecting whether a target event actually occurs, for example, still taking a tunnel fire event as an example, the above target detection model may be a model for detecting whether a tunnel is on fire, and the detection image may be an image captured by a camera arranged in the tunnel.
In an optional embodiment, a large number of target images can be generated through the countermeasure generation network, and the target images are used as training data to train the target detection model, so that the detection accuracy of the target detection model can be ensured, and therefore after the traffic video monitoring collects the detection images, whether the detection images contain the target events or not can be determined through the trained target detection model.
For example, still taking the tunnel fire event as an example, as shown in fig. 3, the computing device may train a tunnel fire detection model with a large number of images after obtaining a tunnel fire image having a realistic style. After training is finished, tunnel images can be acquired in real time through the camera to obtain the detection images, the detection images are further processed through the trained tunnel fire detection model, whether the tunnel is on fire or not is detected, and the purpose of detecting low-frequency events is achieved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.
Example 2
According to the embodiment of the application, an image processing method is further provided.
Fig. 7 is a flowchart of a second image processing method according to an embodiment of the present application. As shown in fig. 7, the method includes the steps of:
step S602, acquiring a first image containing a target event;
the target event in the above steps may be an event with a low occurrence frequency in a traffic video monitoring scene, such as a tunnel fire, a two-vehicle collision, a two-vehicle rear-end collision, a scratch, or the like, or an event with a high labeling cost, such as depth of field of a video image, but is not limited thereto, and may also be other events, such as a common event of a vehicle congestion, a vehicle pause, or the like.
Step S604, acquiring a second acquired image;
the second image in the above steps may be a real captured actual scene image, and the image may be directly obtained through traffic video monitoring. Because the occurrence frequency of the target event is low or the labeling cost is high, the second image may be an actually acquired normal image and may not contain the target event.
Step S606, the first image and the second image are processed by using the countermeasure generating network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and include a target event.
Optionally, the attribute parameters may be parameters of an actual scene, such as style, color, material, and the like, but are not limited thereto, and the style is taken as an example in the embodiment of the present application for description.
In this embodiment of the application, the confrontation generating network may be a network trained by using historical training data, after the first image and the second image are acquired, the two images may be input into the confrontation generating network, the network is trained to transfer the style of the second image into the first image, and finally, a target image having an actual style and containing a target event is output.
Based on the scheme provided by the above embodiment of the application, after the first image containing the target event and the acquired second image are acquired, the first image and the second image can be processed by using the countermeasure generation network to generate the corresponding target image, so that the image containing the target event in the actual scene is acquired. It is easy to note that image generation can be realized through the countermeasure generation network, and images with actual styles and containing target events can be generated in large quantities without additional data annotation or association, so that sufficient training samples are provided for actual algorithm models, the processing accuracy of the algorithm models is ensured, the acquisition cost of data and annotation is reduced, and the technical problem that the acquisition cost of data and annotation is high when the target images are obtained by directly acquiring the images or annotating the images in the related art is solved.
In the above embodiment of the present application, processing the first image and the second image by using a countermeasure generating network to obtain a target image includes: inputting the first image into a generator network; inputting the image output by the generator network, the second image, the first target label corresponding to the first image and the second target label corresponding to the second image into the discriminator network; and under the condition that the output result of the discriminator network meets the preset condition, determining the image output by the generator network as the target image.
The initial network in the above steps may be a Cycle-Gan countermeasure generation network, but is not limited thereto, and may be another countermeasure generation network. The Cycle-Gan structure may contain two generators a and b, and two discriminators a and b.
In the embodiment of the application, the migration of the second image to the first image style can be completed by using only one generator. In addition, not only is the image information used, but also a detection target frame label corresponding to the image is additionally used, namely the target label, so that the migration effect of the target in the image is enhanced. And the target label loss can be additionally added to the network part of the discriminator, so that the generation-resisting network can carry out full-screen migration and pay more attention to the migration of the target style in the image.
The preset condition in the above step may be a condition for determining that the training of the challenge generating network is completed, the output result of the discriminator network is a probability that the target image is the second image, and the preset condition may be a preset probability, for example, the probability may be 0.5. If the probability output by the discriminator network is not 0.5, the discriminator network can accurately discriminate the target image generated by the generator network, the similarity between the target image generated by the generator network and the second image is not high, and the training is required to be continued; if the probability is 0.5, it indicates that the discriminator network a cannot accurately discriminate the target image generated by the generator network a, that is, the similarity between the target image generated by the generator network and the second image is high, and at this time, it is determined that the training of the antagonistic generation network is completed, and the training may be stopped.
In the above embodiments of the present application, the obtaining of the first image including the target event includes at least one of: generating a first image by a three-dimensional engine; and capturing the webpage content by using a web crawler to obtain a first image.
The three-dimensional engine in the above steps may be a game engine disposed in a computing device, such as a game engine of GTAV, but is not limited thereto and may be other engines.
In the above embodiments of the present application, generating the first image by the three-dimensional engine includes: generating a target event through a three-dimensional engine; determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; based on the target event and the perspective parameter, a first image is generated.
The image capturing device in the above steps may be a camera for capturing video in traffic video monitoring, but is not limited thereto. The arrangement parameter may be the placement position of the camera in the actual scene.
In the above embodiment of the present application, after the first image and the second image are processed by using the countermeasure generating network to obtain the target image, the method further includes the following steps: training a target detection model by using a target image; and processing the obtained detection image by using the trained target detection model to obtain a detection result, wherein the detection result is used for representing whether the detection image contains a target event or not.
The target detection model in the above steps may be a model for detecting whether a target event actually occurs, for example, still taking a tunnel fire event as an example, the above target detection model may be a model for detecting whether a tunnel is on fire, and the detection image may be an image captured by a camera arranged in the tunnel.
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 3
According to an embodiment of the present application, there is also provided an image processing apparatus for implementing the above-described image processing method, as shown in fig. 8, the apparatus 700 includes: an image acquisition module 702 and a processing module 704.
The image acquiring module 702 is configured to acquire a first image including a target event; the processing module 704 is configured to process the first image by using a countermeasure generating network to obtain a target image, where the countermeasure generating network is obtained by training an acquired second image, an attribute parameter of the target image is the same as an attribute parameter of the second image, and the target image includes the target event.
It should be noted here that the image acquiring module 702 and the processing module 704 correspond to steps S202 to S204 in embodiment 1, and the two modules are the same as the example and application scenarios realized by the corresponding steps, but are not limited to the disclosure of embodiment 1. It should be noted that the above modules may be operated in the computer terminal 10 provided in embodiment 1 as a part of the apparatus.
In the above embodiment of the present application, the apparatus further includes: the device comprises a data acquisition module and a training module.
The data acquisition module is used for acquiring a plurality of training data, wherein each training data comprises: the method comprises the steps that a first image, a first target label corresponding to the first image, a second image and a second target label corresponding to the second image are obtained; the training module is used for training an initial network by using a plurality of training data to obtain a confrontation generation network, wherein the initial network comprises: the image processing system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
In the above embodiments of the present application, the training module includes: a first training unit and a second training unit.
Wherein the first training unit is configured to train the generator network with a first image of the plurality of training data; the second training unit is used for training the discriminator network by utilizing the image output by the generator network, the first target label corresponding to the first image in the plurality of training data, the second image and the second target label corresponding to the second image to obtain the confrontation generating network.
In the above embodiments of the present application, the image acquisition module includes at least one of: the device comprises a simulation generation unit and an image capture unit.
The simulation generation unit is used for generating a first image through a three-dimensional engine; the image capturing unit is used for capturing the webpage content by using a web crawler to obtain a first image.
In the above embodiments of the present application, the simulation generating unit includes: a first generation subunit, a determination subunit and a second generation subunit.
The first generation subunit is used for generating a target event through a three-dimensional engine; the determining subunit is used for determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; the second generating subunit is configured to generate the first image based on the target event and the viewing angle parameter.
In the above embodiments of the present application, the training module is further configured to train the target detection model with the target image; the processing module is further configured to process the acquired detection image by using the trained target detection model to obtain a detection result, where the detection result is used to represent whether the detection image includes a target event.
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 4
According to an embodiment of the present application, there is also provided an image processing apparatus for implementing the above-described image processing method, as shown in fig. 9, the apparatus 800 includes: a first obtaining module 802, a second obtaining module 804, and a processing module 806.
Wherein. The first obtaining module 802 is configured to obtain a first image including a target event; the second obtaining module 804 is configured to obtain a second acquired image; the processing module 806 is configured to process the first image and the second image by using the countermeasure generating network to obtain a target image, where an attribute parameter of the target image is the same as an attribute parameter of the second image and includes a target event.
It should be noted here that the first obtaining module 802, the second obtaining module 804 and the processing module 806 correspond to steps S602 to S606 in embodiment 2, and the three modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in embodiment 1. It should be noted that the above modules may be operated in the computer terminal 10 provided in embodiment 1 as a part of the apparatus.
In the above embodiments of the present application, the processing module includes: the device comprises a first input unit, a second input unit and a determination unit.
Wherein the first input unit is used for inputting the first image into the generator network; the second input unit is used for inputting the image output by the generator network, the second image, the first target label corresponding to the first image and the second target label corresponding to the second image into the discriminator network; the determining unit is used for determining the image output by the generator network as the target image under the condition that the output result of the discriminator network meets the preset condition.
In the above embodiments of the present application, the first obtaining module includes at least one of: the device comprises a simulation generation unit and an image capture unit.
The simulation generation unit is used for generating a first image through a three-dimensional engine; the image capturing unit is used for capturing the webpage content by using a web crawler to obtain a first image.
In the above embodiments of the present application, the simulation generating unit includes: a first generation subunit, a determination subunit and a second generation subunit.
The first generation subunit is used for generating a target event through a three-dimensional engine; the determining subunit is used for determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; the second generating subunit is configured to generate the first image based on the target event and the viewing angle parameter.
In the above embodiment of the present application, the apparatus further includes: and a training module.
The training module is used for training a target detection model by using a target image; the processing module is further configured to process the acquired detection image by using the trained target detection model to obtain a detection result, where the detection result is used to represent whether the detection image includes a target event.
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 5
According to an embodiment of the present application, there is also provided an image processing system including:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 6
According to the embodiment of the application, an image processing method is further provided.
Fig. 10 is a flowchart of a third image processing method according to an embodiment of the present application. As shown in fig. 10, the method includes the steps of:
step S1002, acquiring a first image containing a target event;
the target event in the above steps may be an event with a low occurrence frequency in a traffic video monitoring scene, such as a tunnel fire, a two-vehicle collision, a two-vehicle rear-end collision, a scratch, or the like, or an event with a high labeling cost, such as depth of field of a video image, but is not limited thereto, and may also be other events, such as a common event of a vehicle congestion, a vehicle pause, or the like.
Step S1004, the first image is processed by using the countermeasure generating network to obtain a target image.
Wherein, the countermeasure generation network includes: the generator network is used for processing the first image to obtain a target image, and the discriminator network comprises: the image sampling device comprises a sampling layer, a plurality of first convolution layers, a plurality of second convolution layers, a plurality of third convolution layers and a judging layer, wherein the first convolution layer is connected with the sampling layer, the third convolution layer is connected with the first convolution layer and the second convolution layer, the judging layer is connected with the third convolution layer, the sampling layer is used for sampling an image output by a generator network based on a first target label and sampling a second image based on a second target label, and the image output by the generator network and the second image are input to the plurality of second convolution layers.
The second image in the above steps may be a real captured actual scene image, and the image may be directly obtained through traffic video monitoring. Because the occurrence frequency of the target event is low or the labeling cost is high, the second image may be an actually acquired normal image and may not contain the target event.
Optionally, the attribute parameters may be parameters of an actual scene, such as style, color, material, and the like, but are not limited thereto, and the style is taken as an example in the embodiment of the present application for description.
The countermeasure generation network in the above steps may be an existing Cycle-Gan countermeasure generation network, but is not limited thereto, and may be another countermeasure generation network. In the embodiment of the application, the migration of the second image to the first image style can be completed by using only one generator. In addition, not only is the image information used, but also a detection target frame label corresponding to the image is additionally used, namely the target label, so that the migration effect of the target in the image is enhanced. And the target label loss can be additionally added to the network part of the discriminator, so that the generation-resisting network can carry out full-screen migration and pay more attention to the migration of the target style in the image.
In this embodiment of the application, the confrontation generating network may be a network trained by using historical training data, after the first image and the second image are acquired, the two images may be input into the confrontation generating network, the network is trained to transfer the style of the second image into the first image, and finally, a target image having an actual style and containing a target event is output.
Based on the scheme provided by the above embodiment of the application, after the first image containing the target event is acquired, the corresponding target image can be generated by using the countermeasure generation network, so that the image containing the target event in the actual scene is acquired. It is easy to note that image generation can be realized through the countermeasure generation network, and images with actual styles and containing target events can be generated in large quantities without additional data annotation or association, so that sufficient training samples are provided for actual algorithm models, the processing accuracy of the algorithm models is ensured, the acquisition cost of data and annotation is reduced, and the technical problem that the acquisition cost of data and annotation is high when the target images are obtained by directly acquiring the images or annotating the images in the related art is solved. In addition, the countermeasure generation network utilizes more information, and avoids style migration originally based on a full picture, so that the gap between the background and the target is lost possibly in the style migration process. The generated data has better stability in the later target detection model training, and is easy to be quickly generalized to different traffic scenes (a ball machine, a gunlock, a fisheye camera and the like).
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 7
According to the embodiment of the application, an image processing method is further provided.
Fig. 11 is a flowchart of a fourth image processing method according to an embodiment of the present application. As shown in fig. 11, the method includes the steps of:
step S1102, acquiring a first image including a target object;
the target object in the above steps may be an object with a low occurrence probability or a high labeling cost in different application scenarios. For example, in a traffic video monitoring scene, the target object may be a tunnel fire, a two-vehicle collision, a two-vehicle rear-end collision, scraping, and the like. In the translation scenario, the target object may be in a small language such as Uygur. In the field of remote sensing applications, the target object may be a pollutant polluting the environment, and may also be a forest fire, etc., but is not limited thereto.
And step S1104, processing the first image by using a confrontation generating network to obtain a target image, wherein the confrontation generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image includes a target object.
The second image in the above steps may be a real captured actual scene image, and the image may be directly obtained by the video monitoring device. Because the occurrence frequency of the target object is low or the labeling cost is high, the second image may be an actually acquired normal image and may not contain the target object.
Optionally, the attribute parameters may be parameters of an actual scene, such as style, color, material, and the like, but are not limited thereto, and in the embodiment of the present application, the style is taken as an example for description.
The countermeasure generation network in the above steps may be an existing Cycle-Gan countermeasure generation network, but is not limited thereto, and may be another countermeasure generation network. In the embodiment of the application, the migration of the second image to the first image style can be completed by using only one generator. In addition, not only is the image information used, but also a detection target frame label corresponding to the image is additionally used, namely the target label, so that the migration effect of the target in the image is enhanced. And the target label loss can be additionally added to the network part of the discriminator, so that the generation-resisting network can carry out full-screen migration and pay more attention to the migration of the target style in the image.
In this embodiment of the application, the confrontation generating network may be a network trained by using historical training data, after the first image and the second image are acquired, the two images may be input into the confrontation generating network, the network is trained to transfer the style of the second image into the first image, and finally, a target image having an actual style and containing a target event is output.
Based on the scheme provided by the above embodiment of the present application, after the first image containing the target object is acquired, the corresponding target image can be generated by using the countermeasure generation network, so that the image containing the target object in the actual scene is acquired. It is easy to note that image generation can be realized through the countermeasure generation network, and images with actual styles and containing target objects can be generated in large quantities without additional data annotation or association, so that sufficient training samples are provided for actual algorithm models, the processing accuracy of the algorithm models is ensured, the acquisition cost of data and annotation is reduced, and the technical problem that the acquisition cost of data and annotation is high when the target images are obtained by directly acquiring the images or labeling the images in the related art is solved. In addition, the countermeasure generation network utilizes more information, and avoids style migration originally based on a full picture, so that the gap between the background and the target is lost possibly in the style migration process.
It should be noted that the preferred embodiments described in the above examples of the present application are the same as the schemes, application scenarios, and implementation procedures provided in example 1, but are not limited to the schemes provided in example 1.
Example 8
The embodiment of the application can provide a computer terminal, and the computer terminal can be any one computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer terminal may execute program codes of the following steps in the image processing method: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
Optionally, fig. 12 is a block diagram of a computer terminal according to an embodiment of the present application. As shown in fig. 12, the computer terminal a may include: one or more processors 902 (only one shown), and memory 904.
The memory may be configured to store software programs and modules, such as program instructions/modules corresponding to the image processing method and apparatus in the embodiments of the present application, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, so as to implement the image processing method. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located from the processor, and these remote memories may be connected to terminal a through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
Optionally, the processor may further execute the program code of the following steps: obtaining a plurality of training data, wherein each training data comprises: the method comprises the steps that a first image, a first target label corresponding to the first image, a second image and a second target label corresponding to the second image are obtained; training an initial network by using a plurality of training data to obtain a confrontation generation network, wherein the initial network comprises: the image processing system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
Optionally, the processor may further execute the program code of the following steps: training a generator network with a first image of a plurality of training data; and training the discriminator network by utilizing the image output by the generator network, the first target label corresponding to the first image in the plurality of training data, the second image and the second target label corresponding to the second image to obtain the confrontation generating network.
Optionally, the processor may further execute the program code of the following steps: generating a first image by a three-dimensional engine; and/or capturing the webpage content by using a web crawler to obtain a first image.
Optionally, the processor may further execute the program code of the following steps: generating a target event through a three-dimensional engine; determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; based on the target event and the perspective parameter, a first image is generated.
Optionally, the processor may further execute the program code of the following steps: after the first image is processed by using the countermeasure generating network to obtain a target image, training a target detection model by using the target image; and processing the obtained detection image by using the trained target detection model to obtain a detection result, wherein the detection result is used for representing whether the detection image contains a target event or not.
The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: acquiring a first image containing a target event; acquiring a second acquired image; and processing the first image and the second image by using a countermeasure generation network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and comprise a target event.
Optionally, the processor may further execute the program code of the following steps: inputting the first image into a generator network; inputting the image output by the generator network, the second image, the first target label corresponding to the first image and the second target label corresponding to the second image into the discriminator network; and under the condition that the output result of the discriminator network meets the preset condition, determining the image output by the generator network as the target image.
By adopting the embodiment of the application, an image processing scheme is provided. The image generation is realized through the countermeasure generation network, and the images with actual styles and containing target events can be generated in large quantity without additional data annotation or association, so that sufficient training samples are provided for the actual algorithm model, the processing accuracy of the algorithm model is ensured, the acquisition cost of the data and the annotation is reduced, and the technical problem that the acquisition cost of the data and the annotation is higher when the target image is obtained by directly acquiring the image or annotating the image in the related technology is solved.
It can be understood by those skilled in the art that the structure shown in fig. 12 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 12 is a diagram illustrating a structure of the electronic device. For example, the computer terminal a may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in fig. 12, or have a different configuration than shown in fig. 12.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 9
Embodiments of the present application also provide a storage medium. Alternatively, in this embodiment, the storage medium may be configured to store program codes executed by the image processing method provided in the above embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
Optionally, the storage medium is further configured to store program codes for performing the following steps: obtaining a plurality of training data, wherein each training data comprises: the method comprises the steps that a first image, a first target label corresponding to the first image, a second image and a second target label corresponding to the second image are obtained; training an initial network by using a plurality of training data to obtain a confrontation generation network, wherein the initial network comprises: the image processing system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
Optionally, the storage medium is further configured to store program codes for performing the following steps: training a generator network with a first image of a plurality of training data; and training the discriminator network by utilizing the image output by the generator network, the first target label corresponding to the first image in the plurality of training data, the second image and the second target label corresponding to the second image to obtain the confrontation generating network.
Optionally, the storage medium is further configured to store program codes for performing the following steps: generating a first image by a three-dimensional engine; and/or capturing the webpage content by using a web crawler to obtain a first image.
Optionally, the storage medium is further configured to store program codes for performing the following steps: generating a target event through a three-dimensional engine; determining a view angle parameter of the target event based on the arrangement parameter of the image acquisition device; based on the target event and the perspective parameter, a first image is generated.
Optionally, the storage medium is further configured to store program codes for performing the following steps: after the first image is processed by using the countermeasure generating network to obtain a target image, training a target detection model by using the target image; and processing the obtained detection image by using the trained target detection model to obtain a detection result, wherein the detection result is used for representing whether the detection image contains a target event or not.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring a first image containing a target event; acquiring a second acquired image; and processing the first image and the second image by using a countermeasure generation network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and comprise a target event.
Optionally, the storage medium is further configured to store program codes for performing the following steps: inputting the first image into a generator network; inputting the image output by the generator network, the second image, the first target label corresponding to the first image and the second target label corresponding to the second image into the discriminator network; and under the condition that the output result of the discriminator network meets the preset condition, determining the image output by the generator network as the target image.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (20)

1. An image processing method comprising:
acquiring a first image containing a target event;
processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises the target event.
2. The method of claim 1, wherein the method further comprises:
obtaining a plurality of training data, wherein each training data comprises: a first image, a first target label corresponding to the first image, a second image, and a second target label corresponding to the second image;
training an initial network by using the plurality of training data to obtain the confrontation generation network, wherein the initial network comprises: the system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
3. The method of claim 2, wherein training an initial network with the plurality of training data to obtain the countermeasure generation network comprises:
training the generator network with a first image of the plurality of training data;
training the discriminator network by using the image output by the generator network, the first target label corresponding to the first image, the second image and the second target label corresponding to the second image in the plurality of training data to obtain the confrontation generating network;
wherein the discriminator network comprises: the image sampling device comprises a sampling layer, a plurality of first convolution layers, a plurality of second convolution layers, a plurality of third convolution layers and a discrimination layer, wherein the first convolution layer is connected with the sampling layer, the third convolution layer is connected with the first convolution layer and the second convolution layer, the discrimination layer is connected with the third convolution layer, the sampling layer is used for sampling an image output by the generator network based on the first target label and sampling the second image based on the second target label, and the image output by the generator network and the second image are input to the plurality of second convolution layers.
4. The method of claim 1, wherein acquiring the first image containing the target event comprises at least one of:
generating the first image by a three-dimensional engine;
and capturing the webpage content by using a web crawler to obtain the first image.
5. The method of claim 4, wherein generating the first image by a three-dimensional engine comprises:
generating, by the three-dimensional engine, the target event;
determining a view angle parameter of the target event based on an arrangement parameter of an image acquisition device;
generating the first image based on the target event and the perspective parameter.
6. The method of claim 1, wherein after processing the first image with a challenge generation network to obtain a target image, the method further comprises:
training a target detection model by using the target image;
and processing the obtained detection image by using the trained target detection model to obtain a detection result, wherein the detection result is used for representing whether the detection image contains the target event or not.
7. An image processing method comprising:
acquiring a first image containing a target event;
acquiring a second acquired image;
and processing the first image and the second image by using a countermeasure generation network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and comprise the target event.
8. The method of claim 7, wherein processing the first image and the second image with a challenge generation network to obtain a target image comprises:
inputting the first image into a generator network;
inputting the image output by the generator network, the second image, the first target label corresponding to the first image and the second target label corresponding to the second image into an authenticator network;
and under the condition that the output result of the discriminator network meets a preset condition, determining the image output by the generator network as the target image.
9. The method of claim 7, wherein acquiring the first image containing the target event comprises at least one of:
generating the first image by a three-dimensional engine;
and capturing the webpage content by using a web crawler to obtain the first image.
10. The method of claim 9, wherein generating the first image by a three-dimensional engine comprises:
generating, by the three-dimensional engine, the target event;
determining a view angle parameter of the target event based on an arrangement parameter of an image acquisition device;
generating the first image based on the target event and the perspective parameter.
11. An image processing apparatus comprising:
the image acquisition module is used for acquiring a first image containing a target event;
the processing module is used for processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises the target event.
12. The apparatus of claim 11, wherein the apparatus further comprises:
a data acquisition module to acquire a plurality of training data, wherein each training data comprises: a first image, a first target label corresponding to the first image, a second image, and a second target label corresponding to the second image;
a training module, configured to train an initial network with the plurality of training data to obtain the confrontation generating network, where the initial network includes: the system comprises a generator network and a discriminator network, wherein the image output by the generator network is input into the discriminator network.
13. The apparatus of claim 12, wherein the training module comprises:
a first training unit for training the generator network with a first image of the plurality of training data;
and the second training unit is used for training the discriminator network by utilizing the image output by the generator network, the first target label corresponding to the first image, the second image and the second target label corresponding to the second image in the plurality of training data to obtain the confrontation generating network.
14. An image processing apparatus comprising:
the first acquisition module is used for acquiring a first image containing a target event;
the second acquisition module is used for acquiring a second acquired image;
and the processing module is used for processing the first image and the second image by using a countermeasure generation network to obtain a target image, wherein the attribute parameters of the target image are the same as those of the second image and contain the target event.
15. The apparatus of claim 14, wherein the processing module comprises:
a first input unit for inputting the first image into a generator network;
a second input unit, configured to input the image output by the generator network, the second image, the first target tag corresponding to the first image, and the second target tag corresponding to the second image into the discriminator network;
and the determining unit is used for determining the image output by the generator network as the target image under the condition that the output result of the discriminator network meets the preset condition.
16. A storage medium comprising a stored program, wherein an apparatus in which the storage medium is located is controlled to perform the image processing method of any one of claims 1 to 10 when the program is run.
17. A computing device comprising a processor and a memory, the processor being configured to execute a program stored in the memory, wherein the program when executed performs the image processing method of any of claims 1 to 10.
18. An image processing system comprising:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: acquiring a first image containing a target event; and processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises a target event.
19. An image processing method comprising:
acquiring a first image containing a target event;
processing the first image by using a countermeasure generating network to obtain a target image;
wherein the countermeasure generation network comprises: a generator network and a discriminator network, wherein the generator network is configured to process the first image to obtain the target image, and the discriminator network includes: the image processing device comprises a sampling layer, a plurality of first convolution layers, a plurality of second convolution layers, a plurality of third convolution layers and a discrimination layer, wherein the first convolution layer is connected with the sampling layer, the third convolution layer is connected with the first convolution layer and the second convolution layer, the discrimination layer is connected with the third convolution layer, the sampling layer is used for sampling the target image based on a first target label and sampling a second image based on a second target label, and the target image and the second image are input to the plurality of second convolution layers.
20. An image processing method comprising:
acquiring a first image containing a target object;
processing the first image by using a countermeasure generating network to obtain a target image, wherein the countermeasure generating network is obtained by training the acquired second image, the attribute parameters of the target image are the same as those of the second image, and the target image comprises the target object.
CN202010197303.2A 2020-03-19 2020-03-19 Image processing method, device and system, storage medium and computing equipment Pending CN113496235A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010197303.2A CN113496235A (en) 2020-03-19 2020-03-19 Image processing method, device and system, storage medium and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010197303.2A CN113496235A (en) 2020-03-19 2020-03-19 Image processing method, device and system, storage medium and computing equipment

Publications (1)

Publication Number Publication Date
CN113496235A true CN113496235A (en) 2021-10-12

Family

ID=77993504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010197303.2A Pending CN113496235A (en) 2020-03-19 2020-03-19 Image processing method, device and system, storage medium and computing equipment

Country Status (1)

Country Link
CN (1) CN113496235A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511475A (en) * 2022-04-21 2022-05-17 天津大学 Image generation method based on improved Cycle GAN
WO2023174068A1 (en) * 2022-03-18 2023-09-21 上海寒武纪信息科技有限公司 Data acquisition method and apparatus, device, and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023174068A1 (en) * 2022-03-18 2023-09-21 上海寒武纪信息科技有限公司 Data acquisition method and apparatus, device, and system
CN114511475A (en) * 2022-04-21 2022-05-17 天津大学 Image generation method based on improved Cycle GAN
CN114511475B (en) * 2022-04-21 2022-08-02 天津大学 Image generation method based on improved Cycle GAN

Similar Documents

Publication Publication Date Title
CN110427917B (en) Method and device for detecting key points
CN110400363A (en) Map constructing method and device based on laser point cloud
US11256958B1 (en) Training with simulated images
CN114445780A (en) Detection method and device for bare soil covering, and training method and device for recognition model
CN113496235A (en) Image processing method, device and system, storage medium and computing equipment
CN110136091A (en) Image processing method and Related product
CN105892638A (en) Virtual reality interaction method, device and system
CN112802081A (en) Depth detection method and device, electronic equipment and storage medium
CN114782769A (en) Training sample generation method, device and system and target object detection method
CN112686979B (en) Simulated pedestrian animation generation method and device and electronic equipment
CN116858215B (en) AR navigation map generation method and device
EP4198772A1 (en) Method and device for making music recommendation
CN105144193A (en) A method and apparatus for estimating a pose of an imaging device
CN109842791B (en) Image processing method and device
CN109034059B (en) Silence type face living body detection method, silence type face living body detection device, storage medium and processor
KR20150022158A (en) Apparatus and method for learning mechanical drawing
CN116258756A (en) Self-supervision monocular depth estimation method and system
TWI542194B (en) Three-dimensional image processing system, apparatus and method for the same
Khan et al. A review of benchmark datasets and training loss functions in neural depth estimation
CN113569616A (en) Content identification method and device, storage medium and electronic equipment
CN113673278A (en) Data processing method and device
CN113378859A (en) Interpretable image privacy detection method
CN113449739A (en) Data processing method, device and system
CN115205707B (en) Sample image generation method, storage medium, and electronic device
CN114764930A (en) Image processing method, image processing device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination