WO2022222519A1 - 故障图像生成方法与装置 - Google Patents

故障图像生成方法与装置 Download PDF

Info

Publication number
WO2022222519A1
WO2022222519A1 PCT/CN2021/139429 CN2021139429W WO2022222519A1 WO 2022222519 A1 WO2022222519 A1 WO 2022222519A1 CN 2021139429 W CN2021139429 W CN 2021139429W WO 2022222519 A1 WO2022222519 A1 WO 2022222519A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
faulty
fault
region
training
Prior art date
Application number
PCT/CN2021/139429
Other languages
English (en)
French (fr)
Inventor
杜长德
金鑫
姜华杰
涂丹丹
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202110914815.0A external-priority patent/CN115294007A/zh
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Priority to EP21937741.3A priority Critical patent/EP4307217A1/en
Publication of WO2022222519A1 publication Critical patent/WO2022222519A1/zh
Priority to US18/482,906 priority patent/US20240104904A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the field of computers, and more particularly, to a method and apparatus for generating a fault image.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that responds in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theory.
  • the detection of surface defects is an important link in industrial quality inspection, and it is also a key step in controlling product quality. It can prevent defective products from entering the market and avoid harm in the use of defective products.
  • Computer vision-based fault detection algorithms can use neural network models to detect surface defects and help people quickly eliminate hidden dangers.
  • the training process of the neural network model requires a large amount of normal data and fault data. In practical applications, the amount of fault data is small, which makes it difficult to meet the training requirements of neural network models.
  • the fault data can be obtained by manually adjusting the normal data. But the labor cost is high and the efficiency is low. Therefore, how to improve the efficiency of obtaining fault data has become an urgent problem to be solved.
  • the present application provides a fault image generation method, which can reduce labor costs and improve processing efficiency.
  • a method for generating a faulty image comprising: acquiring a non-faulty image and a first faulty image, wherein the non-faulty image records a first object without faults, and the first faulty image records A faulty second object, the first object and the second object are of different types; the fault pattern of the second object in the first faulty image is transferred to the first faulty image in the non-faulty image. On an object, a second faulty image is obtained, the second faulty image showing the first object in a faulty state.
  • a second faulty image can be generated by using the non-faulty image and the first faulty image, so that the faulty images that can be used as training data include the first faulty image and the generated second faulty image, effectively increasing the number of faulty images. the number of training samples.
  • the first object recorded in the non-faulty image and the second object recorded in the faulty image do not belong to the same type of objects. That is, another object other than the first object can be used to process the non-faulty image recorded with the first object to generate the second faulty image.
  • the non-faulty image and the first faulty image are acquired from different types of objects. Adjust the non-faulty image according to the first faulty image determined from the second faulty images collected for different types of objects, that is to say, the non-faulty image of a certain object can be adjusted by using the faults of other types of objects, so as to improve the first faulty image.
  • the extensiveness of the acquisition range of a fault image increases the flexibility of the source of the first fault image. Therefore, when the number of faulty images of objects of the same type as the first object is small, the first faulty images of other types of objects and the non-faulty images of the first object can be used to generate an image of the first object.
  • the second fault image improves the flexibility of generating fault images of certain types of objects.
  • the fault pattern of the second object in the first fault image is transferred to the first object in the non-fault image to obtain a second Before the faulty image, the method includes: inputting the non-faulty image into a region generation model, and determining a target region in the non-faulty image to which the faulty pattern migrates.
  • Using the region generation model to determine the target region of non-faulty images can improve efficiency and reduce labor costs.
  • the method before the inputting the non-faulty image into a region generation model and determining the target region for migration of the faulty pattern in the non-faulty image, the method includes: acquiring multiple a training image, in which there are recorded non-faulty objects of the same type as the first object; obtaining area indication information, where the area indication information is used to indicate an area in the training image that can cause a fault; according to The plurality of training images and the region indication information are used to train the region generation model.
  • the region generation model is trained so that the region generation model is suitable for the first object.
  • the processing result of the image of the type has higher accuracy.
  • the fault pattern of the second object in the first fault image is transferred to the first object in the non-fault image to obtain a second
  • the faulty image includes: transforming the shape of the first faulty image; and obtaining the second faulty image by migrating the transformed first faulty image to a target area in the non-faulty image.
  • Adjusting the non-faulty images by using the transformed fault patterns makes the adjustment more flexible and can increase the number and variety of possible third images.
  • the shape transformation includes dimensional stretching, compression, or light-dark change.
  • a method for generating a faulty image comprising: acquiring a non-faulty image and a first faulty image, wherein the non-faulty image records a non-faulty first object, and in the first faulty image recording the faulty second object; inputting the non-faulty image into an area generation model, determining a target area for migration of the fault pattern of the second object in the first faulty image in the non-faulty image; The fault pattern migrates to the target area in the non-faulty image, resulting in a second faulty image showing the first object in a faulty state.
  • the area generation model determine the target area of the non-faulty image, and transfer the first faulty image to the target area to obtain the second faulty image, so that the determination of the target area no longer depends on labor, which can improve efficiency and reduce labor costs .
  • the first object and the second object are of different types.
  • the second object is no longer required to be of the same type as the first object, reducing the need for the first object
  • the limitation of the types of objects recorded in a fault image improves the flexibility of acquiring the first fault image and increases the extensiveness of the sources of the first fault image. Therefore, when the number of faulty images of the object type collected by the non-faulty image is small, the faulty image of the object type can be generated, and the applicability of the fault image generation method provided by the embodiment of the present application is improved.
  • the method when the non-faulty image is input into a region generation model, a target for migrating the faulty pattern of the second object in the first faulty image in the non-faulty image is determined Before the region, the method further includes: acquiring a plurality of training images and region indication information, in which the training images record objects of the same type as the first object that have not failed, and the region indication information is used to indicate the training an area where a fault can occur; and training the area generation model according to the area indication information and the plurality of training images.
  • the area generation model is trained, so that the area generation model is more pertinent, and is suitable for the first object in the non-faulty image.
  • the type of object to improve the accuracy of the target location determined by the region generation model.
  • the transferring the fault pattern to the target area in the non-fault image to obtain a second fault image includes: transforming the fault pattern ; Transfer the transformed fault pattern to the target area of the non-fault image to obtain the second fault image.
  • the non-faulty images are adjusted by using the transformed fault patterns, so that the manner of generating the second faulty images is more flexible, and the number and variety of possible third images can be increased.
  • the shape transformation includes dimensional stretching, compression, or shading.
  • a fault image generating apparatus including each module for executing the method in any one of the first aspect or the second aspect.
  • an electronic device comprising a memory and a processor, the memory is used for storing program instructions; when the program instructions are executed in the processor, the processor is used for executing the first aspect or the first aspect The method in any one of the two aspects.
  • the processor in the fourth aspect above may be either a central processing unit (CPU), or a combination of a CPU and a neural network computing processor, where the neural network computing processor may include a graphics processor (graphics processing unit). unit, GPU), neural network processor (neural-network processing unit, NPU) and tensor processor (tensor processing unit, TPU) and so on.
  • TPU is Google's fully customized artificial intelligence accelerator application-specific integrated circuit for machine learning.
  • a computer-readable medium stores program code for execution by a device, the program code comprising a method for performing any one of the implementations of the first aspect or the second aspect .
  • a computer program product containing instructions, when the computer program product runs on a computer, the computer program product causes the computer to execute the method in any one of the implementation manners of the first aspect or the second aspect.
  • a seventh aspect provides a chip, the chip includes a processor and a data interface, the processor reads instructions stored in a memory through the data interface, and executes any one of the first aspect or the second aspect above method in the implementation.
  • the chip may further include a memory, in which instructions are stored, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementations of the first aspect or the second aspect.
  • the above chip may specifically be a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • FIG. 1 is a schematic structural diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a convolutional neural network according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of another convolutional neural network provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a hardware structure of a chip according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a method for generating a fault image.
  • FIG. 7 is a schematic flowchart of a method for generating a fault image provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a neural network training apparatus provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a neural network training apparatus according to an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a computing device cluster according to an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of another computing device cluster according to an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of another computing device cluster according to an embodiment of the present application.
  • a neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as input, and the output of the operation unit can be:
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is an activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting a plurality of the above single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field, and the local receptive field can be an area composed of several neural units.
  • a deep neural network also known as a multi-layer neural network, can be understood as a neural network with multiple hidden layers.
  • the DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the middle layers are all hidden layers.
  • the layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • DNN looks complicated, it is not complicated in terms of the work of each layer. In short, it is the following linear relationship expression: in, is the input vector, is the output vector, is the offset vector, W is the weight matrix (also called coefficients), and ⁇ () is the activation function.
  • Each layer is just an input vector After such a simple operation to get the output vector Due to the large number of DNN layers, the coefficient W and offset vector The number is also higher.
  • the DNN Take the coefficient W as an example: Suppose that in a three-layer DNN, the linear coefficient from the 4th neuron in the second layer to the 2nd neuron in the third layer is defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
  • the coefficient from the kth neuron in the L-1 layer to the jth neuron in the Lth layer is defined as
  • the input layer does not have a W parameter.
  • more hidden layers allow the network to better capture the complexities of the real world.
  • a model with more parameters is more complex and has a larger "capacity", which means that it can complete more complex learning tasks.
  • Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vectors W of many layers).
  • Convolutional neural network is a deep neural network with a convolutional structure.
  • a convolutional neural network consists of a feature extractor consisting of convolutional layers and subsampling layers, which can be viewed as a filter.
  • the convolutional layer refers to the neuron layer in the convolutional neural network that convolves the input signal.
  • a convolutional layer of a convolutional neural network a neuron can only be connected to some of its neighbors.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some neural units arranged in a rectangle. Neural units in the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as the way to extract image information is independent of location.
  • the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights by learning during the training process of the convolutional neural network.
  • the immediate benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • Recurrent neural networks are used to process sequence data.
  • RNN Recurrent neural networks
  • the layers are fully connected, and each node in each layer is unconnected.
  • this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict the next word of a sentence, you generally need to use the previous words, because the front and rear words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output.
  • RNN can process sequence data of any length.
  • the training of RNN is the same as the training of traditional CNN or DNN.
  • the neural network can use the error back propagation (BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller.
  • BP error back propagation
  • the input signal is passed forward until the output will generate error loss, and the parameters in the initial neural network model are updated by back-propagating the error loss information, so that the error loss converges.
  • the back-propagation algorithm is a back-propagation movement dominated by error loss, aiming to obtain the parameters of the optimal neural network model, such as the weight matrix.
  • GANs Generative adversarial networks
  • the model includes at least two modules: one module is a generative model, and the other is a discriminative model, through which the two modules learn from each other to produce better outputs.
  • the basic principle of GAN is as follows: Take the GAN that generates pictures as an example, suppose there are two networks, G (generator) and D (discriminator), where G is a network that generates pictures, it receives a random noise z, and through this noise Generate a picture, denoted as G(z); D is a discriminant network used to determine whether a picture is "real".
  • an embodiment of the present application provides a system architecture 100 .
  • a data collection device 160 is used to collect training data.
  • the training data may include training images, training audios, training videos, training texts, and the like.
  • the data collection device 160 After collecting the training data, the data collection device 160 stores the training data in the database 130 , and the training device 120 obtains the target model/rule 101 by training based on the training data maintained in the database 130 .
  • the training device 120 processes the input training data and compares the output training information with the labeling information corresponding to the training data until the training device 120 outputs The difference between the training information and the labeling information corresponding to the training data is less than a certain threshold, so that the training of the target model/rule 101 is completed.
  • the above target model/rule 101 can be used to implement the data processing method of the embodiment of the present application.
  • the target model/rule 101 in this embodiment of the present application may specifically be a neural network.
  • the training data maintained in the database 130 may not necessarily come from the collection of the data collection device 160, and may also be received from other devices.
  • the training device 120 may not necessarily train the target model/rule 101 completely based on the training data maintained by the database 130, and may also obtain training data from the cloud or other places for model training.
  • the above description should not be used as a reference to this application Limitations of Examples.
  • the target model/rule 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. Laptops, augmented reality (AR) AR/virtual reality (VR), in-vehicle terminals, etc., can also be servers or the cloud.
  • the execution device 110 is configured with an input/output (I/O) interface 112 for data interaction with external devices, and the user can input data to the I/O interface 112 through the client device 140, the In this embodiment of the present application, the input data may include: pending data input by the client device.
  • the preprocessing module 113 and the preprocessing module 114 are used to perform preprocessing according to the input data (such as data to be processed) received by the I/O interface 112.
  • the preprocessing module 113 and the preprocessing module may also be absent.
  • 114 or only one of the preprocessing modules, and directly use the calculation module 111 to process the input data.
  • the execution device 110 When the execution device 110 preprocesses the input data, or the calculation module 111 of the execution device 110 performs calculations and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the data and instructions obtained by corresponding processing may also be stored in the data storage system 150 .
  • the I/O interface 112 returns the processing results to the client device 140 for provision to the user.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or tasks, and the corresponding target models/rules 101 can be used to achieve the above goals or complete The above task, thus providing the user with the desired result.
  • the user can manually specify the input data, which can be operated through the interface provided by the I/O interface 112 .
  • the client device 140 can automatically send the input data to the I/O interface 112 . If the user's authorization is required to request the client device 140 to automatically send the input data, the user can set the corresponding permission in the client device 140 .
  • the user can view the result output by the execution device 110 on the client device 140, and the specific presentation form can be a specific manner such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as new sample data as shown in the figure, and store them in the database 130.
  • the I/O interface 112 directly uses the input data input into the I/O interface 112 and the output result of the output I/O interface 112 as shown in the figure as a new sample The data is stored in database 130 .
  • FIG. 1 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
  • the target model/rule 101 is obtained by training the training device 120.
  • the target model/rule 101 may be the neural network in the present application in this embodiment of the present application.
  • the neural network may be used in this embodiment of the present application.
  • CNN deep convolutional neural network
  • DCNN deep convolutional neural networks
  • RNN recurrent neural network
  • CNN is a very common neural network
  • a convolutional neural network is a deep neural network with a convolutional structure and a deep learning architecture.
  • a deep learning architecture refers to an algorithm based on machine learning. learning at multiple levels of abstraction.
  • CNN is a feed-forward artificial neural network in which individual neurons can respond to data input into it. The following description takes the input data as an image as an example.
  • a convolutional neural network (CNN) 200 may include an input layer 210 , a convolutional/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230 .
  • the input layer 210 can obtain the data to be processed, and pass the obtained data to be processed by the convolutional layer/pooling layer 220 and the subsequent neural network layer 230 for processing, and the processing result of the data can be obtained.
  • the following takes image processing as an example to introduce the internal layer structure of the CNN 200 in Figure 2 in detail.
  • the convolutional/pooling layer 220 may include layers 221-226 as examples, for example: in one implementation, layer 221 is a convolutional layer, layer 222 is a pooling layer, and layer 223 is a convolutional layer Layer 224 is a pooling layer, 225 is a convolutional layer, and 226 is a pooling layer; in another implementation, 221 and 222 are convolutional layers, 223 are pooling layers, and 224 and 225 are convolutional layers. layer, 226 is the pooling layer. That is, the output of a convolutional layer can be used as the input of a subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 221 may include many convolution operators.
  • the convolution operator is also called a kernel. Its function in data processing is equivalent to a filter that extracts specific information from the input image.
  • the convolution operator can essentially is a weight matrix, which is usually pre-defined, during the convolution operation on the image, the weight matrix is usually one pixel by one pixel (or two pixels by two pixels) along the horizontal direction on the input image... ...it depends on the value of stride) to process, so as to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image.
  • each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" described above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to extract unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), and the size of the convolution feature maps extracted from the multiple weight matrices with the same size is also the same, and then the multiple extracted convolution feature maps with the same size are combined to form The output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained by training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions .
  • the initial convolutional layer eg, 221
  • the features extracted by the later convolutional layers eg, 226 become more and more complex, such as features such as high-level semantics.
  • features with higher semantics are more suitable for the problem to be solved.
  • the pooling layer can be a convolutional layer followed by a layer.
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a max pooling operator for sampling the input image to obtain a smaller size image.
  • the average pooling operator can calculate the pixel values in the image within a certain range to produce an average value as the result of average pooling.
  • the max pooling operator can take the pixel with the largest value within a specific range as the result of max pooling. Also, just as the size of the weight matrix used in the convolutional layer should be related to the size of the image, the operators in the pooling layer should also be related to the size of the image.
  • the size of the output image after processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 200 After being processed by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not sufficient to output the required output information. Because as mentioned before, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to utilize the neural network layer 230 to generate one or a set of outputs of the desired number of classes. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 2) and the output layer 240, and the parameters contained in the multiple hidden layers may be based on specific task types The relevant training data is pre-trained, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
  • the output layer 240 After the multi-layer hidden layers in the neural network layer 230, that is, the last layer of the entire convolutional neural network 200 is the output layer 240, the output layer 240 has a loss function similar to classification cross entropy, and is specifically used to calculate the prediction error,
  • the forward propagation of the entire convolutional neural network 200 (as shown in Figure 2, the propagation from the direction 210 to 240 is forward propagation)
  • the back propagation (as shown in Figure 2, the propagation from the 240 to 210 direction is the back propagation) will Start to update the weight values and biases of the aforementioned layers to reduce the loss of the convolutional neural network 200 and the error between the result output by the convolutional neural network 200 through the output layer and the ideal result.
  • a convolutional neural network (CNN) 200 may include an input layer 110 , a convolutional/pooling layer 120 (where the pooling layer is optional), and a neural network layer 130 .
  • CNN convolutional neural network
  • FIG. 3 Compared with FIG. 2 , multiple convolution layers/pooling layers in the convolutional layer/pooling layer 120 in FIG. 3 are parallel, and the extracted features are input to the full neural network layer 130 for processing.
  • the convolutional neural networks shown in FIG. 2 and FIG. 3 are only examples of two possible convolutional neural networks of a data processing method according to an embodiment of the present application.
  • the present application implements
  • the convolutional neural network used in the data processing method of the example can also exist in the form of other network models.
  • FIG. 4 is a hardware structure of a chip provided by an embodiment of the application, and the chip includes a neural network processor 50 .
  • the chip can be set in the execution device 110 as shown in FIG. 1 to complete the calculation work of the calculation module 111 .
  • the chip can also be set in the training device 120 as shown in FIG. 1 to complete the training work of the training device 120 and output the target model/rule 101 .
  • the algorithms of each layer in the convolutional neural network shown in Figures 2 and 3 can be implemented in the chip shown in Figure 4.
  • the neural network processor NPU 50 is mounted on the main central processing unit (CPU) (host CPU) as a coprocessor, and tasks are allocated by the main CPU.
  • the core part of the NPU is the operation circuit 503, and the controller 504 controls the operation circuit 503 to extract the data in the memory (weight memory or input memory) and perform operations.
  • the arithmetic circuit 503 includes multiple processing units (process engines, PEs). In some implementations, arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory 502 and buffers it on each PE in the operation circuit.
  • the arithmetic circuit fetches the data of matrix A and matrix B from the input memory 501 to perform matrix operation, and stores the partial result or final result of the matrix in the accumulator 508 .
  • the vector calculation unit 507 can further process the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • the vector computing unit 507 can be used for network computation of non-convolutional/non-FC layers in the neural network, such as pooling, batch normalization, local response normalization, etc. .
  • vector computation unit 507 can store the processed output vectors to unified buffer 506 .
  • the vector calculation unit 507 may apply a nonlinear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate activation values.
  • vector computation unit 507 generates normalized values, merged values, or both.
  • the vector of processed outputs can be used as activation input to the arithmetic circuit 503, eg, for use in subsequent layers in a neural network.
  • Unified memory 506 is used to store input data and output data.
  • the weight data directly transfers the input data in the external memory to the input memory 501 and/or the unified memory 506 through the storage unit access controller 505 (direct memory access controller, DMAC), and stores the weight data in the external memory into the weight memory 502, And the data in the unified memory 506 is stored in the external memory.
  • DMAC direct memory access controller
  • a bus interface unit (BIU) 510 is used to realize the interaction between the main CPU, the DMAC and the instruction fetch memory 509 through the bus.
  • the instruction fetch memory (instruction fetch buffer) 509 connected with the controller 504 is used to store the instructions used by the controller 504;
  • the controller 504 is used for invoking the instructions cached in the memory 509 to control the working process of the operation accelerator.
  • the unified memory 506, the input memory 501, the weight memory 502 and the instruction fetch memory 509 are all on-chip (On-Chip) memories, and the external memory is the memory outside the NPU, and the external memory can be double data rate synchronous dynamic random access Memory (double data rate synchronous dynamic random access memory, referred to as DDR SDRAM), high bandwidth memory (high bandwidth memory, HBM) or other readable and writable memory.
  • DDR SDRAM double data rate synchronous dynamic random access Memory
  • HBM high bandwidth memory
  • HBM high bandwidth memory
  • each layer in the convolutional neural network shown in FIG. 2 and FIG. 3 may be performed by the operation circuit 503 or the vector calculation unit 507 .
  • the execution device 110 in FIG. 1 described above can execute each step of the data processing method of the embodiment of the present application.
  • the CNN model shown in FIG. 2 and FIG. 3 and the chip shown in FIG. 4 can also be used to execute the implementation of the present application.
  • the various steps of the data processing method of the example. The method for training a neural network according to the embodiment of the present application and the data processing method according to the embodiment of the present application will be described in detail below with reference to the accompanying drawings.
  • an embodiment of the present application provides a system architecture 300 .
  • the system architecture includes a local device 301, a local device 302, an execution device 210 and a data storage system 250, wherein the local device 301 and the local device 302 are connected with the execution device 210 through a communication network.
  • the execution device 210 may be implemented by one or more servers.
  • the execution device 210 may be used in conjunction with other computing devices, such as data storage, routers, load balancers and other devices.
  • the execution device 210 may be arranged on one physical site, or distributed across multiple physical sites.
  • the execution device 210 may use the data in the data storage system 250 or call the program code in the data storage system 250 to implement the data processing method in this embodiment of the present application.
  • the execution device 210 may execute each step of the data processing method provided by the embodiments of the present application.
  • a user may operate respective user devices (eg, local device 301 and local device 302 ) to interact with execution device 210 .
  • Each local device may represent any computing device, such as a personal computer, computer workstation, smartphone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set-top box, gaming console, etc.
  • Each user's local device can interact with the execution device 210 through any communication mechanism/standard communication network, which can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
  • any communication mechanism/standard communication network which can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
  • the local device 301 and the local device 302 obtain the relevant parameters of the target neural network from the execution device 210, deploy the target neural network on the local device 301 and the local device 302, and use the target neural network for image classification Or image processing, etc.
  • the target neural network can be directly deployed on the execution device 210, and the execution device 210 obtains the data to be processed from the local device 301 and the local device 302, and classifies or processes the data to be processed according to the target neural network. .
  • the above execution device 210 may also be a cloud device, in this case, the execution device 210 may be deployed in the cloud; or, the above execution device 210 may also be a terminal device, in this case, the execution device 210 may be deployed on the user terminal side, the embodiment of the present application This is not limited.
  • the detection of surface defects is an important link in industrial quality inspection, and it is also a key step in controlling product quality. It can prevent defective products from entering the market and avoid harm in the use of defective products. For example, in the railway scene, the parts of the train may be damaged or fail as the service life increases, and the surface of the parts may be defective. If the surface defects of the parts are not found in time, the train may continue to run. A major accident occurs.
  • the detection of surface defects can also be applied in various fields such as power grid and manufacturing.
  • Computer vision-based fault detection algorithms can use neural network models to detect surface defects and help people quickly eliminate hidden dangers.
  • the training process of the neural network model requires a large amount of normal data and fault data.
  • the problem of small samples is often faced, that is to say, the number of faulty images is too small, which does not meet the data volume requirements for training the neural network model.
  • the number of fault images is small, and the fault images used to train the neural network model may not include images corresponding to some fault types, or even if each fault type is included, the data of some fault types is too small, which makes the neural network model obtained by training.
  • the ability to identify faults is limited.
  • FIG. 6 shows a schematic flow chart of a fault image processing method.
  • a first edge image of a non-faulty image is extracted.
  • the device can be imaged without surface defects to obtain non-faulty images. That is, objects in non-faulty images may be free of surface defects.
  • Edge images can also be called edge images.
  • the edge image of the non-faulty image is the image obtained by extracting the edge of the non-faulty image.
  • the edge is the junction of the image area and another attribute area. It is the place where the regional attributes change abruptly, and it is the place where the image information is most concentrated.
  • the edge of the image contains rich information.
  • the edge image of the non-faulty image is manually edited to obtain the edge image of the faulty image.
  • the faulty image only the local area is abnormal, and other areas are the same as the non-faulty image.
  • an image translation model is used to process the edge image of the faulty image to form a faulty image.
  • the neural network model can be trained according to the fault images obtained in S601-S603 and the non-fault images.
  • the trained neural network model can be used for surface defect detection.
  • the non-faulty images are edited manually, the degree of automation is low, and the cost is high.
  • an embodiment of the present application provides a method for generating a fault image.
  • FIG. 7 is a schematic flowchart of a method for generating a fault image provided by an embodiment of the present application.
  • the fault image generation method 700 includes S710 to S720.
  • a non-faulty image and a first faulty image are acquired, the non-faulty image records the first object without fault, the first faulty image records the second faulty object, the first object and the type of the second object is different.
  • the non-faulty image and the first faulty image are obtained by collecting objects of different types.
  • Types can also be called categories, referring to the kinds of things that have common characteristics. Objects of the same type have the same properties or characteristics, while objects of different types have different properties or characteristics. Properties of objects may include functions, materials, material types, colors, and the like. Different types of objects have different properties. For example, the material of the surface of the train compartment is different from that of the wall, the non-faulty image may be acquired from the surface of the train compartment, and the first faulty image may be acquired from the wall.
  • the fault pattern of the second object in the first fault image is transferred to the first object in the non-fault image (for example: overlaying on the first object in the non-fault image, or superposition of layers), so as to obtain Second glitch image.
  • the second failure image shows the first object in a failed state.
  • a first faulty image may record a wall crack
  • a non-faulty image may record a faultless cabin surface.
  • Parts of a fault-free cabin surface can be adjusted with cracks in the wall to obtain an image of a cracked cabin surface.
  • the non-faulty image can be processed according to the first faulty image to obtain the second faulty image. Therefore, when the number of faulty images of the same type is small and the number of non-faulty images is large, the non-faulty images can be adjusted by using different types of faulty images, thereby generating a larger number of faulty images.
  • Objects of different types are recorded in the first fault image and the non-fault image, that is, the first fault image and the non-fault image are obtained by collecting different objects. Therefore, the images of the objects recorded in the non-faulty images can be adjusted by using the faults of different types of objects, so as to improve the breadth of acquisition of the first faulty image and increase the flexibility of the source of the first faulty image. Therefore, when the number of fault images corresponding to the object type recorded in the non-fault image is small, fault images of other types of objects can be used to generate fault images of the object type, and the applicability of generating fault images is strong, adding a second fault. The variety of failure types in the image.
  • the failure pattern may be derived from the first failure image.
  • the failure pattern may be an area of the second object failure in the first failure image.
  • the fault pattern may also be obtained by performing edge extraction on the faulty region of the second object in the first fault image.
  • Non-faulty images can be natural scene graphs or edge graphs.
  • an edge extraction model can be used to process the natural scene graph to obtain an edge map corresponding to the non-faulty image and an edge map corresponding to the first faulty image.
  • the edge map corresponding to the image may be an edge map obtained by processing the image by an edge extraction model; an image is an edge map, and the edge map corresponding to the image is the image.
  • the fault pattern in the edge map corresponding to the first fault image can be used to replace the part of the region where the first object is located in the edge map corresponding to the non-fault image, and the image translation model is used to process the replaced edge map to obtain a natural scene map .
  • the second fault image may be a replaced edge map, or the second fault image may be a natural scene map processed by an image translation model.
  • the image translation model may be obtained by training using the first training data set.
  • the first training data set includes a third image and an edge map corresponding to the third image, and the third image is a natural scene map.
  • the object recorded in the third image may be of the same type as the first object. That is to say, the third image and the non-faulty image are acquired from the same type of object.
  • the image translation model can be a trained AI model.
  • the object recorded in the third image may or may not be faulty.
  • the edge map corresponding to the third image may be processed by using the initial image translation model to obtain the translated image. Adjust the initial image translation model to minimize the difference between the translated image and the third image. Use the adjusted initial image translation model to process the edge maps corresponding to the other third images until the difference gradually converges to obtain an image translation model.
  • the third image and the non-faulty image are acquired from the same type of object.
  • the image translation model is trained by using the third image and the edge map corresponding to the third image, so that the image translation model is pertinent and the accuracy of the image translation model is improved.
  • the appearance of the fault pattern may be transformed, and the transformed first fault image may be overlaid on the first object in the non-fault image to obtain a second fault image.
  • the shape transformation may include style transformations such as deformation, light and shade changes, and the deformation may be, for example, stretching, compression, and the like. Therefore, the adjustment of the non-faulty images is more flexible, and the number and variety of the second faulty images that may be obtained can be increased.
  • the first faulty image can be overlaid on the target area of the non-faulty image to obtain the second faulty image.
  • the target area of the non-faulty image may be the first area determined according to the position information input by the user, or determined by using the area generation model.
  • Non-faulty images can be processed using a region generation model to obtain target regions.
  • the target area of the non-faulty image is determined by using the area generation model, and the target area of the non-faulty image is covered by the faulty image to obtain the second faulty image, which can improve efficiency and reduce labor costs.
  • the appearance of the fault pattern of the first fault image can be transformed, that is, the size of the transformed fault pattern is smaller than or equal to the size of the target area. Therefore, the generated second fault image is more accurate.
  • the region generation model may be an AI model obtained by training.
  • a plurality of training images may be acquired, in which objects of the same type as the first object are recorded without failure.
  • Region indication information may be acquired, where the region indication information is used to indicate regions in the training image that can cause failures.
  • the region generation model may be trained according to the plurality of training images and the region indication information.
  • FIG. 8 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • the image processing system 800 includes an edge extraction model 810 , an editing model 820 , and an image translation model 830 .
  • Image processing system 800 may also include region generation model 840 .
  • the edge extraction model 810, the image translation model 830, and the region generation model 840 may be neural network models obtained by training.
  • it can be a CNN model.
  • the image translation model 830 can use GAN, and can choose mature algorithms such as pix2pix, pix2pixHD, cycle-consistent generative adversarial networks (cycle-consistent generative adversarial networks, CycleGAN), and unsupervised image-to-image translation (UNIT).
  • the image processing system 800 is configured to process the input image according to the fault edge image to obtain the fault image.
  • the image input to the image processing system 800 may be a non-faulty image.
  • the edge extraction model 810 can perform edge extraction on the non-faulty image to obtain an edge image of the non-faulty image.
  • the edge image reflects the outline of the object in the non-faulty image in the form of lines.
  • the edge extraction model 810 may be a trained neural network model.
  • the edge extraction model 810 can be trained from public datasets.
  • the public dataset includes multiple natural scene images and annotated edge images corresponding to each image.
  • the natural scene images in the public dataset are processed using the initial edge extraction model to obtain training edge images. Based on the difference between the training edge image and the labeled edge image corresponding to that image, the parameters of the initial edge extraction model are adjusted to minimize this difference.
  • the difference between the training edge image and the annotated edge image corresponding to the natural scene image can be expressed as a loss value. Keep adjusting the model parameters until the loss value converges or the number of iterations exceeds the preset value.
  • the editing model 820 can edit and modify the target area in the edge image according to the faulty edge image, so as to obtain the second edge image.
  • the set of faulty edge images may include a plurality of faulty edge images.
  • data preparation may be performed to obtain a set of fault edge images.
  • Equipment with surface defects can be imaged to obtain images of faulty samples.
  • image acquisition may be performed on equipment with surface defects present.
  • You can also query and retrieve in the Internet to obtain fault sample images.
  • the fault sample image can be edge extracted, and the edge image corresponding to the faulty area in the edge extraction result can be regarded as a fault edge image.
  • edge extraction can be performed on the faulty region in the faulty sample image to obtain a faulty edge image.
  • the faulty sample images may be acquired from the same or different types of objects in the non-faulty images. That is to say, the set of fault edge images may include fault edge images obtained by processing fault sample images obtained by collecting different objects.
  • the editing model 820 can select the faulty edge images randomly or in a certain order from the set of faulty edge images, and edit and modify the target area in the edge image. For example, the editing model 820 may sequentially edit and modify the target area in the first edge image according to each faulty edge image in the set of faulty edge images.
  • the editing model 820 can perform style transformations such as deformation (such as stretching, compression, etc.), light and dark changes, etc. on the faulty edge images in the set of faulty edge images, and according to the style-transformed faulty edge images, the target area in the first edge image is modified. Make edits and modifications.
  • style transformations such as deformation (such as stretching, compression, etc.), light and dark changes, etc.
  • the way the style is transformed can be selected based on user input.
  • the user can be presented with an optional style transformation mode, and user input can be obtained, so as to determine the style transformation mode corresponding to the user input.
  • the style transformation method can also be determined in a default manner.
  • the styles of faults in the second edge image can be increased, and the diversity of faulty images can be increased.
  • the image translation model 830 may process the second edge image to obtain the fault image. That is, the image translation model 830 is used to convert the line outline graph into a natural scene graph.
  • a natural scene graph can be a rendered graph (ie, a colormap).
  • Editing model 820 may determine the second edge image based on the faulty edge image, the first edge image, and the target area.
  • the target area can be obtained by processing the non-faulty image by using the area generation model 840, and the target area can also be determined according to user input information.
  • User input information can be used to indicate the target area.
  • the manner in which the user input information indicates the target area may be set by default, or may be selected by the user.
  • User input information may be used to indicate the location of one or more keypoints in the non-faulty image.
  • a rectangular target area with the key point as the intersection of the diagonal lines can be determined, the length of the target area is within the preset length range, and the length of the target area is The aspect ratio is within the preset aspect ratio range, and the length and width of the target area are respectively parallel to the two adjacent sides of the non-faulty image.
  • a rectangular target area with the two key points as non-adjacent vertices can be determined, and the length and width of the target area are respectively adjacent to the non-faulty image.
  • the two sides are parallel.
  • a rectangle with the smallest area containing the multiple key points may be determined as the target area.
  • the manner in which the user input information indicates only one key point to indicate the target area has higher diversity but lower rationality of the obtained target area.
  • the editing model 820 can deform the faulty edge images in the set of faulty edge images, so that the size of the deformed faulty edge image is the same as the size of the target area. Therefore, the editing model 820 can replace the target area in the non-faulty image with the deformed faulty edge image to form a second edge image.
  • the region generation model 840 may be trained according to a plurality of training images and training region information in each training image.
  • the plurality of training images may include non-failure images processed by the system 800, and may include other images.
  • the target area can be determined according to the user input information corresponding to the non-faulty images, and the editing module 820 processes the non-faulty images in the first edge image obtained by the edge extraction module 810 in the first edge image.
  • the target area of is replaced with the faulty edge image to form a second edge image.
  • the first edge image, the second edge image, the border of the target area can be displayed.
  • the user can judge the rationality of the target area. Adjust or delete unreasonable target areas.
  • the region generation model 840 can be trained according to the plurality of non-faulty images and the user input information corresponding to each non-faulty image. For example, the region generation model 840 may be trained when the number of acquired user input information exceeds a preset number (eg, 100).
  • Non-faulty images can be fed into the initial region generation model.
  • the initial region generation model processes non-faulty images to obtain training information. According to the difference between the training information and the annotation information corresponding to the non-faulty image, the parameters of the initial region generation model are adjusted to minimize the difference, thereby completing one iteration.
  • the labeling information corresponding to the non-faulty image may be user input information corresponding to the non-faulty image, or may be the target area indicated by the user input information.
  • the adjusted initial region generation model is used as the initial region generation model for the next iteration, and other non-faulty images are processed.
  • the adjusted initial region generation model can be used as the region generation model 840 .
  • the user input information obtained by the system 800 and the non-faulty images corresponding to the user input information can be used to train the region generation model 840.
  • the region generation model 840 can be used to process other non-faulty images to determine the target region of each non-faulty image.
  • the editing module 820 replaces the target area in the first edge image obtained by processing the non-faulty image by the edge extraction module 810 with the faulty edge image to form a second edge image.
  • the region generation model 840 may continue to be trained using more non-faulty images and user input information.
  • the system 800 uses the area generation model 840 to determine the target area of the non-faulty image, which can automatically complete the generation of the faulty image, improve the generation efficiency of the faulty image, and reduce labor costs.
  • the image translation model 830 may be obtained by training the initial image translation model by using multiple non-faulty images and the first edge image processed by the edge extraction model. That is to say, the training of the image translation model 830 may be performed in the process of generating the faulty image.
  • the edge extraction model 810 may be used to process multiple non-faulty images to obtain a first edge image corresponding to each non-faulty image, thereby obtaining data for training the image translation model 830 .
  • the training image translation model 830 needs to perform multiple iterations, and each iteration includes: using the edge extraction model to process the first edge image to obtain the generated image; according to the difference between the generated image and the non-faulty image corresponding to the first edge image Difference, adjust the parameters of the initial image translation model to minimize the difference. After that, the adjusted initial image translation model can be used as the initial image translation model for the next iteration.
  • the iteration can be stopped, and the adjusted initial image translation model is used as the image translation model 830 .
  • the image translation model 830 is trained by using the non-faulty images, so that the image translation model 830 obtained by training can be better adapted to the application scenario of the system 800 , and the authenticity of the images output by the system 800 is improved.
  • the faulty sample images and the corresponding edges of the faulty sample images may also be used to train the image translation model 830 .
  • the accuracy of the fault image output by the image translation model 830 can be improved.
  • Image processing system 800 is capable of transferring the diversity of non-faulty images into faulty images. Using the image processing system 800, the first edge image and the faulty edge image of the non-faulty image are used to form the second edge image, and the image translation technology is used to process the second edge image to obtain the faulty image.
  • FIG. 9 is a schematic flowchart of a method for generating a fault image provided by an embodiment of the present application.
  • the method 900 includes S910 to S930.
  • a non-faulty image and a first faulty image are acquired, in which the non-faulty image records the non-faulty first object, and the first faulty image records the faulty second object.
  • the non-faulty image is input into a region generation model, and a target region in the non-faulty image covered by the faulty pattern of the second object in the first faulty image is determined.
  • the target area of the first image is determined by using the area generation model, and the target area of the non-faulty image is covered by the first faulty image to obtain the second faulty image, so that the determination of the target area no longer depends on manual work, It can improve efficiency and reduce labor costs.
  • the first fault image may be derived from the fourth image.
  • the first faulty image may be a faulty region in the fourth image.
  • the faulty image may also be obtained by performing edge extraction on the faulty region in the fourth image.
  • the first faulty image is obtained according to the fourth image, and the first faulty image is not determined for a specific non-faulty image.
  • the type of the first object and the second object may be the same or different. That is, the non-faulty image and the first faulty image may be acquired from the same or different types of objects.
  • Region generative models can be trained. Acquire a plurality of training images, in which objects of the same type as the first object that have not failed are recorded. Obtain area indication information, where the area indication information is used to indicate an area in the training image that can cause a fault. The region generation model is trained according to the plurality of training images and the region indication information.
  • the region generation model is trained to make the region generation model more targeted, and the generated region generation model It is suitable for the type of the first object and improves the accuracy of the target position determined by the region generation model.
  • the first fault image can be transformed, and the transformed fault pattern can be covered on the target area of the non-fault image to obtain Second glitch image.
  • Shape transformations include size scaling or shading changes.
  • FIG. 10 is a schematic structural diagram of a neural network training system provided by an embodiment of the present application.
  • the neural network training system 3000 includes an acquisition module 3010 and a processing module 3020 .
  • the acquiring module 3010 is used for acquiring training images and region indication information. Unfailed objects are recorded in the training images.
  • the region indication information is used to indicate the regions in the training image that can generate faults.
  • the type of object recorded in the training image may be the same as the type of the first object.
  • the processing module 3020 is configured to train the region generation model according to the target position and the first image.
  • System 3000 may further include a storage module, which may be used to store training images.
  • the acquisition module 3010 may read the training image in the storage module, or the acquisition module 3010 may receive the training image sent by the device where the storage module is located.
  • the acquisition module 3010 and the processing module 3020 can be deployed locally, and the storage module can be deployed locally or in the cloud.
  • FIG. 11 is a schematic structural diagram of an apparatus for generating a fault image provided by an embodiment of the present application.
  • the fault image generating apparatus 1100 includes an acquisition module 1110 and a processing module 1120 .
  • the acquiring module 1110 is configured to acquire a non-faulty image and a first faulty image, wherein the non-faulty image records the non-faulty first object, and the first faulty image records the faulty first object A second object, the first object and the second object are of different types.
  • the processing module 1120 is configured to overlay the fault pattern of the second object in the first fault image on the first object in the non-fault image to obtain a second fault image, the second fault image The first object is displayed in a malfunctioning state.
  • the processing module 1120 is configured to input the non-faulty image into a region generation model, and determine the target region covered by the fault pattern in the non-faulty image.
  • the acquiring module 1110 is further configured to acquire a plurality of training images, in which objects of the same type as the first object that have not failed are recorded in the training images.
  • the acquiring module 1110 is further configured to acquire area indication information, where the area indication information is used to indicate an area in the training image where a fault can occur.
  • the processing module 1120 is further configured to train the region generation model according to the plurality of training images and the region indication information.
  • processing module 1120 is further configured to perform shape transformation on the fault pattern.
  • the processing module 1120 is further configured to overlay the transformed fault pattern on the target area in the non-fault image to obtain the second fault image.
  • the shape transformation includes dimensional stretching, compression, or shading.
  • the acquiring module 1110 is configured to acquire a non-faulty image and a first faulty image, where the non-faulty image records the non-faulty first object, and the first faulty image records a faulty first object the second object.
  • the processing module 1120 is configured to input the non-faulty image into an area generation model, and determine a target area covered by the fault pattern of the second object in the first faulty image in the non-faulty image.
  • the processing module 1120 is further configured to overlay the fault pattern on the target area in the non-faulty image to obtain a second faulty image, where the second faulty image shows the first object in a faulty state.
  • the first object and the second object are of different types.
  • the acquiring module 1110 is further configured to acquire a plurality of training images, in which objects of the same type as the first object that have not failed are recorded in the training images.
  • the acquiring module 1110 is further configured to acquire area indication information, where the area indication information is used to indicate an area in the training image where a fault can occur.
  • the processing module 1120 is further configured to train the region generation model according to the plurality of training images and the region indication information.
  • processing module 1120 is further configured to perform shape transformation on the fault pattern.
  • the processing module 1120 is further configured to overlay the transformed fault pattern on the target area of the non-fault image to obtain the second fault image.
  • the shape transformation includes dimensional stretching, compression, or shading.
  • the fault image generating apparatus 1100 may further include a storage module.
  • the storage module can be used to store the first faulty image, and can also be used to store the non-faulty image.
  • the acquisition module 3010 and the processing module 3020 can be deployed locally, and the storage module can be deployed locally or in the cloud.
  • FIG. 12 is a schematic structural diagram of a computing device according to an embodiment of the present application.
  • Computing device 1200 includes: bus 1202 , processor 1204 , memory 1206 , and communication interface 1208 . Communication between processor 1204 , memory 1206 and communication interface 1208 is via bus 1202 .
  • Computing device 1200 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 1200 .
  • the bus 1202 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one line is shown in FIG. 12, but it does not mean that there is only one bus or one type of bus.
  • Bus 1204 may include pathways for communicating information between various components of computing device 1200 (eg, memory 1206, processor 1204, communication interface 1208).
  • the processor 1204 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP), etc. any one or more of the devices.
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • Memory 1206 may include volatile memory, such as random access memory (RAM).
  • the processor 1204 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state hard disk (solid state) drive, SSD).
  • ROM read-only memory
  • HDD hard disk drive
  • SSD solid state hard disk
  • Executable program codes are stored in the memory 1206, and the processor 1204 executes the executable program codes to implement the aforementioned fault image generation method.
  • the memory 1206 stores instructions for executing the fault image generation method.
  • the communication interface 1203 implements communication between the computing device 1200 and other devices or communication networks using transceiver modules such as, but not limited to, network interface cards, transceivers, and the like.
  • apparatus 12000 only shows a memory, a processor, and a communication interface, in a specific implementation process, those skilled in the art should understand that the apparatus 12000 may also include other devices necessary for normal operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 12000 may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatus 12000 may only include the necessary devices for implementing the embodiments of the present application, and does not necessarily include all the devices shown in FIG. 12 .
  • FIG. 13 is a schematic structural diagram of a computing device cluster provided by an embodiment of the present application.
  • the computing device cluster includes at least one computing device 1200 .
  • Instructions for performing the fault image generation method may be stored in memory 1206 in one or more computing devices 1200 in the computing device cluster.
  • one or more computing devices 1200 in the computing device cluster may also be used to execute part of the instructions of the fault image generation method.
  • a combination of one or more computing devices 1200 may collectively execute the instructions of the fault image generation method.
  • the memory 1206 in different computing devices 1200 in the computing device cluster may store different instructions for executing some steps in the fault image generation method.
  • Figure 14 shows one possible implementation. As shown in FIG. 14 , two computing devices 1200A and 1200B are connected through a communication interface 1208 . Instructions for performing the functions of the interaction unit 1262 and the processing unit 1266 are stored on memory in the computing device 1200A. Instructions for performing the functions of storage unit 1264 are stored on memory in computing device 1200B. In other words, the memory 1206 of the computing devices 1200A and 1200B collectively stores instructions for performing the fault image generation method.
  • connection mode between the computing device clusters shown in FIG. 14 may take into account that the fault image generation method provided by the present application needs to store a large amount of data collected by radar or cameras. Therefore, consider offloading the storage function to computing device 1200B.
  • computing device 1200A shown in FIG. 8 may also be performed by multiple computing devices 1200 .
  • the functions of computing device 1200B may also be performed by multiple computing devices 1200 .
  • one or more computing devices in a cluster of computing devices may be connected by a network.
  • the network may be a wide area network or a local area network, or the like.
  • Figure 15 shows one possible implementation. As shown in FIG. 15, two computing devices 1200C and 1200D are connected through a network. Specifically, the network is connected through a communication interface in each computing device.
  • the memory 1206 in the computing device 1200C has instructions to execute the interaction unit 1262. At the same time, the memory 1206 in the computing device 1200D stores instructions to execute the storage unit 1264 and the processing unit 1266 .
  • connection mode between the computing device clusters shown in FIG. 15 may be that considering that the fault image generation method provided by the present application needs to store a large number of first fault images and perform a large number of calculations to determine the second fault image, it is considered that the storage
  • the functions implemented by unit 1264 and processing unit 1266 are handed over to computing device 1200D for execution.
  • computing device 1200C shown in FIG. 10 may also be performed by multiple computing devices 1200 .
  • the functions of computing device 1200D may also be performed by multiple computing devices 1200 .
  • Embodiments of the present application provide a computer-readable medium, where the computer-readable medium stores program codes for device execution, where the program codes include methods for executing the foregoing fault image generation method.
  • An embodiment of the present application provides a computer program product containing instructions, when the computer program product is run on a computer, the computer causes the computer to execute the aforementioned method for generating a fault image.
  • An embodiment of the present application provides a chip, where the chip includes a processor and a data interface, and the processor reads an instruction stored in a memory through the data interface, and executes the foregoing method for generating a fault image.
  • the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), application-specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • enhanced SDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory Fetch memory
  • direct memory bus random access memory direct rambus RAM, DR RAM
  • the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server or data center by wire (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains one or more sets of available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media.
  • the semiconductor medium may be a solid state drive.
  • At least one means one or more, and “plurality” means two or more.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • at least one item (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • the size of the sequence numbers of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its functions and internal logic, and should not be dealt with in the embodiments of the present application. implementation constitutes any limitation.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

一种故障图像生成方法和装置,故障图像生成方法包括:获取非故障图像和第一故障图像,非故障图像中记录了未发生故障的第一物体,第一故障图像中记录了发生故障的第二物体,第一物体和第二物体的类型不同;将第一故障图像中第二物体的故障图案迁移至非故障图像中的第一物体上,获得第二故障图像,第二故障图像展示故障状态下的第一物体。利用记录其他类型物体的故障图案对某种类型物体的部分图像进行替换,以得到该某种类型的物体的第二故障图像,提高故障图案获取的广泛性和灵活性,增加第二故障图像中故障类型的多样性。

Description

故障图像生成方法与装置 技术领域
本申请涉及计算机领域,并且更具体地,涉及一种故障图像生成方法与装置。
背景技术
人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能领域的研究包括机器人,自然语言处理,计算机视觉,决策与推理,人机交互,推荐与搜索,AI基础理论等。
对表面缺陷的检测是工业质检中的重要环节,也是把控产品质量的关键步骤,可以避免有缺陷的产品流入市场,避免在有缺陷的产品使用过程中带来危害。基于计算机视觉的故障检测算法可以利用神经网络模型,实现表面缺陷检测,帮助人们快速排除隐患。神经网络模型的训练过程需要大量的正常数据和故障数据供。在实际应用中故障数据的数量较少,难以满足神经网络模型的训练需求。
可以通过对正常数据进行人工调整,以得到故障数据。但是人工成本高,且效率较低。因此,如何提升故障数据的获得效率成为了亟待解决的问题。
发明内容
本申请提供一种故障图像生成方法,能够降低人工成本,提高处理效率。
第一方面,提供了一种故障图像生成方法,包括:获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体,所述第一物体和所述第二物体的类型不同;将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
在故障图像的数量较少时,可以利用非故障图像和第一故障图像生成第二故障图像,以使得可以作为训练数据的故障图像包括第一故障图像和生成的第二故障图像,有效地增加了训练样本的数量。其中,非故障图像记录的第一物体和故障图像记录的第二物体不属于同类型的物体。也即,可以利用第一物体之外的其他物体对记录了第一物体的非故障图像进行加工,生成第二故障图像。通过扩充故障图案的来源,更有利于训练样本数量的增加,有效地提升了训练样本生成的效率。
非故障图像和第一故障图像是对不同类型的对象进行采集得到的。根据对于不同类型对象采集的第二故障图像确定的第一故障图像,对非故障图像进行调整,也就是 说,可以利用其他类型物体的故障对某个物体的非故障图像中进行调整,提高第一故障图像获取范围的广泛性,增加第一故障图像来源的灵活性。从而在存在故障的、与第一物体具有相同类型的物体的图像数量较少的情况下,可以利用其他类型物体的第一故障图像与该第一物体的非故障图像,生成该第一物体的第二故障图像,提高生成某类型的物体的故障图像的灵活性。
结合第一方面,在一些可能的实现方式中,所述将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像之前,所述方法包括:将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案迁移的目标区域。
利用区域生成模型,确定非故障图像的目标区域,能够提高效率,降低人工成本。
结合第一方面,在一些可能的实现方式中,所述将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案迁移的目标区域之前,所述方法包括:获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体;获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
利用记录有未发生故障的、与第一物体同类型的物体的训练图像,以及指示训练图像中能够产生故障的区域的区域指示信息,训练区域生成模型,使得区域生成模型对用于第一物体所属类型的图像的处理结果具有更高的准确度。
结合第一方面,在一些可能的实现方式中,所述将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像包括:对所述第一故障图像进行外形变换;根据变换后的所述第一故障图像迁移至所述非故障图像中的目标区域上,获得所述第二故障图像。
利用变换后的故障图案对非故障图像进行调整,使得调整更为灵活,可以增加可能得到的第三图像的数量和多样性。
结合第一方面,在一些可能的实现方式中,所述外形变换包括尺寸拉伸、压缩或明暗变化。
第二方面,提供一种故障图像生成方法,所述方法包括:获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体;将所述非故障图像输入区域生成模型,确定所述非故障图像中所述第一故障图像中所述第二物体的故障图案迁移的目标区域;将所述故障图案迁移至所述非故障图像中的对所述目标区域,得到第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
利用区域生成模型,确定非故障图像的目标区域,并将第一故障图像迁移在该目标区域,以得到第二故障图像,使得目标区域的确定不再依赖于人工,能够提高效率,降低人工成本。
结合第二方面,在一些可能的实现方式中,所述第一物体和所述第二物体的类型不同。
利用记录与第一物体类型不同的第二故障图像对非故障图像中第一物体所在的区域进行迁移,生成第二故障图像,不再要求第二物体与第一物体的类型相同,降低对 第一故障图像中记录的物体类型的限制,提高第一故障图像获取的灵活性,增加第一故障图像来源的广泛性。从而在非故障图像采集的物体类型存在故障的图像数量较少的情况下,可以生成该物体类型存在故障的图像,提高本申请实施例提供的故障图像生成方法的适用性。
结合第二方面,在一些可能的实现方式中,在将所述非故障图像输入区域生成模型,确定所述非故障图像中所述第一故障图像中所述第二物体的故障图案迁移的目标区域之前,所述方法还包括:获取多张训练图像和区域指示信息,所述训练图像中记录有未发生故障的、与第一物体同类型的物体,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;根据所述区域指示信息和所述多张训练图像,训练所述区域生成模型。
根据记录与第一物体相同类型的物体的训练图像以及指示训练图像中能够产生故障区域的区域指示信息,训练区域生成模型,使得区域生成模型更具有针对性,适用于非故障图像中的第一物体的类型,提高区域生成模型确定的目标位置的准确度。
结合第二方面,在一些可能的实现方式中,所述将所述故障图案迁移至所述非故障图像中的所述目标区域上,获得第二故障图像,包括:对所述故障图案进行变换;将变换后的所述故障图案迁移至所述非故障图像的所述目标区域上,以得到所述第二故障图像。
利用变换后的故障图案对非故障图像进行调整,使得生成第二故障图像的方式更为灵活,可以增加可能得到的第三图像的数量和多样性。
结合第二方面,所述外形变换包括尺寸拉伸、压缩或明暗变化。
第三方面,提供一种故障图像生成装置,包括用于执行第一方面或第二方面中的任意一种实现方式中的方法的各个模块。
第四方面,提供一种电子设备,包括存储器和处理器,所述存储器用于存储程序指令;当所述程序指令在所述处理器中执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。
上述第四方面中的处理器既可以是中央处理器(central processing unit,CPU),也可以是CPU与神经网络运算处理器的组合,这里的神经网络运算处理器可以包括图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-network processing unit,NPU)和张量处理器(tensor processing unit,TPU)等等。其中,TPU是谷歌(google)为机器学习全定制的人工智能加速器专用集成电路。
第五方面,提供一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行第一方面或第二方面中的任意一种实现方式中的方法。
第六方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面或第二方面中的任意一种实现方式中的方法。
第七方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一方面或第二方面中的任意一种实现方式中的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有 指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。
上述芯片具体可以是现场可编程门阵列(field-programmable gate array,FPGA)或者专用集成电路(application-specific integrated circuit,ASIC)。
附图说明
图1为本申请实施例提供的一种系统架构的结构示意图。
图2为本申请实施例提供的一种卷积神经网络的结构示意图。
图3为本申请实施例提供的另一种卷积神经网络的结构示意图。
图4为本申请实施例提供的一种芯片的硬件结构示意图。
图5为本申请实施例提供的一种系统架构的示意图。
图6是一种故障图像生成方法的示意性流程图。
图7是本申请实施例提供的一种故障图像生成方法的示意性流程图。
图8是本申请实施例提供的图像处理系统的示意性结构图。
图9是本申请实施例提供的一种数据处理装置的示意性结构图。
图10是本申请实施例提供的一种神经网络训练装置的示意性结构图。
图11是本申请实施例的数据处理装置的示意性结构图。
图12是本申请实施例的神经网络训练装置的示意性结构图;
图13是本申请实施例的一种计算设备集群的示意性结构图;
图14是本申请实施例的另一种计算设备集群的示意性结构图;
图15是本申请实施例的另一种计算设备集群的示意性结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例可能涉及的神经网络的相关术语和概念进行介绍。
(1)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2021139429-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。 每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(2)深度神经网络
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2021139429-appb-000002
其中,
Figure PCTCN2021139429-appb-000003
是输入向量,
Figure PCTCN2021139429-appb-000004
是输出向量,
Figure PCTCN2021139429-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2021139429-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2021139429-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2021139429-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2021139429-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2021139429-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(3)卷积神经网络
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
(4)循环神经网络(recurrent neural networks,RNN)是用来处理序列数据的。在传统的神经网络模型中,是从输入层到隐含层再到输出层,层与层之间是全连接的,而对于每一层层内之间的各个节点是无连接的。这种普通的神经网络虽然解决了很多难题,但是却仍然对很多问题无能无力。例如,你要预测句子的下一个单词是什么,一般需要用到前面的单词,因为一个句子中前后单词并不是独立的。RNN之所以称为 循环神经网路,即一个序列当前的输出与前面的输出也有关。具体的表现形式为网络会对前面的信息进行记忆并应用于当前输出的计算中,即隐含层本层之间的节点不再无连接而是有连接的,并且隐含层的输入不仅包括输入层的输出还包括上一时刻隐含层的输出。理论上,RNN能够对任何长度的序列数据进行处理。对于RNN的训练和对传统的CNN或DNN的训练一样。
既然已经有了卷积神经网络,为什么还要循环神经网络?原因很简单,在卷积神经网络中,有一个前提假设是:元素之间是相互独立的,输入与输出也是独立的,比如猫和狗。但现实世界中,很多元素都是相互连接的,比如股票随时间的变化,再比如一个人说了:我喜欢旅游,其中最喜欢的地方是云南,以后有机会一定要去。这里填空,人类应该都知道是填“云南”。因为人类会根据上下文的内容进行推断,但如何让机器做到这一步?RNN就应运而生了。RNN旨在让机器像人一样拥有记忆的能力。因此,RNN的输出就需要依赖当前的输入信息和历史的记忆信息。
(5)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
(6)反向传播算法
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的神经网络模型的参数,例如权重矩阵。
(7)生成式对抗网络
生成式对抗网络(generative adversarial networks,GAN)是一种深度学习模型。该模型中至少包括两个模块:一个模块是生成模型(generative model),另一个模块是判别模型(discriminative model),通过这两个模块互相博弈学习,从而产生更好的输出。GAN的基本原理如下:以生成图片的GAN为例,假设有两个网络,G(generator)和D(discriminator),其中G是一个生成图片的网络,它接收一个随机的噪声z,通过这个噪声生成图片,记做G(z);D是一个判别网络,用于判别一张图片是不是“真实的”。在理想的状态下,G可以生成足以“以假乱真”的图片G(z),而D难以判定G生成的图片究竟是不是真实的,即D(G(z))=0.5。这样就得到了一个优异的生成模型G,它可以用来生成图片。
如图1所示,本申请实施例提供了一种系统架构100。在图1中,数据采集设备160用于采集训练数据。针对本申请实施例的数据处理方法来说,训练数据可以包括训练图像、训练音频、训练视频、训练文本等。
在采集到训练数据之后,数据采集设备160将这些训练数据存入数据库130,训练设备120基于数据库130中维护的训练数据训练得到目标模型/规则101。
下面对训练设备120基于训练数据得到目标模型/规则101进行描述,训练设备120对输入的训练数据进行处理,将输出的训练信息与训练数据对应的标注信息进行对比,直到根据训练设备120输出的训练信息与训练数据对应的标注信息的差值小于一定的阈值,从而完成目标模型/规则101的训练。
上述目标模型/规则101能够用于实现本申请实施例的数据处理方法。本申请实施例中的目标模型/规则101具体可以为神经网络。需要说明的是,在实际的应用中,所述数据库130中维护的训练数据不一定都来自于数据采集设备160的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的训练数据进行目标模型/规则101的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。
根据训练设备120训练得到的目标模型/规则101可以应用于不同的系统或设备中,如应用于图1所示的执行设备110,所述执行设备110可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)AR/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端等。在图1中,执行设备110配置输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口112输入数据,所述输入数据在本申请实施例中可以包括:客户设备输入的待处理数据。
预处理模块113和预处理模块114用于根据I/O接口112接收到的输入数据(如待处理数据)进行预处理,在本申请实施例中,也可以没有预处理模块113和预处理模块114(也可以只有其中的一个预处理模块),而直接采用计算模块111对输入数据进行处理。
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理过程中,执行设备110可以调用数据存储系统150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统150中。
最后,I/O接口112将处理结果返回给客户设备140,从而提供给用户。
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则101,该相应的目标模型/规则101即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。
在图1中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口112提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口112发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。用户可以在客户设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备140也 可以作为数据采集端,采集如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过客户设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。
值得注意的是,图1仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图1中,数据存储系统150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储系统150置于执行设备110中。
如图1所示,根据训练设备120训练得到目标模型/规则101,该目标模型/规则101在本申请实施例中可以是本申请中的神经网络,具体的,本申请实施例使用神经网络可以为CNN,深度卷积神经网络(deep convolutional neural networks,DCNN),循环神经网络(recurrent neural network,RNN)等等。
由于CNN是一种非常常见的神经网络,下面结合图2重点对CNN的结构进行详细的介绍。如上文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的数据作出响应。下面以输入的数据为图像为例进行说明。
本申请实施例的数据处理方法具体采用的神经网络的结构可以如图2所示。在图2中,卷积神经网络(CNN)200可以包括输入层210,卷积层/池化层220(其中池化层为可选的),以及神经网络层230。其中,输入层210可以获取待处理数据,并将获取到的待处理数据交由卷积层/池化层220以及后面的神经网络层230进行处理,可以得到数据的处理结果。下面以图像处理为例对图2中的CNN 200中内部的层结构进行详细的介绍。
卷积层/池化层220:
卷积层:
如图2所示卷积层/池化层220可以包括如示例221-226层,举例来说:在一种实现中,221层为卷积层,222层为池化层,223层为卷积层,224层为池化层,225为卷积层,226为池化层;在另一种实现方式中,221、222为卷积层,223为池化层,224、225为卷积层,226为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。
下面将以卷积层221为例,介绍一层卷积层的内部工作原理。
卷积层221可以包括很多个卷积算子,卷积算子也称为核,其在数据处理中的作用相当于一个从输入图像中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会 延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的卷积特征图的尺寸也相同,再将提取到的多个尺寸相同的卷积特征图合并形成卷积运算的输出。
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络200进行正确的预测。
当卷积神经网络200有多个卷积层的时候,初始的卷积层(例如221)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络200深度的加深,越往后的卷积层(例如226)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。
池化层:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,在如图2中220所示例的221-226各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。
神经网络层230:
在经过卷积层/池化层220的处理后,卷积神经网络200还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层220只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络200需要利用神经网络层230来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层230中可以包括多层隐含层(如图2所示的231、232至23n)以及输出层240,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像识别,图像分类,图像超分辨率重建等等。
在神经网络层230中的多层隐含层之后,也就是整个卷积神经网络200的最后层为输出层240,该输出层240具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络200的前向传播(如图2由210至240方向的传播为前向 传播)完成,反向传播(如图2由240至210方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络200的损失,及卷积神经网络200通过输出层输出的结果和理想结果之间的误差。
本申请实施例的数据处理方法具体采用的神经网络的结构可以如图3所示。在图3中,卷积神经网络(CNN)200可以包括输入层110,卷积层/池化层120(其中池化层为可选的),以及神经网络层130。与图2相比,图3中的卷积层/池化层120中的多个卷积层/池化层并行,将分别提取的特征均输入给全神经网络层130进行处理。
需要说明的是,图2和图3所示的卷积神经网络仅作为一种本申请实施例的数据处理方法的两种可能的卷积神经网络的示例,在具体的应用中,本申请实施例的数据处理方法所采用的卷积神经网络还可以以其他网络模型的形式存在。
图4为本申请实施例提供的一种芯片的硬件结构,该芯片包括神经网络处理器50。该芯片可以被设置在如图1所示的执行设备110中,用以完成计算模块111的计算工作。该芯片也可以被设置在如图1所示的训练设备120中,用以完成训练设备120的训练工作并输出目标模型/规则101。如图2和图3所示的卷积神经网络中各层的算法均可在如图4所示的芯片中得以实现。
神经网络处理器NPU 50作为协处理器挂载到主中央处理器(central processing unit,CPU)(host CPU)上,由主CPU分配任务。NPU的核心部分为运算电路503,控制器504控制运算电路503提取存储器(权重存储器或输入存储器)中的数据并进行运算。
在一些实现中,运算电路503内部包括多个处理单元(process engine,PE)。在一些实现中,运算电路503是二维脉动阵列。运算电路503还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路503是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器502中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器501中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)508中。
向量计算单元507可以对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。例如,向量计算单元507可以用于神经网络中非卷积/非FC层的网络计算,如池化(pooling),批归一化(batch normalization),局部响应归一化(local response normalization)等。
在一些实现种,向量计算单元能507将经处理的输出的向量存储到统一缓存器506。例如,向量计算单元507可以将非线性函数应用到运算电路503的输出,例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元507生成归一化的值、合并值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路503的激活输入,例如用于在神经网络中的后续层中的使用。
统一存储器506用于存放输入数据以及输出数据。
权重数据直接通过存储单元访问控制器505(direct memory access controller,DMAC)将外部存储器中的输入数据搬运到输入存储器501和/或统一存储器506、将 外部存储器中的权重数据存入权重存储器502,以及将统一存储器506中的数据存入外部存储器。
总线接口单元(bus interface unit,BIU)510,用于通过总线实现主CPU、DMAC和取指存储器509之间进行交互。
与控制器504连接的取指存储器(instruction fetch buffer)509,用于存储控制器504使用的指令;
控制器504,用于调用指存储器509中缓存的指令,实现控制该运算加速器的工作过程。
一般地,统一存储器506,输入存储器501,权重存储器502以及取指存储器509均为片上(On-Chip)存储器,外部存储器为该NPU外部的存储器,该外部存储器可以为双倍数据率同步动态随机存储器(double data rate synchronous dynamic random access memory,简称DDR SDRAM)、高带宽存储器(high bandwidth memory,HBM)或其他可读可写的存储器。
其中,图2和图3所示的卷积神经网络中各层的运算可以由运算电路503或向量计算单元507执行。
上文中介绍的图1中的执行设备110能够执行本申请实施例的数据处理方法的各个步骤,图2和图3所示的CNN模型和图4所示的芯片也可以用于执行本申请实施例的数据处理方法的各个步骤。下面结合附图对本申请实施例的神经网络训练的方法和本申请实施例的数据处理方法进行详细的介绍。
如图5所示,本申请实施例提供了一种系统架构300。该系统架构包括本地设备301、本地设备302以及执行设备210和数据存储系统250,其中,本地设备301和本地设备302通过通信网络与执行设备210连接。
执行设备210可以由一个或多个服务器实现。可选的,执行设备210可以与其它计算设备配合使用,例如:数据存储器、路由器、负载均衡器等设备。执行设备210可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备210可以使用数据存储系统250中的数据,或者调用数据存储系统250中的程序代码来实现本申请实施例的数据处理的方法。
具体地,执行设备210可以执行本申请实施例提供的数据处理方法的各个步骤。
用户可以操作各自的用户设备(例如本地设备301和本地设备302)与执行设备210进行交互。每个本地设备可以表示任何计算设备,例如个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备210进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。
在一种实现方式中,本地设备301、本地设备302从执行设备210获取到目标神经网络的相关参数,将目标神经网络部署在本地设备301、本地设备302上,利用该目标神经网络进行图像分类或者图像处理等等。
在另一种实现中,执行设备210上可以直接部署目标神经网络,执行设备210通过从本地设备301和本地设备302获取待处理数据,并根据目标神经网络对待处理数 据进行分类或者其他类型的处理。
上述执行设备210也可以为云端设备,此时,执行设备210可以部署在云端;或者,上述执行设备210也可以为终端设备,此时,执行设备210可以部署在用户终端侧,本申请实施例对此并不限定。
对表面缺陷的检测是工业质检中的重要环节,也是把控产品质量的关键步骤,可以避免有缺陷的产品流入市场,避免在有缺陷的产品使用过程中带来危害。例如,在铁路场景中,列车的零部件随着使用寿命的增加可能会发生破损、故障,零部件的表面出现缺陷,如果没有及时发现零部件的表面出现缺陷,则列车继续运行过程中可能会发生重大事故。对表面缺陷的检测也可以应用在电网、制造等多种领域。
基于计算机视觉的故障检测算法可以利用神经网络模型,实现表面缺陷检测,帮助人们快速排除隐患。神经网络模型的训练过程需要大量的正常数据和故障数据供。
在实际应用中,往往面临小样本问题,也就是说故障图像数量太少,不符合训练神经网络模型的数据量需求。故障图像数量较少,用于训练神经网络模型的故障图像可能不包括一些故障类型对应的图像,或即使包括各个故障类型,但某些故障类型的数据太少,从而使得训练得到的神经网络模型对故障的识别能力有限。
图6示出了一种故障图像处理方法的示意性流程图。
在S601,利用边缘提取模型,提取非故障图像的第一边缘图像。
可以对不存在表面缺陷的设备进行图像采集,以得到非故障图像。也就是说,非故障图像中的物体可以是不存在表面缺陷的。
边缘图像也可以称为边图像。非故障图像的边缘图像的是对非故障图像进行边缘提取后得到的图像。边缘是图像性区域和另一个属性区域的交接处,是区域属性发生突变的地方,是图像信息最集中的地方,图像的边缘包含着丰富的信息。
在S602,人工编辑非故障图像的边缘图像,以得到故障图像的边缘图像。故障图像中,只有局部区域异常,其他区域与非故障图像相同。
在S603,利用图像翻译模型对故障图像的边缘图像进行处理,以形成故障图像。
从而,可以根据S601-S603得到的故障图像,以及非故障图像,训练神经网络模型。训练得到的神经网络模型可以用于进行表面缺陷检测。
方法600中,通过人工对非故障图像进行编辑,自动化程度低,成本较高。
为了解决上述问题,本申请实施例提供的一种故障图像生成方法。
图7是本申请实施例提供的一种故障图像生成方法的示意性流程图。
故障图像生成方法700包括S710至S720。
在S710,获取非故障图像(non-faulty image)和第一故障图像,非故障图像中记录了未发生故障的第一物体,第一故障图像中记录了发生故障的第二物体,第一物体和第二物体的类型不同。
也就是说,非故障图像和第一故障图像是对不同类型的物体进行采集得到的。
类型也可以称为类别,是指具有共同特征的事物所形成的种类。相同类型的对象具有相同的性质或特点,而不同类型的对象某些性质或特点不同。对象的性质可以包括功能、材料、材料类型、颜色等。不同类型的对象具有不同的性质。例如,列车车厢表面与墙体的材料不同,非故障图像可以是对列车车厢表面进行采集得到的,第一 故障图像可以是对墙面进行采集得到的。
在S720,将第一故障图像中第二物体的故障图案迁移至非故障图像中的第一物体上(例如:覆盖于非故障图像中的第一物体上,或者图层的叠加),从而获得第二故障图像。第二故障图像展示故障状态下的第一物体。
例如,第一故障图像可以记录墙面裂纹,非故障图像可以记录无故障的车厢表面。可以利用墙面的裂纹对无故障的车厢表面的部分区域进行调整,以得到带有裂纹的车厢表面的图像。
通过S710至S720,可以根据第一故障图像对非故障图像进行处理,得到第二故障图像。从而,在同一类型的故障图像的数量较小而非故障图像的数量较多时,可以利用不同类型的故障图像对非故障图像进行调整,从而生成数量较多的故障图像。
第一故障图像与非故障图像中记录有不同类型的物体,即第一故障图像与非故障图像是对不同的对象进行采集得到的。从而,可以利用不同类型物体的故障对非故障图像中记录的物体的图像进行调整,提高第一故障图像获取的广泛性,增加第一故障图像来源的灵活性。从而在非故障图像记录的物体类型对应的故障图像数量较少的情况下,可以利用其他类型物体的故障图像生成该物体类型的故障图像,生成故障图像的适用性较强,增加了第二故障图像中故障类型的多样性。
故障图案可以是根据第一故障图像得到的。故障图案可以是第一故障图像中的第二物体故障的区域。故障图案也可以是对第一故障图像中的第二物体故障的区域进行边缘提取得到的。
非故障图像可以是自然场景图,也可以是边缘图。
当非故障图像和第一故障图像中的任一个为自然场景图时,可以利用边缘提取模型对自然场景图进行处理,以得到非故障图像对应的边缘图和第一故障图像对应的边缘图。
应当理解,一个图像为自然场景图的情况下,该图像对应的边缘图可以是边缘提取模型对该图像处理得到的边缘图;一个图像为边缘图,该图像对应的边缘图为该图像。
可以利用第一故障图像对应的边缘图中的故障图案替换非故障图像对应的边缘图中第一物体所在的部分区域,并利用图像翻译模型对替换后的边缘图进行处理,以得到自然场景图。第二故障图像可以是替换后的边缘图,或者,第二故障图像可以是图像翻译模型处理得到的自然场景图。
图像翻译模型可以是利用第一训练数据集合训练得到的。第一训练数据集合包括第三图像以及第三图像对应的边缘图,第三图像为自然场景图。第三图像中记录的物体与第一物体是类型可以相同。也就是说,第三图像与非故障图像是对相同类型的对象进行采集得到的。图像翻译模型可以是训练得到的AI模型。第三图像中记录的物体可以有故障,也可以无故障。
可以利用初始图像翻译模型对第三图像对应的边缘图进行处理,以得到翻译图像。调整初始图像翻译模型,以最小化翻译图像与第三图像之间的差异。使用调整后的初始图像翻译模型对其他第三图像对应的边缘图进行处理,直到差异逐渐收敛,以得到图像翻译模型。
第三图像与非故障图像是对相同类型的对象进行采集得到的。利用第三图像与第三图像对应的边缘图训练图像翻译模型,使图像翻译模型具有针对性,提高图像翻译模型的准确度。
在S720,可以对故障图案进行外形变换,并将变换后的第一故障图像覆盖于非故障图像中的第一物体上,得到第二故障图像。
外形变换可以包括形变、明暗变化等样式变换,形变例如可以是拉伸、压缩等。从而使得对非故障图像的调整更为灵活,可以增加可能得到的第二故障图像的数量和多样性。
可以将第一故障图像覆盖非故障图像的目标区域,以得到第二故障图像。
非故障图像的目标区域可第一区域可以是根据用户输入的位置信息确定的,也可以是利用区域生成模型确定的。
可以利用区域生成模型对非故障图像进行处理,以得到目标区域。利用区域生成模型确定非故障图像的目标区域,并将故障图像覆盖非故障图像的目标区域,以得到第二故障图像,能够提高效率,降低人工成本。
根据区域生成模型确定的目标区域,可以对第一故障图像的故障图案进行外形变换,即使得变换后的故障图案的尺寸小于或等于目标区域的尺寸。从而,使得生成的第二故障图像更为准确。
区域生成模型可以是训练得到的AI模型。
可以获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体。可以获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域。之后,可以根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
图8是本申请实施例提供的一种图像处理系统的示意性结构图。
图像处理系统800包括边缘提取模型810、编辑模型820、图像翻译模型830。图像处理系统800还可以包括区域生成模型840。
其中,边缘提取模型810、图像翻译模型830、区域生成模型840可以是训练得到的神经网络模型。例如,可以是CNN模型。图像翻译模型830可以采用GAN,可以选择pix2pix、pix2pixHD、循环对抗生成网络(cycle-consistent generative adversarial networks,CycleGAN)、无监督图像转换(unsupervised image-to-image translation,UNIT)等成熟算法。
图像处理系统800用于根据故障边缘图像,对输入的图像进行处理,以得到故障图像。输入图像处理系统800的图像可以是非故障图像。
边缘提取模型810可以对非故障图像进行边缘提取,以得到非故障图像的边缘图像。边缘图像以线条的形式反映非故障图像中的对象的轮廓。
边缘提取模型810可以是训练得到的神经网络模型。可以根据公开数据集训练边缘提取模型810。公开数据集包括多个自然场景图像以及每个图像对应的标注边缘图像。
利用初始边缘提取模型对公开数据集中的自然场景图像进行处理,以得到训练边缘图像。根据训练边缘图像与该图像对应的标注边缘图像之间的差异,调整初始边缘 提取模型的参数,以最小化该差异。训练边缘图像与该自然场景图像对应的标注边缘图像之间的差异可以表示为损失值。不断调整模型参数,直到该损失值收敛或者迭代次数超过预设值。
编辑模型820可以根据故障边缘图像对边缘图像中目标区域进行编辑和修改,从而得到第二边缘图像。
故障边缘图像集合可以包括多个故障边缘图像。在利用系统800进行数据处理之前,可以进行数据准备,以获取故障边缘图像集合。
可以对存在表面缺陷的设备进行图像采集,以获取故障样本图像。例如,可以在需要应用系统800的项目所使用的设备中,对存在的表面缺陷的设备进行图像采集。也可以在互联网中查询、检索,以获取故障样本图像。可以对故障样本图像进行边缘提取,并将边缘提取结果中发生故障的区域对应的边缘图像作为一个故障边缘图像。或者,可以对故障样本图像中发生故障的区域进行边缘提取,以得到故障边缘图像。
故障样本图像可以是对于非故障图像中相同或不同类型的对象进行采集得到的。也就是说,故障边缘图像集合可以包括对不同对象进行采集得到的故障样本图像进行处理得到的故障边缘图像。
编辑模型820可以在故障边缘图像集合中随机或按照一定顺序选取故障边缘图像,对边缘图像中的目标区域进行编辑和修改。例如,编辑模型820可以根据故障边缘图像集合中的各个故障边缘图像,依次对第一边缘图像中的目标区域进行编辑和修改。
编辑模型820可以对故障边缘图像集合中的故障边缘图像进行形变(如拉伸、压缩等)、明暗变化等样式变换,并根据样式变换后的故障边缘图像,对第一边缘图像中的目标区域进行编辑和修改。
样式变换的方式,可以根据用户输入进行选择。例如,可以向用户展示可选的样式变换的方式,并获取用户输入,从而确定用户输入对应的样式变换的方式。或者,也可以按照默认的方式确定样式变换的方式。
通过对故障边缘图像进行样式变换,可以增加第二边缘图像中故障的样式,增加故障图像的多样性。
图像翻译模型830可以对第二边缘图像进行处理,以得到故障图像。也就是说,图像翻译模型830用于将线条轮廓图转换为自然场景图。自然场景图可以是渲染图(即色彩图)。
编辑模型820可以根据故障边缘图像、第一边缘图像和目标区域,确定第二边缘图像。其中,目标区域可以利用区域生成模型840对非故障图像进行处理得到,目标区域也可以根据用户输入信息确定。
用户输入信息可以用于指示目标区域。用户输入信息指示目标区域的方式可以是默认设置的,也可以由用户进行选择。用户输入信息可以用于指示一个或多个关键点在非故障图像中的位置。
如果用户输入信息仅指示一个关键点在非故障图像中的位置,则可以确定以该关键点为对角线的交点的矩形目标区域,目标区域的长度在长度预设范围内,目标区域的长宽比在长宽比预设范围内,目标区域的长和宽分别与非故障图像相邻的两个边平行。
如果用户输入信息指示两个关键点在非故障图像中的位置,则可以确定以该两个关键点为不相邻顶点的矩形目标区域,目标区域的长和宽分别与非故障图像相邻的两个边平行。
如果用户输入信息指示超过两个关键点在非故障图像中的位置,可以确定包含该多个关键点的面积最小的矩形为目标区域。
应当理解,相比于用户输入信息指示两个关键点的方式,用户输入信息通过仅指示一个关键点以指示目标区域的方式,得到的目标区域的多样性较高,但合理性较低。
编辑模型820可以对故障边缘图像集合中的故障边缘图像进行形变,可以使得形变后的故障边缘图像的尺寸与目标区域的尺寸相同。从而,编辑模型820可以以形变后的故障边缘图像代替非故障图像中的目标区域,形成第二边缘图像。
区域生成模型840可以是根据多个训练图像和每个训练图像中的训练区域信息训练得到的。该多个训练图像可以包括系统800处理的非故障图像,也可以包括其他图像。
在系统800对大量非故障图像进行处理的情况下,可以先根据非故障图像对应的用户输入信息,确定目标区域,编辑模块820将边缘提取模块810对非故障图像处理得到的第一边缘图像中的目标区域替换为故障边缘图像,以形成第二边缘图像。
在非故障图像、第一边缘图像、第二边缘图像中,可以显示目标区域的边框。从而,用户可以判断该目标区域的合理性。对于不合理的目标区域进行调整或删除。
在系统800获取较多的非故障图像以及每个非故障图像对应的用户输入信息之后,可以根据该多个非故障图像以及每个非故障图像对应的用户输入信息,训练区域生成模型840。例如,可以在获取的用户输入信息的数量超过预设数量(如100)的情况下,训练区域生成模型840。
可以将非故障图像输入初始区域生成模型。初始区域生成模型对非故障图像进行处理,得到训练信息。根据训练信息以及该非故障图像对应的标注信息之间的差异,调整初始区域生成模型的参数,以使得差异最小化,从而完成一次迭代。非故障图像对应的标注信息可以是非故障图像对应的用户输入信息,也可以是该用户输入信息指示的目标区域。将调整后的初始区域生成模型作为初始区域生成模型进行下一次迭代,对其他非故障图像进行处理。当非故障图像对应的训练信息和标注信息之间的差异逐渐收敛,或迭代次数达到预设值,调整后的初始区域生成模型可以作为区域生成模型840。
也就是说,可以在系统800根据用户输入信息对非故障图像进行调整以生成故障图像生成的过程中,利用系统800获取的用户输入信息以及该用户输入信息对应的非故障图像,训练区域生成模型840。
在训练得到区域生成模型840之后,可以利用区域生成模型840对其他非故障图像进行处理,以确定每个非故障图像的目标区域。编辑模块820将边缘提取模块810对非故障图像处理得到的第一边缘图像中的目标区域替换为故障边缘图像,以形成第二边缘图像。
如果区域生成模型840确定的目标区域不符合要求,可以利用更多的非故障图像和用户输入信息对区域生成模型840继续进行训练。
系统800利用区域生成模型840确定非故障图像的目标区域,可以自动化的完成对故障图像的生成,提高故障图像的生成效率,降低人工成本。
图像翻译模型830可以是利用多个非故障图像,以及边缘提取模型处理得到的第一边缘图像对初始图像翻译模型进行训练得到的。也就是说,对图像翻译模型830的训练,可以是在生成故障图像的过程中进行的。
可以利用边缘提取模型810对多个非故障图像进行处理,以得到每个非故障图像对应的第一边缘图像,从而得到用于训练图像翻译模型830的数据。训练图像翻译模型830需要进行多次迭代,每次迭代包括:利用边缘提取模型对第一边缘图像进行处理,以得到生成图像;根据生成图像与该第一边缘图像对应的非故障图像之间的差异,调整初始图像翻译模型的参数,以使得差异最小化。之后,可以将调整后的初始图像翻译模型作为初始图像翻译模型进行下一次迭代。在非故障图像与生成图像之间的差异小于预设值或迭代次数达到预设次数,可以停止迭代,并将调整后的初始图像翻译模型作为图像翻译模型830。
利用非故障图像训练图像翻译模型830,使得训练得到的图像翻译模型830能够更好的适应于系统800的应用场景,提高系统800输出的图像的真实性。
在一些实施例中,故障样本图像以及故障样本图像对应的边缘也可以用于训练图像翻译模型830。从而,可以提高图像翻译模型830输出的故障图像的准确度。
图像处理系统800能够将非故障图像的多样性迁移到故障图像中。利用图像处理系统800,利用非故障图像的第一边缘图像和故障边缘图像,形成第二边缘图像,并利用图像翻译技术对第二边缘图像进行处理,得到故障图像。
仅对非故障图像的目标区域,利用故障边缘图像进行调整,提高故障图像的合理性和可控性。
图9是本申请实施例提供的一种故障图像生成方法的示意性流程图。
方法900包括S910至S930。
在S910,获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体。
在S920,将所述非故障图像输入区域生成模型,确定所述非故障图像中被所述第一故障图像中所述第二物体的故障图案覆盖的目标区域。
在S930,将所述故障图案覆盖于所述非故障图像中的所述目标区域上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
通过S910至S930,利用区域生成模型,确定第一图像的目标区域,并利用第一故障图像覆盖非故障图像的目标区域,以得到第二故障图像,使得目标区域的确定不再依赖于人工,能够提高效率,降低人工成本。
第一故障图像可以是根据第四图像得到的。第一故障图像可以是第四图像中的存在故障的区域。故障图像也可以是对第四图像中的存在故障的区域进行边缘提取得到的。根据第四图像得到的第一故障图像,第一故障图像不是针对某个特定的非故障图像而确定的。
第一物体和第二物体的类型可以相同或不同。即非故障图像和第一故障图像可以是对相同或不同类型的对象进行采集得到的。
可以训练区域生成模型。获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体。获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域。根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
根据记录有未发生故障的、与第一物体同类型的物体的训练图像,以及指示训练图像中能够产生故障的区域,训练区域生成模型,使得区域生成模型更具有针对性,生成的区域生成模型适用于第一物体的类型,提高区域生成模型确定的目标位置的准确度。
为了提高第一故障图像的适用性,提高第二故障图像的多样性,可以对第一故障图像进行变换,并将变换后的所述故障图案覆盖于所述非故障图像的目标区域上,获得第二故障图像。
外形变换包括尺寸放缩或明暗变化等。
上文结合图1至图9的描述了本申请实施例提供的方法实施例,下面结合图10至图12,描述本申请实施例的装置实施例。应理解,方法实施例的描述与装置实施例的描述相互对应,因此,未详细描述的部分可以参见上文的描述。
图10是本申请实施例提供的一种神经网络训练系统的示意性结构图。
神经网络训练系统3000包括获取模块3010、处理模块3020。
获取模块3010用于获取训练图像和区域指示信息。训练图像中记录有未发生故障的物体。区域指示信息用于指示训练图像中能够产生故障的区域。
训练图像中记录的物体的类型可以与第一物体的类型相同。
处理模块3020用于根据所述目标位置和所述第一图像,训练所述区域生成模型。
系统3000可以还包括存储模块,存储模块可以用于存储训练图像。获取模块3010可以在存储模块中读取训练图像,或者,获取模块3010可以接收存储模块所在的设备发送的训练图像。
获取模块3010、处理模块3020部署可以布置在本地,存储模块可以布置在本地或云端。
图11是本申请实施例提供的一种故障图像生成装置的示意性结构图。
故障图像生成装置1100包括获取模块1110、处理模块1120。
在一些实施例中,获取模块1110用于,获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体,所述第一物体和所述第二物体的类型不同。
处理模块1120用于,将所述第一故障图像中所述第二物体的故障图案覆盖于所述非故障图像中的所述第一物体上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
可选地,处理模块1120用于,将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案覆盖的目标区域。
可选地,获取模块1110还用于,获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体。
获取模块1110还用于,获取区域指示信息,所述区域指示信息用于指示所述训练 图像中能够产生故障的区域。
处理模块1120还用于,根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
可选地,处理模块1120还用于,对所述故障图案进行外形变换。
处理模块1120还用于,将变换后的所述故障图案覆盖于所述非故障图像中的目标区域上,获得所述第二故障图像。
可选地,所述外形变换包括尺寸拉伸、压缩或明暗变化。
在另一些实施例中,获取模块1110用于,获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体。
处理模块1120用于,将所述非故障图像输入区域生成模型,确定所述非故障图像中所述第一故障图像中所述第二物体的故障图案覆盖的目标区域。
处理模块1120还用于,将所述故障图案覆盖于所述非故障图像中的所述目标区域上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
可选地,所述第一物体和所述第二物体的类型不同。
可选地,获取模块1110还用于,获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体。
获取模块1110还用于,获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域。
处理模块1120还用于,根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
可选地,处理模块1120还用于,对所述故障图案进行外形变换。
处理模块1120还用于,将变换后的所述故障图案覆盖于所述非故障图像的所述目标区域上,获得所述第二故障图像。
可选地,所述外形变换包括尺寸拉伸、压缩或明暗变化。
故障图像生成装置1100还可以包括存储模块。存储模块可以用于存储第一故障图像,也可以用于存储非故障图像。获取模块3010、处理模块3020部署可以布置在本地,存储模块可以布置在本地或云端。
图12是本申请实施例的计算设备的示意性结构图。
计算设备1200包括:总线1202、处理器1204、存储器1206和通信接口1208。处理器1204、存储器1206和通信接口1208之间通过总线1202通信。计算设备1200可以是服务器或终端设备。应理解,本申请不限定计算设备1200中的处理器、存储器的个数。
总线1202可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线1204可包括在计算设备1200各个部件(例如,存储器1206、处理器1204、通信接口1208)之间传送信息的通路。
处理器1204可以包括中央处理器(central processing unit,CPU)、图形处理器 (graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器1206可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。处理器1204还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。存储器1206中存储有可执行的程序代码,处理器1204执行该可执行的程序代码以实现前述故障图像生成方法。具体的,存储器1206上存有用于执行故障图像生成方法的指令。
通信接口1203使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备1200与其他设备或通信网络之间的通信。
应注意,尽管上述装置12000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置12000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置12000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置12000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图12中所示的全部器件。
图13是本申请实施例提供的一种计算设备集群的示意性结构图。
如图13所示,所述计算设备集群包括至少一个计算设备1200。计算设备集群中的一个或多个计算设备1200中的存储器1206中可以存有用于执行故障图像生成方法的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1200也可以用于执行故障图像生成方法的部分指令。换言之,一个或多个计算设备1200的组合可以共同执行故障图像生成方法的指令。
需要说明的是,计算设备集群中的不同的计算设备1200中的存储器1206可以存储不同的指令,用于执行故障图像生成方法中的部分步骤。
图14示出了一种可能的实现方式。如图14所示,两个计算设备1200A和1200B通过通信接口1208实现连接。计算设备1200A中的存储器上存有用于执行交互单元1262和处理单元1266的功能的指令。计算设备1200B中的存储器上存有用于执行存储单元1264的功能的指令。换言之,计算设备1200A和1200B的存储器1206共同存储用于执行故障图像生成方法的指令。
图14所示的计算设备集群之间的连接方式可以是考虑到本申请提供的故障图像生成方法需要对雷达或相机采集的大量数据进行存储。因此,考虑将存储功能交由计算设备1200B执行。
应理解,图8中示出的计算设备1200A的功能也可以由多个计算设备1200完成。同样,计算设备1200B的功能也可以由多个计算设备1200完成。
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图15示出了一种可能的实现方式。如图15所示,两个计算设备1200C和1200D之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备1200C中的存储器1206中存有执行交互单元1262的指令。同时,计算设备1200D 中的存储器1206中存有执行存储单元1264和处理单元1266的指令。
图15所示的计算设备集群之间的连接方式可以是考虑到本申请提供的故障图像生成方法需要对大量第一故障图像进行存储,并执行大量的计算确定第二故障图像,因此考虑将存储单元1264和处理单元1266实现的功能交由计算设备1200D执行。
应理解,图10中示出的计算设备1200C的功能也可以由多个计算设备1200完成。同样,计算设备1200D的功能也可以由多个计算设备1200完成。
本申请实施例提供一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行前文所述的故障图像生成方法。
本申请实施例提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行前文所述的故障图像生成方法。
本申请实施例提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行前文所述的故障图像生成方法。
应理解,本申请实施例中的处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机 能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法 的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (23)

  1. 一种故障图像生成方法,其特征在于,包括:
    获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体,所述第一物体和所述第二物体的类型不同;
    将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像之前,所述方法包括:
    将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案迁移的目标区域。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案迁移的目标区域之前,所述方法包括:
    获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体;
    获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;
    根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
  4. 根据权利要求1至3所述的方法,其特征在于,所述将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像包括:
    对所述故障图案进行外形变换;
    将变换后的所述故障图案迁移至所述非故障图像中的目标区域上,获得所述第二故障图像。
  5. 根据权利要求4所述的方法,其特征在于,所述外形变换包括尺寸拉伸、压缩或明暗变化。
  6. 一种故障图像生成方法,其特征在于,所述方法包括:
    获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体;
    将所述非故障图像输入区域生成模型,确定所述非故障图像的目标区域;
    将所述故障图案迁移至所述非故障图像中的所述目标区域上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
  7. 根据权利要求6所述的方法,其特征在于,所述第一物体和所述第二物体的类型不同。
  8. 根据权利要求6或7所述的方法,其特征在于,在所述将所述非故障图像输入区域生成模型,确定所述非故障图像的目标区域之前,所述方法还包括:
    获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体;
    获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;
    根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
  9. 根据权利要求6至8中任一项所述的方法,其特征在于,所述将所述故障图案迁移至所述非故障图像中的所述目标区域上,获得第二故障图像包括:
    对所述故障图案进行外形变换;
    将变换后的所述故障图案迁移至所述非故障图像的所述目标区域上,获得所述第二故障图像。
  10. 根据权利要求9所述的方法,其特征在于,所述外形变换包括尺寸拉伸、压缩或明暗变化。
  11. 一种故障图像生成系统,其特征在于,包括获取模块和处理模块;
    所述获取模块,用于获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体,所述第一物体和所述第二物体的类型不同;
    所述处理模块,用于将所述第一故障图像中所述第二物体的故障图案迁移至所述非故障图像中的所述第一物体上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
  12. 根据权利要求11所述的系统,其特征在于,所述处理模块,用于将所述非故障图像输入区域生成模型,确定所述非故障图像中所述故障图案迁移的目标区域。
  13. 根据权利要求12所述的系统,其特征在于,
    所述获取模块还用于,获取多张训练图像,所述训练图像中记录有未发生故障的、与第一物体同类型的物体;
    所述获取模块还用于,获取区域指示信息,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;
    所述处理模块,用于所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
  14. 根据权利要求11-13中任一项所述的系统,其特征在于,所述处理模块,用于对所述第一故障图像进行外形变换;将变换后的第一故障图像迁移至所述非故障图像中的目标区域上,获得第二故障图像。
  15. 根据权利要求14所述的系统,其特征在于,所述变换包括拉伸、压缩或明暗变化。
  16. 一种故障图像生成系统,其特征在于,包括获取模块和处理模块;
    所述获取模块用于,获取非故障图像和第一故障图像,所述非故障图像中记录了未发生故障的第一物体,所述第一故障图像中记录了发生故障的第二物体;
    所述处理模块,用于将所述非故障图像输入区域生成模型,确定所述非故障图像中的目标区域;将所述第一故障图像迁移至所述非故障图像中的所述目标区域上,获得第二故障图像,所述第二故障图像展示故障状态下的所述第一物体。
  17. 根据权利要求16所述的系统,其特征在于,所述第一物体和所述第二物体的类型不同。
  18. 根据权利要求16或17所述的系统,其特征在于,所述获取模块,用于获取多张训练图像和区域指示信息,所述训练图像中记录有未发生故障、与第一物体同类型的物体,所述区域指示信息用于指示所述训练图像中能够产生故障的区域;所述处理模块,用于根据所述多张训练图像和所述区域指示信息,训练所述区域生成模型。
  19. 根据权利要求16至18中任一项所述的系统,其特征在于,所述处理模块还用于,对所述第一故障图像进行外形变换;将变换后的第一故障图像迁移至所述非故障图像的所述目标区域上,获得所述第二故障图像。
  20. 根据权利要求16至19中任一项所述的系统,其特征在于,所述外形变换包括拉伸、压缩或明暗变化。
  21. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1至10中任一所述的方法。
  22. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算机设备集群运行时,使得所述计算机设备集群执行如权利要求的1至10中任一所述的方法。
  23. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1至10中任一所述的方法。
PCT/CN2021/139429 2021-04-20 2021-12-18 故障图像生成方法与装置 WO2022222519A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21937741.3A EP4307217A1 (en) 2021-04-20 2021-12-18 Fault image generation method and apparatus
US18/482,906 US20240104904A1 (en) 2021-04-20 2023-10-08 Fault image generation method and apparatus

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110423683.1 2021-04-20
CN202110423683 2021-04-20
CN202110914815.0 2021-08-10
CN202110914815.0A CN115294007A (zh) 2021-04-20 2021-08-10 故障图像生成方法与装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/482,906 Continuation US20240104904A1 (en) 2021-04-20 2023-10-08 Fault image generation method and apparatus

Publications (1)

Publication Number Publication Date
WO2022222519A1 true WO2022222519A1 (zh) 2022-10-27

Family

ID=83723541

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139429 WO2022222519A1 (zh) 2021-04-20 2021-12-18 故障图像生成方法与装置

Country Status (3)

Country Link
US (1) US20240104904A1 (zh)
EP (1) EP4307217A1 (zh)
WO (1) WO2022222519A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102216085A (zh) * 2008-09-16 2011-10-12 阿姆斯特郎世界工业公司 具有长的重复长度的片材和多种图案的砖
US20120050377A1 (en) * 2010-08-27 2012-03-01 Masashi Ueshima Defective recording element correction parameter selection chart, defective recording element correction parameter determination method and apparatus, and image forming apparatus
CN110909669A (zh) * 2019-11-19 2020-03-24 湖南国奥电力设备有限公司 基于图像检测的地下电缆故障确定方法和装置
CN111091546A (zh) * 2019-12-12 2020-05-01 哈尔滨市科佳通用机电股份有限公司 铁路货车钩尾框折断故障识别方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102216085A (zh) * 2008-09-16 2011-10-12 阿姆斯特郎世界工业公司 具有长的重复长度的片材和多种图案的砖
US20120050377A1 (en) * 2010-08-27 2012-03-01 Masashi Ueshima Defective recording element correction parameter selection chart, defective recording element correction parameter determination method and apparatus, and image forming apparatus
CN110909669A (zh) * 2019-11-19 2020-03-24 湖南国奥电力设备有限公司 基于图像检测的地下电缆故障确定方法和装置
CN111091546A (zh) * 2019-12-12 2020-05-01 哈尔滨市科佳通用机电股份有限公司 铁路货车钩尾框折断故障识别方法

Also Published As

Publication number Publication date
EP4307217A1 (en) 2024-01-17
US20240104904A1 (en) 2024-03-28

Similar Documents

Publication Publication Date Title
WO2020253416A1 (zh) 物体检测方法、装置和计算机存储介质
WO2020216227A9 (zh) 图像分类方法、数据处理方法和装置
US20220092351A1 (en) Image classification method, neural network training method, and apparatus
WO2021043168A1 (zh) 行人再识别网络的训练方法、行人再识别方法和装置
WO2021120719A1 (zh) 神经网络模型更新方法、图像处理方法及装置
WO2021227726A1 (zh) 面部检测、图像检测神经网络训练方法、装置和设备
WO2021147325A1 (zh) 一种物体检测方法、装置以及存储介质
WO2021043273A1 (zh) 图像增强方法和装置
US20210398252A1 (en) Image denoising method and apparatus
WO2021155792A1 (zh) 一种处理装置、方法及存储介质
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
WO2022052601A1 (zh) 神经网络模型的训练方法、图像处理方法及装置
WO2022001805A1 (zh) 一种神经网络蒸馏方法及装置
US20220148291A1 (en) Image classification method and apparatus, and image classification model training method and apparatus
WO2021018245A1 (zh) 图像分类方法及装置
CN111914997B (zh) 训练神经网络的方法、图像处理方法及装置
WO2021164750A1 (zh) 一种卷积层量化方法及其装置
WO2021218517A1 (zh) 获取神经网络模型的方法、图像处理方法及装置
WO2021018251A1 (zh) 图像分类方法及装置
US20230281973A1 (en) Neural network model training method, image processing method, and apparatus
CN111768415A (zh) 一种无量化池化的图像实例分割方法
WO2023165361A1 (zh) 一种数据处理方法及相关设备
CN110705564A (zh) 图像识别的方法和装置
WO2020062299A1 (zh) 一种神经网络处理器、数据处理方法及相关设备
WO2023207531A1 (zh) 一种图像处理方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937741

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2021937741

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2021937741

Country of ref document: EP

Effective date: 20231012

NENP Non-entry into the national phase

Ref country code: DE