CN112950460A - Technology for image style migration - Google Patents
Technology for image style migration Download PDFInfo
- Publication number
- CN112950460A CN112950460A CN202110323511.7A CN202110323511A CN112950460A CN 112950460 A CN112950460 A CN 112950460A CN 202110323511 A CN202110323511 A CN 202110323511A CN 112950460 A CN112950460 A CN 112950460A
- Authority
- CN
- China
- Prior art keywords
- version
- training
- image
- cuda
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013508 migration Methods 0.000 title claims abstract description 16
- 230000005012 migration Effects 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 9
- 238000005457 optimization Methods 0.000 claims abstract description 9
- 241000512668 Eunectes Species 0.000 claims abstract description 5
- 238000011161 development Methods 0.000 claims abstract description 4
- 238000010835 comparative analysis Methods 0.000 claims abstract description 3
- 238000000034 method Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 7
- 238000013461 design Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 5
- 238000013135 deep learning Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000007786 learning performance Effects 0.000 claims description 2
- 238000004904 shortening Methods 0.000 claims description 2
- 230000003068 static effect Effects 0.000 claims description 2
- 230000004913 activation Effects 0.000 claims 1
- 238000013528 artificial neural network Methods 0.000 claims 1
- 238000002474 experimental method Methods 0.000 claims 1
- 238000006243 chemical reaction Methods 0.000 abstract description 2
- 238000011160 research Methods 0.000 abstract description 2
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 abstract 1
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a technology for image style migration, which comprises an experimental environment and an algorithm research part. The experimental platform is a Linux CentOS-78 operating system, and the video card is NVIDIA Tesla V100. The algorithmic study includes preprocessing the homemade data set and training the model. The operation flow of the whole system is as follows: 1. the Ubuntu16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures. The invention mainly aims to improve the accuracy of image style conversion and promote the development of artificial intelligence.
Description
Technical Field
The invention belongs to the aspect of image processing based on artificial intelligence, and relates to an image style migration technology.
Background
Image style migration is a method of applying some characteristics or styles of one picture to another picture to convert the picture into a designated image style. The traditional non-parametric image style migration method is mainly based on the drawing of a physical model and the synthesis of textures. Although the methods obtain better effect, the non-parametric image style migration method can only extract the bottom layer features of the image, but not the high-level abstract features, and the final image synthesis effect is not ideal when the images with complex colors and textures are processed. In recent years, deep learning carries with the rapid development of the wave of artificial intelligence, and shows a remarkable effect in the aspect of processing mass data, and the powerful ability of learning and processing data exceeds the performance of human beings even in partial fields. Therefore, Gatys et al discovered that, in the process of studying texture synthesis using a convolutional neural network, the statistical characteristics of the feature map in the convolutional neural network can reflect the style of an image, and the feature map is a deep feature representation of the network input image and reflects the content features of the image. Then, a randomly initialized image can be adjusted to be similar to the famous painting in style but the content is the image of the common photo through an iterative optimization method, thereby leading the concept of the migration network. Subsequent Johnson et al developed the network and presented the concept of transforming the network. Although the image stylization has achieved a good effect at present, there are still some problems that are not solved: firstly, the speed problem is solved, even the most advanced conversion network scheme is adopted, dozens of minutes are usually required for training a model, secondly, the problems that the style representation of the result after migration is not obvious enough exist, and the problems obviously have space for improvement. Rough and difficult to meet the actual requirements. Therefore, a fast and accurate algorithm is urgently needed to solve the above problems.
Disclosure of Invention
The invention provides an image style migration system based on a GAN optimization algorithm of a GAN, which is used for solving the problems of low speed, difficult training and low accuracy in the existing algorithm. The specific scheme is as follows:
in a first aspect, the present application provides a new image style migration method, including:
data sets were published data sets monet2photo and summer2winter _ yosemie, provided by the institute of electronic engineering and computers, university of california, berkeley.
And building environments comprising Anaconda, Pycharm compiling environment, Opencv, TensorFlow, Bottleneck, numpy, olefile, xlwt, zict, atomics writes and other installation packages.
And constructing a Darknet and CUDA parallel computing framework for receiving and processing data. And carrying out image style migration according to the cycleGAN optimization algorithm provided by the patent.
And training the model, wherein the environment required by the system is configured before the model is trained, and the training can be started after the version is checked to be correct. The main process is to send the data set into the optimized CycleGAN for training.
And evaluating the output result.
In a second aspect, the present application provides an image style migration system, including:
the experimental environment is as follows: the operating system selects Linux CentOS-78, and the display card is Tesla V100 GPU.
And (3) algorithm research: mainly, images in two data sets of monet2photo and summer2 winner _ yosemite are subjected to image style migration through an optimized CycleGAN algorithm. The addition-v 3 network is added into the original CycleGAN algorithm, so that the problem of poor representation of the original algorithm due to the residual error network is solved.
The operation flow of the whole system is as follows: 1. the Ubuntu16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic overall framework diagram of an image style migration technology system provided in an example of the present application. Fig. 2 is a software system overall design scheme, fig. 3 is an inclusion-v 3 module diagram, fig. 4 is a CycleGAN optimization network structure diagram, fig. 5 is a loss function diagram, and fig. 6 is an Accuracy diagram.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As can be seen from FIG. 1, the application is based on the Linux CentOS-78 operating system.
A CPU: the CPU has 8G memory and powerful ALU (arithmetic operation unit) capable of completing arithmetic calculation in very few clock cycles and reaching 64bit double precision. The addition and multiplication of the double-precision floating-point operation are carried out in 1-3 clock cycles. The frequency of the clock cycles of the CPU is very high, reaching 1.5323gigahertz (gigahz, power 9 of 10). The powerful data processing function of the central processing unit effectively improves the working efficiency of the computer, and during data processing operation, the central processing unit is not only simple in operation, but also realizes the correspondence between the control instruction input by a user and the CPU in the instruction task executing process on the basis of the instruction task issued by a computer user.
NVIDIA Tesla V100 is a sophisticated implementation in the data center GPU on the market today to accelerate artificial intelligence, high performance computing, and graphics. The NVIDIA Tesla V100 accelerator is based on a brand-new Volta GV100 GPU, and the GV100 is a first processor which breaks through the 100TFLOPS deep learning performance limit. GV100 combines the CUDA core and the sensor core, providing excellent performance of AI (Artificial Intelligence) supercomputers in the GPU. Now, with the Tesla V100 accelerated system, the AI model, which in the past required several weeks of computational resources, took only a few days to complete training. With the substantial shortening of the training time, under the assistance of the NVIDIA Tesla V100 accelerator, the AI can now solve various novel problems.
Programming language: one of the design goals of Python is to make the code highly readable. When the design is carried out, punctuation marks and English single characters frequently used by other languages are used as much as possible, so that the code looks neat and beautiful. It does not require repeated writing of statement statements as is the case with other static languages such as C, Pascal, nor is it often the case and unexpected as is their syntax. This makes it a programming language for scripting and rapid development of applications on most platforms.
CUDA (computer Unified Device architecture), which is an operating platform introduced by NVIDIA (video graphics card vendor). CUDA is a general-purpose parallel computing architecture derived from NVIDIA that enables GPUs to solve complex computational problems. The version used in this patent is CUDA10.1 with cudnn 7.5.
The overall design scheme of the software system is shown in fig. 2, which mainly comprises the following steps:
step 1: and building environments comprising Anaconda, Pycharm compiling environment, Opencv, TensorFlow, Bottleneck, numpy, olefile, xlwt, zict, atomics writes and other installation packages.
Step 2: and sending the digitally vectorized pictures into the optimized cycleGAN for training. The input and output image sizes are unified to 256 × 256 pixels. Batch size is 2, training times are 300 rounds, and each 5 epochs holds one checkpoint. The initial value of the learning rate at this time is set to 0.0002, and is set to be gradually decreased after epoch is more than 150 times. The gradient descent in the training is optimized by using the Adam algorithm. FIG. 3 is a diagram of an inclusion-v 3 module, and FIG. 4 is a diagram of a CycleGAN optimization network.
And 4, step 4: and testing whether the identification precision of the model file meets the expected requirement, adjusting parameters related to the algorithm according to the experimental result, and verifying and comparing. The evaluation index is evaluated by peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM).
FIG. 5 is a graph of loss function, and FIG. 6 is an Accuracy graph.
Claims (2)
1. A technology for image style migration is characterized in that an experiment platform comprises a CPU, a GPU, a programming language and a CUDA.
A CPU: the CPU has 8G memory and powerful ALU (arithmetic operation unit) capable of completing arithmetic calculation in very few clock cycles and reaching 64bit double precision. The addition and multiplication of the double-precision floating-point operation are carried out in 1-3 clock cycles. The frequency of the clock cycles of the CPU is very high, reaching 1.5323gigahertz (gigahz, power 9 of 10). The powerful data processing function of the central processing unit effectively improves the working efficiency of the computer, and during data processing operation, the central processing unit is not only simple in operation, but also realizes the correspondence between the control instruction input by a user and the CPU in the instruction task executing process on the basis of the instruction task issued by a computer user.
NVIDIA Tesla V100 is a sophisticated implementation in the data center GPU on the market today to accelerate artificial intelligence, high performance computing, and graphics. The NVIDIA Tesla V100 accelerator is based on a brand-new Volta GV100 GPU, and the GV100 is a first processor which breaks through the 100TFLOPS deep learning performance limit. GV100 combines the CUDA core and the Tensor core to provide superior performance of AI (architecture Intelligency) supercomputers in the GPU. Now, with the Tesla V100 accelerated system, the AI model, which in the past required several weeks of computational resources, took only a few days to complete training. With the substantial shortening of the training time, under the assistance of the NVIDIA Tesla V100 accelerator, the AI can now solve various novel problems.
Programming language: one of the design goals of Python is to make the code highly readable. When the design is carried out, punctuation marks and English single characters frequently used by other languages are used as much as possible, so that the code looks neat and beautiful. It does not require repeated writing of statement statements as is the case with other static languages such as C, Pascal, nor is it often the case and unexpected as is their syntax. This makes it a programming language for scripting and rapid development of applications on most platforms.
CUDA (computer Unified Device architecture), which is an operating platform introduced by NVIDIA (video graphics card vendor). CUDA is a general-purpose parallel computing architecture derived from NVIDIA that enables GPUs to solve complex computational problems. The version used in this patent is CUDA10.1 with cudnn 7.5.
2. According to claim 1The system described above, wherein images of one style can be converted to images of another style. The algorithm selected for the method is added with an inclusion-v 3 module on the basis of the CycleGAN, an algorithm model in the network uses a full convolution neural network, and training and evaluation are carried out on a data set. : 1. the ubutun16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures. The RELU function is used as the activation function in this process, as shown in equation 1. Loss function LcycFor Cycle Consistency Loss, the optimizer selects the Adam algorithm optimizer for optimization as shown in equation 2. The evaluation index adopts peak Signal-to-Noise ratio (PSNR) (Peak Signal to Noise ratio) as shown in formula 3, wherein an image I with an image size of m × n pixels and Noise K are given in the PSNR, and maxI in the formula is the maximum value of image color. The structural similarity SSIM (structural similarity) is shown in equation 4. The SSIM is measured in the image X, Y from three aspects of brightness (l), contrast (c) and structure(s), the value range of the SSIM is 0-1, and the closer to 1, the better the effect is.
SSIM (X, Y) ═ 1(X, Y) × c (X, Y) × s (X, Y) formula 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110323511.7A CN112950460A (en) | 2021-03-26 | 2021-03-26 | Technology for image style migration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110323511.7A CN112950460A (en) | 2021-03-26 | 2021-03-26 | Technology for image style migration |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112950460A true CN112950460A (en) | 2021-06-11 |
Family
ID=76228339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110323511.7A Pending CN112950460A (en) | 2021-03-26 | 2021-03-26 | Technology for image style migration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112950460A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107369159A (en) * | 2017-06-29 | 2017-11-21 | 大连理工大学 | Threshold segmentation method based on multifactor two-dimensional gray histogram |
WO2019042139A1 (en) * | 2017-08-29 | 2019-03-07 | 京东方科技集团股份有限公司 | Image processing method, image processing apparatus, and a neural network training method |
CN112487999A (en) * | 2020-12-02 | 2021-03-12 | 西安邮电大学 | Remote sensing image robust feature extraction method based on cycleGAN |
-
2021
- 2021-03-26 CN CN202110323511.7A patent/CN112950460A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107369159A (en) * | 2017-06-29 | 2017-11-21 | 大连理工大学 | Threshold segmentation method based on multifactor two-dimensional gray histogram |
WO2019042139A1 (en) * | 2017-08-29 | 2019-03-07 | 京东方科技集团股份有限公司 | Image processing method, image processing apparatus, and a neural network training method |
CN112487999A (en) * | 2020-12-02 | 2021-03-12 | 西安邮电大学 | Remote sensing image robust feature extraction method based on cycleGAN |
Non-Patent Citations (2)
Title |
---|
彭晏飞等: "基于循环生成对抗网络的图像风格迁移", 《计算机工程与科学》 * |
王鹿等: "基于深度学习的风格迁移算法的研究与实现", 《智能计算机与应用》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11574097B2 (en) | Deep learning based identification of difficult to test nodes | |
US11507846B2 (en) | Representing a neural network utilizing paths within the network to improve a performance of the neural network | |
US20190340499A1 (en) | Quantization for dnn accelerators | |
CN111401406B (en) | Neural network training method, video frame processing method and related equipment | |
EP3888012A1 (en) | Adjusting precision and topology parameters for neural network training based on a performance metric | |
CN113273082A (en) | Neural network activation compression with exception block floating point | |
CN113348474A (en) | Neural network activation compression with non-uniform mantissas | |
US11972354B2 (en) | Representing a neural network utilizing paths within the network to improve a performance of the neural network | |
CN108171328B (en) | Neural network processor and convolution operation method executed by same | |
CN111414915B (en) | Character recognition method and related equipment | |
CN113065997B (en) | Image processing method, neural network training method and related equipment | |
CN114004352B (en) | Simulation implementation method, neural network compiler and computer readable storage medium | |
CN115080749B (en) | Weak supervision text classification method, system and device based on self-supervision training | |
CN114283347A (en) | Target detection method, system, intelligent terminal and computer readable storage medium | |
CN114529785B (en) | Model training method, video generating method and device, equipment and medium | |
de Prado et al. | Automated design space exploration for optimized deployment of dnn on arm cortex-a cpus | |
CN113627421B (en) | Image processing method, training method of model and related equipment | |
CN117351299A (en) | Image generation and model training method, device, equipment and storage medium | |
CN112950460A (en) | Technology for image style migration | |
CN116362301A (en) | Model quantization method and related equipment | |
US20220129755A1 (en) | Incorporating a ternary matrix into a neural network | |
CN115310596A (en) | Convolution operation method, convolution operation device, storage medium and electronic equipment | |
CN114219091A (en) | Network model reasoning acceleration method, device, equipment and storage medium | |
CN114299517A (en) | Image processing method, apparatus, device, storage medium, and computer program product | |
Zhao et al. | Design and Development of Image Recognition Toolkit Based on Deep Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210611 |