CN112950460A - Technology for image style migration - Google Patents

Technology for image style migration Download PDF

Info

Publication number
CN112950460A
CN112950460A CN202110323511.7A CN202110323511A CN112950460A CN 112950460 A CN112950460 A CN 112950460A CN 202110323511 A CN202110323511 A CN 202110323511A CN 112950460 A CN112950460 A CN 112950460A
Authority
CN
China
Prior art keywords
version
training
image
cuda
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110323511.7A
Other languages
Chinese (zh)
Inventor
梅琪琪
王洪博
祝忠明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Univeristy of Technology
Original Assignee
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Univeristy of Technology filed Critical Chengdu Univeristy of Technology
Priority to CN202110323511.7A priority Critical patent/CN112950460A/en
Publication of CN112950460A publication Critical patent/CN112950460A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a technology for image style migration, which comprises an experimental environment and an algorithm research part. The experimental platform is a Linux CentOS-78 operating system, and the video card is NVIDIA Tesla V100. The algorithmic study includes preprocessing the homemade data set and training the model. The operation flow of the whole system is as follows: 1. the Ubuntu16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures. The invention mainly aims to improve the accuracy of image style conversion and promote the development of artificial intelligence.

Description

Technology for image style migration
Technical Field
The invention belongs to the aspect of image processing based on artificial intelligence, and relates to an image style migration technology.
Background
Image style migration is a method of applying some characteristics or styles of one picture to another picture to convert the picture into a designated image style. The traditional non-parametric image style migration method is mainly based on the drawing of a physical model and the synthesis of textures. Although the methods obtain better effect, the non-parametric image style migration method can only extract the bottom layer features of the image, but not the high-level abstract features, and the final image synthesis effect is not ideal when the images with complex colors and textures are processed. In recent years, deep learning carries with the rapid development of the wave of artificial intelligence, and shows a remarkable effect in the aspect of processing mass data, and the powerful ability of learning and processing data exceeds the performance of human beings even in partial fields. Therefore, Gatys et al discovered that, in the process of studying texture synthesis using a convolutional neural network, the statistical characteristics of the feature map in the convolutional neural network can reflect the style of an image, and the feature map is a deep feature representation of the network input image and reflects the content features of the image. Then, a randomly initialized image can be adjusted to be similar to the famous painting in style but the content is the image of the common photo through an iterative optimization method, thereby leading the concept of the migration network. Subsequent Johnson et al developed the network and presented the concept of transforming the network. Although the image stylization has achieved a good effect at present, there are still some problems that are not solved: firstly, the speed problem is solved, even the most advanced conversion network scheme is adopted, dozens of minutes are usually required for training a model, secondly, the problems that the style representation of the result after migration is not obvious enough exist, and the problems obviously have space for improvement. Rough and difficult to meet the actual requirements. Therefore, a fast and accurate algorithm is urgently needed to solve the above problems.
Disclosure of Invention
The invention provides an image style migration system based on a GAN optimization algorithm of a GAN, which is used for solving the problems of low speed, difficult training and low accuracy in the existing algorithm. The specific scheme is as follows:
in a first aspect, the present application provides a new image style migration method, including:
data sets were published data sets monet2photo and summer2winter _ yosemie, provided by the institute of electronic engineering and computers, university of california, berkeley.
And building environments comprising Anaconda, Pycharm compiling environment, Opencv, TensorFlow, Bottleneck, numpy, olefile, xlwt, zict, atomics writes and other installation packages.
And constructing a Darknet and CUDA parallel computing framework for receiving and processing data. And carrying out image style migration according to the cycleGAN optimization algorithm provided by the patent.
And training the model, wherein the environment required by the system is configured before the model is trained, and the training can be started after the version is checked to be correct. The main process is to send the data set into the optimized CycleGAN for training.
And evaluating the output result.
In a second aspect, the present application provides an image style migration system, including:
the experimental environment is as follows: the operating system selects Linux CentOS-78, and the display card is Tesla V100 GPU.
And (3) algorithm research: mainly, images in two data sets of monet2photo and summer2 winner _ yosemite are subjected to image style migration through an optimized CycleGAN algorithm. The addition-v 3 network is added into the original CycleGAN algorithm, so that the problem of poor representation of the original algorithm due to the residual error network is solved.
The operation flow of the whole system is as follows: 1. the Ubuntu16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic overall framework diagram of an image style migration technology system provided in an example of the present application. Fig. 2 is a software system overall design scheme, fig. 3 is an inclusion-v 3 module diagram, fig. 4 is a CycleGAN optimization network structure diagram, fig. 5 is a loss function diagram, and fig. 6 is an Accuracy diagram.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As can be seen from FIG. 1, the application is based on the Linux CentOS-78 operating system.
A CPU: the CPU has 8G memory and powerful ALU (arithmetic operation unit) capable of completing arithmetic calculation in very few clock cycles and reaching 64bit double precision. The addition and multiplication of the double-precision floating-point operation are carried out in 1-3 clock cycles. The frequency of the clock cycles of the CPU is very high, reaching 1.5323gigahertz (gigahz, power 9 of 10). The powerful data processing function of the central processing unit effectively improves the working efficiency of the computer, and during data processing operation, the central processing unit is not only simple in operation, but also realizes the correspondence between the control instruction input by a user and the CPU in the instruction task executing process on the basis of the instruction task issued by a computer user.
NVIDIA Tesla V100 is a sophisticated implementation in the data center GPU on the market today to accelerate artificial intelligence, high performance computing, and graphics. The NVIDIA Tesla V100 accelerator is based on a brand-new Volta GV100 GPU, and the GV100 is a first processor which breaks through the 100TFLOPS deep learning performance limit. GV100 combines the CUDA core and the sensor core, providing excellent performance of AI (Artificial Intelligence) supercomputers in the GPU. Now, with the Tesla V100 accelerated system, the AI model, which in the past required several weeks of computational resources, took only a few days to complete training. With the substantial shortening of the training time, under the assistance of the NVIDIA Tesla V100 accelerator, the AI can now solve various novel problems.
Programming language: one of the design goals of Python is to make the code highly readable. When the design is carried out, punctuation marks and English single characters frequently used by other languages are used as much as possible, so that the code looks neat and beautiful. It does not require repeated writing of statement statements as is the case with other static languages such as C, Pascal, nor is it often the case and unexpected as is their syntax. This makes it a programming language for scripting and rapid development of applications on most platforms.
CUDA (computer Unified Device architecture), which is an operating platform introduced by NVIDIA (video graphics card vendor). CUDA is a general-purpose parallel computing architecture derived from NVIDIA that enables GPUs to solve complex computational problems. The version used in this patent is CUDA10.1 with cudnn 7.5.
The overall design scheme of the software system is shown in fig. 2, which mainly comprises the following steps:
step 1: and building environments comprising Anaconda, Pycharm compiling environment, Opencv, TensorFlow, Bottleneck, numpy, olefile, xlwt, zict, atomics writes and other installation packages.
Step 2: and sending the digitally vectorized pictures into the optimized cycleGAN for training. The input and output image sizes are unified to 256 × 256 pixels. Batch size is 2, training times are 300 rounds, and each 5 epochs holds one checkpoint. The initial value of the learning rate at this time is set to 0.0002, and is set to be gradually decreased after epoch is more than 150 times. The gradient descent in the training is optimized by using the Adam algorithm. FIG. 3 is a diagram of an inclusion-v 3 module, and FIG. 4 is a diagram of a CycleGAN optimization network.
And 4, step 4: and testing whether the identification precision of the model file meets the expected requirement, adjusting parameters related to the algorithm according to the experimental result, and verifying and comparing. The evaluation index is evaluated by peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM).
FIG. 5 is a graph of loss function, and FIG. 6 is an Accuracy graph.

Claims (2)

1. A technology for image style migration is characterized in that an experiment platform comprises a CPU, a GPU, a programming language and a CUDA.
A CPU: the CPU has 8G memory and powerful ALU (arithmetic operation unit) capable of completing arithmetic calculation in very few clock cycles and reaching 64bit double precision. The addition and multiplication of the double-precision floating-point operation are carried out in 1-3 clock cycles. The frequency of the clock cycles of the CPU is very high, reaching 1.5323gigahertz (gigahz, power 9 of 10). The powerful data processing function of the central processing unit effectively improves the working efficiency of the computer, and during data processing operation, the central processing unit is not only simple in operation, but also realizes the correspondence between the control instruction input by a user and the CPU in the instruction task executing process on the basis of the instruction task issued by a computer user.
NVIDIA Tesla V100 is a sophisticated implementation in the data center GPU on the market today to accelerate artificial intelligence, high performance computing, and graphics. The NVIDIA Tesla V100 accelerator is based on a brand-new Volta GV100 GPU, and the GV100 is a first processor which breaks through the 100TFLOPS deep learning performance limit. GV100 combines the CUDA core and the Tensor core to provide superior performance of AI (architecture Intelligency) supercomputers in the GPU. Now, with the Tesla V100 accelerated system, the AI model, which in the past required several weeks of computational resources, took only a few days to complete training. With the substantial shortening of the training time, under the assistance of the NVIDIA Tesla V100 accelerator, the AI can now solve various novel problems.
Programming language: one of the design goals of Python is to make the code highly readable. When the design is carried out, punctuation marks and English single characters frequently used by other languages are used as much as possible, so that the code looks neat and beautiful. It does not require repeated writing of statement statements as is the case with other static languages such as C, Pascal, nor is it often the case and unexpected as is their syntax. This makes it a programming language for scripting and rapid development of applications on most platforms.
CUDA (computer Unified Device architecture), which is an operating platform introduced by NVIDIA (video graphics card vendor). CUDA is a general-purpose parallel computing architecture derived from NVIDIA that enables GPUs to solve complex computational problems. The version used in this patent is CUDA10.1 with cudnn 7.5.
2. According to claim 1The system described above, wherein images of one style can be converted to images of another style. The algorithm selected for the method is added with an inclusion-v 3 module on the basis of the CycleGAN, an algorithm model in the network uses a full convolution neural network, and training and evaluation are carried out on a data set. : 1. the ubutun16.0 operating system is installed, and the configuration Anaconda version is Anaconda3, the python version is 3.7, the CUDA version is 10.0, and the cudnn version is 7.5. 2. And (3) sending the data set subjected to digital vectorization into a cycleGAN optimization algorithm for training. 3. And carrying out comparative evaluation on the output pictures. The RELU function is used as the activation function in this process, as shown in equation 1. Loss function LcycFor Cycle Consistency Loss, the optimizer selects the Adam algorithm optimizer for optimization as shown in equation 2. The evaluation index adopts peak Signal-to-Noise ratio (PSNR) (Peak Signal to Noise ratio) as shown in formula 3, wherein an image I with an image size of m × n pixels and Noise K are given in the PSNR, and maxI in the formula is the maximum value of image color. The structural similarity SSIM (structural similarity) is shown in equation 4. The SSIM is measured in the image X, Y from three aspects of brightness (l), contrast (c) and structure(s), the value range of the SSIM is 0-1, and the closer to 1, the better the effect is.
Figure FDA0002993677940000011
Figure FDA0002993677940000012
Figure FDA0002993677940000013
SSIM (X, Y) ═ 1(X, Y) × c (X, Y) × s (X, Y) formula 4.
CN202110323511.7A 2021-03-26 2021-03-26 Technology for image style migration Pending CN112950460A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110323511.7A CN112950460A (en) 2021-03-26 2021-03-26 Technology for image style migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110323511.7A CN112950460A (en) 2021-03-26 2021-03-26 Technology for image style migration

Publications (1)

Publication Number Publication Date
CN112950460A true CN112950460A (en) 2021-06-11

Family

ID=76228339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110323511.7A Pending CN112950460A (en) 2021-03-26 2021-03-26 Technology for image style migration

Country Status (1)

Country Link
CN (1) CN112950460A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107369159A (en) * 2017-06-29 2017-11-21 大连理工大学 Threshold segmentation method based on multifactor two-dimensional gray histogram
WO2019042139A1 (en) * 2017-08-29 2019-03-07 京东方科技集团股份有限公司 Image processing method, image processing apparatus, and a neural network training method
CN112487999A (en) * 2020-12-02 2021-03-12 西安邮电大学 Remote sensing image robust feature extraction method based on cycleGAN

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107369159A (en) * 2017-06-29 2017-11-21 大连理工大学 Threshold segmentation method based on multifactor two-dimensional gray histogram
WO2019042139A1 (en) * 2017-08-29 2019-03-07 京东方科技集团股份有限公司 Image processing method, image processing apparatus, and a neural network training method
CN112487999A (en) * 2020-12-02 2021-03-12 西安邮电大学 Remote sensing image robust feature extraction method based on cycleGAN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
彭晏飞等: "基于循环生成对抗网络的图像风格迁移", 《计算机工程与科学》 *
王鹿等: "基于深度学习的风格迁移算法的研究与实现", 《智能计算机与应用》 *

Similar Documents

Publication Publication Date Title
US11574097B2 (en) Deep learning based identification of difficult to test nodes
US11507846B2 (en) Representing a neural network utilizing paths within the network to improve a performance of the neural network
US20190340499A1 (en) Quantization for dnn accelerators
CN111401406B (en) Neural network training method, video frame processing method and related equipment
EP3888012A1 (en) Adjusting precision and topology parameters for neural network training based on a performance metric
CN113273082A (en) Neural network activation compression with exception block floating point
CN113348474A (en) Neural network activation compression with non-uniform mantissas
US11972354B2 (en) Representing a neural network utilizing paths within the network to improve a performance of the neural network
CN108171328B (en) Neural network processor and convolution operation method executed by same
CN111414915B (en) Character recognition method and related equipment
CN113065997B (en) Image processing method, neural network training method and related equipment
CN114004352B (en) Simulation implementation method, neural network compiler and computer readable storage medium
CN115080749B (en) Weak supervision text classification method, system and device based on self-supervision training
CN114283347A (en) Target detection method, system, intelligent terminal and computer readable storage medium
CN114529785B (en) Model training method, video generating method and device, equipment and medium
de Prado et al. Automated design space exploration for optimized deployment of dnn on arm cortex-a cpus
CN113627421B (en) Image processing method, training method of model and related equipment
CN117351299A (en) Image generation and model training method, device, equipment and storage medium
CN112950460A (en) Technology for image style migration
CN116362301A (en) Model quantization method and related equipment
US20220129755A1 (en) Incorporating a ternary matrix into a neural network
CN115310596A (en) Convolution operation method, convolution operation device, storage medium and electronic equipment
CN114219091A (en) Network model reasoning acceleration method, device, equipment and storage medium
CN114299517A (en) Image processing method, apparatus, device, storage medium, and computer program product
Zhao et al. Design and Development of Image Recognition Toolkit Based on Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210611