CN110544218A - Image processing method, device and storage medium - Google Patents

Image processing method, device and storage medium Download PDF

Info

Publication number
CN110544218A
CN110544218A CN201910829520.6A CN201910829520A CN110544218A CN 110544218 A CN110544218 A CN 110544218A CN 201910829520 A CN201910829520 A CN 201910829520A CN 110544218 A CN110544218 A CN 110544218A
Authority
CN
China
Prior art keywords
image
training
character
model
transparency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910829520.6A
Other languages
Chinese (zh)
Other versions
CN110544218B (en
Inventor
陈锡显
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910829520.6A priority Critical patent/CN110544218B/en
Publication of CN110544218A publication Critical patent/CN110544218A/en
Application granted granted Critical
Publication of CN110544218B publication Critical patent/CN110544218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image processing method, an apparatus, a device and a storage medium thereof; the method comprises the following steps: acquiring color characteristics and transparency characteristics of a character image to be processed and color characteristics and transparency characteristics of a template character image; performing transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model; and processing the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image, wherein the display effect of the target character image is the same as that of the template character image. By the method and the device, the RGBA artistic word effect graph with the transparency information can be directly generated.

Description

Image processing method, device and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to an image processing method, an image processing apparatus, and a storage medium.
Background
At present, with the improvement of living standard, the aesthetic requirements of people are higher and higher. The artistic words (also can be made into artistic words) with artistic effect are one of the necessary propaganda tools for the development of social politics and economy. For example, art words can be added to political slogans, economic advertisements, and the like, so that the attention is attracted. And the artistic appeal of the artistic calligraphy also plays a role in beautifying the life of people and can bring spiritual enjoyment to people.
For more complicated artistic words, the artistic words are generally designed and drawn by hands by professional designers, and the effects of the artistic words cannot be applied to other words through programs, so that a large amount of manpower and material resources are required.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device and a storage medium, which can directly generate an art word effect diagram of RGBA with transparency information.
The technical scheme of the embodiment of the application is realized as follows:
an embodiment of the present application provides an image processing method, including:
acquiring color characteristics and transparency characteristics of a character image to be processed and color characteristics and transparency characteristics of a template character image;
performing transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model;
And processing the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image, wherein the display effect of the target character image is the same as that of the template character image.
an embodiment of the present application provides an image processing apparatus, the apparatus including:
The first acquisition module is used for acquiring the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image;
The adjusting module is used for carrying out transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model;
And the processing module is used for processing the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image, wherein the display effect of the target character image is the same as that of the template character image.
An embodiment of the present application provides an image processing apparatus, which at least includes:
A memory for storing executable instructions;
and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.
The embodiment of the application provides a storage medium, which stores executable instructions and is used for causing a processor to execute the executable instructions so as to realize the method provided by the embodiment of the application.
The embodiment of the application has the following beneficial effects:
when the character image to be processed is converted into the target character image, firstly, the character image to be processed and the template character image in the RGBA format are obtained, and further, the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image are obtained, then, the trained neural network model is subjected to transfer learning based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model, so that the character image to be processed can be converted through the adjusted neural network model to obtain the target character image with the same display effect as the template character image, and the target character image with the transparency information can be directly generated because the character image to be processed which is processed through the adjusted neural network is the character image to be processed with the transparency information, the method can realize batch generation of target character images and improve the generation efficiency of artistic characters.
drawings
Fig. 1A is an alternative schematic structural diagram of a system architecture provided in an embodiment of the present application;
FIG. 1B is a diagram showing a comparison between an original character and an artistic word;
FIG. 1C is a comparison graph illustrating the RGB and RGBA stacking effect;
FIG. 1D is a diagram of a comparison of an art word that is manually created and an art word that is automatically created in the related art;
FIG. 1E is another comparison of an art word that is manually created and an art word that is automatically created in the related art;
FIG. 1F is a schematic diagram of a network architecture of the image processing method according to the embodiment of the present application;
FIG. 1G is a schematic diagram of another network architecture of an image processing method according to an embodiment of the present application;
FIG. 2 is an alternative schematic configuration of an apparatus provided in an embodiment of the present application;
FIG. 3A is a schematic flow chart of an implementation of an image processing method according to an embodiment of the present application;
Fig. 3B is a schematic flowchart of another implementation of the image processing method according to the embodiment of the present application;
FIG. 4 is a schematic flow chart illustrating a process for training a neural network model according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of another implementation of the image processing method according to the embodiment of the present application;
FIG. 6 is a schematic diagram of a flow chart of an implementation of a method for generating an artistic word according to an embodiment of the present application;
FIG. 7A is a diagram illustrating an artistic word effect generated by a method for making an artistic word according to an embodiment of the present application;
FIG. 7B is a diagram of another effect of the artistic word generated by the method for making the artistic word according to the embodiment of the present application;
FIG. 8 is an effect diagram obtained by adding the generated artistic word to the background diagram of Banner.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
in the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
Where similar language of "first/second" appears in the specification, the following description is added, and where reference is made to the term "first \ second \ third" merely for distinguishing between similar items and not for indicating a particular ordering of items, it is to be understood that "first \ second \ third" may be interchanged both in particular order or sequence as appropriate, so that embodiments of the application described herein may be practiced in other than the order illustrated or described herein.
unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Artistic words: the character style features of the deformed character style which is processed by professional character style designers in an artistic way accord with the meaning of the character, and the deformed character style has the characteristics of attractive appearance, interest, easy recognition, striking and expanding and the like, and is the character style deformation with pattern meaning or decoration meaning. The artistic character can be made into reasonable deformation decoration for strokes and structure of Chinese characters from the characteristics of meaning, shape and structure of Chinese characters, so that it can write beautiful variant characters.
2) Banner: banner advertisements contained in mobile App software interior, website pages and the like are main advertisement forms for popularization of internet products. The picture which is beautiful and designed by professional designers, background, commodity, advertisement file, propaganda file and other elements are generally used.
3) RGB: red Green Blue (Blue), which represents a common pixel representation, such as the format of bmp, where such RGB information-only pictures are overlaid in front of one another, can cause occlusion.
4) RGBA: is a color space representing Red (Red) Green (Blue) and Alpha (transparency). It can also be considered that extra information is added to the RGB model, that is, transparency information is added, a in RGBA represents Alpha, which characterizes the transparency of the added pixels, and the transparency is increased from 0 to 255, which is excessive from completely transparent (invisible) to completely opaque, representing the picture format png, and when this picture and another picture are put together, the transparency represents the fusion and superposition condition of the two picture pixels.
5) transparency channel: an 8-bit grayscale channel that records transparency characteristics in an image with 256 levels of grayscale, defining transparent, opaque, and translucent regions, where white represents opaque, black represents transparent, and gray represents translucent.
6) One of the methods of migration learning and machine learning is to reuse the model developed for task a as an initial point in the process of developing the model for task B. The transfer Learning may include Zero-time Learning (Zero-shot Learning), One-time Learning (One-shot Learning), and small-amount Learning (Few-shot Learning).
7) Zero learning means that there is no sample of a certain class in the training set, but if a powerful mapping can be learned, the mapping can not see the class even during training, but can also obtain the features of the new class.
8) one learning means that there are samples in each category of the training set, but there are only a few samples (only one or a few). At this point, a generalized mapping can be learned on a larger data set, and then updated to a smaller data set. It can also be understood that a neural network model is trained by a large amount of training data, and parameters of the neural network model can be updated by a small number of samples.
In order to better understand the image processing method provided in the embodiment of the present application, a batch production scheme of art words and the disadvantages thereof in the related art will be described first.
FIG. 1A is a schematic diagram of Banner with artistic words, as shown in FIG. 1A, "absolutely necessary" is an artistic word with special effect, which can greatly increase the design effect of Banner. If the artistic word effect is designed by a designer in a Photoshop style or through a hollow splicing base map mode, the effect can be rapidly applied to other characters in batches without the need of re-making by the designer. Fig. 1B is a comparison diagram of an original text and an artistic word, and it can be seen from fig. 1B that the artistic word 113 has a rich texture effect relative to the original text 111, and the artistic word 112 also has shadow, edge, and other effects relative to the original text, and also causes the shape of the word to change (e.g., widen). Therefore, in the related art, 60% of the relatively complicated artistic word effects are manually drawn by designers, so that the effects cannot be applied to other characters through a program, for example, after the artistic effect of the limited character is achieved, the effect of the spring measure doll needs to be drawn again by the designers, and a large amount of manpower and material resources are required.
fig. 1C is a comparison diagram showing an example of the superimposition effect of RGB and RGBA, in fig. 1C, a blue portion is a background, 121 is an effect of superimposing an RGB map on a background map, and 122 is an effect of superimposing an RGBA map on a background map. As shown in fig. 1C, directly superimposing the RGB image on the background image does not result in artistic words. However, the existing artistic word making method can intelligently generate an RGB image, so that the RGB image cannot be directly synthesized with a background picture of Ba nner.
In the related art, the scheme for generating the RGB format artistic word mainly includes a rule class and a deep learning class.
1) The best method of rule classes is: t-effects, i.e. Awesome type graphics: Statistics-based effects transfer.
2) The best method for deep learning classes is: tetgan, namely TET-GAN, Text Effects transfer Via Stylation and Destylation.
fig. 1D is a comparison diagram of an art word manually produced and an art word automatically produced in the related art, wherein a part 131 in fig. 1D is a diagram of a given word "art" and a corresponding art word manually produced, when a new word, such as "P", is given, an art word 132 having the same effect as "art" is intelligently produced in the related art, and the comparison between 131 and 132 shows that a large amount of speckle noise exists in the art word 132 automatically produced.
The defects of the art word making method in the related technology mainly comprise the following points:
first, none of the prior disclosed methods is capable of producing a transparency map and thus is not capable of meeting the product requirements;
secondly, in the aspect of generating effect, more speckle noises exist in the current technical scheme for automatically generating artistic words;
Third, the automatic generation of artistic words using the related art is prone to generation errors.
Fig. 1E is another comparison diagram of an art word manually made and an art word automatically generated in the related art, wherein a part 141 in fig. 1E is given a word "han" and a corresponding art word manually made, when a new word, such as "commander" is given, the art word 142 having the same effect as "han" is intelligently generated in the related art, and it can be seen from the comparison between 141 and 142 that the background color and the outline color of the art word 142 automatically generated are opposite, that is, an error occurs instead of being correctly generated by the method for automatically generating the art word.
In the related art, the generated RGB format artistic word can be combined with matting and clipping processing to obtain the RGBA format artistic word, but the defects mainly include:
First, the value of the a channel is not estimated accurately, only 0 and 255;
second, many artistic word effect maps, such as white text effects and white backgrounds, are difficult to separate;
Thirdly, fine matting, such as flame text is difficult;
Fourth, the trained matting model is difficult to apply to new artistic word matting because the new artistic words are much different in the characteristics to distinguish the words from the background, from the characteristics of the artistic words of the previous training set.
based on this, the embodiment of the application provides an image processing method, which is based on the existing RGB artistic word generation model and uses an Attention (Attention) mechanism to extract and generate an Alpha channel; and the outlines of the artistic words are reformed according to the extracted Alpha, so that not only can the RGBA artistic words be intelligently generated, but also speckle noise outside the artistic words can be reduced, the practical artistic words can be generated in batch, and manpower and material resources are reduced.
An exemplary application of the apparatus implementing the embodiment of the present application is described below, and the apparatus provided in the embodiment of the present application may be implemented as a server. In the following, an exemplary application will be described that encompasses a server when the apparatus is implemented as a server.
Referring to fig. 1F, fig. 1F is a schematic diagram of a network architecture of the image processing method according to the embodiment of the present application, and as shown in fig. 1F, the network architecture at least includes a terminal 100, a server 200, and a network 300. To support an exemplary application, the terminal 100 is connected to the server 200 through a network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two, and uses a wireless link to realize data transmission.
When the image processing method provided by the embodiment of the present application is implemented by using the network architecture shown in fig. 1F, a user may send a text image with a target display effect as a reference image and a text image that needs to be converted into the target display effect as a to-be-processed image to a server through the terminal 100, and after receiving the to-be-processed image and the reference image, the server performs one-shot learning on the trained neural network model through the reference image to adjust parameters of the neural network model, and inputs the to-be-processed image into the adjusted neural network model, so that a target image with the target display effect can be obtained. After obtaining the target image, the server 200 may send the target image to the terminal 100, and after receiving the target image, the terminal 100 may display the target image on its own graphical interface.
in the embodiment of the application, the reference image and the image to be processed are RGBA images, that is, the final processed image is also a target image with transparency information, and therefore, the reference image and the image to be processed can be displayed as separate images and can also be directly superimposed on a background image.
fig. 1G is a schematic diagram of another network architecture of the image processing method according to the embodiment of the present application, as shown in fig. 1G, the network architecture includes only the terminal 100. After acquiring user requirements, that is, acquiring a template image with a target display effect and an image to be processed which needs to be converted into the target display effect, the terminal 100 performs one-shot learning on the trained neural network model through the template image to adjust parameters of the neural network model, and inputs the image to be processed into the adjusted neural network model, so as to obtain the target image with the target display effect. After obtaining the target image, the terminal 100 may display the target image on its own graphical interface.
The apparatus provided in the embodiments of the present application may be implemented as hardware or a combination of hardware and software, and various exemplary implementations of the apparatus provided in the embodiments of the present application are described below.
the apparatus provided in the embodiments of the present application may be implemented as hardware or a combination of hardware and software, and various exemplary implementations of the apparatus provided in the embodiments of the present application are described below.
the server 200 may be a single server, or a server cluster, a cloud computing center, etc. formed by multiple servers, and according to the exemplary structure of the server 200 shown in fig. 2, other exemplary structures of the server 200 may be foreseen, so that the structure described herein should not be considered as a limitation, for example, some components described below may be omitted, or components not described below may be added to adapt to the special needs of some applications.
The server 200 shown in fig. 2 includes: at least one processor 210, memory 240, at least one network interface 220, and a user interface 230. Each of the components in the terminal 200 are coupled together by a bus system 250. It will be appreciated that the bus system 250 is used to enable communications among the components. The bus system 250 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 250 in fig. 2.
The user interface 230 may include a display, a keyboard, a mouse, a touch-sensitive pad, a touch screen, and the like.
The memory 240 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM). The volatile Memory may be Random Access Memory (RAM). The memory 240 described in embodiments herein is intended to comprise any suitable type of memory.
the memory 240 in the embodiment of the present application is capable of storing data to support the operation of the server 200. Examples of such data include: any computer program for operating on server 200, such as an operating system and application programs. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program may include various application programs.
As an example of the method provided by the embodiment of the present application implemented by software, the method provided by the embodiment of the present application may be directly embodied as a combination of software modules executed by the processor 210, the software modules may be located in a storage medium located in the memory 240, and the processor 210 reads executable instructions included in the software modules in the memory 240, and completes the method provided by the embodiment of the present application in combination with necessary hardware (for example, including the processor 210 and other components connected to the bus 250).
by way of example, the Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor or the like.
methods of implementing embodiments of the present application will be described in connection with the foregoing exemplary application and implementations of apparatus implementing embodiments of the present application.
In order to better understand the method provided by the embodiment of the present application, artificial intelligence, each branch of artificial intelligence, and the application field related to the method provided by the embodiment of the present application are explained first.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
the artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. The directions will be described below.
Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
The key technologies of Speech Technology (Speech Technology) are Automatic Speech Recognition (ASR) and Speech synthesis (TTS) and voiceprint Recognition. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future.
natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like.
The automatic driving technology generally comprises technologies such as high-precision maps, environment perception, behavior decision, path planning, motion control and the like, and the self-determined driving technology has wide application prospects.
With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.
The scheme provided by the embodiment of the application relates to technologies such as artificial intelligence natural language processing and the like, and is specifically explained by the following embodiment.
referring to fig. 3A, fig. 3A is a schematic flow chart of an implementation of an image processing method provided in an embodiment of the present application, and the implementation may be applied to an image processing device, where the image processing device may be the server 200 in fig. 1F, or may be the terminal 100 in fig. 1G, and will be described with reference to the steps shown in fig. 3A.
step S101, the image processing equipment acquires the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image.
Here, the text image to be processed and the template text image are both RGBA images. The template character image and the character image to be processed both comprise text characters, wherein the text characters can be Chinese characters, also can be English letters, and certainly can also be Japanese, Korean and the like.
The template character image refers to a character image with a certain artistic effect, that is, the text characters in the template character image are not characters in a conventional format, but have a certain artistic effect, for example, the text characters can have a flame effect, a stone-like texture effect, and the like. The text characters in the character image to be processed are conventional characters without artistic effect.
The color feature, which may be referred to as color feature information or color information, refers to RGB information of the image, that is, each channel value of R, G, B channels of each pixel point in the image. The transparency feature, which may also be referred to as transparency information, refers to the channel value of the a channel in the image.
After the character image to be processed and the template character image are acquired, the image processing device can perform feature extraction on the images to extract color features and transparency features.
Step S102, the image processing equipment acquires a trained neural network model.
Here, the trained neural network model is trained using at least a training image having a transparency characteristic.
When step S102 is implemented by the server 200 in fig. 1F, it may be that the server acquires a neural network model trained by itself using a training image with transparency characteristics.
When step S102 is implemented by the terminal 100 in fig. 1G, the terminal 100 may acquire, from the server 200, a neural network model trained by the server using a training image with transparency characteristics, or the terminal 100 may acquire a neural network model trained by itself using a training image with transparency characteristics.
That is to say, the trained neural network model may be trained by a server or a terminal, but since the computation of training the neural network is large and the requirement on the computing capability of the device is high, the neural network model is generally trained by the server.
and step S103, the image processing equipment performs transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain the adjusted neural network model.
Here, since the artistic effect in the training text image adopted when the neural network model is trained is generally different from the artistic effect in the template text image, for example, when the neural network is trained, the text in the training text image is a flame effect, then the trained neural network can only convert the conventional text image into the text image with the flame effect, and now a user wants to generate the text image with the stone texture effect, it is obvious that the text image with the stone texture effect cannot be generated by using the trained neural network.
at this time, the trained neural network model needs to be migrated and learned through the color feature and the transparency feature of the template text image to adjust the parameters of the trained neural network, so that the adjusted neural network model can process the character image to be processed into a target character image with the same display effect as the template text image.
And step S104, the image processing equipment processes the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image.
Here, the display effect of the target text image is the same as the display effect of the template text image.
When the step S104 is implemented, the color feature and the transparency feature of the character image to be processed may be input into the adjusted neural network model, and the color feature and the transparency feature of the character image to be processed are processed by the adjusted neural network model, so that the target character image with the same display effect as the template character image can be obtained.
In the image processing method provided in the embodiment of the present application, first, a to-be-processed text image and a template text image in an RGBA format are obtained, and further, a color feature and a transparency feature of the to-be-processed text image and a color feature and a transparency feature of the template text image are obtained, and then, a trained neural network model is subjected to migration learning based on the color feature and the transparency feature of the template text image to obtain an adjusted neural network model, so that the to-be-processed text image can be converted by the adjusted neural network model to obtain a target text image with the same display effect as the template text image, and since the to-be-processed image with the transparency information is processed by the adjusted neural network, the target text image with the transparency information, which is the RGBA, can be directly generated, the method can realize batch generation of the target character images and improve the generation efficiency of the artistic words, and can also directly synthesize the target character images in the RGBA format with the background images to improve the synthesis efficiency of the synthesized images.
In some embodiments, before step S101, a preset neural network model may be trained through steps S21 to S24 shown in fig. 4 to obtain a trained neural network model:
Step S21, acquiring actual color features and actual transparency features of the training text images.
And the display effect of each training character image is the same. In some embodiments, before step S21, it is further required to obtain a plurality of training text images, where the training text images may be obtained from a network and have the same display effect, may be obtained by manual design, or may be generated by a design tool or design software, but the design tool or design software can only generate a training text image with a single display effect. After the plurality of training character images are obtained, feature extraction is carried out on each training character image so as to obtain the actual color feature and the actual transparency feature of each training character image.
it should be noted that, in the embodiment of the present application, the actual color feature of the text image refers to a color feature extracted from the text image, and correspondingly, the actual transparency feature of the text image refers to a transparency feature extracted from the text image.
and step S22, acquiring source text images corresponding to the training text images, and acquiring the actual color characteristic and the actual transparency characteristic of the source text images.
Here, since the training text image is a text image with a certain artistic display effect, for example, with a flame effect, a flower effect, a stone texture effect, and the like, and is not a conventional text image, in order to obtain a mapping relationship from a conventional source text image to the training text image, at this time, an actual color feature and an actual transparency feature of the source text image corresponding to the training text image also need to be obtained as a part of the training data.
And step S23, determining the actual color features and the actual transparency features of the training character images and the actual color features and the actual transparency features of the source character images as training data.
And step S24, training a preset neural network model based on the training data to obtain the trained neural network model.
Here, in the embodiment of the present application, a convolutional neural network model may be used, and further, a U-Net neural network model may be used.
When the step S24 is implemented, the preset neural network model may be trained based on the training data, and when the neural network model can learn how to change from the source text image to the training text image, the trained neural network model may be considered to be obtained.
In some embodiments, the neural network model includes at least: the first skeleton extraction model, the source character restoration model, the target character production model, and the transparency generation model, and correspondingly, step 24 shown in fig. 4 can be implemented by the following steps:
Step S241, training the first skeleton extraction model, the second skeleton extraction model and the source character restoration model according to the actual color characteristics of the first source character image and the actual color characteristics of the first training character image in the training data to realize shape maintenance;
Here, step S241 may be implemented by:
step S2411, inputting the actual color characteristics of the first source text and image in the training data into a first skeleton extraction model to obtain source text and character skeleton information.
here, the source text skeleton information may be contour information of each stroke in text characters included in the source text digital image, and may also be centerline information of each stroke.
Step S2412, inputting the actual color characteristics of the first training character image in the training data into a second skeleton extraction model to obtain first training character skeleton information.
Here, the first source text image and the first training text image are corresponding, that is, the text characters in the first source text image are the same as the text characters in the first training image. For example, if the text character in the source text image is "yes", the text character in the first training text image is also "yes", that is, the first training text image contains the artistic word of "yes".
Accordingly, the training text skeleton information may be contour information of each stroke in text characters included in the training text image, and may also be centerline information of each stroke. It should be noted that the source text skeleton information corresponds to the training text skeleton information, that is, if the source text skeleton information is the contour information of each stroke in the text characters contained in the source text image, the training text skeleton information is the contour information of each stroke in the text characters contained in the training text image; if the source text skeleton information is the centerline information of each stroke in the text characters contained in the source text image, the training text skeleton information is the centerline information of each stroke in the text characters contained in the training text image.
step S2413, carrying out prediction processing based on the source character framework information and the first training character framework information through a source character restoration model to obtain the predicted color characteristics of the first source character and image.
Here, in step S2413, when implemented, the source text skeleton information and the first training text skeleton information are respectively input into the source text restoration model, and the predicted color features of the first source text image are respectively obtained.
Step S2414, performing back propagation on the difference between the predicted color feature and the actual color feature of the first source text image in the first skeleton extraction model, the second skeleton extraction model and the source text restoration model to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the source text restoration model.
Here, during training, the difference between the predicted color feature and the actual color feature of the first source text image is propagated backward in the first skeleton extraction model, the second skeleton extraction model and the source text recovery model to update parameters (e.g., weight, threshold, etc.) of the first skeleton extraction model, the second skeleton extraction model and the source text recovery model until the difference between the predicted color feature and the actual color feature of the first source text image satisfies a preset first optimization target. The first skeleton extraction model, the second skeleton extraction model, and the source character restoration model that can achieve shape retention can be obtained through steps S2411 to S2413.
Step S242, training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color characteristics of the first source character image, the actual color characteristics of the first training character image and the actual color characteristics of the second training character image in the training data to realize texture generation;
Here, step S242 may be implemented by:
step S2421, inputting the color characteristics of the second training character image into the second skeleton extraction model to obtain second training character skeleton information.
here, the second training character image is an image different from the first training character image. The second skeleton extraction model in step S2421 may be the second skeleton extraction model trained in step S241.
step S2422, performing prediction processing based on the source character skeleton information and the second training character skeleton information through a target character generation model to obtain the predicted color characteristics of the first training character image.
Here, in step S2422, when implemented, the source text skeleton information and the second training text skeleton information are spliced, and the spliced skeleton information is input into the target text generation model, so as to obtain the predicted color feature of the first training text image.
The source character framework information and the second training character framework information are presented in the form of matrixes, so that the source character framework information and the second training character framework information are spliced, namely the two matrixes are spliced, and correspondingly, the obtained spliced framework information is a spliced matrix obtained by splicing the two matrixes.
step S2423, the difference value of the predicted color feature and the actual color feature of the first training character image is reversely propagated in the first skeleton extraction model, the second skeleton extraction model and the target character generation model, so that the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model are updated.
Here, when implemented, the step S2423 performs back propagation on the difference between the predicted color feature and the actual color feature of the first source text digital image in the first skeleton extraction model, the second skeleton extraction model and the target text generation model to update parameters (e.g., weight, threshold, etc.) of the first skeleton extraction model, the second skeleton extraction model and the target text generation model until the difference between the predicted color feature and the actual color feature of the first source text digital image meets the preset second optimization target. The first skeleton extraction model, the second skeleton extraction model and the target character generation model which can realize texture generation can be obtained through the steps S2421 to S2423.
And step S243, training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color characteristics of the first source character image, the actual color characteristics and the actual transparency characteristics of the first training character image so as to realize transparency information generation.
Here, step S243 may be implemented by:
step S2431, performing prediction processing based on the actual color feature of the first source character image and the actual color feature of the first training character image through a transparency generation model, performing back propagation on the obtained difference value between the predicted transparency feature of the first training character image and the actual transparency feature of the first training character image in the transparency generation model, and updating the parameter of the transparency generation model.
Here, when the step S2431 is implemented, the actual color feature of the first source text image and the actual color feature of the first training text image are spliced and synthesized to obtain a synthesized actual color feature, and then the synthesized actual color feature is input to the transparency generation model for prediction processing to obtain a predicted transparency feature of the first training text image; and then the difference value of the predicted transparency characteristic and the actual transparency characteristic of the first training character image is reversely propagated in the transparency generation model to update the parameter of the transparency generation model until the difference value of the predicted transparency characteristic and the actual transparency characteristic of the first training character image reaches a third optimization target.
And step S2432, obtaining a first associated characteristic based on the predicted color characteristic and the predicted transparency characteristic of the first training character image.
Here, since the predicted color feature and the predicted transparency feature of the first training text image are both in a matrix form, when the step S2432 is implemented, the predicted color matrix corresponding to the predicted color feature of the first training text image and the transparency matrix corresponding to the predicted transparency feature may be subjected to a product-by-product operation to obtain a first correlation matrix, that is, the first correlation feature. It should be noted that the dimensions of the predicted color matrix and the predicted transparency matrix are the same.
and S2433, obtaining a second correlation characteristic based on the actual color characteristic and the actual transparency characteristic of the first training character image.
Here, in step S2433, when it is implemented, the actual color matrix corresponding to the actual color feature of the first training text image and the actual transparency matrix corresponding to the actual transparency feature are subjected to a product-by-product operation to obtain a second correlation matrix, that is, a second correlation feature.
Step S2434, performing back propagation on the first skeleton extraction model, the second skeleton extraction model and the target character generation model based on the difference value between the first correlation characteristic and the second correlation characteristic so as to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model.
Here, when implemented, step S2434 may be implemented by propagating the difference between the first relevant feature and the second relevant feature back to the first skeleton extraction model, the second skeleton extraction model, and the target character generation model to update the parameters of the first skeleton extraction model, the second skeleton extraction model, and the target character generation model until the difference between the first relevant feature and the second relevant feature satisfies the fourth optimization goal. The first and second skeleton extraction models used in step S2434 are the first and second skeleton extraction models obtained in step S241 and step S242.
a neural network model capable of transforming the source text image into the training text image is obtained through steps S241 to S243, and the obtained training text image is an RGBA image having transparency information.
in some embodiments, when a preset neural network is trained to obtain a trained neural network model, a batch training method may be performed on the first skeleton extraction model, the second skeleton extraction model, the source character restoration model, the target character generation model and the transparency generation model according to the actual color feature and the actual transparency feature of the source character image and the actual color feature and the actual transparency feature of the training character image, that is, training data required by all models are read in at one time, and parameters of each model are adjusted through a training algorithm until the whole neural network model converges to a specified precision, so that a phenomenon that a network is forgotten can be overcome, and the training speed is increased.
through steps S101 to S104, the target text image in the RGBA format is obtained, and the target text image may be output as a separate image, or may be added to some background images to be prominently displayed. Thus, in some embodiments, as shown in fig. 3B, after step S104, the method further comprises:
And step S105, acquiring the placement positions of the background image and the target character image.
And S106, overlapping the target character image to the background image based on the placement position to obtain a composite image.
Here, since the target text image is an RGBA image having transparency information, when image synthesis is performed, the target text image is directly superimposed on the background image without performing processing such as matting.
step S107, outputting the composite image.
Here, outputting the composite image may be displaying the composite image on a graphical interface of an image processing apparatus, or transmitting the composite image to a terminal.
Through steps S105 to S107, the generated target character image can be directly superimposed on the background image to obtain a composite image, without performing matting processing on the target character image, so that the efficiency of image synthesis can be improved.
Based on the foregoing embodiments, an embodiment of the present application further provides an image processing method, and fig. 5 is a schematic flow chart illustrating an implementation of the image processing method according to the embodiment of the present application, as shown in fig. 5, the method includes:
step S301, the terminal acquires character information to be processed and template character images.
Here, the template text image may be a text image that is artificially designed to have a certain artistic effect. The text information to be processed may be text characters intended to be transformed to have the same display effect as the template text image, and in some embodiments, may also be text images with the conventional display effect of the text characters.
And step S302, the terminal sends the character information to be processed and the template character image to a server.
here, the terminal transmits the to-be-processed text information and the template text image to the server to request the server to convert the text characters in the to-be-processed text information into a text image having the same effect as the template text image.
Step S303, the server obtains the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image.
Here, when the terminal sends the server the character to be processed, before step S303, the server needs to convert the character to be processed into the character image to be processed based on the preset conversion condition.
and the server extracts the characteristics of the character image to be processed and the target character image to obtain the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image.
Step S304, the server acquires the trained neural network model.
here, the trained neural network model is obtained by the server itself through training with at least a training image with transparency characteristics, and may also be obtained through training with a training image with transparency characteristics by other computer devices.
And S305, the server performs transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model.
Here, when the step S305 is implemented, the server may perform one shot learning or raw shot learning on the trained neural network model based on the color feature and the transparency feature of the template text image, so as to obtain an adjusted neural network model.
and S306, processing the color characteristic and the transparency characteristic of the character image to be processed by the server through the adjusted neural network model to obtain a target character image.
Here, since the adjusted neural network model is obtained from the trained neural network model through the template character image, the target character image obtained by processing the color feature and the transparency feature of the character image to be processed through the adjusted neural network model is an image with the same display effect as the template character image.
And step S307, the server sends the target character image to the terminal.
Here, the terminal may directly output and display the target text image after receiving the target text image, and may further synthesize the target text image and the background image through steps S308 to S309 to obtain a synthesized image, so that the target text in the synthesized image is more noticeable.
Step S308, the terminal acquires the placement positions of the background image and the target character image.
Step S309, the terminal synthesizes the target character image and the background image based on the position information to obtain a synthetic image;
in step S310, the terminal outputs the composite image.
It should be noted that, in the above-described image processing method, the explanation of the same steps or concepts as those in other embodiments may refer to the description in other embodiments.
In the image processing method provided by the embodiment of the application, after acquiring the character information to be processed and the template character image, the terminal sends the character information to be processed and the template character image to the server, the server performs migration learning on the trained neural network model according to the target character image to obtain an adjusted neural network model, and processes the text image to be processed through the adjusted neural network model, so as to obtain the target character image with the same display effect as the template character image, the target character image is an image in an RGBA format, can be independently output and displayed and can be directly superposed into the background image to obtain a synthetic image which does not shield the background image, so that not only can the batch rapid generation of the character images with artistic display effect be realized, but also the generated character images can be directly superposed into the background image, the treatment difficulty is reduced, and the treatment efficiency can be further improved.
Based on the foregoing embodiments, an embodiment of the present application further provides an image processing method applied to the network architecture shown in fig. 1F, where the method includes:
and step 41, the terminal acquires the character image to be processed and the template character image.
Here, the to-be-processed text image acquired by the terminal may be obtained according to the to-be-processed text character input by the user. For example, the character to be processed input by the user may be converted into a character image to be processed according to a preset conversion rule.
and step 42, the terminal acquires the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image.
And the terminal extracts the characteristics of the character image to be processed and the target character image to obtain the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image.
and 43, the terminal acquires the trained neural network model from the server.
Here, the trained neural network model is obtained by the server training at least using a training image with transparency characteristics, and the training process may refer to steps S21 to S24.
and step 44, the terminal performs transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model.
here, when the step 44 is implemented, the terminal may perform one shot learning or raw shot learning on the trained neural network model based on the color feature and the transparency feature of the template text image, so as to obtain an adjusted neural network model.
And step 45, the terminal processes the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain the target character image.
here, since the adjusted neural network model is obtained from the trained neural network model through the template character image, the target character image obtained by processing the color feature and the transparency feature of the character image to be processed through the adjusted neural network model is an image with the same display effect as the template character image.
Similarly, the terminal may directly output and display the target character image after generating the target character image, or may further synthesize the target character image and the background image in steps 46 to 48 to obtain a synthesized image and output the synthesized image.
And step 46, the terminal acquires the background image and the position information of the target character image in the background image.
Step 47, the terminal synthesizes the target character image and the background image based on the position information to obtain a synthesized image;
And 48, outputting the composite image by the terminal.
It should be noted that, in the above-described image processing method, the explanation of the same steps or concepts as those in other embodiments may refer to the description in other embodiments.
In the image processing method provided by the embodiment of the application, after acquiring the character image to be processed and the template character image, the terminal extracts the color feature and the transparency feature of the character image to be processed and the template character image, performs transfer learning on the trained neural network model acquired from the server through the target character image to obtain an adjusted neural network model, and processes the text image to be processed through the adjusted neural network model to obtain the target character image with the same display effect as the template character image, wherein the target character image is an image in an RGBA format, can be independently output and displayed and can be directly superposed on the background image to obtain a synthetic image which does not shield the background image, so that not only can the batch rapid generation of the character image with artistic display effect be realized, but also the generated character image can be directly superposed on the background image, the processing difficulty is reduced, so that the processing efficiency can be further improved, and in addition, the terminal can realize the processing process through an application program, and better human-computer interaction experience can be provided for a user.
In some embodiments, the terminal may further send the template text image to the server, the server obtains color features and transparency features of the template text image, the server adjusts the trained neural network model according to the color features and transparency features of the template text image to obtain an adjusted neural network model, and then sends the adjusted neural network model to the terminal, so that the terminal may convert the text image to be processed determined by the user into a target text image with the same display effect as the template text image through the adjusted neural network model, and the obtained target text image may be output separately, or the target text image may be further superimposed into a background image to obtain a composite image and output.
next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
The embodiment of the application provides a technical scheme which can intelligently generate the art words of RGBA and is superior to the traditional method in the aspect of RGB dimension effect. In the embodiment of the application, an Attention mechanism is used for extracting and generating an Alpha channel based on the existing RGB artistic word generation model; and performing artistic word outline reforming according to the extracted Alpha and performing enhancement by combining additional data. By the method provided by the embodiment of the application, the RGBA artistic words can be directly and intelligently generated, and speckle noise outside the artistic words is reduced; and errors generated by certain artistic words can be avoided, the artistic words which can be actually used can be generated in batches, and manpower and material resources are reduced, so that great product differentiation and competitiveness are provided for related products of intelligent design.
Fig. 6 is a schematic flow chart of an implementation of a method for generating an artistic word in an embodiment of the present application, and as shown in fig. 6, the method includes the following steps:
Step S601, obtaining an artistic word effect diagram in an RGBA format.
Here, the artistic word effect diagram in the RGBA format includes the RGB information of the artistic word, and also includes the transparency characteristic a. In the embodiment of the present application, the RGB information of the art word in the RGBA format is represented as y, and the corresponding transparency characteristic is represented as ym. Meanwhile, original characters corresponding to the artistic words in the RGB format are also required to be obtained, and the RGB information and the transparency characteristics of the original characters are respectively recorded as x and xm. Thus, x, xm, y, ym constitute training data.
In the present embodiment, it is necessary to establish a model to learn how to change from [ x, xm ] to [ y, ym ]. The artistic word effect map in the RGBA format in step S601 may be downloaded from the internet, may also be generated by a certain algorithm, or may be artificially produced. Of course, the number of the obtained artistic word effects is not limited, but to ensure that the trained model can be obtained by using the training data, in the embodiment of the present application, 3 thousands of artistic word effect graphs in RGBA formats are used together.
Step S602, shape retention.
by combining with a universal U-net, three neural network models Gx, Ex and Ey can be defined, wherein main skeleton information of an original character x and an artistic character y can be respectively extracted through Ex (x) and Ey (y), the original character is restored through Gx (Ex (x)) or Gx (Ey (y)), parameters of each neural network model are trained through training data, and the difference between the original character x and a real original character x is reduced until the difference between x ^ and the real original character x meets an optimization target.
step S603, texture generation.
Given an original character x, another original character x, and an artistic character y corresponding to the original character x, when the step S603 is implemented, a splicing matrix is obtained by splicing ex (x) and Ey (y), and the splicing matrix is used as an input of a neural network model Gy to generate the artistic character y corresponding to the original character x. Namely, the artistic word corresponding to x is generated through the following network combination Gy ([ Ex (x) y (y) ]), and the error between the artistic word and the real artistic word y is reduced by continuously training the neural network model, so that a trained neural network model Gy, Ex and Ey can be obtained finally, and the artistic word corresponding to the original character is generated through the trained neural network model.
In step S604, transparency is generated.
Here, step S604, when implemented, may be generated by a neural network model S ([ x, y ]) and continuously reduces the gap with ym by training the network model S ([ x, y). in addition, there is a need to reduce an error between y and ym by training the network, wherein element-by-element product (element-by-element product).
In step S605, One-shot learning is performed.
Through the implementation process of steps S601 to S604, for each style of artistic word generated, a certain amount of artistic word of the corresponding style is required as training data. In practice, however, only one or two samples of each style may be trained by a typical designer. At this time, the neural network model obtained after training on 3 thousands of images through steps S601 to S604 can be stored. When a pair of new style samples exist, the stored neural network model can iterate for multiple times on the new style samples so as to adjust the parameters of the neural network model to obtain an adjusted neural network model, and the adjusted neural network model can be used for generating artistic words of the style.
In step S606, the generation result is output.
Fig. 7A and 7B are diagrams illustrating artistic word effects generated by the method for making artistic words according to the embodiment of the present application. In fig. 7A, 701 is an artificially designed artistic word, 702 is an artistic word generated by the method provided by the embodiment of the present application, and in fig. 7B, 711 is an artificially designed artistic word, 712 is an artistic word generated by the method provided by the embodiment of the present application, and it can be seen from a comparison of 701 and 702 and 711 and 712 that the artistic word generated by the method provided by the embodiment of the present application has no redundant noise points and has the same effect and no generation error compared with the artificially designed artistic word.
In addition, the artistic word effect diagrams 702 and 712 generated by the embodiment of the application are RGBA effect diagrams with transparency characteristics, so that the RGBA effect diagrams can be directly dragged onto the background diagram of the Banner to realize non-occlusion display. FIG. 8 is a graph showing the effect of adding the artistic word "containing diamond bag Ji Sheng" with flame effect to the background of Banner.
because the existing designed product for generating Banner through artificial intelligence comprises an AriLuban system and does not provide the function of manufacturing artistic words, the embodiment of the application can provide a method for generating RGBA artistic words, lays a foundation for further advancing to automation, batch and intellectualization of subsequent intelligent design, and can provide huge product differentiation and competitiveness for the designed products; moreover, the art words can be quickly and effectively manufactured in batches through the embodiment of the application, so that manpower and material resources can be greatly reduced. In addition, because the artistic word generated by the embodiment of the application has the transparency characteristic, the artistic word making tool derived by the embodiment of the application can be used as an independent product application, and an intelligent Banner design can be enabled, so that great economic benefits can be brought.
An exemplary structure of the software modules is described below, and in some embodiments, as shown in fig. 2, the software modules in the apparatus 440, that is, the image processing apparatus 80, may include:
The first obtaining module 81 is configured to obtain color features and transparency features of a text image to be processed and color features and transparency features of a template text image;
A second obtaining module 82 for
The adjusting module 83 is configured to perform transfer learning on the trained neural network model based on the color features and transparency features of the template text images to obtain an adjusted neural network model;
And the processing module 84 is configured to process the color feature and the transparency feature of the character image to be processed through the adjusted neural network model to obtain a target character image, where a display effect of the target character image is the same as a display effect of the template character image.
In some embodiments, the apparatus further comprises:
the third acquisition module is used for acquiring the actual color characteristics and the actual transparency characteristics of a plurality of training character images, wherein the display effects of the training character images are the same;
the fourth acquisition module is used for acquiring source text images corresponding to the plurality of training text images and acquiring actual color characteristics and actual transparency characteristics of the source text images;
The first determining module is used for determining the actual color features and the actual transparency features of the training character images and the actual color features and the actual transparency features of the source character images as training data;
And the training module is used for training a preset neural network model based on the training data to obtain the trained neural network model.
in some embodiments, the neural network model includes at least: the system comprises a first skeleton extraction model, a source character recovery model, a target character production model and a transparency generation model, wherein the training module is further used for:
training the first skeleton extraction model, the second skeleton extraction model and the source character restoration model according to the actual color characteristics of the first source character image and the actual color characteristics of the first training character image in the training data so as to realize shape maintenance;
training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color features of the first source text image, the actual color features of the first training character image and the actual color features of the second training character image in the training data to realize texture generation;
And training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color characteristic of the first source text image, the actual color characteristic of the first training character image and the actual transparency characteristic so as to realize transparency information generation.
in some embodiments, the training module is further to:
inputting actual color characteristics of a first source text image in the training data into a first skeleton extraction model to obtain source text skeleton information;
Inputting the actual color characteristics of the first training character image in the training data into a second skeleton extraction model to obtain first training character skeleton information;
Performing prediction processing based on the source character framework information and the first training character framework information through a source character restoration model to obtain the predicted color characteristics of the first source character and image;
And carrying out back propagation on the difference value between the predicted color characteristic and the actual color characteristic of the first source text and image in the first framework extraction model, the second framework extraction model and the source text restoration model so as to update the parameters of the first framework extraction model, the second framework extraction model and the source text restoration model.
In some embodiments, the training module is further to:
Inputting the color characteristics of the second training character image into a second skeleton extraction model to obtain second training character skeleton information;
Performing prediction processing based on the source character skeleton information and the second training character skeleton information through a target character generation model to obtain predicted color characteristics of the first training character image;
And carrying out back propagation on the difference value of the predicted color characteristic and the actual color characteristic of the first training character image in the first skeleton extraction model, the second skeleton extraction model and the target character generation model so as to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model.
In some embodiments, the training module is further to:
predicting the actual color characteristic of the first source text image and the actual color characteristic of the first training text image through a transparency generation model, and performing backward propagation on the difference value of the predicted transparency characteristic of the first training text image and the actual transparency characteristic of the first training text image in the transparency generation model to update the parameter of the transparency generation model;
obtaining a first correlation characteristic based on the predicted color characteristic and the predicted transparency characteristic of the first training character image;
Obtaining a second correlation characteristic based on the actual color characteristic and the actual transparency characteristic of the first training character image;
And performing back propagation on the first skeleton extraction model, the second skeleton extraction model and the target character generation model based on the difference value of the first association characteristic and the second association characteristic so as to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model.
In some embodiments, the apparatus further comprises:
The fifth acquisition module is used for acquiring a background image;
the synthesis module is used for superposing the target character image to the background image to obtain a synthesized image;
And the output module is used for outputting the composite image.
As an example of the method provided by the embodiment of the present Application being implemented by hardware, the method provided by the embodiment of the present Application may be directly implemented by the processor 410 in the form of a hardware decoding processor, for example, implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.
embodiments of the present application provide a storage medium having stored therein executable instructions that, when executed by a processor, will cause the processor to perform methods provided by embodiments of the present application, for example, as illustrated in fig. 3A, 3B, 4, and 5.
in some embodiments, the storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EE PROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
in some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
by way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (H TML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
by way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (10)

1. an image processing method, characterized in that the method comprises:
Acquiring color characteristics and transparency characteristics of a character image to be processed and color characteristics and transparency characteristics of a template character image;
Obtaining a trained neural network model, wherein the trained neural network model is obtained by training at least by using a training image with transparency characteristics;
performing transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model;
and processing the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image, wherein the display effect of the target character image is the same as that of the template character image.
2. The method of claim 1, further comprising:
Acquiring actual color characteristics and actual transparency characteristics of a plurality of training character images, wherein the display effect of each training character image is the same;
Acquiring source text images corresponding to the training text images, and acquiring actual color characteristics and actual transparency characteristics of the source text images;
Determining the actual color features and the actual transparency features of the training character images and the actual color features and the actual transparency features of the source character images as training data;
And training a preset neural network model based on the training data to obtain the trained neural network model.
3. The method of claim 2, wherein the neural network model comprises at least: the method comprises the following steps of obtaining a first skeleton extraction model, a source character recovery model, a target character production model and a transparency generation model, training a preset neural network model based on training data to obtain a trained neural network model, and comprises the following steps:
Training the first skeleton extraction model, the second skeleton extraction model and the source character restoration model according to the actual color characteristics of the first source character image and the actual color characteristics of the first training character image in the training data;
Training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color features of the first source character image, the actual color features of the first training character image and the actual color features of the second training character image in the training data;
and training the first skeleton extraction model, the second skeleton extraction model and the target character generation model according to the actual color characteristic of the first source text image, the actual color characteristic and the actual transparency characteristic of the first training character image.
4. the method of claim 3, wherein the training the first skeleton extraction model, the second skeleton extraction model, and the source text recovery model according to the actual color features of the first source text image and the actual color features of the first training text image in the training data comprises:
Inputting actual color characteristics of a first source text image in the training data into a first skeleton extraction model to obtain source text skeleton information;
Inputting the actual color characteristics of the first training character image in the training data into a second skeleton extraction model to obtain first training character skeleton information;
performing prediction processing based on the source character framework information and the first training character framework information through a source character restoration model to obtain the predicted color characteristics of the first source character and image;
and carrying out back propagation on the difference value between the predicted color characteristic and the actual color characteristic of the first source text and image in the first framework extraction model, the second framework extraction model and the source text restoration model so as to update the parameters of the first framework extraction model, the second framework extraction model and the source text restoration model.
5. the method of claim 4, wherein training the first skeleton extraction model, the second skeleton extraction model, and the target text generation model according to the actual color features of the first source text image, the actual color features of the first training text image, and the actual color features of the second training text image in the training data comprises:
Inputting the color characteristics of the second training character image into a second skeleton extraction model to obtain second training character skeleton information;
Performing prediction processing based on the source character skeleton information and the second training character skeleton information through a target character generation model to obtain predicted color characteristics of the first training character image;
and carrying out back propagation on the difference value of the predicted color characteristic and the actual color characteristic of the first training character image in the first skeleton extraction model, the second skeleton extraction model and the target character generation model so as to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model.
6. The method of claim 5, wherein the training the first skeleton extraction model, the second skeleton extraction model, and the target text generation model according to the actual color features of the first source text image, the actual color features of the first training text image, and the actual transparency features comprises:
Predicting the actual color characteristic of the first source text image and the actual color characteristic of the first training text image through a transparency generation model, and performing backward propagation on the difference value of the predicted transparency characteristic of the first training text image and the actual transparency characteristic of the first training text image in the transparency generation model to update the parameter of the transparency generation model;
obtaining a first correlation characteristic based on the predicted color characteristic and the predicted transparency characteristic of the first training character image;
obtaining a second correlation characteristic based on the actual color characteristic and the actual transparency characteristic of the first training character image;
And performing back propagation on the first skeleton extraction model, the second skeleton extraction model and the target character generation model based on the difference value of the first association characteristic and the second association characteristic so as to update the parameters of the first skeleton extraction model, the second skeleton extraction model and the target character generation model.
7. The method of claims 1 to 6, further comprising:
acquiring the placement positions of a background image and the target character image;
Based on the placement position, the target character image is superposed into the background image to obtain a composite image;
And outputting the composite image.
8. An image processing apparatus, characterized in that the apparatus comprises:
The first acquisition module is used for acquiring the color characteristic and the transparency characteristic of the character image to be processed and the color characteristic and the transparency characteristic of the template character image;
the second acquisition module is used for acquiring a trained neural network model, wherein the trained neural network model is obtained by training at least by using a training image with transparency characteristics;
The adjusting module is used for carrying out transfer learning on the trained neural network model based on the color characteristic and the transparency characteristic of the template character image to obtain an adjusted neural network model;
and the processing module is used for processing the color characteristic and the transparency characteristic of the character image to be processed through the adjusted neural network model to obtain a target character image, wherein the display effect of the target character image is the same as that of the template character image.
9. An image processing apparatus characterized by comprising:
A memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 7 when executing executable instructions stored in the memory.
10. A storage medium having stored thereon executable instructions for causing a processor to perform the method of any one of claims 1 to 7 when executed.
CN201910829520.6A 2019-09-03 2019-09-03 Image processing method, device and storage medium Active CN110544218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910829520.6A CN110544218B (en) 2019-09-03 2019-09-03 Image processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910829520.6A CN110544218B (en) 2019-09-03 2019-09-03 Image processing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN110544218A true CN110544218A (en) 2019-12-06
CN110544218B CN110544218B (en) 2024-02-13

Family

ID=68711190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910829520.6A Active CN110544218B (en) 2019-09-03 2019-09-03 Image processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN110544218B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177449A (en) * 2019-12-30 2020-05-19 深圳市商汤科技有限公司 Multi-dimensional information integration method based on picture and related equipment
CN111369481A (en) * 2020-02-28 2020-07-03 当家移动绿色互联网技术集团有限公司 Image fusion method and device, storage medium and electronic equipment
CN111931928A (en) * 2020-07-16 2020-11-13 成都井之丽科技有限公司 Scene graph generation method, device and equipment
CN111968048A (en) * 2020-07-30 2020-11-20 国网智能科技股份有限公司 Method and system for enhancing image data of few samples in power inspection
WO2021136178A1 (en) * 2020-01-03 2021-07-08 京东方科技集团股份有限公司 Electronic device and interaction method therefor, and computer-readable storage medium
CN113299250A (en) * 2021-05-14 2021-08-24 漳州万利达科技有限公司 Image display method and device and display equipment
CN113421214A (en) * 2021-07-15 2021-09-21 北京小米移动软件有限公司 Special effect character generation method and device, storage medium and electronic equipment
CN113450267A (en) * 2021-05-14 2021-09-28 桂林电子科技大学 Transfer learning method capable of rapidly acquiring multiple natural degradation image restoration models
WO2023284738A1 (en) * 2021-07-12 2023-01-19 上海交通大学 Method and system for beautifying image

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110025694A1 (en) * 2009-07-30 2011-02-03 Ptucha Raymond W Method of making an artistic digital template for image display
US20110029635A1 (en) * 2009-07-30 2011-02-03 Shkurko Eugene I Image capture device with artistic template design
US20110029914A1 (en) * 2009-07-30 2011-02-03 Whitby Laura R Apparatus for generating artistic image template designs
CN107025457A (en) * 2017-03-29 2017-08-08 腾讯科技(深圳)有限公司 A kind of image processing method and device
CN109285112A (en) * 2018-09-25 2019-01-29 京东方科技集团股份有限公司 Image processing method neural network based, image processing apparatus
CN109410141A (en) * 2018-10-26 2019-03-01 北京金山云网络技术有限公司 A kind of image processing method, device, electronic equipment and storage medium
US20190096093A1 (en) * 2016-06-29 2019-03-28 Panasonic Intellectual Property Management Co., Ltd. Image processing apparatus and image processing method
US20190156526A1 (en) * 2016-12-28 2019-05-23 Shanghai United Imaging Healthcare Co., Ltd. Image color adjustment method and system
CN109829925A (en) * 2019-01-23 2019-05-31 清华大学深圳研究生院 A kind of method and model training method for extracting clean prospect in scratching figure task

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110025694A1 (en) * 2009-07-30 2011-02-03 Ptucha Raymond W Method of making an artistic digital template for image display
US20110029635A1 (en) * 2009-07-30 2011-02-03 Shkurko Eugene I Image capture device with artistic template design
US20110029914A1 (en) * 2009-07-30 2011-02-03 Whitby Laura R Apparatus for generating artistic image template designs
US20190096093A1 (en) * 2016-06-29 2019-03-28 Panasonic Intellectual Property Management Co., Ltd. Image processing apparatus and image processing method
US20190156526A1 (en) * 2016-12-28 2019-05-23 Shanghai United Imaging Healthcare Co., Ltd. Image color adjustment method and system
CN107025457A (en) * 2017-03-29 2017-08-08 腾讯科技(深圳)有限公司 A kind of image processing method and device
WO2018177237A1 (en) * 2017-03-29 2018-10-04 腾讯科技(深圳)有限公司 Image processing method and device, and storage medium
CN109285112A (en) * 2018-09-25 2019-01-29 京东方科技集团股份有限公司 Image processing method neural network based, image processing apparatus
CN109410141A (en) * 2018-10-26 2019-03-01 北京金山云网络技术有限公司 A kind of image processing method, device, electronic equipment and storage medium
CN109829925A (en) * 2019-01-23 2019-05-31 清华大学深圳研究生院 A kind of method and model training method for extracting clean prospect in scratching figure task

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
白海娟;周未;王存睿;王磊;: "基于生成式对抗网络的字体风格迁移方法", 大连民族大学学报, no. 03 *
陈园园;袁焕丽;石齐双;: "基于神经网络的手写体数字识别", 智能计算机与应用, no. 03 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177449B (en) * 2019-12-30 2021-11-05 深圳市商汤科技有限公司 Multi-dimensional information integration method based on picture and related equipment
CN111177449A (en) * 2019-12-30 2020-05-19 深圳市商汤科技有限公司 Multi-dimensional information integration method based on picture and related equipment
WO2021136178A1 (en) * 2020-01-03 2021-07-08 京东方科技集团股份有限公司 Electronic device and interaction method therefor, and computer-readable storage medium
CN111369481A (en) * 2020-02-28 2020-07-03 当家移动绿色互联网技术集团有限公司 Image fusion method and device, storage medium and electronic equipment
CN111931928A (en) * 2020-07-16 2020-11-13 成都井之丽科技有限公司 Scene graph generation method, device and equipment
CN111931928B (en) * 2020-07-16 2022-12-27 成都井之丽科技有限公司 Scene graph generation method, device and equipment
CN111968048A (en) * 2020-07-30 2020-11-20 国网智能科技股份有限公司 Method and system for enhancing image data of few samples in power inspection
CN111968048B (en) * 2020-07-30 2024-03-26 国网智能科技股份有限公司 Method and system for enhancing image data of less power inspection samples
CN113299250B (en) * 2021-05-14 2022-05-27 漳州万利达科技有限公司 Image display method and device and display equipment
CN113450267A (en) * 2021-05-14 2021-09-28 桂林电子科技大学 Transfer learning method capable of rapidly acquiring multiple natural degradation image restoration models
CN113299250A (en) * 2021-05-14 2021-08-24 漳州万利达科技有限公司 Image display method and device and display equipment
WO2023284738A1 (en) * 2021-07-12 2023-01-19 上海交通大学 Method and system for beautifying image
CN113421214A (en) * 2021-07-15 2021-09-21 北京小米移动软件有限公司 Special effect character generation method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN110544218B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN110544218B (en) Image processing method, device and storage medium
Li et al. Layoutgan: Synthesizing graphic layouts with vector-wireframe adversarial networks
CN113761153A (en) Question and answer processing method and device based on picture, readable medium and electronic equipment
CN113191375A (en) Text-to-multi-object image generation method based on joint embedding
CN115393872B (en) Method, device and equipment for training text classification model and storage medium
CN113506377A (en) Teaching training method based on virtual roaming technology
CN114969282B (en) Intelligent interaction method based on rich media knowledge graph multi-modal emotion analysis model
CN113762039A (en) Information matching method and related device for traffic sign board
CN112184582A (en) Attention mechanism-based image completion method and device
CN111540032A (en) Audio-based model control method, device, medium and electronic equipment
CN117078790B (en) Image generation method, device, computer equipment and storage medium
Yu et al. Mask-guided GAN for robust text editing in the scene
CN111507259B (en) Face feature extraction method and device and electronic equipment
CN117033609A (en) Text visual question-answering method, device, computer equipment and storage medium
CN116823596A (en) Driving state image data set augmentation method and device
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
CN116958323A (en) Image generation method, device, electronic equipment, storage medium and program product
Wang et al. Computer-Aided Traditional Art Design Based on Artificial Intelligence and Human-Computer Interaction
CN115908639A (en) Transformer-based scene image character modification method and device, electronic equipment and storage medium
CN116957669A (en) Advertisement generation method, advertisement generation device, computer readable medium and electronic equipment
CN113673567B (en) Panorama emotion recognition method and system based on multi-angle sub-region self-adaption
Xu Immersive display design based on deep learning intelligent VR technology
CN112634456B (en) Real-time high-realism drawing method of complex three-dimensional model based on deep learning
CN114399708A (en) Video motion migration deep learning system and method
CN112836467A (en) Image processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant