CN117894026A

CN117894026A - Text image generation method and device and electronic equipment

Info

Publication number: CN117894026A
Application number: CN202410023510.4A
Authority: CN
Inventors: 詹孟学
Original assignee: Bright Jupiter Private Ltd
Current assignee: Bright Jupiter Private Ltd
Priority date: 2024-01-05
Filing date: 2024-01-05
Publication date: 2024-04-16

Abstract

The embodiment of the invention provides a text image generation method, a text image generation device and electronic equipment, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: determining characters belonging to a target font to be generated as target characters; acquiring a character image of the target character from each character image of the first preset font, and taking the character image as a character image to be utilized; inputting the character image to be utilized into a pre-trained image conversion model to obtain a character image belonging to a target font corresponding to the target character; the image conversion model is as follows: the method comprises the steps of performing pre-training based on a second preset font and a third preset font, and training based on sample images selected from all character images of a first preset font and character images belonging to a target font corresponding to the sample images, wherein the third preset font and the second preset font are different fonts. By the scheme, the efficiency of generating the characters belonging to the target fonts is improved.

Description

Text image generation method and device and electronic equipment

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for generating a text image, and an electronic device.

Background

A language often has a large number of words, in chinese, as many as thousands of words are commonly used. When a set of target fonts with specific styles needs to be generated, the characters belonging to the target fonts need to be designed one by manually aiming at a large number of characters. For example, when a user wants to generate a set of personalized fonts in their own handwriting style, the user is required to write each text one by one. It can be seen that the process of generating the text belonging to the target font is time-consuming and labor-consuming and has low efficiency.

Therefore, how to improve the efficiency of generating characters belonging to the target font is a problem to be solved.

Disclosure of Invention

The embodiment of the invention aims to provide a character image generation method, a character image generation device and electronic equipment, so as to improve the efficiency of generating characters belonging to a target font. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a text image generating method, where the method includes:

determining characters belonging to a target font to be generated as target characters;

acquiring a character image of the target character from each character image of a first preset font, and taking the character image as a character image to be utilized;

Inputting the character image to be utilized into a pre-trained image conversion model to obtain a character image which belongs to the target font and corresponds to the target character; wherein, the image conversion model is: the method comprises the steps of pre-training based on a second preset font and a third preset font, training based on sample images selected from all character images of the first preset font and character images corresponding to the sample images and belonging to the target font, and enabling the third preset font and the second preset font to be different fonts.

Optionally, before the determining the text to be generated belonging to the target font, the method further includes:

acquiring a character image of a first appointed character from each character image of the first preset font, and taking the character image as a first sample image;

acquiring a character image belonging to the target font corresponding to the first appointed character as a first truth image;

inputting the first sample image into an image conversion model to be trained to obtain a first predicted image output by the image conversion model to be trained;

and calculating model loss based on the first predicted image and the first truth image so as to carry out model parameter adjustment on the image conversion model to be trained, and obtaining the image conversion model after training.

Optionally, the pre-training process of the image conversion model includes:

acquiring a font image of a second designated text from each text image of the second preset font as a second sample image;

acquiring a font image of the second designated text from each text image of the third preset font as a second truth image;

inputting the second sample image into an image conversion model to be pre-trained to obtain a second predicted image output by the image conversion model to be pre-trained;

and calculating model loss based on the second predicted image and the second truth image so as to carry out model parameter adjustment on the image conversion model, thereby obtaining a pre-trained image conversion model.

Optionally, the determining the text to be generated belonging to the target font as the target text includes:

acquiring unified codes representing characters belonging to a target font to be generated as target codes;

the obtaining the text image of the target text from each text image of the first preset font comprises the following steps:

and acquiring the character image of the character represented by the target code from each character image of the first preset font.

determining a character from characters to be contained in a character library belonging to the target font to be constructed, and taking the character as the target character;

after the text image to be utilized is input into a pre-trained image conversion model to obtain the text image belonging to the target font corresponding to the target text, the method further comprises:

determining the next character from the characters to be contained as a target character, and returning to execute the step of obtaining the character image of the target character from the character images of the first preset fonts as a character image to be utilized until the character image belonging to the target font corresponding to the characters to be contained is obtained;

and constructing a word stock belonging to the target font based on the obtained text image.

In a second aspect, an embodiment of the present invention provides a text image generating apparatus, including:

the target character acquisition module is used for determining characters to be generated, belonging to the target fonts, as target characters;

the character image to be utilized is used for acquiring the character image of the target character from each character image of the first preset font, and the character image is used as the character image to be utilized;

The character image generation module is used for inputting the character image to be utilized into a pre-trained image conversion model to obtain a character image which belongs to the target font and corresponds to the target character; wherein, the image conversion model is: the method comprises the steps of pre-training based on a second preset font and a third preset font, training based on sample images selected from all character images of the first preset font and character images corresponding to the sample images and belonging to the target font, and enabling the third preset font and the second preset font to be different fonts.

Optionally, the apparatus further comprises:

the first sample image acquisition module is used for acquiring a character image of a first appointed character from each character image of the first preset font before the target character acquisition module determines the character belonging to the target font to be generated, and the character image is used as a first sample image;

the first truth image acquisition module is used for acquiring a text image belonging to the target font corresponding to the first appointed text as a first truth image;

the first input module is used for inputting the first sample image into an image conversion model to be trained to obtain a first predicted image output by the image conversion model to be trained;

And the first parameter adjusting module is used for calculating model loss based on the first predicted image and the first truth image so as to perform model parameter adjustment on the image conversion model to be trained and obtain the image conversion model after training.

Optionally, the apparatus further comprises:

the second sample image acquisition module is used for acquiring a font image of a second designated text from each text image of the second preset font in the pre-training process of the image conversion model, and taking the font image of the second designated text as a second sample image;

the second truth image acquisition module is used for acquiring the font image of the second designated text from each text image of the third preset font as a second truth image;

the second input module is used for inputting the second sample image into an image conversion model to be pre-trained to obtain a second predicted image output by the image conversion model to be pre-trained;

and the second parameter adjusting module is used for calculating model loss based on the second predicted image and the second truth image so as to carry out model parameter adjustment on the image conversion model and obtain a pre-trained image conversion model.

Optionally, the target text obtaining module is specifically configured to:

the text image acquisition module to be utilized is specifically configured to:

Optionally, the target text obtaining module is specifically configured to:

the character image generating module is further configured to determine, after the character image belonging to the target font corresponding to the target character is obtained by inputting the character image to be utilized into a pre-trained image conversion model, a next character from all characters to be included, and as a target character, trigger the character image obtaining module to execute the step of obtaining the character image of the target character from all character images of the first preset font, and as a character image to be utilized, until the character image belonging to the target font corresponding to each character to be included is obtained;

and the word stock construction module is used for constructing a word stock belonging to the target font based on the obtained text image.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any one of the steps of the character image generation method when executing the program stored in the memory.

In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps of any one of the above-mentioned text image generation methods.

The embodiment of the invention also provides a computer program product containing instructions, which when run on a computer, cause the computer to execute the text image generating method.

The embodiment of the invention has the beneficial effects that:

according to the character image generation method provided by the embodiment of the invention, the image conversion model is obtained by training based on the sample image selected from the first preset fonts and the character image belonging to the target fonts corresponding to the sample image, so that the image conversion model can learn the association relationship between the first preset fonts and the fonts of the target fonts, and further can acquire the character image of the target characters from each character image of the first preset fonts as the character image to be utilized when the target characters belonging to the target fonts need to be generated; and inputting the character image to be utilized into the image conversion model, so that the character image belonging to the target font corresponding to the target character can be automatically generated. Therefore, according to the scheme, the characters belonging to the target font can be obtained in a mode of generating the image by utilizing the image conversion model, so that the efficiency of generating the characters belonging to the target font is improved.

And the image conversion model is also based on a second preset font and a third preset font, and the two different fonts are pre-trained, so that the association relation between the different fonts can be learned, and the image conversion model is trained based on the sample image and the text image belonging to the target font corresponding to the sample image after pre-training, so that the text image of the first preset font can be more accurately converted into the text image belonging to the target font by using the image conversion model.

Of course, it is not necessary for any one product or method of practicing the invention to achieve all of the advantages set forth above at the same time.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other embodiments may be obtained according to these drawings to those skilled in the art.

FIG. 1 is a flowchart of a text image generating method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a text image generating method according to an embodiment of the present invention;

FIG. 3 is another flowchart of a text image generating method according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a text image generating device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, those of ordinary skill in the art will be able to devise all other embodiments that are obtained based on this application and are within the scope of the present invention.

In order to improve the efficiency of generating characters of a target font, the embodiment of the invention provides a character image generation method which can be applied to electronic equipment such as a computer, a server and the like. The method comprises the following steps:

In this embodiment, the text belonging to the target font can be obtained by using the image conversion model in a manner of generating an image, thereby improving the efficiency of generating the text belonging to the target font. And the image conversion model is also based on a second preset font and a third preset font, and the two different fonts are pre-trained, so that the association relation between the different fonts can be learned, and the image conversion model is trained based on the sample image and the text image belonging to the target font corresponding to the sample image after pre-training, so that the text image of the first preset font can be more accurately converted into the text image belonging to the target font by using the image conversion model.

The text image generating method provided by the embodiment of the invention is described in detail below with reference to the accompanying drawings, and as shown in fig. 1, the text image generating method may include the following steps:

s101, determining characters belonging to a target font to be generated as target characters;

the target font is a font of a specific style to be generated, and may be an artistic font designed by a designer, or a handwritten font of a person, or may not be limited to this.

For example, when the user needs to generate a set of word stock with own handwriting style and containing all common words, the target word can be any word in all common words; when the handwriting of a calligrapher needs to be simulated to simulate the characters which are not written or are lost by the calligrapher, the target characters are the characters needing to be simulated.

S102, acquiring a character image of a target character from each character image of a first preset font, and taking the character image as a character image to be utilized;

the first preset font may be any existing font, for example, regular script, song Ti, bold, etc. For example, as shown in fig. 2, when the target text is determined to be a "language" word, an existing character library, such as a text image of the "language" word in the regular script GB2312 (national standard regular script), may be obtained as the text image to be utilized.

In an implementation manner, the electronic device may first obtain, as a target code, a unicode representing a word to be generated that belongs to the target font; and further acquiring the character image of the character represented by the target code from each character image of the first preset font.

Specifically, in the existing word stock, a word code, such as a unified code (Unicode), of a plurality of words and a corresponding bitmap representing a word form are generally recorded, so that the electronic device may acquire the unified code of the target word as the target code, further acquire the bitmap corresponding to the target code from the word stock corresponding to the first preset font, as the to-be-utilized word image, or may perform processing such as image enhancement and resolution conversion on the acquired bitmap according to the requirement, and then obtain the to-be-utilized word image.

S103, inputting the character image to be utilized into a pre-trained image conversion model to obtain a character image belonging to a target font corresponding to the target character; the image conversion model is as follows: the method comprises the steps of performing pre-training based on a second preset font and a third preset font, and training based on sample images selected from all character images of a first preset font and character images belonging to a target font corresponding to the sample images, wherein the third preset font and the second preset font are different fonts.

The image conversion model may be a neural network model constructed based on an AIGC (Artificial Intelligence Generated Content, generative artificial intelligence) technique, which is a technique of generating related contents with an appropriate generalization ability through learning and recognition of existing data. Such models may be Stable Diffusion models, pixArt-alpha models, and the like.

The image conversion model may be pre-trained based on a second preset font and a third preset font in advance, so as to learn an association relationship between different fonts, where the second preset font and the third preset font are different fonts, and in an implementation manner, the second preset font may be the same as the first preset font. And training based on the sample image selected from the first preset fonts and the text image belonging to the target fonts corresponding to the sample image, so that the model learning image converts the association relationship between the first preset fonts and the target fonts.

After the text image to be utilized is acquired, the text image to be utilized can be input into the image conversion model, so that the image conversion model outputs the text image belonging to the target font corresponding to the target text. Still taking fig. 2 as an example, after the text image of the "word" of the regular script is acquired, the text image may be input into the image conversion model, so that the image conversion model may output the text image belonging to the target font, for example, the text image of the "word" of the user's handwriting.

In this embodiment, the image conversion model is obtained by training based on a sample image selected from a first preset font and a text image belonging to a target font corresponding to the sample image, so that the image conversion model can learn an association relationship between the first preset font and the font of the target font, and further can obtain a text image of the target text from each text image of the first preset font when the target text belonging to the target font needs to be generated, and the text image is used as the text image to be utilized; and inputting the character image to be utilized into the image conversion model, so that the character image belonging to the target font corresponding to the target character can be automatically generated. Therefore, according to the scheme, the characters belonging to the target font can be obtained in a mode of generating the image by utilizing the image conversion model, so that the efficiency of generating the characters belonging to the target font is improved.

In an embodiment of the present invention, the training process of the image conversion model may be performed before determining the text to be generated and belonging to the target font in step S101, and in an implementation manner, the training process may include the following steps:

a1, acquiring a character image of a first appointed character from each character image of a first preset font, and taking the character image as a first sample image;

the first designated text may include a plurality of pre-selected text, and a plurality of different text with larger morphological differences should be selected as much as possible when the first designated text is selected, so as to improve training effect of the model.

In this step, the electronic device may also acquire a unicode of the first specified text, and further acquire a corresponding bitmap from a font library of the first preset font according to the acquired unicode, as the first sample image.

A2, acquiring a character image belonging to the target font corresponding to the first designated character as a first truth image;

for the first specified text, a text image belonging to the target font may be acquired, for example, when the user needs to generate a set of fonts with own handwriting style, the first specified text may be written by the user, for example, the first specified text includes: when the word is a 'Yong' word, the user can write the 'Yong' word. The user can further obtain the image of the first handwritten designated text through the human-computer interaction mode, for example, the user can write the first designated text on paper, further collect the image of the handwritten text through the image collecting device and upload the image to the electronic device, or the user can also directly write the first designated text on a touch panel interacted with the electronic device. Of course, the present invention is not limited thereto.

Step A3, inputting the first sample image into an image conversion model to be trained to obtain a first predicted image output by the image conversion model to be trained;

and step A4, calculating model loss based on the first predicted image and the first truth image so as to carry out model parameter adjustment on the image conversion model to be trained, and obtaining the image conversion model after training.

In this embodiment, the number of the first specified text may be multiple, and further, for each first specified text, a corresponding image may be obtained from each text image of a first preset font, and used as a first sample image corresponding to the first specified text, and the first sample image is input into an image conversion model to be trained, so as to obtain a first predicted image corresponding to the first specified text; and calculating model loss based on the first predicted image and a first truth image corresponding to the first appointed text so as to perform model parameter adjustment on the image conversion model to be trained, for example, the model parameter adjustment can be performed by using a gradient descent method. And repeating the above process for a plurality of times until the model converges to obtain the trained image conversion model.

In one implementation, the image transformation model may also be pre-trained prior to training, and the pre-training process may include the steps of:

Step B1, acquiring a font image of a second designated text from each text image of the second preset font as a second sample image;

step B2, acquiring a font image of the second designated text from each text image of the third preset font as a second truth image;

step B3, inputting the second sample image into an image conversion model to be pre-trained, and obtaining a second predicted image output by the image conversion model to be pre-trained;

and step B4, calculating model loss based on the second predicted image and the second truth image so as to carry out model parameter adjustment on the image conversion model, and obtaining a pre-trained image conversion model.

The pre-training process is similar to the model training process described above. In the pre-training process, the second preset font and the third preset font are both existing fonts. And the second preset font and the third preset font are different fonts, for example, the second preset font is a regular script, and the third preset font is a line book. In addition, the number of sample images used in the pre-training process can be far more than that in the training process, so that the text images belonging to the target fonts which need to be acquired can be greatly reduced. For example, 10000 characters can be used in the pre-training process, and only 100 characters are needed in the training process.

In one implementation, the pre-training process may further include:

acquiring a font image of a third appointed character from each character image of the second preset font, and taking the font image of the third appointed character as a test character image; the third specified word is different from the second specified word; acquiring a font image of the third appointed character from each character image of the third preset font, and taking the font image as a third truth image; inputting the character image to be tested into the image conversion model to obtain a third predicted image output by the image conversion model; and then, determining the pre-training effect of the image conversion model based on the similarity of the third predicted image and the third truth image.

That is, a portion may be selected from the overall sample image as the training set and a portion as the test set during the pre-training process.

For example, 5000 characters may be selected from 10000 different characters, a corresponding character image of the 5000 characters may be obtained from each character image of the second preset font, and used as a second sample image, and a font image of a corresponding character image of the 5000 characters may be obtained from each character image of the third preset font, and used as a second truth image, so that the second sample image and the second truth image corresponding to the 5000 characters are used as a training set for performing model adjustment on an image conversion model. Meanwhile, for the remaining 5000 characters in the 10000 different characters, acquiring a character image corresponding to the 5000 characters from each character image of the second preset font, and acquiring a font image of the character image corresponding to the 5000 characters from each character image of the third preset font, and further determining a pre-training effect of the image conversion model according to the similarity between a third prediction image and the third truth image, so as to evaluate whether the pre-trained image conversion model is available. If the evaluation result is unavailable, the model can be modified by adopting modes of readjusting the model structure, replacing the training set and the like.

In this embodiment, the image conversion model to be trained is trained by acquiring a text image of a first specified text from each text image of the first preset font, using the text image as a first sample image, and acquiring a text image belonging to the target font corresponding to the first specified text, and using the text image as a first truth image, so that the image conversion model learns the association relationship between the first preset font and the font of the target font only by a small number of text images of the first specified text, and the trained image conversion model can be used for automatically generating the text belonging to the target font.

In an embodiment of the present invention, if it is required to generate a word stock belonging to a target style, as shown in fig. 3, the text image generating method may include the following steps:

s301, determining a word from words to be contained in a word stock belonging to a target font to be constructed as a target word;

the characters to be contained in the word stock to be constructed can be set according to requirements, for example, for Chinese characters, the characters to be contained in the word stock can be 6763 characters in GB2315, or 97046 Chinese characters recorded in Unicode 15.0 version can also be contained.

S302, acquiring a character image of a target character from each character image of a first preset font, wherein the character image is used as a character image to be utilized;

s303, inputting a character image to be utilized into a pre-trained image conversion model to obtain a character image corresponding to a target character and belonging to the target font;

in this embodiment, the steps S302-S303 and the steps S102-S103 are not described herein. After step S303, determining the next text from the text to be included as a target text, returning to execute step S302, acquiring a text image of the target text from the text images of the first preset fonts, and executing step S304 until obtaining a text image belonging to the target font corresponding to the text to be included as a text image to be utilized;

s304, constructing a word stock belonging to the target font based on the obtained text image.

After obtaining the text image belonging to the target font corresponding to each text, a word stock can be constructed according to the unified code of each text and the text image belonging to the target font. Specifically, a font library file, for example, a file in TTF (true font) format, may be generated, so that the font library may be called in a terminal such as a computer, a mobile phone, and the like, and when the terminal displays a text, the terminal may display the text belonging to the target font.

In the embodiment, the efficiency of generating the characters belonging to the target fonts is improved. Furthermore, when the word stock is required to be built, the word pictures belonging to the target fonts can be automatically generated for each word to be contained in the word stock, so that the efficiency of building the word stock is improved.

Based on the same inventive concept, the embodiment of the invention also provides a text image generating device, as shown in fig. 4, which comprises:

a target text obtaining module 401, configured to determine a text to be generated, which belongs to a target font, as a target text;

the to-be-utilized text image obtaining module 402 is configured to obtain a text image of the target text from each text image of a first preset font, as a to-be-utilized text image;

a text image generating module 403, configured to input the text image to be utilized into a pre-trained image conversion model, to obtain a text image corresponding to the target text and belonging to the target font; wherein, the image conversion model is: the method comprises the steps of pre-training based on a second preset font and a third preset font, training based on sample images selected from all character images of the first preset font and character images corresponding to the sample images and belonging to the target font, and enabling the third preset font and the second preset font to be different fonts.

Optionally, the apparatus further comprises:

a first sample image obtaining module, configured to obtain, from each text image of the first preset font, a text image of a first specified text as a first sample image before the target text obtaining module 401 determines a text to be generated that belongs to the target font;

Optionally, the apparatus further comprises:

Optionally, the target text obtaining module 401 is specifically configured to:

the text image obtaining module 402 is specifically configured to:

Optionally, the target text obtaining module 401 is specifically configured to:

the text image generating module 403 is further configured to determine, after the text image corresponding to the target text is obtained by inputting the text image to be utilized into a pre-trained image conversion model, a next text from each text to be included, as a target text, trigger the text image obtaining module to execute the step of obtaining the text image of the target text from each text image of the first preset font, as a text image to be utilized, until a text image corresponding to each text to be included and belonging to the target font is obtained;

The embodiment of the invention also provides an electronic device, as shown in fig. 5, which comprises a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502 and the memory 503 complete communication with each other through the communication bus 504,

a memory 503 for storing a computer program;

the processor 501 is configured to implement any of the above-described text image generating method steps when executing the program stored in the memory 503.

The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In yet another embodiment of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements the steps of the text image generating method according to any one of the above embodiments.

In yet another embodiment of the present invention, a computer program product containing instructions that, when run on a computer, cause the computer to perform the text image generating method of any of the above embodiments is also provided.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, electronic devices, and computer-readable storage medium embodiments, since they are substantially similar to method embodiments, the description is relatively simple, and references to parts of the description of method embodiments are only required.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. A method for generating a text image, the method comprising:

2. The method of claim 1, wherein prior to said determining text to be generated that belongs to the target font, the method further comprises:

3. The method according to claim 1 or 2, wherein the pre-training process of the image transformation model comprises:

4. The method according to claim 1, wherein the determining the text to be generated belonging to the target font as the target text includes:

5. The method according to claim 1, wherein the determining the text to be generated belonging to the target font as the target text includes:

6. A text image generating apparatus, the apparatus comprising:

7. The apparatus of claim 6, wherein the apparatus further comprises:

8. The apparatus according to claim 6 or 7, characterized in that the apparatus further comprises:

9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

a memory for storing a computer program;

a processor for carrying out the method steps of any one of claims 1-5 when executing a program stored on a memory.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-5.