CN116777734A

CN116777734A - Method, device, equipment and storage medium for generating background penetration image

Info

Publication number: CN116777734A
Application number: CN202310566415.4A
Authority: CN
Inventors: 王翔
Original assignee: Shenzhen Xingtong Technology Co ltd
Current assignee: Shenzhen Xingtong Technology Co ltd
Priority date: 2023-05-18
Filing date: 2023-05-18
Publication date: 2023-09-19

Abstract

The disclosure relates to a method, a device, equipment and a storage medium for generating a background penetration image. The method comprises the following steps: acquiring a foreground image to be added with a background penetration trace and text information of the foreground image; selecting a basic penetration image matched with the text information of the foreground image; cutting out a penetration image matched with the size of the foreground image from the basic penetration image; carrying out background pretreatment on the permeation image to obtain a background image; and fusing the foreground image and the background image to obtain a background penetration image. The method can remarkably improve the distinguishing capability of the image detection model.

Description

Method, device, equipment and storage medium for generating background penetration image

Technical Field

The disclosure relates to the technical field of image recognition, and in particular relates to a method, a device, equipment and a storage medium for generating a background penetration image.

Background

At present, text detection technology is widely applied to photographing and searching questions, intelligent correction, automatic text input and the like. The main purpose of text detection is to detect the position of text lines, including printed text lines and handwritten text lines. Different from street signs, billboards and the like, in intelligent correction, the background penetration problem exists in the cut-off teaching aid, test paper and operation images, namely, the content on the other side of the paper is displayed on the current page through the paper, so that the detection result of a text line can be greatly influenced. The background penetration trace is generally removed by conventional image processing in the prior art, and the processed image is then sent to a subsequent detection module. However, with the development of the deep learning model, the deep learning model can learn and distinguish the features of the background penetration trace on the premise of sufficient data. The deep learning model needs to be trained by using image samples with background permeation marks, most of the current image samples are manually acquired, the background permeation marks are different in morphology, and the discrimination of the deep learning model is improved slightly.

Disclosure of Invention

In order to solve the technical problems described above or at least partially solve the technical problems described above, the present disclosure provides a method, an apparatus, a device, and a storage medium for generating a background penetration image.

According to an aspect of the present disclosure, there is provided a method for generating a background penetration image, including:

acquiring a foreground image to be added with a background penetration trace and text information of the foreground image;

selecting a basic penetration image matched with the text information of the foreground image;

cutting out a penetration image matched with the size of the foreground image from the basic penetration image;

carrying out background pretreatment on the permeation image to obtain a background image;

and fusing the foreground image and the background image to obtain a background penetration image.

According to another aspect of the present disclosure, there is provided a generation apparatus of a background penetration image, including:

the text information module is used for acquiring a foreground image to which the background penetration trace is to be added and text information of the foreground image;

the selection module is used for selecting a basic penetration image matched with the text information of the foreground image;

the cutting module is used for cutting the penetration image matched with the size of the foreground image from the basic penetration image;

the preprocessing module is used for carrying out background preprocessing on the penetration image to obtain a background image;

and the fusion module is used for fusing the foreground image and the background image to obtain a background penetration image.

According to another aspect of the present disclosure, there is provided an electronic device including:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions from the memory and executing the instructions to realize the generation method of the background penetration image.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, characterized in that the computer instructions, when run on a terminal device, cause the terminal device to implement the above-described background penetration image generation method.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:

the method, the device, the equipment and the storage medium for generating the background penetration image acquire a foreground image of a background penetration trace to be added and text information of the foreground image; selecting a basic penetration image matched with the text information of the foreground image; cutting out a penetration image matched with the size of the foreground image from the basic penetration image; carrying out background pretreatment on the permeation image to obtain a background image; and fusing the foreground image and the background image to obtain a background penetration image. Through the method, the basic penetrating image matched with the text information of the foreground image can be selected, the cut penetrating image is matched with the size of the foreground image, namely, the text information and the size of the penetrating image are matched with the foreground image, the background image obtained by carrying out background pretreatment on the penetrating image is fused with the foreground image to form the background penetrating image, and the background image becomes a background penetrating trace in the background penetrating image. Compared with the prior art, the background penetration trace has matched text information and size with the foreground image, the phenomenon of different forms of the background penetration trace is avoided to a great extent, the difference between the background penetration trace and the foreground image is reduced, the background penetration trace and the foreground image have higher similarity, and the background penetration image is used for training the image detection model, so that the discrimination capability of the image detection model can be remarkably improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 is a physical diagram of background penetration phenomenon;

fig. 2 is a flowchart of a method for generating a background penetration image according to an embodiment of the present disclosure;

FIG. 3 is a detailed flowchart of step S3 in an embodiment of the present disclosure;

FIG. 4 is a detailed flowchart of step S4 in an embodiment of the present disclosure;

FIG. 5 is a detailed flowchart of step S3 in another embodiment of the present disclosure;

fig. 6 is a schematic diagram of a generating apparatus of a background penetration image according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below. It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

At present, text detection technology is widely applied to photographing and searching questions, intelligent correction, automatic text input and the like. The main purpose of text detection is to detect the position of text lines, including printed text lines and handwritten text lines. Different from street signs, billboards and the like, in intelligent correction, the background penetration problem exists in the cut-off teaching aid, test paper and operation images, namely the content on the other side of the paper is displayed on the current page through the paper, as shown in fig. 1, and the detection result of the text line can be greatly influenced. The background penetration trace is generally removed by conventional image processing in the prior art, and the processed image is then sent to a subsequent detection module. However, with the development of the deep learning model, the deep learning model can learn and distinguish the features of the background penetration trace on the premise of sufficient data. The deep learning model needs to be trained by using image samples with background permeation marks, most of the current image samples are manually acquired, the background permeation marks are different in morphology, and the discrimination capability of the deep learning model is improved slightly.

In order to solve the above-described problems, a method for generating a background penetration image provided by an embodiment of the present disclosure is described below. In the embodiment of the present disclosure, the method for generating the background penetration image may be performed by an electronic device or a server. The electronic device or the server is the generating end of the background penetration image in the embodiment of the disclosure. The electronic device may include a mobile phone, a tablet computer, a desktop computer, a notebook computer, and other devices with communication functions. The server may be a cloud server or a server cluster, or other devices with storage and computing functions. Note that the following embodiments are exemplarily explained with the electronic device as an execution subject.

Fig. 2 shows a flowchart of a method for generating a background penetration image provided by an embodiment of the present disclosure.

As shown in fig. 2, the method for generating the background penetration image includes the following steps:

s1: and acquiring a foreground image to which the background penetration trace is to be added and text information of the foreground image.

In this embodiment, an image set may be prepared in advance, where the images are all the images of the questions cut by the teaching aid, the test paper, and the operation, and are all images without background penetration marks.

In some embodiments, the text information includes an average size of the characters. And detecting each image in the image set by utilizing a single word detection algorithm, recording a single word detection result of each image, and calculating the average width and height of all character frames in each image, and recording the average width and height as average_w and average_h.

And randomly selecting one or more images from the image set to serve as foreground images, and simultaneously acquiring the average character sizes average_w and average_h of the one or more images.

In other embodiments, the text information includes a text category. Taking the related application scenario of teaching assistance as an example, the text category is the subject of teaching assistance, text in each image in the image set is obtained by utilizing a text detection and text recognition algorithm, and the subject to which the text belongs is classified based on a natural language processing algorithm, and is specifically classified into mathematics, chinese, english, physics, chemistry, geography, politics, history, biology and the like. The present disclosure is not limited to natural language processing algorithms used to classify text.

One or more images are then randomly selected from the set of images as foreground images while the subject of each of the one or more images is acquired.

In other embodiments, the average size and subject of each foreground image character may be acquired simultaneously.

In some other application scenarios, the text category may change the classification scheme accordingly. For example, the application scenario of the paper, the text category may be academic, classified as physics, materials science, medicine, etc., or classified in language, classified as chinese, english, french, etc. For another example, the application scenario of the journal may be a topical category, classified as finance, science and technology, practice, etc.

S2: and selecting a basic penetration image matched with the text information of the foreground image.

In this embodiment, the base infiltration image may be selected from the image set described above.

In other embodiments, a basic infiltration image set may be prepared in advance, where the images are also the block images cut by the teaching aid, the test paper, and the operation, and are all images without background infiltration traces. In addition, it is also necessary to calculate the character average width and height of all characters in each base penetration image or identify the subject to which each base penetration image belongs.

In some embodiments, the text information includes an average size of the characters, the step comprising: and selecting a basic penetration image matched with the character average size of the foreground image, wherein the difference between the character average size of the selected basic penetration image and the character average size of the foreground image is smaller than or equal to a preset value. For example, the character average width and height of the base stitched image may differ from the character average width and height of the foreground image by less than 10 pixels.

In other embodiments, the text information includes a text category, the step comprising: a base stitched image is selected that is the same text category as the foreground image.

In other embodiments, the selected base-penetration image may be the same as the text category of the foreground image, or the difference between the average size of the characters of the foreground image and the average size of the characters of the foreground image is less than or equal to a preset value.

For example, in the step S1, one foreground image to which the background penetration trace is to be added is selected, and one or more basic penetration images are selected from the basic penetration image set according to the average width and height (average_w and average_h) of the characters of the foreground image and the subject to which the foreground image belongs. The specific criteria for the selected base osmotic image are: the basic penetration image and the foreground image belong to the same subject, and the average width and height of characters of the basic penetration image and the average width and height of characters of the foreground image are different by less than 10 pixels. The pixel values of the phase difference can be drawn according to actual requirements, and the principle is to ensure that the character of the basic penetration image is basically consistent with the character size of the foreground image, namely the character sizes of the current page and the back page are basically the same, and are consistent with the scene of the user.

Regarding the subject, it is found in practice that the discrimination gain after training of the image detection model is greater when the foreground image and the background penetration trace are the same subject than when they are different subjects. For example, when a mathematical foreground image is used and the background penetration trace is also a mathematical text, the discrimination capability gain of the image detection model after training the image detection model by using a fusion image synthesized by the two is larger than that when the background penetration trace is other subjects text (such as English).

S3: and cutting out the penetration image matched with the size of the foreground image from the basic penetration image.

In this embodiment, the size of the base infiltration image is larger than the foreground image, i.e., the width and height of the base infiltration image are both larger than the foreground image, or the width or height of the base infiltration image is larger than the foreground image.

As shown in fig. 3, the present step specifically includes the following steps:

s301: and setting a cutting frame with the same size as the foreground image.

And setting a cutting frame with the same size according to the width and the height of the foreground image.

S302: traversing the base penetration image using the crop frame.

The starting point and the sequence of the frame cutting and traversing scanning basic penetration images are not particularly limited, and the principle is that the whole basic penetration images can be scanned. For example, the base osmotic image is scanned in a top-to-bottom, left-to-right order starting from the top left corner of the base osmotic image using a crop frame.

When the area occupied by the characters in the cutting frame reaches the preset proportion, step S303 is executed. The calculation mode of the area occupied by the character is to calculate the minimum rectangle surrounding the single character as a character frame, and when the ratio of the sum of the areas of the character frames in the cutting frame to the area of the cutting frame reaches the preset ratio, step S303 is executed.

The preset proportion is not particularly limited, and can be determined according to specific requirements, and the principle is to ensure that a certain number of characters can be cut in step S303, so that too few characters are avoided to be cut or blank areas are avoided to be cut, and the preset proportion adopted in the embodiment is 30%.

If the area occupied by the characters in the cutting frame does not reach the preset proportion until the basic penetration image is traversed, replacing another basic penetration image, and returning to the step S302 to traverse the scanning again.

S303: and cutting the current area where the cutting frame is positioned into a penetration image.

After the penetrating image is cut off, the penetrating image is associated with the corresponding foreground image and recorded. And then replacing another basic penetration image, returning to the step S302 and traversing the scanning again.

In some embodiments, the step S302 may be performed on the current base osmotic image continuously without replacing the base osmotic image until the current base osmotic image is traversed, so that a plurality of osmotic images can be cut off from one base osmotic image.

S4: and carrying out background pretreatment on the permeated image to obtain a background image.

As shown in fig. 4, this step specifically includes:

s401: the infiltrated image is horizontally flipped.

Randomly selecting one penetrating image from penetrating images associated with the foreground image, and then horizontally overturning the penetrating image to enable the character direction to be changed into mirror image characters conforming to the back surface of the paper.

S402: and converting the flipped osmotic image into a gray scale image.

Converting the overturned penetration image into a gray image by using a gray algorithm

S403: the gray scale image is converted into a binary image.

The gray scale image is converted to a binary image based on an adaptive threshold binarization algorithm.

S404: the binary image is converted into an RGB three-channel image.

The obtained binary image is converted into an RGB three-channel image by using a method of converting a gray level image into an RGB image in a computer vision library (such as opencv).

S405: and carrying out blurring processing on the RGB three-channel image to obtain a background image.

And carrying out fuzzy processing on the RGB three-channel image by using a Gaussian fuzzy algorithm to obtain a final background image.

S5: and fusing the foreground image and the background image to obtain a background penetration image.

In this embodiment, the foreground image and the background image are fused according to the following expression, and the background penetration image is obtained.

N _xy ＝W _p *P _xy +W _q *Q _xy

Wherein N is _xy For pixel values at coordinates (x, y) in the background-stitched image, P _xy For pixel values at coordinates (x, y) in the foreground image, Q _xy Is the pixel value, W, at coordinates (x, y) in the background image _p And W is _q The weights of the foreground pixels and the background pixels, respectively.

Thus, a background penetration image containing background penetration traces is obtained, and the background penetration image is used as detection frame marking information of a training sample, namely detection frame marking information of a foreground image.

Adjusting W _p And W is _q The penetration degree of the background trace can be controlled, and the characters of the foreground image are always darker than the characters of the background image, so the weight value of the foreground pixels is larger than that of the background pixels, W _p The value range of W can be between 0.8 and 1 _q The range of values of (2) may be between 0.1 and 0.4.

In some embodiments, W _p And W is _q Can also be adjusted according to actual needs, but the fused characters are required to meet W _p *P _xy >W _q *Q _xy I.e. to ensure that the characters of the foreground image are darker than the characters of the background image.

Typically, in a background penetration image, the same W should be used for each pixel _p And W is _q And the uniformity of the whole image is ensured. However, in some special cases, for example, the difference of the character depth is larger at different positions of the foreground image or the difference of the character depth is larger at different positions of the background image, different W can be correspondingly adopted in different areas _p And W is _q 。

In some embodiments, the method further comprises the step of:

s6: and training the image detection model by taking the background penetration image as a training sample.

In practical applications, the image detection model needs to be trained by generating a large number of background penetration images through the method provided by the embodiment of the disclosure.

By the method provided by the embodiment of the disclosure, the basic penetrating image matched with the text information of the foreground image can be selected, and then the cut penetrating image is matched with the size of the foreground image, namely, the text information and the size of the penetrating image are matched with the foreground image, the background image obtained by the penetrating image after the background pretreatment is fused with the foreground image to form the background penetrating image, and the background image becomes the background penetrating trace in the background penetrating image. Compared with the prior art, the background penetration trace has matched text information and size with the foreground image, the phenomenon of different forms of the background penetration trace is avoided to a great extent, the difference between the background penetration trace and the foreground image is reduced, the background penetration trace and the foreground image have higher similarity, and the background penetration image is used for training the image detection model, so that the discrimination capability of the image detection model can be remarkably improved.

As a data-driven algorithm, under the condition that the model structure of the image detection model based on the deep learning technology is increasingly perfect, the quality of the image sample used for training becomes an important factor influencing the effect of the model, and the quality of the image sample mainly refers to the complexity and the representativeness of the image sample. In the use scenario to which the present disclosure relates, there are generally three ways to obtain an image sample: 1. the image shot by the user in the practical application; 2. images acquired exclusively by man; 3. an image synthesized based on an image algorithm. Of these three approaches, the image taken by the user is most representative, but for some special scenes, such as the case where the image related to the disclosure has background penetration traces, the ratio of the image taken by the user is very low, and these special scenes are also considered by the algorithm. At this time, the special type of image sample needs to be manually collected or synthesized by using an image algorithm, so as to improve the robustness of the image detection model under the special scenes.

Compared with manual collection, the image sample synthesis method based on the image algorithm has the advantages of being more controllable, lower in cost, shorter in data collection period and the like. Practice shows that in the synthesized image sample, when the background penetration trace is the same text, the improvement of the distinguishing capability of the image detection model is most obvious, while the improvement of the distinguishing capability of the image detection model is not obvious due to other types of background penetration traces, such as lines, irregular curves and the like. Meanwhile, through practice, for a user scene with a part of extremely serious background permeation trace, such as thin paper and pen writing on the back of the paper, the appearance degree of the background permeation trace is almost as deep as that of the character to be detected in the foreground image, the distinguishing capability of the image detection model on the condition is difficult to be improved by adding the common background permeation image, and the distinguishing capability of the image detection model on the condition can be further improved only when the permeation degree of the added image sample is deep and the permeation character trace is the same subject and the character size is basically consistent. The image detection model can be considered to learn not only the difference between the background penetration trace and the depth of the character to be detected, but also the difference between the background penetration trace and the character orientation of the character to be detected. Therefore, the embodiment of the disclosure provides a method for generating a background penetration image aiming at a special scene of background penetration, and the method expands the diversity of training data samples, thereby achieving the purpose of improving the discrimination capability of an image detection model and improving the robustness of the image detection model.

In some embodiments, the size of the base stitched image may be the same as, or even smaller than, the size of the foreground image. Correspondingly, as shown in fig. 5, the step S3 specifically includes the following steps:

s311: and setting a cutting frame with the size smaller than that of the foreground image.

And setting a cutting frame with the size smaller than that of the basic infiltration image according to the size of the basic infiltration image, wherein the size of the cutting frame is also smaller than that of the foreground image.

S312: traversing the base penetration image using the crop frame.

For example, the base osmotic image is scanned in a top-to-bottom, left-to-right order starting from the top left corner of the base osmotic image using a crop frame.

When the area occupied by the characters in the cutting frame reaches the preset proportion, step S313 is executed. Because the size of the crop frame is smaller than the foreground image, too few characters are avoided from being cropped, where the preset ratio needs to be increased appropriately, for example to 50%.

If the area occupied by the characters in the cutting frame does not reach the preset proportion until the basic penetration image is traversed, replacing another basic penetration image, and returning to the step S312 to traverse the scanning again.

S313: and cutting the current area where the cutting frame is positioned into a penetration image.

After the penetrating image is cut off, the penetrating image is associated with the corresponding foreground image and recorded.

S314: and adding a blank frame for the penetration image to enable the penetration image to be the same as the foreground image in size.

The adding mode of the blank frame can be set according to actual application scenes, the blank frame can be added around the penetration image, and the blank frame can be added on one side or two sides of the penetration image, so that the penetration image and the foreground image are identical in size, and the follow-up image fusion is facilitated.

After that, another basic penetration image is replaced, and the process returns to step S312 to traverse the scan again.

The embodiment of the present disclosure further provides an apparatus for implementing the above method for generating a background penetration image, which is described below with reference to fig. 6. In the embodiment of the present disclosure, the generating device of the background penetration image may be an electronic device or a server. The electronic device may include a mobile phone, a tablet computer, a desktop computer, a notebook computer, and other devices with communication functions. The server may be a cloud server or a server cluster, or other devices with storage and computing functions.

Fig. 6 shows a schematic structural diagram of a background penetration image generating apparatus according to an embodiment of the present disclosure.

As shown in fig. 6, the generating means of the background penetration image may include:

a text information module 610, configured to obtain a foreground image to which a background penetration trace is to be added, and text information of the foreground image;

a selection module 620, configured to select a basic penetration image that matches text information of the foreground image;

a cropping module 630, configured to crop a stitched image matching the size of the foreground image from the base stitched image;

a preprocessing module 640, configured to perform a background preprocessing on the penetration image to obtain a background image;

the fusion module 650 is configured to fuse the foreground image with the background image to obtain a background penetration image.

The generating device provided by the embodiment of the disclosure can select the basic penetrating image matched with the text information of the foreground image, and then the cut penetrating image is matched with the size of the foreground image, namely, the text information and the size of the penetrating image are matched with the foreground image, the background image obtained by the penetrating image after the background pretreatment is fused with the foreground image to form the background penetrating image, and the background image becomes the background penetrating trace in the background penetrating image. Compared with the prior art, the background penetration trace has matched text information and size with the foreground image, the phenomenon of different forms of the background penetration trace is avoided to a great extent, the difference between the background penetration trace and the foreground image is reduced, the background penetration trace and the foreground image have higher similarity, and the background penetration image is used for training the image detection model, so that the discrimination capability of the image detection model can be remarkably improved.

In some embodiments, the text information includes an average size of characters, and the selection module 620 is specifically configured to:

and selecting a basic penetration image matched with the character average size of the foreground image, wherein the difference between the character average size of the selected basic penetration image and the character average size of the foreground image is smaller than or equal to a preset value.

In some embodiments, the text information includes text categories, and the selection module 620 is specifically configured to:

a base stitched image is selected that is the same text category as the foreground image.

In some embodiments, the base permeate image is larger in size than the front Jing Tuxiang, and the cropping module 630 comprises:

the frame cutting unit is used for setting a frame cutting frame with the same size as the foreground image;

the traversing unit is used for traversing the basic penetration image by utilizing the cutting frame;

and the cutting unit is used for cutting the current area where the cutting frame is positioned into a penetration image when the occupied area of the characters in the cutting frame reaches a preset proportion.

In some embodiments, the size of the base stitched image is less than or equal to the foreground image, and the cropping module 630 comprises:

the frame cutting unit is used for setting a frame cutting frame with a size smaller than that of the foreground image;

the cutting unit is used for cutting the current area where the cutting frame is positioned into a penetration image when the occupied area of the characters in the cutting frame reaches a preset proportion;

and the adding unit is used for adding blank frames for the penetration image so that the penetration image and the foreground image have the same size.

In some embodiments, the predetermined proportion is above 30%.

In some embodiments, the fusion module 650 is specifically configured to:

fusing the foreground image and the background image according to the following formula to obtain a background penetration image

N _xy ＝W _p *P _xy +W _q *Q _xy

In some embodiments, W _p The value range of (2) is 0.8-1, W _q The value range of (2) is 0.1-0.4.

In some embodiments, the preprocessing module 640 includes:

the overturning unit is used for horizontally overturning the seepage image;

the gray level unit is used for converting the overturned penetration image into a gray level image;

a binarization unit for converting the gray image into a binary image;

the three-channel unit is used for converting the binary image into an RGB three-channel image;

and the blurring unit is used for blurring the RGB three-channel image to obtain a background image.

In some embodiments, the apparatus further comprises:

and the training module is used for taking the background penetration image as a training sample and training the image detection model.

The device provided in this embodiment has the same implementation principle and technical effects as those of the foregoing method embodiment, and for brevity, reference may be made to the corresponding content of the foregoing method embodiment where the device embodiment is not mentioned.

The exemplary embodiments of the present disclosure also provide an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor for causing the electronic device to perform a method according to embodiments of the present disclosure when executed by the at least one processor.

The present disclosure also provides a computer program product comprising a computer program, wherein the computer program, when executed by a processor of a computer, is for causing the computer to perform a method according to embodiments of the disclosure.

Referring to fig. 7, a block diagram of an electronic device 400 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the electronic device 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706, an output unit 707, a storage unit 708, and a communication unit 709. The input unit 706 may be any type of device capable of inputting information to the electronic device 700, and the input unit 706 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit 707 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 708 may include, but is not limited to, magnetic disks, optical disks. The communication unit 709 allows the electronic device 700 to exchange information/data with other devices through computer networks, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the various methods and processes described above. For example, in some embodiments, the text recognition method or training method of the recognition network may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM 702 and/or the communication unit 709. In some embodiments, the computing unit 701 may be configured to perform a text recognition method or training method of the recognition network by any other suitable means (e.g., by means of firmware).

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As used in this disclosure, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for generating a background-stitched image, comprising:

2. The method of claim 1, wherein the text information comprises an average size of characters;

the step of selecting a base penetration image matched with text information of a foreground image comprises the following steps:

3. The method of claim 1, wherein the text information comprises a text category;

4. The method of claim 1, wherein the base infiltration image is larger in size than the foreground image;

a step of cropping a stitched image matching the size of a foreground image from a base stitched image, comprising:

setting a cutting frame with the same size as the foreground image;

traversing the basic penetration image by utilizing a cutting frame;

and when the occupied area of the characters in the cutting frame reaches a preset proportion, cutting the current area where the cutting frame is positioned into a penetration image.

5. The method of claim 1, wherein the step of cropping the stitched image matching the size of the foreground image from the base stitched image comprises:

setting a cutting frame with the size smaller than that of the foreground image;

traversing the basic penetration image by utilizing a cutting frame;

when the occupied area of the characters in the cutting frame reaches a preset proportion, cutting the current area where the cutting frame is positioned into a penetration image;

and adding a blank frame for the penetration image to enable the penetration image to be the same as the foreground image in size.

6. The method according to claim 4 or 5, wherein the step of fusing the foreground image with the background image to obtain the background penetration image comprises:

N _xy ＝W _p *P _xy +W _q *Q _xy

7. The method of claim 6, wherein W _p The value range of (2) is 0.8-1, W _q The value range of (2) is 0.1-0.4.

8. The method according to claim 4 or 5, wherein the predetermined proportion is 30% or more.

9. The method of claim 1, wherein the step of background preprocessing the permeate image to obtain a background image comprises:

horizontally overturning the seepage image;

converting the overturned penetration image into a gray level image;

converting the gray level image into a binary image;

converting the binary image into an RGB three-channel image;

and carrying out blurring processing on the RGB three-channel image to obtain a background image.

10. The method as recited in claim 1, further comprising:

and training the image detection model by taking the background penetration image as a training sample.

11. A background-penetration image generating apparatus, comprising:

12. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the image acquisition method for model training of any one of the preceding claims 1-10.

13. A non-transitory computer readable storage medium storing computer instructions which, when executed on a terminal device, cause the terminal device to implement the method of any of claims 1-10.