CN115578486A

CN115578486A - Image generation method and device, electronic equipment and storage medium

Info

Publication number: CN115578486A
Application number: CN202211283368.4A
Authority: CN
Inventors: 刘俊启
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-10-19
Filing date: 2022-10-19
Publication date: 2023-01-06

Abstract

The present disclosure provides an image generation method, which relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and the technical field of image processing. The specific implementation scheme is as follows: carrying out target detection on the original image to obtain a detection frame of at least one object in the original image; determining a first candidate image area in the original image according to the image boundary of the original image and the area boundary of at least one detection frame; determining a second candidate image area related to a preset position from at least one first candidate image area according to the size of the target information and one preset position in the preset position sequence; determining a target image area from at least one second candidate area according to a first distance between the target information of the second candidate image area and the detection frame; and adding the target information to the original image to obtain a target image. The present disclosure also provides an image generation apparatus, an electronic device, and a storage medium.

Description

Image generation method and device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and the technical field of image processing, and can be applied to an information recommendation scene. More specifically, the present disclosure provides an image generation method, apparatus, electronic device, and storage medium.

Background

With the development of computer technology and search technology, users can add various information to images. For example, the user may add a preset text or image to the image, and share the image with the information to other users.

Disclosure of Invention

The present disclosure provides an image generation method, apparatus, device, and storage medium.

According to an aspect of the present disclosure, there is provided an image generating method including: performing target detection on the original image to obtain a detection frame of at least one object in the original image, wherein the detection frame is used for indicating an image area where the object is located in the original image; determining at least one first candidate image area in the original image according to the image boundary of the original image and the area boundary of the at least one detection frame; determining a second candidate image area related to a preset position from at least one first candidate image area according to the size of the target information and one preset position in a preset position sequence, wherein the preset position sequence comprises at least one preset position; determining a target image area from the at least one second candidate area according to at least one first distance between the target information of the second candidate image area and the at least one detection frame; and adding the target information to the original image according to the preset position corresponding to the target image area to obtain the target image.

According to another aspect of the present disclosure, there is provided an image generating apparatus including: the target detection module is used for carrying out target detection on the original image to obtain a detection frame of at least one object in the original image, wherein the detection frame is used for indicating an image area where the object is located in the original image; a first determining module, configured to determine at least one first candidate image region in the original image according to an image boundary of the original image and a region boundary of the at least one detection frame; a second determining module, configured to determine, according to the size of the target information and a preset position in a preset position sequence, a second candidate image region related to the preset position from the at least one first candidate image region, where the preset position sequence includes the at least one preset position; a third determining module, configured to determine a target image area from the at least one second candidate area according to at least one first distance between the target information in the second candidate image area and the at least one detection frame; and the adding module is used for adding the target information to the original image according to the preset position corresponding to the target image area to obtain the target image.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided according to the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method provided according to the present disclosure.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of an exemplary system architecture to which the image generation method and apparatus may be applied, according to one embodiment of the present disclosure;

FIG. 2 is a flow diagram of an image generation method according to one embodiment of the present disclosure;

FIG. 3A is a schematic diagram of an original image according to one embodiment of the present disclosure;

FIG. 3B is a schematic diagram of a detection box according to one embodiment of the present disclosure;

fig. 3C and 3D are schematic diagrams of candidate image regions according to one embodiment of the present disclosure; FIG. 3E is a schematic diagram of a target image according to one embodiment of the present disclosure;

FIG. 4 is a schematic illustration of a candidate image region according to another embodiment of the present disclosure;

FIG. 5 is a block diagram of an image generation apparatus according to one embodiment of the present disclosure; and

fig. 6 is a block diagram of an electronic device to which an image generation method may be applied according to one embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As described above, text or images may be added to the image. For example, when adding text, the text may be added to a preset location in the original image (e.g., the lower left or lower right corner of the image). At this preset position, there may be some information in the original image, resulting in a possible conflict between the added text and the information in the original image, making the information in the original image difficult to recognize.

FIG. 1 is a schematic diagram of an exemplary system architecture to which the image generation method and apparatus may be applied, according to one embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1, the system architecture 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104 and a server 105. Network 104 is the medium used to provide communication links between

terminal devices

101, 102, 103 and server 105. Network 104 may include various connection types, such as wired and/or wireless communication links, and so forth.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The backend management server may analyze and process the received data such as the user request, and feed back a processing result (for example, a web page, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the image generation method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the image generation apparatus provided by the embodiment of the present disclosure may be generally disposed in the server 105. The image generation method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the image generating apparatus provided in the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105.

Fig. 2 is a flow diagram of an image generation method according to one embodiment of the present disclosure.

As shown in fig. 2, the method 200 may include operations S210 to S250.

In operation S210, target detection is performed on the original image to obtain a detection frame of at least one object in the original image.

In an embodiment of the present disclosure, the detection box is used to indicate an image area where the object is located in the original image.

In the embodiment of the present disclosure, the target detection may be performed on the original image by using a deep learning model. For example, a deep learning model may be trained using sample images that include class labels and detection box labels. And using the trained deep learning model for carrying out target detection on the original image.

In the embodiment of the present disclosure, the target detection is performed on the original image, and the category of the object may also be obtained. The categories of objects may include animals, goods, trademarks, two-dimensional codes, and the like. For example, the category of the object may include a speaker of a certain product, the product, a trademark of the product, a two-dimensional code associated with the product, and so on.

In the embodiment of the present disclosure, the detection frame is used to indicate an image area where the object is located in the original image. For example, the detection box may include coordinates and dimensions. The coordinates may be coordinates of the top left vertex of the detection box and the dimensions may include the width and height of the detection box.

In operation S220, at least one first candidate image region is determined in the original image according to the image boundary of the original image and the region boundary of the at least one detection frame.

For example, a rectangular first candidate image region may be determined using one image boundary and three region boundaries from three detection boxes, respectively. For another example, the first candidate image region of another rectangle may be determined using two image boundaries and two region boundaries from two detection frames, respectively. It is to be understood that the shape of the candidate image area may be a rectangle or a polygon, which is not limited by the present disclosure.

In operation S230, a second candidate image region associated with a preset position is determined from the at least one first candidate image region according to the size of the target information and a preset position in the preset position sequence.

In an embodiment of the disclosure, the sequence of preset positions comprises at least one preset position. For example, the preset position sequence may include positions of a lower left corner of the original image, a lower right corner of the original image, an upper right corner of the original image, and an upper left corner of the original image.

In the disclosed embodiment, one or more first candidate image regions having a size greater than or equal to that of the target information may be determined from among the at least one first candidate image region. From these first candidate image regions, a first candidate image region including a preset position may be taken as a second candidate image region associated with the preset position.

In operation S240, a target image area is determined from the at least one second candidate area according to at least one first distance between the target information of the second candidate image area and the at least one detection frame.

In the embodiment of the present disclosure, the target information may be added to a second candidate image region, and then at least one first distance between the target information and the at least one detection frame may be determined. If the first distances are both greater than or equal to the first preset distance threshold, the second candidate image area may be used as a target image area.

In the embodiment of the present disclosure, the target image area may be at least one for one preset position. For example, the target image area may be one. For another example, the target image area may be two or more.

In operation S250, the target information is added to the original image according to the preset position corresponding to the target image region to obtain a target image.

For example, the target information may be added to the target image area. The original image to which the target information is added may be taken as the target image.

Through the embodiment of the disclosure, each object in the original image is identified, so that the target information can avoid key information (for example, each object in the original image), the addition of the target information can be intelligently completed, and the possibility of conflict between the target information and the key information in the image is reduced. The method and the device are beneficial to improving the operation efficiency of related image products, so that some key information in the images can be effectively spread. The method and the device are also beneficial to identifying the key information in the image by using an artificial intelligence software tool, so that the effective display times of the key information in the image are improved, the experience of identifying or searching by using the image is indirectly improved, and the competitiveness of related products is improved.

The following describes the method for generating an anti-counterfeit image according to the present disclosure in detail with reference to the related embodiments.

Fig. 3A is a schematic diagram of an original image according to one embodiment of the present disclosure.

As shown in fig. 3A, an object 310, an object 320, an object 330, an object 340, and an object 350 may be included in the original image 300. For example, object 330 may be a commodity. As shown by object 340, the brand of the item may be "AA". As shown by object 350, the promotional text for the item may be "A series of AA" and "classic look". The article of merchandise may be a volleyball. The speaker of the item is shown as object 310 in FIG. 3A. As shown by object 320, the speaker's signature may be "Liu somebody".

Fig. 3B is a schematic diagram of a detection box according to one embodiment of the present disclosure.

In some embodiments, the above operation S210 may be performed on the original image 300, resulting in detection frames and categories of a plurality of objects in the original image 300. For example, object detection is performed on the original image 300, and a detection frame 311 of the object 310, a detection frame 321 of the object 320, a detection frame 331 of the object 330, a detection frame 341 of the object 340, and a detection frame 351 of the object 350 can be obtained. For another example, the original image 300 may have a width of 1000 and a height of 600. The top left vertex of the original image 300 is taken as the coordinate system zero point. The coordinates of the top left vertex of the detection box 311 may be (15, 2), the width of the detection box 311 may be 419, and the height may be 594. The coordinates of the top left vertex of the detection box 321 may be (365, 380), the width of the detection box 321 may be 157, and the height may be 130. The coordinates of the top left vertex of the detection frame 331 may be (560, 279), the width of the detection frame 331 may be 193, and the height may be 251. The coordinates of the top left vertex of the detection box 341 may be (494, 69), the width of the detection box 341 may be 336, and the height may be 99. The coordinates of the top left vertex of the detection box 351 may be (485, 181), the width of the detection box 351 may be 352, and the height may be 91.

As another example, the category of object 310 may be "speaker of a good". The class of object 320 may be "speaker's signature". The category of object 330 may be "goods". The category of the object 340 may be "item identification". The category of the object 350 may be "promotional text for goods".

Fig. 3C and 3D are schematic diagrams of candidate image regions according to one embodiment of the present disclosure.

In some embodiments, in some implementations of operation S220 described above, determining at least one first candidate image region in the original image may include: from the at least two image borders and the at least one region border, a first candidate image region may be determined. In the embodiment of the present disclosure, the first candidate image area is from an image area other than the image area indicated by the detection frame in the original image. For example, the candidate image region 361 may be determined according to the left image boundary, the lower image boundary, the upper image boundary of the original image 300, and the left region boundary of the detection block 311. For another example, the candidate image region 362 may be determined based on the right image boundary, the lower image boundary of the original image 300, the right region boundary of the detection frame 331, and the lower region boundary of the detection frame 351. For another example, the candidate image region 363 may be determined based on the right image boundary, the lower image boundary of the original image 300, the right region boundary of the detection frame 311, and the lower region boundary of the detection frame 331. It is to be understood that the candidate image regions 361 to 363 may be the first candidate image regions. It is also understood that the candidate image regions 361 to 363 are merely examples, and other candidate image regions may also be determined, which is not described herein.

In some embodiments, the at least one preset position is I preset positions. I is an integer greater than or equal to 1. For example, taking I =4 as an example, the 1 st preset position may be a lower left corner of the original image 300. The 2 nd preset position may be a lower right corner of the original image 300. The 3 rd preset position may be the upper left corner of the original image 300. The 4 th preset position may be the upper right corner of the original image 300.

In some embodiments, in some embodiments of the foregoing operation S230, determining, according to the size of the target information and a preset position in the preset position sequence, a second candidate image region related to the preset position from the at least one first candidate image region may include: and determining a first image boundary and a second image boundary related to the ith preset position from the plurality of image boundaries according to a plurality of second distances between the ith preset position in the preset position sequence and the plurality of image boundaries. N initial image regions comprising a part of the first image border and a part of the second image border are determined from the at least one first candidate image region. In response to determining that the size of the initial image region is greater than or equal to the size of the target information, the initial image region is determined to be a second candidate image region. I is an integer greater than or equal to 1 and less than or equal to I. N is an integer greater than or equal to.

In the embodiment of the present disclosure, the number of the second candidate image regions may be M. M is an integer greater than or equal to 1.

For example, the target information may be target text. The target text may be, for example, "Zhao something". The bounding box of the target text may have a width of 158 and a height of 40.

For example, in the case of i =1, the second distance between the 1 st preset position (lower left corner of the original image 300) and the left image boundary of the original image 300 and the lower image boundary is small, the lower image boundary may be the first image boundary related to the 1 st preset position, and the left image boundary may be the second image boundary related to the 1 st preset position. Of the three candidate image regions described above, the candidate image region 361 is defined by the left image boundary and the lower image boundary, and the candidate image region 361 may be used as an initial image region. The candidate image region 361 may have a width of 15, for example. The width of the candidate image region 361 is smaller than the width of the bounding box of the target text. The candidate image region 361 may not be a second candidate image region.

For another example, in the case of i =2, the second distance between the 2 nd preset position (the lower right corner of the original image 300) and the right image boundary and the lower image boundary of the original image 300 is smaller, the lower image boundary may be the first image boundary related to the 2 nd preset position, and the right image boundary may be the second image boundary related to the 2 nd preset position. Of the three candidate image regions described above, candidate image region 362 is defined by the lower image boundary and the right image boundary, and candidate image region 362 may serve as an initial image region. The candidate image region 362 may have a width of 240 and a height of 315, for example. The width and height of the candidate image region 362 are both larger than the bounding box of the target text. Candidate image region 362 may serve as a second candidate image region.

For another example, in the case of i =2, among the above three candidate image regions, the candidate image region 363 is determined by the lower image boundary and the right image boundary, and the candidate image region 363 may be one initial image region. The candidate image region 363 may have a width of 560 and a height of 70, for example. The width and height of the candidate image region 363 are both larger than the bounding box of the target text. Candidate image region 363 may be a second candidate image region.

In some embodiments, in some embodiments of the above operation S240, determining the target image region from the at least one second candidate region according to the at least one first distance between the target information of the second candidate image region and the at least one detection frame may include: the target information is added to the mth second candidate image area. And determining whether an overlapping area exists between the surrounding frame and the detection frame according to a first distance between the surrounding frame and the detection frame of the target information. And in response to determining that no overlapping area exists between the bounding box and the at least one detection box, determining the mth second candidate image area as a target image area.

In the disclosed embodiments, M is an integer greater than or equal to 1 and less than or equal to M. For example, the number of the second candidate image regions may be 2 for the 2 nd preset position. It will be appreciated that for the 2 nd preset position, M may be 2.m can take values of 1 and 2. The candidate image region 362 may be regarded as the 1 st second candidate image region. The candidate image region 363 may be the 2 nd second candidate image region

For example, the target text may be added to the candidate image region 362. Next, a first distance between the bounding box of the target text and the

detection box

311, 321, 331, 341, 351, respectively, may be determined. And determining whether the surrounding frame is overlapped with the detection frame according to the first distance. In one example, for the 2 nd preset position, a distance between a lower right corner of the detection frame and an upper left corner of the bounding frame may be calculated as the first distance. If it is determined that the bounding box does not overlap the detection box, the candidate image region 362 may be regarded as a target image region.

For another example, the target text may be added to the candidate image area 363. Next, a first distance between the bounding box of the target text and the

detection box

311, 321, 331, 341, 351, respectively, may be calculated. And determining whether the surrounding frame is overlapped with the detection frame according to the first distance. If it is determined that the bounding box does not overlap the detection box, the candidate image region 363 may also be used as a target image region. Through the embodiment of the disclosure, whether the surrounding frame overlaps with the detection frame can be efficiently determined according to the distance between the surrounding frame and the detection frame. This can improve the efficiency of image generation.

Next, the above operation S250 may be performed, resulting in the target image shown in fig. 3E.

Fig. 3E is a schematic diagram of a target image according to one embodiment of the present disclosure.

As shown in fig. 3E, the target image includes target text 370. There is no overlapping area between bounding box 371 of target text 370 and the multiple detection boxes. For example, the lower right vertex of bounding box 371 may be aligned with the lower right vertex of original image 300 to add target text 370 to original image 300.

It is to be understood that the method provided by the present disclosure is explained in detail above by taking the example that the size of the target information is smaller than or equal to the size of one candidate image area. But the present disclosure is not limited thereto and will be described in detail below.

It will be appreciated that the size of the target information may be larger than the size of any of the candidate image regions.

Based on this, in the embodiment of the present disclosure, in response to determining that the size of the target information is larger than the size of any of the first candidate image areas, the size of the target information is reduced, resulting in adjusted target information. And determining a second candidate image area from at least one first candidate image area according to the adjusted size of the target information and the preset position sequence.

For example, in the case where the target information is a target text, the size of each character in the target text may be reduced. For example, when the target information is a target image, the target image may be scaled to reduce the size of the target information.

It is to be understood that the method of the present disclosure has been described in detail above by taking the candidate image region as an example of a rectangle. However, the disclosure is not limited thereto, and in the embodiment of the disclosure, the candidate image area may also be a circle or a polygon, which will be described in detail below.

Fig. 4 is a schematic diagram of a candidate image region according to another embodiment of the present disclosure.

As shown in fig. 4, the candidate image area 464 may be determined based on the lower image boundary of the original image 400, the right area boundary of the detection block 411, the plurality of area boundaries of the detection block 421, the lower area boundary of the detection block 451, and the left area boundary of the detection block 431. The candidate image area 464 is polygonal in shape.

Fig. 5 is a block diagram of an image generation apparatus according to one embodiment of the present disclosure.

As shown in fig. 5, the apparatus 500 may include an object detection module 510, a first determination module 520, a second determination module 530, a third determination module 540, and an addition module 550.

The target detection module 510 is configured to perform target detection on the original image to obtain a detection frame of at least one object in the original image. For example, the detection box is used to indicate an image area where the object is located in the original image.

A first determining module 520, configured to determine at least one first candidate image region in the original image according to the image boundary of the original image and the region boundary of the at least one detection frame.

A second determining module 530, configured to determine, according to the size of the target information and a preset position in the preset position sequence, a second candidate image region related to the preset position from the at least one first candidate image region. For example, the sequence of preset positions comprises at least one preset position.

A third determining module 540, configured to determine the target image area from the at least one second candidate area according to at least one first distance between the target information of the second candidate image area and the at least one detection frame.

And an adding module 550, configured to add the target information to the original image according to a preset position corresponding to the target image region, so as to obtain the target image.

In some embodiments, the first determining module comprises: a first determining unit, configured to determine a first candidate image region according to at least two image boundaries and at least one region boundary, where the first candidate image region is from an image region in the original image except the image region indicated by the detection frame.

In some embodiments, the at least one preset position is I preset positions, I being an integer greater than or equal to 1, and the second determining module includes: a second determining unit, configured to determine a first image boundary and a second image boundary associated with an ith preset position from the plurality of image boundaries according to a plurality of second distances between the ith preset position in the preset position sequence and the plurality of image boundaries, where I is an integer greater than or equal to 1 and less than or equal to I; a third determining unit, configured to determine N initial image regions associated with a portion of the first image boundary and a portion of the second image boundary from the at least one first candidate image region, where N is an integer greater than or equal to; and a fourth determination unit for determining the initial image area as one of the second candidate image areas in response to determining that the size of the initial image area is greater than or equal to the size of the target information.

In some embodiments, the at least one second candidate image region is M second candidate image regions, M being an integer greater than or equal to 1, the third determining module comprises: an adding unit configured to add target information to an mth second candidate image area, where M is an integer greater than or equal to 1 and less than or equal to M; a fifth determining unit, configured to determine whether an overlapping area exists between the bounding box and the detection box according to the first distance between the bounding box and the detection box of the target information; and a sixth determining unit, configured to determine the mth second candidate image area as a target image area in response to determining that there is no overlapping area between the bounding box and the at least one detection box.

In some embodiments, the second determination unit includes: a reducing unit, configured to reduce the size of the target information in response to determining that the size of the target information is larger than the size of any of the first candidate image areas, to obtain adjusted target information; and a seventh determining unit configured to determine a second candidate image region from the at least one first candidate image region according to the adjusted size of the target information and the preset position sequence.

In some embodiments, the shape of the first candidate image region is at least one of rectangular, polygonal, and circular.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

In an embodiment of the present disclosure, an electronic device may include: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform methods provided in accordance with the present disclosure.

In embodiments of the present disclosure, a non-transitory computer readable storage medium has stored thereon computer instructions for causing a computer to perform a method provided in accordance with the present disclosure.

In embodiments of the present disclosure, the computer program product may comprise a computer program which, when executed by a processor, implements a method provided according to the present disclosure.

FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the device 600 comprises a computing unit 601, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 601 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 executes the respective methods and processes described above, such as the image generation method. For example, in some embodiments, the image generation method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the image generation method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the image generation method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. An image generation method comprising:

performing target detection on an original image to obtain a detection frame of at least one object in the original image, wherein the detection frame is used for indicating an image area where the object is located in the original image;

determining at least one first candidate image area in the original image according to the image boundary of the original image and the area boundary of at least one detection frame;

determining a second candidate image area related to a preset position from at least one first candidate image area according to the size of the target information and the preset position in a preset position sequence, wherein the preset position sequence comprises at least one preset position;

determining a target image area from at least one of the second candidate image areas according to at least one first distance between the target information of the second candidate image area and at least one of the detection frames; and

and adding the target information to the original image according to the preset position corresponding to the target image area to obtain a target image.

2. The method of claim 1, wherein said determining at least one first candidate image region in the original image comprises:

and determining the first candidate image area according to at least two image boundaries and at least one area boundary, wherein the first candidate image area is from image areas except the image area indicated by the detection frame in the original image.

3. The method of claim 1, wherein at least one of the preset positions is I of the preset positions, I being an integer greater than or equal to 1,

the determining, according to the size of the target information and a preset position in a preset position sequence, a second candidate image region related to the preset position from at least one of the first candidate image regions includes:

determining a first image boundary and a second image boundary related to an ith preset position from a plurality of image boundaries according to a plurality of second distances between the ith preset position and the plurality of image boundaries in the preset position sequence, wherein I is an integer greater than or equal to 1 and less than or equal to I;

determining N initial image regions associated with a portion of the first image border and a portion of the second image border from at least one of the first candidate image regions, where N is an integer greater than or equal to; and

determining the initial image region as a second candidate image region in response to determining that the size of the initial image region is greater than or equal to the size of the target information.

4. The method of claim 1, wherein at least one of the second candidate image regions is M of the second candidate image regions, M being an integer greater than or equal to 1,

the determining a target image region from at least one of the second candidate image regions according to at least one first distance between the target information of the second candidate image region and at least one of the detection boxes comprises:

adding the target information to an mth second candidate image area, wherein M is an integer greater than or equal to 1 and less than or equal to M;

determining whether an overlapping area exists between an enclosing frame and a detection frame of the target information according to a first distance between the enclosing frame and the detection frame;

in response to determining that there is no overlapping area between the bounding box and at least one of the detection boxes, determining the mth second candidate image area as one of the target image areas.

5. The method according to claim 1, wherein the determining, from at least one of the first candidate image regions, a second candidate image region associated with a preset position according to the size of the target information and the preset position in the preset position sequence comprises:

in response to determining that the size of the target information is larger than the size of any one of the first candidate image regions, reducing the size of the target information to obtain adjusted target information; and

and determining the second candidate image area from at least one first candidate image area according to the adjusted size of the target information and the preset position sequence.

6. The method of claim 1, wherein the first candidate image region is at least one of rectangular, polygonal, and circular in shape.

7. An image generation apparatus comprising:

the target detection module is used for carrying out target detection on an original image to obtain a detection frame of at least one object in the original image, wherein the detection frame is used for indicating an image area where the object is located in the original image;

a first determining module, configured to determine at least one first candidate image region in the original image according to an image boundary of the original image and a region boundary of at least one of the detection frames;

a second determining module, configured to determine, according to a size of the target information and a preset position in a preset position sequence, a second candidate image region related to the preset position from at least one of the first candidate image regions, where the preset position sequence includes at least one preset position;

a third determining module, configured to determine a target image region from at least one of the second candidate image regions according to at least one first distance between the target information of the second candidate image region and at least one of the detection frames; and

and the adding module is used for adding the target information to the original image according to the preset position corresponding to the target image area to obtain a target image.

8. The apparatus of claim 7, wherein the first determining means comprises:

a first determining unit, configured to determine the first candidate image area according to at least two image boundaries and at least one area boundary, where the first candidate image area is from an image area in the original image except the image area indicated by the detection frame.

9. The apparatus of claim 7, wherein at least one of the preset positions is I of the preset positions, I being an integer greater than or equal to 1,

the second determining module includes:

a second determining unit, configured to determine, according to a plurality of second distances between an ith preset position in the preset position sequence and a plurality of image boundaries, a first image boundary and a second image boundary related to the ith preset position from the plurality of image boundaries, where I is an integer greater than or equal to 1 and less than or equal to I;

a third determining unit, configured to determine N initial image regions associated with a portion of the first image boundary and a portion of the second image boundary from at least one of the first candidate image regions, where N is an integer greater than or equal to N; and

a fourth determining unit for determining the initial image area as a second candidate image area in response to determining that the size of the initial image area is greater than or equal to the size of the target information.

10. The apparatus of claim 7, wherein at least one of the second candidate image regions is M of the second candidate image regions, M being an integer greater than or equal to 1,

the third determining module comprises:

an adding unit configured to add the target information to an mth second candidate image area, where M is an integer greater than or equal to 1 and less than or equal to M;

a fifth determining unit, configured to determine whether an overlapping area exists between an enclosing frame of the target information and the detection frame according to a first distance between the enclosing frame and the detection frame; and

a sixth determining unit, configured to determine the mth second candidate image region as one target image region in response to determining that there is no overlapping region between the bounding box and at least one of the detection boxes.

11. The apparatus of claim 7, wherein the second determining unit comprises:

a reduction unit, configured to reduce the size of the target information in response to determining that the size of the target information is larger than the size of any of the first candidate image regions, to obtain adjusted target information; and

a seventh determining unit, configured to determine the second candidate image region from at least one of the first candidate image regions according to the adjusted size of the target information and the preset position sequence.

12. The apparatus of claim 7, wherein the shape of the first candidate image region is at least one of rectangular, polygonal, and circular.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 6.