CN118015153A

CN118015153A - Method for generating digital image of pet and electronic equipment

Info

Publication number: CN118015153A
Application number: CN202311802067.2A
Authority: CN
Inventors: 徐波; 尹攀; 楼彬彬; 范新龙; 李大伟; 徐策; 文章
Original assignee: Taobao China Software Co Ltd
Current assignee: Taobao China Software Co Ltd
Priority date: 2023-12-25
Filing date: 2023-12-25
Publication date: 2024-05-10

Abstract

The embodiment of the application discloses a method and electronic equipment for generating a digital image of a pet, wherein the method comprises the following steps: responding to a request submitted by a user for generating images for a target pet, and determining a plurality of original images about the target pet carried in the request; determining pet category and variety information of the target pet; determining appearance characteristic description information corresponding to the pet category and variety information by inquiring a pre-established knowledge base; inputting the plurality of original images and the appearance characteristic description information into an image generation model for training, solidifying the appearance characteristic description information corresponding to the pet category and variety in the process of training a special fine tuning model, so as to record the appearance characteristics of the target pet and the category and variety to which the target pet belongs through the special fine tuning model, and generating a digital image for the target pet. The method can ensure that the appearance characteristics of the pet and the belonging varieties are reserved in the generated image.

Description

Method for generating digital image of pet and electronic equipment

Technical Field

The application relates to the technical field of image generation, in particular to a method for generating a digital image of a pet and electronic equipment.

Background

Photograph processing techniques have been perfected but have limitations in terms of individuality and creativity for pets. Conventional filters and text mapping functions are difficult to meet the needs of pet owners who desire to create unique interesting photographs. AI (ARTIFICIAL INTELLIGENCE ) drawing technology appearing in the industry, including large models of "draft map", "draft map" and the like, can make up for the defects in the aspects of individuation, creative and the like existing in the traditional technology, but the existing AI drawing tools are generally universal, and high-quality AI drawing is difficult to realize for pets.

Disclosure of Invention

The application provides a method and electronic equipment for generating digital images of pets, which can ensure that the appearance characteristics of the pets and varieties belonging to the pets are reserved in generated images, and the quality of the generated images is improved.

The application provides the following scheme:

a method of generating a digital representation of a pet, comprising:

Responding to a request submitted by a user for generating images for a target pet, and determining a plurality of original images about the target pet carried in the request;

determining pet category and variety information of the target pet;

determining appearance characteristic description information corresponding to the pet category and variety information by inquiring a pre-established knowledge base;

Inputting the plurality of original images and the appearance characteristic description information into an image generation model for training, wherein the image generation model consists of a basic model and a fine adjustment model, and solidifying the appearance characteristic description information corresponding to the pet category and variety in the process of training the special fine adjustment model for the target pet so as to record the appearance characteristics of the target pet and the category and variety of the target pet through the special fine adjustment model, and generating a digital image for the target pet.

Wherein the base model is also associated with a control model;

In generating a digital representation for the target pet, comprising:

Determining a reference map, the reference map being an image associated with the pet;

Preprocessing the reference map through the control model, wherein the preprocessing comprises the following steps: and carrying out linear manuscript drawing extraction processing on the reference picture so as to be used for solidifying the pet action, gesture and/or background characteristics included in the reference picture, and then generating digital images with the appearance characteristics of the target pet and the category and variety thereof and the pet action, gesture and/or background characteristics in the reference picture through the basic model and the fine tuning model.

Wherein, still include:

generating a digital archive for the target pet, wherein the digital archive comprises a basic digital image generated for the target pet;

the determining a reference map includes:

in generating the base digital representation, an image associated with the personification of the pet is determined as the reference map.

Wherein, still include:

Providing a plurality of selectable scene/style title options;

the determining a reference map includes:

And determining the examination images related to the target scene/style and the pets as the reference images according to the selected target scene/style titles so as to generate corresponding scenerized/stylized digital images for the target pets.

Wherein the determining the reference map comprises:

The reference map is generated using a text-based generation image model.

Wherein, still include:

Performing quality detection on the original image provided by the user, wherein the quality detection comprises the following steps: and checking the definition and the integrity of the original image, determining whether repetition exists in the original image, and/or judging whether the original image corresponds to the same target pet.

A pet information processing method, comprising:

receiving a plurality of original images of target pets submitted by users and pet category and variety information;

Generating a digital file for the target pet according to the plurality of original images and the pet category and variety information, wherein the digital file comprises a digital image of the target pet, and the digital image comprises the appearance characteristics of the target pet and the category and variety of the target pet, and anthropomorphic, scenic and/or stylized characteristics.

An apparatus for generating a digital representation of a pet, comprising:

A request receiving unit, configured to determine a plurality of original images about a target pet carried in a request submitted by a user in response to the request for image generation for the target pet;

A category and variety information determining unit configured to determine pet category and variety information to which the target pet belongs;

The knowledge base inquiring unit is used for determining the appearance characteristic description information corresponding to the pet category and variety information by inquiring a pre-established knowledge base;

The model training unit is used for inputting the plurality of original images and the appearance characteristic description information into an image generation model for training, wherein the image generation model consists of a basic model and a fine adjustment model, and in the process of training a special fine adjustment model for the target pet, the appearance characteristic description information corresponding to the type and variety of the pet is solidified, so that the appearance characteristics of the target pet and the type and variety of the target pet are recorded through the special fine adjustment model, and a digital image is generated for the target pet.

A pet information processing device comprising:

The information receiving unit is used for receiving a plurality of original images of target pets submitted by users and pet category and variety information;

the digital file generation unit is used for generating a digital file for the target pet according to the plurality of original images and the pet category and variety information, wherein the digital file comprises a digital image of the target pet, and the digital image comprises the appearance characteristics of the target pet and the category and variety of the target pet, and anthropomorphic, scenic and/or stylized characteristics.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the preceding claims.

An electronic device, comprising:

one or more processors; and

A memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding claims.

According to the specific embodiment provided by the application, the application discloses the following technical effects:

According to the embodiment of the application, after a user submits a plurality of images of the pet, a digital image can be generated for the pet, wherein in order to avoid losing or deviating some special appearance characteristics of a specific pet variety in the image generation process, a pet appearance characteristic knowledge base can be pre-established, wherein appearance characteristic description information corresponding to a plurality of different pet varieties under a plurality of pet varieties can be recorded, and in the process of training a fine tuning model for the specific pet, the appearance characteristics corresponding to the variety and the category of the specific pet can be solidified by utilizing the information in the knowledge base, so that the appearance characteristics of the pet and the belonging variety can be reserved in the generated image, and the quality of the generated image is improved.

Of course, it is not necessary for any one product to practice the application to achieve all of the advantages set forth above at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application;

FIG. 2 is a flow chart of a first method provided by an embodiment of the present application;

FIG. 3 is a schematic illustration of an interface provided by an embodiment of the present application;

FIG. 4 is a flow chart of a second method provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of an apparatus provided by an embodiment of the present application;

Fig. 6 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the application, fall within the scope of protection of the application.

In order to facilitate understanding of the technical solution provided by the embodiments of the present application, it should be first described that, in the embodiments of the present application, a specific engineering link flow may be: the user uploads several images (e.g., typically 8 to 20, etc.) about his own pet, including photographs taken at various angles, various poses, etc. (of course, 360 degrees of photographing is not required as in the three-dimensional reconstructed scene), and may upload information about the pet's category, variety, etc., and then a digital profile may be created for the user's pet, including generating a digital image for the pet, etc. The digital representation may be of various kinds, including, for example, a basic digital representation resembling a human certificate, or may also provide the user with a plurality of selectable scene/style titles, etc., the user may choose to create a specific scene/stylized digital representation for his or her pet, etc.

Wherein, regarding specific digital figures, which may be generated by image generation models, content created by AI large models may be included in such digital figures, including, for example, content regarding the background in a scene/stylized digital figure, a cartoon process on the face, a personification process on the body part, and so on; meanwhile, the pet animal model needs to keep a certain similarity with the original appearance characteristics of the pet animal model. Colloquially, assuming that the user's pet is a pet dog, the generated digital image may not only be somewhat cartoonized, personified, scenerised, stylized, etc. through the AI large model, but also require that the pet dog appear to be exactly his own, etc.

However, for the processing object of the pet class, since the gap span between different individuals even on the same organ may be large, it may be difficult to ensure the quality of the generated image if a general image generation model is used in the process of generating the digital avatar.

For example, taking a face image processing as an example, since the facial feature differences of different people are not usually large, the general image generation model may use the concept of "average face" when generating a face image, that is, first, average a plurality of face images most commonly used to obtain an "average face", then when generating a face image for a specific person, the face image of the specific person may be generated according to the difference between the face features of the person and the face features of the "average face", and so on. In this case, if a digital image is required to be generated for a certain person, only a few photos about the person need to be uploaded, various appearance characteristics of the person can be recorded by training a fine tuning model for the person, and the like, then the digital image can be generated for the person based on the fine tuning model, and the facial characteristics specific to the person can be retained in the generated digital image, and the like.

But is difficult for pets to achieve in the manner described above. Taking pet dogs as an example, the varieties of pet dogs are very numerous, and there are often great differences between different varieties of pet dogs in many aspects. For example, there are great differences in the ear only, where in the ear type, there are standing ears (the ear stands upright above both sides of the head, the top tip is not too large), there are hanging ears (the ear naturally hangs down on both sides of the cheek), there are semi-standing ears (the root of the ear stands upright on the top of the head, but the tip of the ear hangs down forward), etc.; in addition, the difference in ear size spans are also large, e.g., the standing ear is generally smaller and the hanging ear is larger, with some hanging only near the cheek, some hanging to the chest, etc. With respect to hair types, there are curls, there are smoother, there are double layer hairs, there are bristle patterns, etc. In addition, the face shape may be classified into an egg-shaped face, a long-shaped face, a heart-shaped face, a square-shaped face, a round-shaped face, a pear face, a diamond face, and the like. Moreover, different pet dogs may also vary widely in size, etc. It is because the differences between different pet dogs in the appearance features have larger spans, so if the dogs are processed in a manner of 'average face' like a human face, the generated image may be difficult to maintain the original appearance features of specific pet individuals, even the generated pet image may be difficult to maintain the original features of specific varieties, but the pet image may be in a state between different varieties, and obviously, the generated image is not desirable.

In view of the above, a corresponding solution is provided in the embodiment of the present application, in which a knowledge base may be pre-established, and in the knowledge base, corresponding appearance characteristic description information may be respectively stored for pets of a plurality of different categories (for example, cats, dogs, etc. belong to different categories), different varieties (for example, pet dogs may include a plurality of different varieties of blemish dogs, ratio Xiong Quan, jipuppet, bulldog, york summer, etc. under the category of pet dogs). Such appearance characteristic description information may be expressed in text or the like. For example, "labrador: the ears are triangular and naturally drop on two sides of the head, the distance between the two ears is relatively long, the ears are in proportion to the head, and the size of the ears is usually equal to or smaller than that of the ears so as to just shade the inner sides of eyes of dogs; smooth and soft hair and changeable color; the head is relatively wide, slightly circular, full in structure, free of sharp corners, relatively wide in face view, nearly equal in width and length, and the like.

After uploading a plurality of images about the pet of the user, particularly before generating a digital image, the user can first determine the category and variety information of the pet (particularly, the category and variety information can be input by the user, or the identification result can be identified from the images, if necessary, the user can confirm the identification result, etc.), query the knowledge base to obtain the description information of the appearance characteristics corresponding to the category and variety, and then train the fine tuning model based on the images and the description information of the appearance characteristics to generate a fine tuning model exclusive for the pet, so as to save the appearance characteristics of the pet. In this way, the general characteristics of the specific variety of pets can be solidified through the information recorded in the knowledge base in the process of training the fine adjustment model, so that the condition that the generated image loses the characteristics of the specific variety due to the change of the general characteristics is avoided, and the quality of the generated image is improved.

The fine-tuning model is a model used for being matched with a basic model of the image generation model, and can be used for sensing the characteristics of some processing objects for fine classification purposes. For example, taking the basic model as an SD (Stable Diffusion) model, SD is a basic model for generating images based on text, and Lora (Low-Rank Adaptation) is a fine tuning model (small model) superimposed on the SD basic model, a text or a question provided by a user or a system to the model may be used as input, and the Prompt may be a complete sentence, a question, a segment, or simply a keyword, and its function is to guide the generation of the model, to guide its generation of output related to the Prompt, and then to add a uniform style to the generated images. The embodiment of the application can utilize the characteristics, and train and learn the appearance characteristics of the pets by using the Lora model aiming at each specific pet individual so as to generate a commodity diagram which is more real and can embody the appearance characteristics of a specific pet individual. That is, a dedicated image generation model can be generated for specific pet individual training, but parameters of a basic model such as SD can be kept unchanged during training, and only parameters of a fine tuning model such as Lora are adjusted. Because the parameter quantity of the Lora model is relatively small, training of the Lora model can be completed only by a small quantity of training samples (for example, the training samples can be on the order of units), so that the implementation efficiency can be improved, and the implementation difficulty is reduced.

In addition, as described above, in particular, when creating a digital image for a pet, in addition to retaining the own appearance of the pet, it may be necessary to perform processes such as cartoon, anthropomorphic, scenery, stylized, and the like on the digital image, and in order to achieve this, it may be implemented by using a "base map+control model" manner. The base map may specifically refer to a reference map, for example, if an image having a certain style needs to be generated, the image having the style may be collected in advance as the reference map, and so on. In the embodiment of the application, the specific base map may have a certain scene or style and other characteristics, and may further include content related to animals, for example, an image of "dog is shopping" needs to be generated, and then an existing image related to "dog is shopping" may be collected in advance as the base map, and so on.

With respect to the control model, it may specifically be referred to as a control model of the base model in the image generation model. Still taking the SD base model as an example, such base model may be generally associated Controlnet with a control model, etc., by which the generation of images is controlled. The control model may also be referred to as a preprocessor, by which a reference map (the aforementioned base map) is preprocessed, and then the map obtained after this preprocessing is used as a control map to be given to the base model for control. For example, taking a graph A as a reference graph, wherein a owl is provided, during the processing of the graph A through an image generation model, the posture of the owl in the generated image is expected to be kept unchanged, the control model can be utilized to preprocess the graph A, such preprocessing can include edge detection or line draft extraction, and the like, the posture information is obtained from the graph A and solidified, so that the posture in the image generation process is controlled to be kept unchanged, and the like.

Based on the above principle, in the embodiment of the present application, the aforementioned base map having features of scenery, stylization, personification, cartoon, etc. may be utilized to generate a digital image of a pet, and in the process of generation, the base map may be preprocessed by the aforementioned control model, and such preprocessing may include line draft extraction, etc. to extract the motion, posture and/or background features of the pet therefrom and solidify them, so that when the digital image of a specific pet is generated by the image generation model, the motion, posture and/or background of the pet and the base map are kept consistent, thereby obtaining the digital image related to the scenery, stylization, personification or cartoon of the pet, etc.

From a system architecture perspective, referring to fig. 1, an embodiment of the present application may provide a tool for generating a digital image for a pet, which may exist in the form of a web page or mobile end application, etc., when facing the user. Through the tool, a user can generate a digital image for the own pet, specifically, the user can take a picture of the own pet and the like to obtain a plurality of images, the images are uploaded into the tool, and under the optional mode, the information of the category, variety and the like of the specific pet can be uploaded. The tool may then generate a digital profile of the pet, which may include the digital representation generated for the pet by the image generation model. In order to ensure the effect of generating the image, a knowledge base can be pre-established, wherein the knowledge base can comprise the description information of the appearance characteristics of a plurality of different types and varieties of pets, so that the characteristics can be solidified when a fine-tuning model is trained for specific pets. Wherein, in a default state, the digital representation may be a base digital representation similar to a human certificate photograph. In addition, the user may be provided with a variety of optional scene/style titles, the user may also choose to generate a digital representation of a particular scene/stylized feature for his pet, and so on.

The following describes in detail the specific implementation scheme provided by the embodiment of the present application.

Example 1

First, this embodiment provides a method for generating a digital pet image from the perspective of the aforementioned tool, referring to fig. 2, the method may specifically include:

s201: and responding to a request submitted by a user for generating images for the target pet, and determining a plurality of original images about the target pet carried in the request.

After the knowledge base is established, interactions with the user may be performed. Specifically, if a user needs to create a digital avatar for his pet, the user may submit a plurality of pictures, etc., of the pet.

Then, in an optional manner, in order to ensure the effect of generating an image, quality detection may also be performed on the original image provided by the user, where specific quality detection may include: and checking the definition and the integrity of the original image, determining whether repetition exists in the original image, and/or judging whether the original image corresponds to the same target pet.

Specifically, the uploaded pet image can be subjected to definition inspection through an image processing algorithm to judge whether the image is blurred or has noise, so that the quality and definition of the image are ensured. And whether the same pet image is uploaded or not can be judged by checking the image similarity algorithm, and repeated use of a plurality of images is avoided so as to ensure the stability of the pet effect. In addition, the pet model recognition algorithm can be used for checking the uploaded multiple images to judge whether the pets are the same or not, and meanwhile, whether the multiple pets or the images are incomplete or not can be checked to ensure consistency and stability of the quality of the generated pet images. Through the quality inspection of the pet image, the pet image uploaded by the user can be ensured to meet the requirement of subsequent processing, and the generated image has higher quality and visual effect. The quality inspection steps can be automatically performed after the user uploads the image, so that the user experience and the quality of the generated image are improved.

S202: and determining the pet category and variety information of the target pet.

After receiving the request of the user, the original image uploaded by the user is determined, and the pet type and variety information of the target pet can be determined. Specifically, the pet category and variety information may be determined in a variety of ways. For example, in one mode, the user may upload information about the category, variety, etc. of a specific pet in addition to the image, so that the category and variety to which the target pet belongs may be determined directly according to the information uploaded by the user. Of course, in an optional mode, the information of the sex, the age and the like of the pet can be uploaded, so that a more complete digital file can be built for the pet, and a digital image more similar to the characteristics of the sex and the age can be generated. Or in another mode, the information of the category, the variety and the like of the pet can be identified from the original image uploaded by the user through an algorithm, and the identification result can be confirmed by the user in specific implementation.

S203: and determining the appearance characteristic description information corresponding to the pet category and variety information by inquiring a pre-established knowledge base.

In order to avoid that the finally generated image loses the characteristics of a specific variety, a knowledge base can be pre-established, and the knowledge base can comprise the appearance characteristic description information of different types and varieties of pets. Thus, after the category and variety information of the target pet is determined, the appearance characteristic description information of the pet corresponding to the specific category and variety can be inquired and determined from the knowledge base according to the category and variety information. As previously described, such appearance characteristic description information may be expressed in text form.

S204: inputting the plurality of original images and the appearance characteristic description information into an image generation model for training, wherein the image generation model consists of a basic model and a fine adjustment model, and solidifying the appearance characteristic description information corresponding to the pet category and variety in the process of training the special fine adjustment model for the target pet so as to record the appearance characteristics of the target pet and the category and variety of the target pet through the special fine adjustment model, and generating a digital image for the target pet.

After the appearance characteristic description information of the pets corresponding to specific categories and varieties is obtained, the obtained appearance characteristic description information can be used as input of an image generation model, wherein the image generation model can comprise a basic model and a fine adjustment model, and specifically, the special fine adjustment model can be generated for the target pets by adjusting and optimizing parameters of the fine adjustment model under the condition that parameters of the basic model are kept unchanged. Therefore, the appearance characteristics of the target pet can be recorded through the exclusive fine adjustment model, and the appearance characteristic description information corresponding to the specific pet variety is used, so that the characteristics can be solidified in the process of training the fine adjustment model, and the generated image is prevented from deviating from the characteristics of the specific species.

After training the fine-tuning model described above, a digital image may be generated for the user's pet using such fine-tuning model. The specific digital figures may include, among other things, basic digital figures similar to a human certificate photograph (which in this case pertains to personification of pet images), digital figures with scenery/stylized features, and so forth. In a specific implementation, the image generation model may be used. In order to make the aspects of the generated personification, scene, stylization and the like of the composition of the digital image controllable, the method can also be realized in a mode of 'reference diagram + control model'. That is, some images of anthropomorphic, scenic, stylized, etc. aspects related to the pet may be acquired in advance as a base map, and then such reference map may be preprocessed by a control model, the preprocessing including: and carrying out linear manuscript extraction processing on the reference picture so as to be used for solidifying characteristics of actions, postures and/or backgrounds of the pets included in the reference picture, and then generating digital images with the appearance characteristics of the target pets and the actions, postures and/or background characteristics of the pets in the reference picture through the basic model and the fine tuning model. That is, the generated digital image can keep the appearance characteristics of specific pets through the fine tuning model, and the actions, the postures, the backgrounds and the like of the digital image can be controlled through the control model.

The reference map may be collected from materials existing in the network, or may be generated by a large model of "aragonite map", etc., which is not limited herein.

For example, assuming that a user uploads a plurality of photos of a pet cat and submits specific variety and other information, a digital file may be generated for the pet and a corresponding page may be provided, in which a head portrait of the pet may be provided, as shown in fig. 3 (a), and of course, the head portrait may be generated by a tool provided in an embodiment of the present application, and some anthropomorphic processing may be performed, for example, a document photo with "normal" wear may be performed. Of course, multiple "digital fits" may also be provided, including multiple differently worn certificate photographs, and so forth. In addition, a plurality of sceneries/stylized title options may be provided, for example, when the user selects "swimming", it may be determined that the scene required by the user is "cat swimming", and at this time, as shown in fig. 3 (B), a plurality of optional reference pictures may be provided for the user to select in an optional manner. After a certain reference map is determined, a scenerized digital image as shown in fig. 3 (C) may be generated, where not only the appearance characteristics of the specific pet itself and its belonging variety are retained, but also anthropomorphic, scenerized or stylized characteristics, and so on. For example, in a specifically generated digital image, the actions, attitudes, etc. of a specific pet may be kept identical to those in the reference map, and even if the pet in the reference map is different from the current target pet in terms of category, variety, etc., the above-described effects can be obtained.

In summary, according to the embodiment of the application, after a user submits a plurality of images of a pet, a digital image can be generated for the pet, wherein in order to avoid losing or deviating some of the special appearance characteristics of a specific pet variety in the image generation process, a pet appearance characteristic knowledge base can be pre-established, wherein the appearance characteristic description information corresponding to a plurality of different pet varieties under a plurality of pet varieties can be recorded, and in the process of training a fine tuning model for a specific pet, the appearance characteristics corresponding to the variety and the category of the specific pet can be solidified by utilizing the information in the knowledge base, so that the appearance characteristics of the pet and the category can be reserved in the generated image, and the quality of the generated image is improved.

Example two

In the second embodiment, a pet information processing method is provided mainly from the perspective of a business process link, and referring to fig. 4, the method may include:

S401: receiving a plurality of original images of target pets submitted by users and pet category and variety information;

s402: generating a digital file for the target pet according to the plurality of original images and the pet category and variety information, wherein the digital file comprises a digital image of the target pet, and the digital image comprises the appearance characteristics of the target pet and the category and variety of the target pet, and anthropomorphic, scenic and/or stylized characteristics.

For the undescribed parts in the second embodiment, reference may be made to the description of the first embodiment and other parts of the specification, and the description is not repeated here.

It should be noted that, in the embodiment of the present application, the use of user data may be involved, and in practical application, the user specific personal data may be used in the solution described herein within the scope allowed by the applicable legal regulations in the country under the condition of meeting the applicable legal regulations in the country (for example, the user explicitly agrees to the user to notify practically, etc.).

Corresponding to the first embodiment, the embodiment of the application further provides a device for generating a digital image of a pet, referring to fig. 5, the device may include:

a knowledge base establishing unit 501, configured to establish a knowledge base of pet appearance characteristics, where the knowledge base is configured to store appearance characteristic description information corresponding to a plurality of different pet varieties under a plurality of pet categories;

A request receiving unit 502, configured to determine, in response to a request submitted by a user for image generation for a target pet, a plurality of original images related to the target pet and pet category and variety information carried in the request;

a knowledge base query unit 503, configured to query the knowledge base according to the pet category and variety information, and determine corresponding appearance feature description information;

The model training unit 504 is configured to input the plurality of original images and the appearance characteristic description information into an image generation model for training, where the image generation model is composed of a basic model and a fine adjustment model, and in the process of training a dedicated fine adjustment model for the target pet, the appearance characteristic description information corresponding to the category and the variety is solidified, so that the appearance characteristics of the target pet and the category and the variety to which the target pet belongs are recorded through the dedicated fine adjustment model, and a digital image is generated for the target pet.

In specific implementation, the basic model may also be associated with a control model; when a digital image is specifically generated for the target pet, a reference image can be determined, wherein the reference image is an image related to the pet; preprocessing the reference map through the control model, wherein the preprocessing comprises the following steps: and carrying out linear manuscript drawing extraction processing on the reference picture so as to be used for solidifying the pet action, gesture and/or background characteristics included in the reference picture, and then generating digital images with the appearance characteristics of the target pet and the category and variety thereof and the pet action, gesture and/or background characteristics in the reference picture through the basic model and the fine tuning model.

In particular, the apparatus may further include:

a digital archive generating unit, configured to generate a digital archive for the target pet, where the digital archive includes a basic digital image generated for the target pet; at this time, an image related to personification of the pet is determined as the reference map when the basic digital avatar is generated.

In addition, the apparatus may further include:

a scene/style title option providing unit for providing a plurality of selectable scene/style title options;

at this time, the reference image may be determined as a reference image according to the selected target scene/style title, so as to generate a corresponding scenerized/stylized digital image for the target pet.

In particular implementations, the reference map may be generated using a model that generates images based on text.

In addition, the apparatus may further include:

A quality detection unit, configured to perform quality detection on the original image provided by a user, where the quality detection includes: and checking the definition and the integrity of the original image, determining whether repetition exists in the original image, and/or judging whether the original image corresponds to the same target pet.

Corresponding to the embodiment, the embodiment of the application also provides a pet information processing device, which can comprise:

In addition, the embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of the method of any one of the previous method embodiments.

And an electronic device comprising:

one or more processors; and

A memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding method embodiments.

Fig. 6, among other things, illustrates an architecture of an electronic device, for example, device 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, an aircraft, and so forth.

Referring to fig. 6, device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, a sensor component 614, and a communication component 616.

The processing component 602 generally controls overall operation of the device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or part of the steps of the methods provided by the disclosed subject matter. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.

The memory 604 is configured to store various types of data to support operations at the device 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phonebook data, messages, pictures, videos, and the like. The memory 604 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 606 provides power to the various components of the device 600. The power components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 600.

The multimedia component 608 includes a screen between the device 600 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 600 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 610 is configured to output and/or input audio signals. For example, the audio component 610 includes a Microphone (MIC) configured to receive external audio signals when the device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.

The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of the device 600. For example, the sensor assembly 614 may detect the on/off state of the device 600, the relative positioning of the components, such as the display and keypad of the device 600, the sensor assembly 614 may also detect a change in position of the device 600 or a component of the device 600, the presence or absence of user contact with the device 600, the orientation or acceleration/deceleration of the device 600, and a change in temperature of the device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 616 is configured to facilitate communication between the device 600 and other devices, either wired or wireless. The device 600 may access a wireless network based on a communication standard, such as WiFi, or a mobile communication network of 2G, 3G, 4G/LTE, 5G, etc. In one exemplary embodiment, the communication component 616 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 604, including instructions executable by processor 620 of device 600 to perform the methods provided by the disclosed subject matter. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The method and the electronic device for generating the digital image of the pet provided by the application are described in detail, and specific examples are applied to the description of the principle and the implementation mode of the application, and the description of the examples is only used for helping to understand the method and the core idea of the application; also, it is within the scope of the present application to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the application.

Claims

1. A method of generating a digital representation of a pet, comprising:

determining pet category and variety information of the target pet;

2. The method of claim 1, wherein the step of determining the position of the substrate comprises,

The basic model is also associated with a control model;

In generating a digital representation for the target pet, comprising:

3. The method as recited in claim 2, further comprising:

the determining a reference map includes:

4. The method as recited in claim 2, further comprising:

Providing a plurality of selectable scene/style title options;

the determining a reference map includes:

5. The method of claim 2, wherein the step of determining the position of the substrate comprises,

The determining a reference map includes:

The reference map is generated using a text-based generation image model.

6. The method as recited in claim 1, further comprising:

7. A pet information processing method, characterized by comprising:

8. An apparatus for generating a digital representation of a pet, comprising:

9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 7.

10. An electronic device, comprising:

one or more processors; and

A memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of claims 1 to 7.