CN112734657B

CN112734657B - Cloud group photo method and device based on artificial intelligence and three-dimensional model and storage medium

Info

Publication number: CN112734657B
Application number: CN202011584123.6A
Authority: CN
Inventors: 杨文龙
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2023-04-07
Anticipated expiration: 2040-12-28
Also published as: CN112734657A

Abstract

The invention discloses a cloud group photo method and device based on artificial intelligence and a three-dimensional model and a storage medium. The method comprises the following steps: selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to shooting requirements; determining three-dimensional shooting scenes and the postures of the group photo personnel models; selecting an illumination condition and a shooting angle to shoot a picture to be processed in a three-dimensional shooting scene; and inputting the photo to be processed into a pre-trained photo realistic model to obtain a realistic photo. Optionally, the user-related data or model used in the above operations may be used only after the user authorizes the designated group photo. The embodiment of the invention has the following advantages: (1) The group photo effect is similar to the reality and is in the field, the real physical attributes of each person are attached, and compared with a direct image matting or direct GAN generation mode, the group photo effect is better. (2) The method has no hard requirements on the geographic position, the photographing posture of the user and the photographing synchronism, and only needs the authorization of the user.

Description

Cloud group photo method and device based on artificial intelligence and three-dimensional model and storage medium

Technical Field

The invention relates to the technical field of computer software, in particular to a cloud group photo method and device based on artificial intelligence and a three-dimensional model and a storage medium.

Background

Due to the influence of epidemic situations, people cannot go out freely, inconvenience is brought to people who need to participate in graduation ceremonies or social activities, and no good off-the-spot group photo method exists at present: the current group photo method is basically that different personal photos are pieced together, has no characteristics of scenes, different body sizes, collective expressions and the like, has no group photo feeling for people, and has no memorial effect. As shown in fig. 1, the existing cloud group photo solution has the following problems: (1) not formal; (2) Each person in the photo has no uniform scene or background and is relatively disordered; (3) the difference of body type and posture of each person cannot be reflected; (4) The method has a uniform background, and does not have expressions or actions (smiling together, shouting together, making a ghost together, and the like) which are unified (or matched) together; (5) cannot highlight the group photo that is authorized by the person.

Disclosure of Invention

In view of the foregoing technical drawbacks, embodiments of the present invention provide a cloud group photo method and apparatus based on artificial intelligence and a three-dimensional model, and a storage medium.

In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a cloud group photo method based on artificial intelligence and a three-dimensional model, including:

selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to shooting requirements;

determining a three-dimensional shooting scene according to the three-dimensional background scene model and the three-dimensional personnel model;

selecting an illumination condition and a shooting angle to shoot a picture to be processed in the three-dimensional shooting scene; the photo to be processed is a two-dimensional model picture comprising a plurality of group photo takers;

and receiving the authorization of each group photo, and inputting the photo to be processed into a photo realistic model trained in advance to obtain a realistic photo.

As a specific embodiment of the present application, the obtaining of the three-dimensional person model of each group photo specifically includes:

acquiring a personnel three-dimensional model of each group photo by adopting a sensor acquisition mode; or

And generating from a single or multiple two-dimensional pictures by adopting a deep learning algorithm.

As a specific embodiment of the present application, the determining a three-dimensional shooting scene specifically includes:

putting the three-dimensional person model into the three-dimensional background scene model;

in the background scene three-dimensional model, receiving the editing operation of each group photo taker on the personnel three-dimensional model to obtain an integral three-dimensional shooting scene; the editing operation comprises standing posture, gestures, expressions, positions and the like of the three-dimensional human model of each group photo in the background scene model.

Preferably, in a preferred embodiment of the present application, after obtaining the photo with reality, the cloud group photo method further includes:

receiving beautifying operation of each group photo on the image part of each group photo in the realistic picture; the beautifying operation comprises beautifying processing, image fine adjustment or expression transformation;

and receiving the modification or confirmation of the overall effect of each group photo person on the partial beautification of the group photo person.

Further, in some preferred implementations of the present application, the cloud group photo method further includes training a model photo realistic model and an expression transformation model, specifically:

acquiring a real photo and a personal three-dimensional model of a group photo, and training by adopting the real photo and the personal three-dimensional model to obtain a photo realistic model;

training by adopting the real photo to obtain the expression table conversion model;

the photo reality model comprises a condition generator and a discriminator 1, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real or not; or

The photo reality model comprises a condition generator, a discriminator 1 and a discriminator 2, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real, and the discriminator 2 is a face identity and position recognizer and is used for recognizing faces and positions in a picture after reality and comparing and discriminating the faces and the positions with a true value; the truth value is determined by the three-dimensional shooting scene; or

The photo reality model comprises a condition generator, a discriminator 1, a discriminator 2 and a discriminator 3, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real or not, and the discriminator 2 is a face identity and position recognizer and is used for recognizing faces and positions in a picture after reality and comparing and discriminating the faces and the positions with a true value; the truth value is determined by the three-dimensional shooting scene; the discriminator 3 is a person attribute identifier for identifying the sex, age, height and race of the corresponding person in the picture after the verification.

Similarly and optionally, new discriminators may be added as needed to check that some other property of the rendered picture is correct.

In a second aspect, an embodiment of the present invention provides a cloud group photo device based on artificial intelligence and a three-dimensional model, including:

the acquiring unit is used for selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to a shooting requirement;

the determining unit is used for determining a three-dimensional shooting scene according to the background scene three-dimensional model and the personnel three-dimensional model;

the shooting unit is used for selecting the illumination condition and the shooting angle to shoot the picture to be processed in the three-dimensional shooting scene; the photo to be processed is a two-dimensional model picture comprising a plurality of group photo takers;

and the processing unit is used for receiving the authorization of each group photo, inputting the photo to be processed into a pre-trained model photo realistic model and obtaining a realistic photo.

In a third aspect, an embodiment of the present invention further provides another cloud group photo apparatus based on artificial intelligence and a three-dimensional model, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the first aspect.

In a fourth aspect, the present invention also provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions, which, when executed by a processor, cause the processor to perform the method of the first aspect.

The embodiment of the invention has the following advantages:

(1) The group photo effect is similar to the real presence, the user experience is good, and even better effect than the actual photographing can be achieved.

(2) Compared with a direct image matting and group photo mode, the method has the advantages that the method can have better and more real relative proportion (between people, between people and objects and the like), interactivity (such as shielding, cooperative action, hugging and the like between different group photo people), more consistent shadow effect and better effect.

(3) Compared with the photo generated by directly using GAN (countermeasure generation network), the photo is more suitable for the real physical attributes (height, posture and the like) of each person.

(4) The method has no hard requirements on geographic position, user photographing posture and photographing synchronism, and only needs user authorization.

Drawings

In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below.

FIG. 1 is an interface diagram of a cloud group photo in the prior art;

fig. 2 is a flowchart of a cloud group photo method based on artificial intelligence and three-dimensional model according to an embodiment of the present invention;

FIG. 3 is an exemplary diagram of a three-dimensional model of a human body;

FIG. 4 is an exemplary diagram of a scene modeling;

FIG. 5 is a flow chart of model photo-realistic model training;

FIG. 6 is a comparison graph of the effects of different expression transformation methods;

fig. 7 is a structural diagram of a cloud group photo device based on artificial intelligence and a three-dimensional model according to a first embodiment of the present invention;

fig. 8 is a structural diagram of a cloud group photo device based on artificial intelligence and a three-dimensional model according to a second embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The main technical principle of the invention is as follows:

detecting a three-dimensional model (containing information such as height and weight) of the person taking the picture (optionally: maintaining a specific expression) using a monocular camera, a binocular camera or a depth camera

Put the three-dimensional models of the individual people into a uniform scene (background), which can be three-dimensional or two-dimensional

Selecting different photographing points and angles to take multiple pictures

Using deep learning techniques to reconstruct the details of the captured picture to form a high quality, sharp picture

Transforming (to a unified style, etc.) the human expressions in the photograph (in whole or in part) using deep learning techniques, if necessary

The post-processing operation such as image post-processing or beauty can be optionally performed.

Referring to fig. 2, a cloud group photo method based on artificial intelligence and a three-dimensional model according to an embodiment of the present invention may include the following steps:

and S101, selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to shooting requirements.

The shooting requirements comprise formal or relaxed requirements, dressing requirements, expression postures and the like.

Specifically, each group photo user acquires the personal three-dimensional model by using a sensor or the system generates the personal three-dimensional model of each group photo user from a two-dimensional photo by using a deep learning algorithm.

Wherein, the sensor collection can adopt various modes such as an optical scanner, a laser radar, a (monocular, binocular, etc.) camera or the fusion of several sensors.

Further, examples of three-dimensional models of a human body generated from a three-dimensional scan of a camera are shown in fig. 3 and 4.

Further, please refer to the following in the laser radar or the fusion mode of the laser radar and the camera:

https://www.ednchina.com/news/5942.html

https://baijiahao.baidu.com/sid＝1680498261851454168&wfr＝spider&for＝pc

further, the three-dimensional model of the background scene is selected as follows:

the three-dimensional model for scene modeling may be a virtual three-dimensional scene, or may be a real scene using a sensor such as a camera or a lidar (in a manner similar to the above). The scene modeling is shown in fig. 5.

And S102, determining a three-dimensional shooting scene according to the three-dimensional background scene model and the three-dimensional personnel model.

Specifically, step S102 includes:

placing the three-dimensional model of the person into the three-dimensional model of the background scene;

in the background scene three-dimensional model, receiving the editing operation of each group photo taker on the personnel three-dimensional model to obtain an integral three-dimensional shooting scene; the editing operation comprises standing posture, gestures, expressions, positions and the like of the three-dimensional personnel model of each group photo in the background scene model.

That is, each group photo selects the standing posture, gesture, expression, position, etc. of its own model in the background three-dimensional scene model.

S103, previewing the overall effect of the three-dimensional shooting scene, and receiving the modification or confirmation of the overall effect by the group photo responsible person.

Specifically, the overall effect, such as the position or standing posture of each person, is previewed and further modified or confirmed.

And S104, selecting an illumination condition and a shooting angle to shoot a picture to be processed in the three-dimensional shooting scene.

The photo to be processed is a two-dimensional model picture comprising a plurality of group photo partners.

And S105, receiving the authorization of each group photo, and inputting the photo to be processed into a pre-trained reality model to obtain a reality photo.

Specifically, before step S105 is performed, the realistic model and the expression transformation model need to be trained.

The realistic model is a (conditional) generator model, in which the (conditional) generator can use a (conditional) VAE (conditional Auto Encoder) or a (conditional) GAN (generic adaptive Networks) method. Fig. 5 is an example of a method using GAN, in which there is one condition generator and three discriminators.

The discriminator 1 is a photo truth discriminator and is used for judging whether the photos are real enough, label data is not needed, training photos are not needed to be in one-to-one correspondence, and real photos and false real photos can be directly sent to training;

the discriminator 2 is composed of a face identity and position recognizer (trained by real photo data and used for recognizing the face and position in the real photo) and a true value comparison module. The true value is the actual face and position in the model picture, and because the model picture is generated in the three-dimensional model, the information can be directly calculated from the three-dimensional model. If the result of the recognition from the false real photo by the recognizer is consistent with the true value, the generated false real photo is high in truth degree.

The discriminator 3 is composed of a module which is trained by using the data of the real photo and is used for discriminating the attributes of the corresponding person in the real photo, such as the sex, the age, the height, the species and the like, and adding a true value for comparison. The true value is obtained by uploading by the corresponding user.

In fig. 5, the photo liveness discriminator is mandatory, and the other two discriminators are optional (the person attribute input is also optional). Specifically, when model training is performed, firstly, a real photo set and three-dimensional model information are used for training a condition generator and a discriminator, and after training is completed, only the (condition) generator needs to be used when the model training device is used specifically.

And S106, receiving beautifying operation of each group photo user on the image part in the realistic picture.

S107, receiving the modification or confirmation of the overall effect of each group photo person after the group photo person beautifies the self part.

Specifically, if necessary, post-processing operations such as beauty, image fine adjustment, or expression conversion are further performed.

The expression transformation can adopt an expression transformation model, such as StarGAN and the like. The effect pairs of different expression transformation methods are shown in fig. 6.

Optionally, in the present invention, the group photo needs to be authorized by each participant (a relatively secure authorization method such as fingerprint identification or password) before being used (the participant is added to the group photo). Wherein the authorization is primarily for: some models (such as human faces) trained based on real photos of users and the output of which may relate to private information of users, such as photo realistic models, expression transformation models and the like need to be encrypted and can be used for a specified group photo or photo only after the user authorizes the models.

The method comprises the following steps that a photo reality model (a face recognizer needs to use at least one user photo), an expression transformation model (needs to use a plurality of user photos) and a face recognition model only need a user to upload a certain amount of real photos (or corresponding video pictures are collected in a previous video conference after the user is directly authorized); if the user's own three-dimensional model can be uploaded in advance for the training of the photo realistic model, the effect is better. It should be noted that the step is only used for model training, and is optional, if a "face recognition and position discriminator" or an "expression transformation model" is needed, the user needs to upload some own photos in advance, and the photos do not need to be the same as the scenes when the user really takes a picture in the cloud, and the photos can be taken in the front.

Compared with the prior art, the cloud group photo method based on artificial intelligence and the three-dimensional model has the following advantages:

(4) The method has no hard requirements on the geographic position, the photographing posture of the user and the photographing synchronism, and only needs the authorization of the user.

Based on the same inventive concept, the embodiment of the invention also provides a cloud group photo device based on artificial intelligence and a three-dimensional model. As shown in fig. 7, the apparatus includes:

the acquiring unit 10 is used for selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to the shooting requirement;

the determining unit 11 is used for determining a three-dimensional shooting scene according to the three-dimensional background scene model and the three-dimensional personnel model;

the shooting unit 12 is used for selecting the illumination condition and the shooting angle to shoot the photo to be processed in the three-dimensional shooting scene; the photo to be processed is a two-dimensional model picture comprising a plurality of group photo takers;

and the processing unit 13 is used for receiving the authorization of each group photo, and inputting the photo to be processed into a pre-trained model photo realistic model to obtain a realistic photo.

Specifically, the acquisition unit 10 is configured to:

And generating a three-dimensional person model of each group photo from a single or a plurality of two-dimensional pictures by adopting a deep learning algorithm.

Specifically, the determination unit 11 is configured to:

Further, the processing unit 13 is further configured to:

previewing the overall effect of the three-dimensional shooting scene;

receiving a modification or confirmation of the overall effect by a group photo leader.

Further, the processing unit 13 is further configured to:

and receiving the modification or confirmation of the whole effect of each group photo person after the group photo person beautifies the part of the group photo person.

Further, the device further comprises a training unit for training the photo realistic model and the expression transformation model, specifically comprising:

training an expression transformation model by adopting a plurality of real photos of the same person with different expressions;

The photo reality model comprises a condition generator, a discriminator 1, a discriminator 2 and a discriminator 3, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real or not, and the discriminator 2 is a face identity and position recognizer and is used for recognizing faces and positions in a picture after reality and comparing and discriminating the faces and the positions with a true value; the truth value is determined by the three-dimensional shooting scene; the discriminator 3 is a person attribute identifier for identifying the sex, age, height and race of the corresponding person in the picture after the reality.

Similarly and optionally, new discriminators can be added as needed to check that some other known property of the picture after the rendering is correct.

Optionally, as shown in fig. 8, the cloud group photo device based on artificial intelligence and three-dimensional model of the present invention may include: one or more processors 101, one or more input devices 102, one or more output devices 103, and memory 104, the processors 101, input devices 102, output devices 103, and memory 104 being interconnected via a bus 105. The memory 104 is used for storing a computer program comprising program instructions, the processor 101 being configured for invoking the program instructions for performing the methods of the above-described method embodiment parts.

It should be understood that, in the embodiment of the present invention, the Processor 101 may be a Central Processing Unit (CPU), a GPU and an AI-Specific chip, and the Processor may also be other general processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The input device 102 may include a keyboard or the like, and the output device 103 may include a display (LCD or the like), a speaker, or the like.

The memory 104 may include read-only memory and random access memory, and provides instructions and data to the processor 101. A portion of the memory 104 may also include non-volatile random access memory. For example, the memory 104 may also store device type information.

In a specific implementation, the processor 101, the input device 102, and the output device 103 described in the embodiment of the present invention may execute an implementation manner described in the embodiment of the cloud group photo method based on artificial intelligence and a three-dimensional model provided in the embodiment of the present invention, and details are not described here again.

It should be noted that, for a more detailed description of the device in this embodiment, please refer to the foregoing method embodiment, which is not described herein again.

Accordingly, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions that, when executed by a processor, implement: the cloud group photo method based on the artificial intelligence and the three-dimensional model is disclosed.

The computer readable storage medium may be an internal storage unit of the system according to any of the foregoing embodiments, for example, a hard disk or a memory of the system. The computer readable storage medium may also be an external storage device of the system, such as a plug-in hard drive, smart Media Card (SMC), secure Digital (SD) Card, flash memory Card (Flash Card), etc. provided on the system. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the system. The computer-readable storage medium is used for storing the computer program and other programs and data required by the system. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A cloud group photo method based on artificial intelligence and a three-dimensional model is characterized by comprising the following steps:

training a photo realistic model and an expression transformation model;

receiving the authorization of each group photo, and inputting the photo to be processed into a pre-trained photo realistic model to obtain a realistic photo;

the method specifically comprises the following steps of training a photo reality model and an expression transformation model:

training by adopting the real photo to obtain the expression transformation model;

The photo reality model comprises a condition generator, a discriminator 1, a discriminator 2 and a discriminator 3, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real, the discriminator 2 is a face identity and position recognizer and is used for recognizing the face and position in a picture after reality and comparing and discriminating the face and position with a true value, and the true value is determined by the three-dimensional shooting scene; the discriminator 3 is a person attribute discriminator and is used for identifying the sex, age, height and race of the corresponding person in the picture after the reality.

2. The cloud group photo method of claim 1, wherein obtaining the three-dimensional model of the person for each group photo specifically comprises:

And generating a three-dimensional person model of each group photo from the single or multiple two-bit pictures by adopting a deep learning algorithm.

3. The cloud group photo method of claim 1, wherein confirming that a three-dimensional shooting scene specifically includes:

in the background scene three-dimensional model, receiving the editing operation of each group photo taker on the personnel three-dimensional model to obtain an integral three-dimensional shooting scene; the editing operation comprises the standing posture, the gesture, the expression and the position of the three-dimensional human model of each group photo in the background scene model.

4. The cloud group photo method of claim 3, wherein after determining the three-dimensional photographic scene, the cloud group photo method further comprises:

previewing the overall effect of the three-dimensional shooting scene;

5. The cloud photography method of claim 4, wherein after obtaining the photo for realism, the cloud photography method further comprises:

receiving beautifying operation of each group photo on the own picture part in the realistic picture; the beautifying operation comprises beautifying processing, image fine adjustment or expression transformation;

6. A cloud group photo device based on artificial intelligence and three-dimensional model, its characterized in that includes:

the training unit is used for training a photo reality model and an expression transformation model;

the acquiring unit is used for selecting or acquiring a background scene three-dimensional model and a personnel three-dimensional model of each group photo according to the shooting requirement;

the determining unit is used for determining a three-dimensional shooting scene according to the three-dimensional background scene model and the three-dimensional personnel model;

the shooting unit is used for selecting the illumination condition and the shooting angle to shoot the photo to be processed in the three-dimensional shooting scene; the photo to be processed is a two-dimensional model picture comprising a plurality of group photo participants;

the processing unit is used for receiving the authorization of each group photo, inputting the photo to be processed into a photo realistic model trained in advance and obtaining a realistic photo;

wherein the training unit is specifically configured to:

The photo reality model comprises a condition generator, a discriminator 1, a discriminator 2 and a discriminator 3, wherein the discriminator 1 is a photo reality discriminator and is used for judging whether a real photo is real, the discriminator 2 is a face identity and position recognizer and is used for recognizing the face and position in a picture after reality and comparing and discriminating the face and position with a true value, and the true value is determined by the three-dimensional shooting scene; the discriminator 3 is a person attribute discriminator for discriminating the sex, age, height and race of the corresponding person in the photorealistic picture.

7. A cloud group photo apparatus based on artificial intelligence and three-dimensional model, comprising a processor, an input device, an output device and a memory, wherein the processor, the input device, the output device and the memory are connected with each other, wherein the memory is used for storing a computer program, the computer program comprises program instructions, and the processor is configured to call the program instructions to execute the method according to any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method according to any one of claims 1-5.