CN111640192A

CN111640192A - Scene image processing method and device, AR device and storage medium

Info

Publication number: CN111640192A
Application number: CN202010507588.5A
Authority: CN
Inventors: 王子彬; 孙红亮; 李炳泽
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2020-09-08

Abstract

The present disclosure provides a scene image processing method, apparatus, AR device, and storage medium, wherein the method comprises: acquiring a scene image containing a human body image of a user; extracting a human body image of a target user from the acquired scene image; identifying a scene category indicated by the acquired scene image, and determining a target Augmented Reality (AR) role special effect matched with the scene category; and fusing the extracted human body image of the target user with the target AR role special effect and then displaying the fused image. According to the method and the device, the human body image of the target user is fused with the target AR role special effect matched with the scene category in the dynamic image acquisition process, the dynamic perception environment and the interaction environment of the user are reserved while role playing is achieved, the displayed fused image is more real and vivid, the sense of reality and the sense of fusion of user perception are increased, and the user interaction experience is improved.

Description

Scene image processing method and device, AR device and storage medium

Technical Field

The present disclosure relates to the field of augmented reality technologies, and in particular, to a scene image processing method and apparatus, an AR device, and a storage medium.

Background

When visiting an exhibition or museum, a visitor usually needs to play some exhibition roles in the exhibition or museum and take a picture of a souvenir, for example, when visiting a palace, the visitor wants to change into a character role in the palace. In order to meet similar requirements of tourists, exhibition halls or museums usually make some role cards or provide clothes with the same shape of roles, and meet the requirements of role playing of the tourists by combining with the figure cards or wearing the clothes of corresponding roles.

However, in the group photo mode, the group photo obtained by the tourists is a static image, which only can meet the requirements of user role playing, but the played role lacks sense of reality and blending, and the interactive experience of the user is reduced.

Disclosure of Invention

The embodiment of the disclosure at least provides a scene image processing method, a scene image processing device and a scene image processing system, which are used for improving the blending sense and the reality sense of a user and improving the user experience of the user under the condition of realizing the role playing requirement of the user.

In a first aspect, an embodiment of the present disclosure provides a scene image processing method, including:

acquiring a scene image containing a human body image of a user;

extracting a human body image of a target user from the acquired scene image;

identifying a scene category indicated by the acquired scene image, and determining a target Augmented Reality (AR) role special effect matched with the scene category;

and fusing the extracted human body image of the target user with the target AR role special effect and then displaying the fused image.

In a possible implementation manner, the extracting unit is specifically configured to extract a human body image of a target user from an acquired scene image by using a human body recognition model trained in advance; the human body recognition model is obtained by training a sample image labeled with foreground data and background data.

In a possible implementation manner, fusing and displaying the extracted human body image of the target user and the target AR character special effect specifically includes:

determining real-time posture information of the human body image of the target user by using the human body recognition model;

adjusting the display posture corresponding to the target AR role special effect according to the determined real-time posture information;

and superposing the target AR role special effect after the display posture is adjusted on the extracted human body image of the target user.

In one possible embodiment, identifying a scene category indicated by the captured image of the scene includes:

extracting a background environment image from the scene image by using the human body recognition model;

and determining the scene type of the background environment image by using a pre-trained scene recognition model.

In a possible implementation manner, determining the augmented reality AR character special effect matched with the scene category specifically includes:

displaying all target AR role special effects corresponding to the identified scene types;

and responding to the selection operation of the user, and determining the target AR role special effect selected by the user as the target AR role special effect.

In a possible implementation manner, if there are a plurality of user human body images extracted from the scene image, before determining the target augmented reality AR character special effect matching the scene category, the method further includes:

displaying all the extracted human body images of the user;

and responding to the selection operation of the user, and determining the user human body image selected by the user as the target user human body image.

In a possible implementation manner, the method for processing a scene image provided by the embodiment of the present disclosure further includes:

receiving an image shooting instruction;

and intercepting and storing the currently displayed fusion image according to the received image shooting instruction.

In a second aspect, an embodiment of the present disclosure further provides a scene image processing apparatus, including:

the system comprises an acquisition unit, a display unit and a control unit, wherein the acquisition unit is used for acquiring a scene image containing a human body image of a user;

the extraction unit is used for extracting a human body image of a target user from the acquired scene image;

the identification unit is used for identifying the scene category indicated by the acquired scene image and determining the special effect of the target augmented reality AR role matched with the scene category;

and the display unit is used for fusing the extracted human body image of the target user with the target AR role special effect and then displaying the fused image.

In a possible implementation manner, the recognition unit is specifically configured to extract a human body image of the target user from the acquired scene image by using a human body recognition model trained in advance; the human body recognition model is obtained by training a sample image labeled with foreground data and background data.

In a possible implementation manner, the display unit is specifically configured to determine real-time posture information of a human body image of a target user by using the human body recognition model; adjusting the display posture corresponding to the target AR role special effect according to the determined real-time posture information; and superposing the target AR role special effect after the display posture is adjusted on the extracted human body image of the target user.

In a possible implementation manner, the recognition unit is specifically configured to extract a background environment image from the scene image by using the human body recognition model; and determining the scene type of the background environment image by using a pre-trained scene recognition model.

In a possible implementation manner, the display unit is further configured to display all target AR role special effects corresponding to the identified scene categories;

the identification unit is specifically configured to determine, in response to a selection operation of a user, that the AR character special effect selected by the user is the target AR character special effect.

In a possible implementation manner, an embodiment of the present disclosure provides a scene image processing apparatus, further including a determining unit, where:

the display unit is further configured to display all extracted user human body images before the recognition unit determines that the target Augmented Reality (AR) role special effect matched with the scene category exists if the extraction unit extracts a plurality of user human body images from the scene image by using a pre-trained human body recognition model;

the determining unit is used for responding to the selection operation of the user and determining the user human body image selected by the user as the target user human body image.

In a possible implementation manner, an embodiment of the present disclosure provides a scene image processing apparatus, further including:

a receiving unit configured to receive an image capturing instruction;

and the intercepting unit is used for intercepting and storing the currently displayed fusion image according to the received image shooting instruction.

In a third aspect, an embodiment of the present disclosure further provides an AR device, including: a processor and a memory coupled to each other, the memory storing machine-readable instructions executable by the processor, the machine-readable instructions being executable by the processor when the AR device is running to implement the method for processing images of a scene in the first aspect, or any of the possible implementations of the first aspect.

In a fourth aspect, this disclosed embodiment also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.

The scene image processing method and apparatus, the AR device and the storage medium provided by the embodiment of the disclosure, by extracting the human body image of the target user from the acquired scene image and determining the special effect of the target AR role matched with the scene type according to the scene type of the scene image knowledge point, further fusing the extracted human body image of the target user with the special effect of the target AR role and displaying the fused human body image, compared with the prior art that the user can only group with the figure card or the costume with the figure shape, the human body image of the target user is fused with the target AR role special effect matched with the scene category in the process of dynamic image acquisition, the role playing is realized, and simultaneously, the dynamic perception environment and the interaction environment of the user are reserved, so that the displayed fusion image is more real and vivid, the sense of reality and the sense of fusion of the perception of the user are increased, and the user interaction experience is improved.

Further, the scene image processing method provided by the embodiment of the disclosure may further determine real-time posture information of the target human body image in real time according to the acquired scene image, and adjust the display posture of the target AR character special effect according to the determined real-time posture information, so that the three-dimensional virtual may change along with the change of the posture of the target user, so that the fused image has more sense of reality.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1a illustrates a flowchart of a scene image processing method provided by an embodiment of the present disclosure;

fig. 1b is a flowchart illustrating a specific method for matching a corresponding three-dimensional virtual image according to a background environment in a scene image processing method provided by the embodiment of the present disclosure;

fig. 2 is a flowchart illustrating a specific method for dynamically adjusting a display posture of a target AR character special effect according to a human body image posture in a scene image processing method provided in an embodiment of the present disclosure;

fig. 3 is a schematic diagram illustrating a scene image processing apparatus provided in an embodiment of the present disclosure;

fig. 4 shows a schematic diagram of an AR device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Example one

When a user visits a scenic spot, an exhibition hall or a museum, there is a need to play and take a shadow of a role shown in the scenic spot, the exhibition hall or the museum. Scenic spots, exhibition halls or museums provide people-shaped cards or role-modeling clothes for users to use so as to meet the requirements of the users. However, in the above-mentioned image-keeping method, if the user wants to play different roles, the user needs to replace different sculpted garments or match different human-shaped cards, so that the user experience is reduced, and the image obtained by the user is a static image, so that the reality and fusion of role playing are reduced due to lack of dynamic display of perception environment.

In view of this, the present disclosure provides a scene image processing method, which retains dynamic display of a user perception environment on the premise of satisfying a user role playing requirement, improves reality and fusion of role playing, and improves user interaction experience.

Augmented Reality (AR) technology may be applied to an AR device, which may be any electronic device capable of supporting AR functions, including but not limited to AR glasses, a tablet computer, a smart phone, and the like. When the AR device is operated in a real scene, virtual objects superimposed in the real scene can be viewed by the AR device, such as when passing through some buildings or tourist attractions, virtual teletext introductions superimposed near buildings or tourist attractions can be seen by the AR device, the virtual image-text can be called as a virtual object, and the building or tourist attraction can be a real scene, in which, the virtual image-text introduction seen through the AR glasses can change along with the change of the orientation angle of the AR glasses, here, the virtual teletext presentation is related to the AR glasses position relationship, however, in other scenes, we want to see a more abundant and various virtual-real combined augmented reality scene, such as wanted to see, how this presentation effect is achieved will be described below with reference to the following specific embodiments for the discussion of the embodiments of the present disclosure.

To facilitate understanding of the present embodiment, first, a scene imaging method disclosed in the present disclosure is described in detail, where an execution main body of the method for generating AR scene content provided in the present disclosure may be the AR device, for example, the AR device may include devices with display functions and data processing capabilities, such as AR glasses, a tablet computer, a smart phone, and an intelligent wearable device.

In particular implementation, the scene image processing method provided in the embodiment of the present disclosure may be installed in the AR device as an independent application client, or may be a function implemented in an application client installed in the AR device, which is not limited in the embodiment of the present disclosure.

Referring to fig. 1a, an implementation flowchart of a scene image processing method provided in the embodiment of the present disclosure includes the following steps:

and S11, acquiring a scene image containing the human body image of the user.

In specific implementation, the application client implementing the scene image processing method provided by the embodiment of the disclosure calls a camera of the AR device to acquire a scene image in real time.

In an embodiment, if the AR device is an AR device provided in a scenic spot and interacting with a user, whether the scene image processing method provided by the present disclosure is triggered may be determined by detecting whether a human body image exists in a scene image acquired in a target area in real time, or after detecting a human body image, prompting the user whether to trigger the scene image processing method provided by the present disclosure through a dialog box, and after receiving a confirmation instruction of the user, starting to execute a process of the scene image processing method provided by the present disclosure.

In another embodiment, if the AR device is the user's own device, a camera is called according to a user operation instruction to start image acquisition to obtain a corresponding scene image.

And S12, extracting the human body image of the target user from the acquired scene image.

In one embodiment, the human body recognition model may be trained by using a sample image labeled with foreground attitude data and background environment data, and the human body image of the target user may be extracted from the acquired scene image by using the pre-trained human body recognition model. In order to improve the accuracy and precision of human body recognition model recognition, in the embodiment of the present disclosure, pixel-level accurate positioning in foreground elements, such as hair, eyes, neck, skin, lips, etc., may be performed, so that accurate feature information of facial features may be obtained through training.

Furthermore, the real-time postures of the human face corresponding to different angles can be obtained by training according to the rotation angles of the eyes, the mouth, the lips and the like relative to the forward image, for example, whether the human face forwards or turns the head leftwards for a certain angle or turns the head rightwards for a certain angle or the like.

Therefore, according to the human body recognition model obtained by training in the embodiment of the disclosure, not only the human body image in the image can be recognized, but also the real-time posture of the recognized human body image can be recognized.

In specific implementation, the convolutional neural network can be used for training the human body recognition model based on the labeled sample image.

It should be noted that, if there are a plurality of user human body images extracted from the scene image in step S12, all the extracted user human body images may be displayed on the display screen of the AR device for the user to select, and in response to the selection operation of the user, it is determined that the user human body image selected by the user is the target user human body image.

And S13, identifying the scene category indicated by the acquired scene image, and determining the target AR role special effect matched with the scene category.

In specific implementation, the AR role special effect can be made by using three-dimensional software and a three-dimensional engine. The AR role special effect can be made by a scenic spot, an exhibition hall or a museum and the like in an offline mode, and is stored in the AR device in the installation process of the application client. For example, the former can make the AR role special effect of wearing Qing dynasty official costume, flag clothes and the like, and the ocean can make the AR role special effect of mermaid and the like.

In order to improve the user perception experience and avoid reducing the user perception experience due to the fact that the scene image background is not matched with the target AR role special effect, for example, a fusion image of mermaid is presented for user fusion in the palace scene, in the embodiment of the present disclosure, the corresponding target AR role special effect may be matched according to the background environment in the scene image, as shown in fig. 1b, the method includes the following steps:

s131, extracting a background environment image from the scene image by using the human body recognition model.

In this step, a background environment image is segmented from the acquired scene image by using the trained human body recognition model.

S132, determining the scene type of the background environment image by using the pre-trained scene recognition model.

In this step, the trained scene recognition model may be used to determine the scene type to which the background environment belongs, for example, whether the background environment belongs to a museum scene or a marine museum scene.

In specific implementation, the sample image labeled with the scene category can be used, the convolutional neural network is adopted to train the scene recognition model, and the acquired real-time scene image is input into the scene recognition model to determine the corresponding scene category.

In some embodiments, for some scenic spots, a method of recognizing scenes by using pre-stored characteristic object features and the like can be further adopted to judge the scene category.

And S133, selecting a target AR character special effect from the AR character special effects corresponding to the identified scene types.

In specific implementation, a corresponding relationship between the scene category and the target AR role special effect may be established in advance, for example, the target AR role special effect corresponding to the palace museum may include an AR role special effect of wearing Qing dynasty official costume, flagging and the like, and the target AR role special effect corresponding to the ocean hall may include mermaid and the like.

In this way, after the scene category is identified, all AR role special effects corresponding to the identified scene category can be displayed on a display screen of the AR device for selection by a user, the AR role special effect selected by the user is determined to be the selected target AR role special effect in response to the selection operation of the user, and the target AR role special effect selected by the user is superposed on the human body image of the target user.

And S14, fusing the extracted human body image of the target user with the determined special effect of the target AR role and displaying the fused image.

In this step, the target human body image extracted in step S12 and the determined target AR character special effect are fused and then presented on the display screen of the AR device. In specific implementation, the target AR role special effect can be superposed on the human body image for display.

According to the scene image processing method provided by the embodiment of the disclosure, the human body image of the target user can be extracted from the collected scene image by using the human body recognition model obtained by pre-training, the current scene category is recognized according to the scene image, and the target AR role special effect matched with the recognized scene category is superposed on the extracted human body image of the target user to obtain the fusion image, so that the user keeps the dynamic real environment where the user is located while realizing role playing, the real perception environment is increased, the sense of reality and the sense of fusion of the user are improved, and the interaction experience of the user is improved.

Example two

In specific implementation, in order to make the fused image more vivid, in the embodiment of the present disclosure, the display posture of the target AR character special effect may be dynamically adjusted according to the posture of the human body image. As shown in fig. 2, it is a schematic diagram of an implementation flow of dynamically adjusting a display posture of a specific effect of a target AR character according to a posture of a human body image, and includes the following steps:

and S21, determining the real-time posture information of the human body image of the target user by using the human body recognition model.

In this step, the sample image labeled with the face pose data can be used for training the human body recognition model, and the model obtained by training can be used for recognizing the implementation pose information of the human body in the human body image.

In one embodiment, the real-time posture information of the target user human body image can be further determined according to the angle of the human face image and the like in the target human body image relative to the forward direction image.

And S22, adjusting the display posture corresponding to the target AR role special effect according to the determined real-time posture information.

In this step, the display posture of the target AR character special effect is adjusted according to the real-time posture information of the human body image determined in step S21, so that the target AR character special effect changes with the change of the human body posture, and the sense of incongruity of the fused image is reduced.

And S23, superposing the target AR role special effect after the display posture is adjusted on the extracted target user human body image.

According to the second embodiment disclosed by the application, the display posture of the special effect of the target AR role can be dynamically adjusted according to the real-time posture of the user, so that the fusion image of the human body image and the special effect of the target AR role is more vivid, and the sense of reality and the sense of fusion of the fusion image are further increased.

In some embodiments, the method for processing a scene image according to the embodiments of the present application may further include:

step one, receiving an image shooting instruction.

And step two, intercepting and storing the currently displayed fusion image according to the received image shooting instruction.

Therefore, the scene image processing method provided by the embodiment of the disclosure can also meet the requirement of special effect group photo memoization of the user and the AR role.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

EXAMPLE III

Based on the same inventive concept, a scene image processing apparatus corresponding to the scene image processing method is also provided in the embodiments of the present disclosure, and because the principle of the apparatus in the embodiments of the present disclosure for solving the problem is similar to the scene image processing method in the embodiments of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 3, a schematic diagram of a scene image processing apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes:

an acquiring unit 31 for acquiring a scene image including a human body image of a user;

an extracting unit 32, configured to extract a target user human body image from the acquired scene image;

the recognition unit 33 is configured to recognize a scene category indicated by the acquired scene image, and determine a target Augmented Reality (AR) character special effect matched with the scene category;

and the display unit 34 is configured to fuse the extracted human body image of the target user with the target AR role special effect and display the fused image.

In a possible implementation manner, the recognition unit 33 is specifically configured to extract a human body image of the target user from the acquired scene image by using a human body recognition model trained in advance; the human body recognition model is obtained by training a sample image labeled with foreground data and background data.

In a possible implementation, the display unit 34 is specifically configured to determine real-time posture information of a human body image of a target user by using the human body recognition model; adjusting the display posture corresponding to the target AR role special effect according to the determined real-time posture information; and superposing the target AR role special effect after the display posture is adjusted on the extracted human body image of the target user.

In a possible implementation, the recognition unit 33 is specifically configured to extract a background environment image from the scene image by using the human body recognition model; and determining the scene type of the background environment image by using a pre-trained scene recognition model.

In a possible implementation manner, the display unit 34 is further configured to display all target AR character special effects corresponding to the identified scene categories;

the identifying unit 33 is specifically configured to determine, in response to a selection operation of a user, that the AR character special effect selected by the user is the target AR character special effect.

the display unit 34 is further configured to, if there are multiple user human body images extracted from the scene image by the extraction unit using a pre-trained human body recognition model, display all the extracted user human body images before the recognition unit determines the target augmented reality AR character special effect matching the scene category;

a receiving unit configured to receive an image capturing instruction;

Compared with the prior art that the user can only group with a human-shaped plate or a garment with a character shape, the human-shaped plate and the special effect of the target AR role can be fused in the dynamic image acquisition process, the real perception environment and interaction environment of the user can be kept while role playing is realized, the sense of reality and the sense of fusion of user perception are increased, and the user interaction experience is improved.

Moreover, according to the scene image processing method provided by the embodiment of the application, the display posture of the target AR role special effect can be further dynamically adjusted according to the change of the human body image posture, and the corresponding target AR role special effect is selected for fusion by recognizing the scene type, so that the fused image is more vivid and more realistic.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

Example four

An AR device 40 is further provided in the embodiment of the present disclosure, as shown in fig. 4, and a schematic structural diagram of the AR device 40 provided in the embodiment of the present disclosure includes:

a processor 41 and a memory 42; the memory 42 stores machine readable instructions executable by the processor 41 which, when the AR device is run, are executed by the processor to perform the steps of: step S11, obtaining a scene image containing a human body image of a user; step S12, extracting a human body image of the target user from the acquired scene image; step S13, identifying the scene type indicated by the acquired scene image, and determining the special effect of the target augmented reality AR role matched with the scene type; and step S14, fusing the extracted human body image of the target user with the determined special effect of the target AR role and displaying the fused image.

The specific execution process of the instruction may refer to the steps of the scene image processing method in the embodiment of the present disclosure, and details are not repeated here.

EXAMPLE five

The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the scene image processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The computer program product of the scene image processing method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the scene image processing method described in the above method embodiments, which may be referred to specifically for the above method embodiments, and are not described herein again.

The embodiments of the present disclosure also provide a computer program, which when executed by a processor implements any one of the methods of the foregoing embodiments. The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an AR device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method for processing an image of a scene, comprising:

acquiring a scene image containing a human body image of a user;

extracting a human body image of a target user from the acquired scene image;

2. The method of claim 1, wherein extracting the target user body image from the acquired scene image comprises:

extracting a target user human body image from the acquired scene image by using a pre-trained human body recognition model; the human body recognition model is obtained by training a sample image labeled with foreground data and background data.

3. The method according to claim 1, wherein the step of fusing and displaying the extracted human body image of the target user and the target AR character special effect specifically comprises:

4. The method of claim 2, wherein identifying a scene category indicated by the captured scene image comprises:

5. The method according to claim 4, wherein determining the target Augmented Reality (AR) character special effect matching the scene category specifically comprises:

displaying multiple AR role special effects corresponding to the identified scene categories;

and responding to the selection operation of the user, and determining the AR role special effect selected by the user as the target AR role special effect.

6. The method of claim 1, wherein if there are a plurality of human body images of the user extracted from the scene image, before determining the special effect of the target Augmented Reality (AR) character matching the scene category, further comprising:

displaying all the extracted human body images of the user;

7. The method of any one of claims 1 to 6, further comprising:

receiving an image shooting instruction;

8. A scene image processing apparatus, comprising:

9. An AR device, comprising: an interconnected processor and memory, the memory storing machine-readable instructions executable by the processor, the processor being configured to execute machine-readable instructions stored in the memory, the machine-readable instructions, when executed by the processor, causing the processor to perform the steps of the image processing method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when executed by an AR device, performs the steps of the image processing method according to any one of claims 1 to 7.