CN111723806A

CN111723806A - Augmented reality method and apparatus

Info

Publication number: CN111723806A
Application number: CN201910207272.1A
Authority: CN
Inventors: 刘享军; 杨超
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2019-03-19
Filing date: 2019-03-19
Publication date: 2020-09-29

Abstract

The embodiment of the application discloses a method and a device for augmented reality. One embodiment of the method comprises: preprocessing an original image of a target object acquired in real time; determining a target area from the preprocessed image of the target object; determining virtual content corresponding to the target area; and applying the virtual content to the target area to generate an augmented reality image of the target object added with the virtual content. The embodiment realizes the real-time augmented reality processing of the acquired image of the target object by the terminal equipment.

Description

Augmented reality method and apparatus

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to the technical field of image processing, and particularly relates to a method and a device for augmented reality.

Background

Augmented Reality (AR) is a technology for calculating the position and angle of a camera in real time and superimposing virtual objects in a real scene, and enhances the interaction experience of a user with a real environment by adding virtual digital information. The general process of augmented reality is: firstly, the pose of a camera is positioned in a real scene, and then a virtual object is registered in the real scene by adopting a computer graphic rendering technology to generate a virtual-real fused application view.

Disclosure of Invention

The embodiment of the application provides a method and a device for augmented reality.

In a first aspect, an embodiment of the present application provides an augmented reality method, where the method includes: preprocessing an original image of a target object acquired in real time; determining a target area from the preprocessed image of the target object; determining virtual content corresponding to the target area; and applying the virtual content to the target area to generate an augmented reality image of the target object added with the virtual content.

In some embodiments, determining the target region from the pre-processed original image of the target object comprises: and inputting the preprocessed image of the target object into a pre-trained semantic segmentation model to obtain a segmentation result comprising a target region, wherein the semantic segmentation model is used for segmenting the input image into a plurality of regions.

In some embodiments, the semantic segmentation model is a lightweight deep learning network.

In some embodiments, determining virtual content corresponding to the target area comprises: virtual content matching the image features is determined based on the image features of the other regions in the image of the target object except the target region.

In some embodiments, determining to include virtual content matching the image features based on image features of other regions of the image of the target object than the target region comprises: obtaining the main colors of other areas according to the color characteristics of pixels corresponding to the other areas and a preset characteristic value statistical calculation method; inquiring a pre-configured color mapping relation to obtain a contrast color corresponding to the main color; the display color of the virtual content is configured to be a contrasting color.

In some embodiments, prior to applying the virtual content to the target area, the method further comprises: and performing smoothing processing on the contour of the target area in the segmentation result.

In some embodiments, smoothing the contour of the target region comprises: and smoothing the contour of the target region by using a Gaussian smoothing algorithm comprising a preset Gaussian kernel.

In some embodiments, applying the virtual content to the target area, generating an augmented reality image of the target object with the virtual content added thereto, comprises: and replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object, and generating an augmented reality image of the target object to which the virtual content is added.

In a second aspect, an embodiment of the present application provides an augmented reality apparatus, including: a preprocessing unit configured to preprocess an original image of a target object acquired in real time; a target area determination unit configured to determine a target area from the image of the target object after the preprocessing; a virtual content determination unit configured to determine virtual content corresponding to the target area; a generating unit configured to apply the virtual content to the target region, generating an augmented reality image of the target object to which the virtual content is added.

In some embodiments, the target area determination unit is further configured to: and inputting the preprocessed image of the target object into a pre-trained semantic segmentation model to obtain a segmentation result comprising a target region, wherein the semantic segmentation model is used for segmenting the input image into a plurality of regions.

In some embodiments, the virtual content determination unit is further configured to: determining to include virtual content matching the image feature based on the image feature of the other region in the image of the target object except the target region.

In some embodiments, the virtual content determination unit is further configured to: obtaining the main colors of other areas according to the color characteristics of pixels corresponding to the other areas and a preset characteristic value statistical calculation method; inquiring a pre-configured color mapping relation to obtain a contrast color corresponding to the main color; the display color of the virtual content is configured to be a contrasting color.

In some embodiments, the apparatus further comprises a contour processing unit configured to: the contour of the target area in the segmentation result is smoothed before the virtual content determination unit applies the virtual content to the target area.

In some embodiments, the contour processing unit is further configured to: and smoothing the contour of the target region by using a Gaussian smoothing algorithm comprising a preset Gaussian kernel.

In some embodiments, the generating unit is further configured to: and replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object, and generating an augmented reality image of the target object to which the virtual content is added.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device, on which one or more programs are stored, which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any implementation manner of the first aspect.

In a fourth aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.

According to the augmented reality method and device provided by the embodiment of the application, the received original image of the target object acquired by the image acquisition device in real time is preprocessed, the target area is determined from the preprocessed image of the target object, the virtual content corresponding to the target area is determined, and finally the virtual content is applied to the target area to generate the augmented reality image of the target object added with the virtual content, so that the terminal device can perform augmented reality processing on the acquired image of the target object in real time.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the augmented reality method of one embodiment of the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of an augmented reality method according to the present application;

3A-3C are schematic diagrams of an application scenario of an augmented reality method according to the present application;

FIG. 4 is a flow diagram of yet another embodiment of an augmented reality method according to the present application;

FIG. 5 is a schematic diagram of an embodiment of an augmented reality device according to the present application;

FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 illustrates an exemplary system architecture 100 in which the augmented reality method of one embodiment of the present application may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various client applications installed thereon, such as a web browser application, a shopping-like application, a search-like application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as a background server that receives and saves the augmented reality image transmitted by the

terminal apparatuses

101, 102, and 103.

It should be noted that the augmented reality method provided in the embodiment of the present application is generally executed by the

terminal devices

101, 102, and 103, and accordingly, the augmented reality apparatus is generally disposed in the

terminal devices

101, 102, and 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of an augmented reality method according to the present application is shown. The augmented reality method comprises the following steps:

step 201, preprocessing an original image of a target object acquired in real time.

In this embodiment, an execution subject of the augmented reality method (for example, the terminal device shown in fig. 1) may be provided with an image capturing device, or may be in communication connection with the image capturing device in a wired connection manner or a wireless connection manner.

The image acquisition device can acquire the image of the target object in real time. The image of the target object acquired by the image acquisition device may be regarded as an original image of the target object.

The target object may be, for example, various objects, such as a person, a body part (e.g., a hand, a head, etc.) of the person, an animal, etc., or may be an object, such as an automobile, a building, etc.

The execution subject may receive in real time an original image of the target object acquired by the image acquisition device.

The execution subject may perform preprocessing on an original image of the target object. The preprocessing here may include geometric transformation of the image, image enhancement, etc. The geometric transformation of the upper image may include, for example, geometric operations of translation, transposition, mirroring, rotation, scaling, etc. of the image.

The image of the target object after preprocessing, the size of the pixels included in the target object, the position and the angle of the image of the target object and the like can meet preset image rules.

Step 202, determining a target area from the preprocessed image of the target object.

In this embodiment, based on the preprocessed image of the target object obtained in step 201, the execution subject (for example, the server shown in fig. 1) may subject the preprocessed image of the target object to various analysis processes, so as to determine the target area from the preprocessed image. The target region here may be a local region in the image of the target object.

In some application scenarios, the target object may be a human hand. The target region here may be, for example, a nail region extracted from a human hand image. In other application scenarios, the target object may be, for example, a head of a person, and the target region may be, for example, a hair region extracted from an image of the head of the person. In other application scenarios, the target object may be a human face, for example, and the target region may be a lip region determined from an image of the human face, for example.

In practice, the object to be processed may be determined in advance. The object to be processed here may be a partial component of the target object. The execution subject may segment the target object by various image segmentation methods, and then select an image of the object to be processed as the target region from the segmentation result.

The image segmentation method may be, for example, a threshold-based segmentation method, a region-based segmentation method, an edge-based segmentation method, a cluster analysis-based image segmentation method, a wavelet transform-based segmentation method, a neural network-based segmentation method, or the like.

It should be noted that the above-mentioned various image segmentation methods are well-known technologies that are widely researched and applied at present, and are not described herein again.

In step 203, virtual content corresponding to the target area is determined.

In this embodiment, after the target area is obtained in step 202, the virtual content corresponding to the target area may be determined by various analysis methods on the execution subject of the augmented reality method.

The virtual content may include, for example, a virtual decoration, a display color of a virtual content region, and the like.

In some application scenarios, a plurality of virtual content selection items, such as a virtual decoration selection item, a plurality of display color selection items, and the like, may be provided in the interface displayed on the execution subject. The user may select one virtual content from the plurality of virtual content selection items. The execution body may determine the virtual content indicated by the selection operation of the user as the virtual content corresponding to the target area.

In some optional implementations of this embodiment, step 203 may include: virtual content matching the image features is determined based on the image features of the other regions in the image of the target object except the target region. The image features may include, for example, color features, texture features, shape features, spatial relationship features, and the like.

As an illustrative example, the target object may be a head of a person, and the original image of the target object is a head image of the target object captured by the image capturing device in real time. The head image includes a face image and a hair image. The hair can be determined in advance as an object to be processed, and the virtual hairstyle of the hair can be determined according to the texture feature and the shape feature of the face.

As another illustrative example, the target object may be a human hand. The original image of the target object can be a hand image acquired by the image acquisition device in real time. The human hand image includes a human hand and a fingernail on the hand. The nail may be determined in advance as an object to be processed. The execution subject may extract texture features, color features, and the like of other regions excluding the fingernails in the human hand image to determine the virtual content corresponding to the fingernail region. The virtual content here may include, for example, display colors, virtual decorations, and the like.

Further, the above determining the virtual content including the image feature matching with the image feature based on the image feature of the other region except the target region in the image of the target object may include the following steps:

firstly, according to the color features of the pixels corresponding to the other regions, the main colors of the other regions are obtained according to a preset feature value statistical method.

The characteristic value statistical method can be used for counting the distribution of RGB values of the image, and taking the average value or the median value of the RGB values as the main color. The gradation distribution of all the pixels may be calculated so that the gradation value occupying the largest number of pixels is the gradation value of the main color.

Secondly, inquiring a preset color mapping relation to obtain a contrast color corresponding to the main color.

The pre-configured color mapping relationship may be a list storing a plurality of colors and contrast colors corresponding to each color, or may be a functional relationship between the main color and the contrast colors.

And finally, configuring the display color of the virtual content as the contrast color.

As an illustrative example, the target object may be a human hand. The original image of the target object may be a human hand image acquired in real time. The nail may be determined in advance as the target object. The execution body can extract main colors of other areas of the hand except the fingernails, and selects contrast colors corresponding to the main colors according to a preset color mapping relation. And determining the contrast color as the display color of the virtual content corresponding to the determined nail region.

And step 204, applying the virtual content to the target area, and generating an augmented reality image of the target object added with the virtual content.

In the present embodiment, the execution subject described above may apply the virtual content to the target region by various methods, generating an augmented reality image of the target object to which the virtual content is added.

In some application scenarios, after the target region is determined from the pre-processed image of the target object in step 202, a binary mask map may be generated. The target area in the upper mask image is a white area, and the other areas are black areas. In these application scenarios, the virtual content may be applied in the white area in the mask map. And then mixing the mask image to which the virtual content is applied with the original image of the target object to obtain the augmented reality image of the target object to which the virtual content is added.

In some other application scenarios, the applying the virtual content to the target area in step 204 to generate the augmented reality image of the target object with the virtual content added thereto may include: and replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object, and generating an augmented reality image of the target object to which the virtual content is added. In practice, after the target region is determined from the pre-processed image of the target object in step 202, a binary mask map may be generated. The target area in the upper mask image is a white area, and the other areas are black areas. And then, the binary mask map is processed into a processed binary mask map with the same pixel size as the original image by the corresponding reverse image processing method for preprocessing the original image of the target object. The virtual content is applied within the white area (target area) of the processed binary mask map. And replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object.

With continuing reference to fig. 3A-3C, fig. 3A-3C are schematic diagrams of an application scenario of an augmented reality method according to the present application. In the application scenario, a human hand is taken as a target object. The nail region in the hand image is taken as a target region. The terminal device can receive in real time an original image of the human hand captured by the image capturing device provided thereon. And the hand image is preprocessed to obtain a preprocessed hand image as shown in fig. 3A. The nail region in fig. 3A is a target region 301. The terminal device may determine the target area 301 from the human hand image shown in fig. 3A, and obtain an image showing the target area as shown in fig. 3B (the white area in fig. 3B is the target area). The terminal device may determine virtual content corresponding to the target area. The terminal device may apply the virtual content to the above-described target region, generating an augmented reality image of the target object to which the virtual content is added as shown in fig. 3B. In the augmented reality image, the nail region 301' to which the virtual content is added is displayed.

The method provided by the embodiment of the application generates the augmented reality image of the target object added with the virtual content by preprocessing the original image of the target object acquired in real time, then determining the target area from the preprocessed image of the target object, then determining the virtual content corresponding to the target area, and finally applying the virtual content to the target area. The terminal equipment performs augmented reality processing on the acquired image of the target object in real time, so that a user can observe the effect of applying the entity corresponding to the virtual content to the target object through the augmented reality image.

With further reference to fig. 4, a flow 400 of yet another embodiment of an augmented reality method is shown. The augmented reality method flow 400 includes the following steps:

step 401, preprocessing an original image of a target object acquired in real time.

In this embodiment, step 401 may be the same as or similar to step 201 in the embodiment shown in fig. 2, and is not described herein again.

Step 402, inputting the preprocessed image of the target object into a pre-trained semantic segmentation model to obtain a segmentation result including a target region.

In this embodiment, a pre-trained semantic segmentation model may be preset in an execution subject (e.g., a terminal device shown in fig. 1) of the augmented reality method. In step 401, after the original image of the target object acquired by the image acquisition device is preprocessed, the preprocessed image of the target object is input into the pre-trained semantic segmentation model, and a segmentation result of the image of the target object is generated by the semantic segmentation model. The segmentation result may be a plurality of regions obtained by segmenting the image of the target object by the semantic segmentation model.

In some application scenarios, the object to be processed of the target object may be specified in advance. The object to be processed is reflected in the image, namely the target area.

The semantic segmentation model may be various existing semantic segmentation models, such as: a FCN network-based semantic segmentation model, a SegNet network-based semantic segmentation model, a hole convolution network-based semantic segmentation model, a RefineNet-based semantic segmentation model, and the like. It should be noted that the above semantic segmentation models are well-known technologies that are widely researched and applied at present, and are not described herein.

In some optional implementations of the present embodiment, the semantic segmentation model may be a lightweight deep learning network. The lightweight deep learning network may be, for example, an existing MobileNet V1 network, a MobileNet V2 network, or a deep learning network developed in the future and applied to a mobile terminal.

The lightweight deep learning network is mainly used for a mobile terminal calculation model. The lightweight deep learning model changes the original standard convolution operation into two-layer convolution operation, and can reduce calculation parameters on the premise of ensuring accuracy. The core of the lightweight deep learning network is to factorize the original standard convolution operation into a deep convolution (depthwiseconvolation) and a 1 × 1 point-by-point convolution (Pointwise convolution) operation. That is, the original convolutional layer is divided into two convolutional layers, wherein each convolutional kernel of the former convolutional layer (Filter is only convolved with each Channel of the input, and then the latter convolutional layer is responsible for Combining (Combining), i.e., the result of the convolution of the last layer is combined.

Because the calculated amount and the parameter amount of the lightweight deep learning network are reduced, when the terminal device uses the lightweight deep learning model to segment the image, the occupied resources are less, so that the mobile device can smoothly segment the image in real time and perform processing after segmentation, and the user experience can be improved.

When performing semantic segmentation on an image of a target object using the various semantic segmentation models described above, training of the semantic segmentation models is required. Training a semantic split pattern may include the steps of: first, a training sample set is obtained. The training sample set includes a plurality of images of the target object with the label added. The labeling may include an outline of each region to be distinguished of the target object. Or the annotation may comprise an outline of the target region. And secondly, performing geometric transformation on the training sample, labeling the image of the target object after the geometric transformation, and adding the image of the target object after the geometric transformation added with the label to the training sample set to complete the augmentation of the training sample set. And thirdly, using a machine learning method to train the samples in the augmented training sample set as the input of the initial semantic segmentation model and the contour of the target area of the target object as the expected output of the initial semantic segmentation model to obtain the trained semantic segmentation model.

In the present embodiment, the lightweight deep learning network described above includes an encoder portion and a decoder portion. The encoder portion may be a network modified from the MobilenetV2 network. Specifically, the last fully connected layer of the original MobilenetV2 network can be replaced by a convolutional layer, so that pixel-level segmentation can be realized. The lightweight deep learning network can be applied to the field of pixel level segmentation.

A decoder section having a hierarchical structure substantially corresponding to the hierarchy of the encoder section. After deconvolution operation is carried out on each layer, channel fusion is carried out on feature maps (feature maps) with the same size as the feature extraction part, and convolution is carried out again after a new feature map is obtained. And the last layer carries out fixed parameter form up-sampling in a bilinear mode. It is worth noting that no normalization and activation function operations need to be performed after each convolution, since the role of the decoder part is to restore low-resolution features to high-resolution features.

It should be noted that the structures and the using methods of the lightweight deep learning networks MobileNet V1 and MobileNet V2 are well-known technologies that are widely researched and applied at present, and are not described herein again.

In step 403, virtual content corresponding to the target area is determined.

In this embodiment, step 403 may be the same as or similar to step 203 in the embodiment shown in fig. 2, and is not described herein again.

Step 404, performing smoothing processing on the contour of the target area in the segmentation result.

In this embodiment, in order to avoid the edge aliasing phenomenon that occurs when virtual content (for example, display color) is directly applied to the target region, the contour of the target region obtained by segmentation in step 402 may be smoothed by using various conventional smoothing methods.

The smoothing algorithm may include, for example: a mean filtering algorithm, a median filtering algorithm, a bilateral filtering algorithm, etc. The smoothing algorithm is a well-known technical content widely studied and applied at present, and is not described herein in detail.

In some optional implementations of the embodiment, the smoothing of the contour of the target region in the segmentation result may include smoothing the contour of the segmented target region by using a gaussian smoothing algorithm including a preset gaussian kernel. Specifically, for each pixel, neighborhood pixel value sampling may be performed on the pixel, and then weighted average processing is performed on the sampling value of the neighborhood pixel value according to the gaussian kernel, so as to obtain a color value of the pixel.

In these alternative implementations, an example of the above-mentioned preset gaussian kernel is as follows:

0.0751	0.1238	0.0751
			0.1238	0.2042	0.1238
0.0751	0.1238	0.0751

the execution body may distribute the task of the smoothing Processing of the contour of the target region to a Graphics Processing Unit (GPU) provided in the execution body. Thereby reducing the amount of tasks of a Central Processing Unit (CPU) in the execution main body.

Through the Gaussian smoothing processing, the detail level of the contour of the target area obtained by segmentation can be reduced, and therefore the sawtooth phenomenon caused when virtual content is applied to the target area can be improved.

Step 405, applying the virtual content to the target area, generating an augmented reality image of the target object to which the virtual content is added.

In this embodiment, step 405 may be the same as step 204 in the embodiment shown in fig. 2, and is not described herein again.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the process 400 of the augmented reality method in this embodiment highlights that the preprocessed image of the target object is input to the pre-trained semantic segmentation model to obtain the segmentation result including the target region, so that the computation amount for determining the target region from the image of the target object can be reduced, and the speed of generating the augmented reality image of the target object can be increased. In some application scenarios, the semantic segmentation model may be a lightweight semantic segmentation network model, so that the fluency of the augmented reality image of the target object generated on the mobile terminal in real time may be ensured.

In addition, the embodiment also highlights the step of smoothing the contour of the target area in the segmentation result, so that the scheme described in the embodiment enables the image of the target object in the augmented reality image to be visually integrated with the virtual content.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an augmented reality apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied to various electronic devices.

As shown in fig. 5, the augmented reality apparatus 500 of the present embodiment includes: a preprocessing unit 501, a target area determining unit 502, a virtual content determining unit 503, and a generating unit 504. The preprocessing unit 501 is configured to preprocess an original image of a target object acquired in real time; a target area determination unit 502 configured to determine a target area from the image of the target object after the preprocessing; a virtual content determining unit 503 configured to determine virtual content corresponding to the target area; a generating unit 504 configured to apply the virtual content to the target region, generating an augmented reality image of the target object to which the virtual content is added.

In this embodiment, specific processes of the preprocessing unit 501, the target area determining unit 502, the virtual content determining unit 503, and the generating unit 504 of the augmented reality apparatus 500 and technical effects thereof may refer to related descriptions of step 201, step 202, step 203, and step 204 in the corresponding embodiment of fig. 2, respectively, and are not repeated herein.

In some optional implementations of the present embodiment, the target area determination unit 502 is further configured to: and inputting the preprocessed image of the target object into a pre-trained semantic segmentation model to obtain a segmentation result comprising a target region, wherein the semantic segmentation model is used for segmenting the input image into a plurality of regions.

In some optional implementations of this embodiment, the semantic segmentation model is a lightweight deep learning network.

In some optional implementations of this embodiment, the virtual content determining unit 503 is further configured to: determining to include virtual content matching the image feature based on the image feature of the other region except the target region in the image of the target object.

In some optional implementations of this embodiment, the virtual content determining unit 503 is further configured to: obtaining the main colors of other areas according to the color characteristics of the pixels corresponding to the other areas and a preset characteristic value statistical method; inquiring a pre-configured color mapping relation to obtain a contrast color corresponding to the main color; the display color of the virtual content is configured to be a contrasting color.

In some optional implementations of the present embodiment, the augmented reality apparatus 500 further includes a contour processing unit (not shown in the figure). The contour processing unit is configured to: the contour of the target area in the segmentation result is smoothed before the virtual content determination unit applies the virtual content to the target area.

In some optional implementations of this embodiment, the contour processing unit is further configured to: and smoothing the contour of the target region by using a Gaussian smoothing algorithm comprising a preset Gaussian kernel.

In some optional implementations of this embodiment, the generating unit 504 is further configured to: and replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object, and generating an augmented reality image of the target object to which the virtual content is added.

Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The computer system 600 may also include a Graphics Processor (GPU) 612. The CPU601, ROM 602, RAM 603, and GPU612 are connected to each other via a bus 604. An Input/Output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output section 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN (Local area network) card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a preprocessing unit, a target area determining unit, a virtual content determining unit, and a generating unit. The names of the units do not in some cases constitute a limitation on the units themselves, for example, the preprocessing unit may also be described as a "unit that preprocesses an original image of a target object acquired in real time".

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: preprocessing an original image of a target object acquired in real time; determining a target area from the preprocessed image of the target object; determining virtual content corresponding to the target area; and applying the virtual content to the target area to generate an augmented reality image of the target object added with the virtual content.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. An augmented reality method comprising:

preprocessing an original image of a target object acquired in real time;

determining a target area from the preprocessed image of the target object;

determining virtual content corresponding to the target area;

and applying the virtual content to the target area to generate an augmented reality image of the target object added with the virtual content.

2. The method of claim 1, wherein the determining a target region from the pre-processed original image of the target object comprises:

inputting the preprocessed image of the target object into a pre-trained semantic segmentation model to obtain a segmentation result comprising the target area, wherein the semantic segmentation model is used for segmenting the input image into a plurality of areas.

3. The method of claim 2, wherein the semantic segmentation model is a lightweight deep learning network.

4. The method of claim 1, wherein the determining virtual content corresponding to the target area comprises:

and determining virtual content matched with the image characteristics based on the image characteristics of other areas except the target area in the image of the target object.

5. The method of claim 4, wherein the determining, based on image features of other regions of the image of the target object than the target region, to include virtual content matching the image features comprises:

obtaining the main colors of the other regions according to the color features of the pixels corresponding to the other regions and a preset feature value statistical method;

inquiring a pre-configured color mapping relation to obtain a contrast color corresponding to the main color;

and configuring the display color of the virtual content as the contrast color.

6. The method of claim 2, wherein prior to applying the virtual content to the target area, the method further comprises:

and carrying out smoothing processing on the contour of the target area in the segmentation result.

7. The method of claim 6, wherein the smoothing the contour of the target region comprises:

and smoothing the contour of the target region by using a Gaussian smoothing algorithm comprising a preset Gaussian kernel.

8. The method of claim 1, wherein the applying the virtual content to the target region, generating an augmented reality image of a target object to which the virtual content is added, comprises:

and replacing the target area to which the virtual content is applied with a corresponding area in the original image of the target object, and generating an augmented reality image of the target object to which the virtual content is added.

9. An augmented reality apparatus comprising:

a preprocessing unit configured to preprocess an original image of a target object acquired in real time;

a target area determination unit configured to determine a target area from the image of the target object after the preprocessing;

a virtual content determination unit configured to determine virtual content corresponding to the target area;

a generating unit configured to apply the virtual content to the target region, generating an augmented reality image of a target object to which the virtual content is added.

10. The apparatus of claim 9, wherein the target region determination unit is further configured to:

11. The apparatus of claim 10, wherein the semantic segmentation model is a lightweight deep learning network.

12. The apparatus of claim 9, wherein the virtual content determination unit is further configured to:

determining virtual content including a match with the image feature based on image features of other regions of the image of the target object except the target region.

13. The apparatus of claim 12, wherein the virtual content determination unit is further configured to:

and configuring the display color of the virtual content as the contrast color.

14. The apparatus of claim 10, wherein the apparatus further comprises a contour processing unit configured to:

smoothing the contour of the target region in the segmentation result before the virtual content determination unit applies the virtual content to the target region.

15. The apparatus of claim 14, wherein the contour processing unit is further configured to:

16. The apparatus of claim 9, wherein the generating unit is further configured to:

17. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

18. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.