CN117041670A

CN117041670A - Image processing method and related equipment

Info

Publication number: CN117041670A
Application number: CN202311286705.XA
Authority: CN
Inventors: 施嘉呈; 王浩; 李昌盛; 王千
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2023-10-08
Filing date: 2023-10-08
Publication date: 2023-11-10
Anticipated expiration: 2043-10-08
Also published as: CN117041670B

Abstract

The embodiment of the application provides an image processing method and related equipment. The method comprises the following steps: the method comprises the steps that video communication is established between a first electronic device and a second electronic device; the first electronic equipment acquires a first frame image by adopting a camera; the first frame image comprises a first object, and the viewpoint of the first object in the first frame image is a first viewpoint; the first electronic device sends a second frame image to the second electronic device; the second frame image is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the first object is a second viewpoint; the first electronic device is preset with a second viewpoint, and the second viewpoint is different from the first viewpoint. Therefore, images meeting the viewpoint requirements can be displayed in the electronic equipment and/or other electronic equipment, so that the facial expression communication benefit of the user is improved, and the attractiveness of the video is also improved.

Description

Image processing method and related equipment

Technical Field

The present application relates to the field of terminal technologies, and in particular, to an image processing method and related devices.

Background

The electronic device may be provided with a camera, for example, the electronic device may be a personal computer (personal computer, PC), a notebook computer, a mobile phone, or the like. The notebook computer can be provided with a camera above the screen, so that the notebook computer can realize functions of video conference, video call and the like.

However, in the scenes such as video conference or video call, the problem that the deviation between the image angle and the positive face angle of some users is large occurs, so that the aesthetic property of the user image is deteriorated, and the user experience is affected.

Disclosure of Invention

The embodiment of the application provides an image processing method and related equipment, which are applied to the technical field of terminals. When an image with larger deviation between the viewpoint and the preset viewpoint is obtained, converting the viewpoint in the image into the preset viewpoint, displaying the image meeting the viewpoint requirement on the electronic equipment and/or other electronic equipment, and improving the use experience of the user.

In a first aspect, an embodiment of the present application provides an image processing method, which is applied to a first electronic device including a camera, and the method includes: the method comprises the following steps: the method comprises the steps that video communication is established between a first electronic device and a second electronic device; the first electronic equipment acquires a first frame image by adopting a camera; the first frame image comprises a first object, and the viewpoint of the first object in the first frame image is a first viewpoint; the first electronic device sends a second frame image to the second electronic device; the second frame image is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the first object is a second viewpoint; the first electronic device is preset with a second viewpoint, and the second viewpoint is different from the first viewpoint. In this way, the first electronic device can convert the image of the real viewpoint into the image of the preset viewpoint, so that the image meeting the viewpoint requirement is obtained; for example, the first electronic device may convert a face image of a chin from a viewpoint into a front face image, so as to improve the facial expression communication benefit of the user, and also improve the aesthetic property of the video.

In one possible implementation manner, the first electronic device acquires a third frame image by adopting a camera; the third frame image comprises a first object, wherein the viewpoint of the first object is a third viewpoint which is different from the first viewpoint; the first electronic device sends a fourth frame image to the second electronic device; the fourth frame image is obtained by converting the viewpoint of the first object in the third frame image by the first electronic equipment; in the fourth frame image, the viewpoint of the first object is the second viewpoint. In this way, when the user uses the first electronic device to perform video communication, the first electronic device can convert any image with a non-preset viewpoint into an image with a preset viewpoint, so as to obtain an image meeting the viewpoint requirement; for example, the first electronic device may convert the non-frontal face image into a frontal face image, so as to improve the facial expression communication benefit of the user, and also improve the aesthetic property of the video.

In one possible implementation manner, a plurality of object feature templates of a first object are stored in the first electronic device, and any one object feature template is a feature template corresponding to the first object at one viewpoint; before the first electronic device sends the second frame image to the second electronic device, the method further comprises: the first electronic equipment identifies characteristic points of a first object in a first frame image, converts the characteristic points of the first object into characteristic vectors of the first object, and generates a first object characteristic template; when the similarity between the first object feature template and any one of the object feature templates is larger than or equal to a similarity threshold, the first electronic device combines the object feature templates of the second viewpoints in the object feature templates to convert the first frame image into the second frame image. Therefore, after the first electronic equipment verifies the identity of the first object in the first frame image, the first electronic equipment can perform viewpoint conversion on the first frame image to obtain an image meeting the viewpoint requirements, and therefore accuracy and safety of the image processing method in the embodiment of the application are improved.

In one possible implementation, the first electronic device converts the first frame image into the second frame image in combination with an object feature template of the second viewpoint in the plurality of object feature templates, including: and the first electronic equipment fuses part of characteristic points in the object characteristic template of the second viewpoint with characteristic points of the first object in the first frame image to obtain a second frame image. Thus, the electronic device can obtain the second frame image under the preset viewpoint, thereby obtaining the image meeting the viewpoint requirement.

In one possible implementation, before the first electronic device establishes video communication with the second electronic device, the method further includes: the first electronic equipment displays a first interface, wherein the first interface comprises a first button which is in a closed state; wherein the state of the first button comprises: an on state and an off state; when the state of the first button is an on state, the first electronic equipment outputs a frame image of a preset viewpoint when the first object performs video communication; when the state of the first button is in a closed state, the first electronic equipment outputs a frame image acquired by the camera when the first object performs video communication; the first electronic device receives a triggering operation for a first button; in response to a trigger operation for the first button, the first electronic device sets the first button in the off state to the first button in the on state. Therefore, the user can select whether to use the viewpoint conversion function in the first electronic equipment according to the self requirement, and the user experience is improved.

In one possible implementation, the first interface further includes a second button, where the second button is used to instruct the first electronic device to use the camera to capture an image of the first object; after the first electronic device displays the first interface, further comprising: the first electronic device receives a triggering operation for the second button; responding to triggering operation for a second button, and displaying a second interface by the first electronic equipment, wherein the second interface comprises a first object and prompt information; the prompt information comprises prompt words and/or prompt voice; the prompt information is used for indicating the first object to rotate the head; the electronic equipment adopts a camera to collect a plurality of images of the first object. In this way, the first electronic device may collect information of the first object in advance. The information of the first object in the setting scene can be used as the basis of viewpoint conversion and identity verification of the first object in the using scene, so that the first electronic equipment can accurately obtain the target face image.

In one possible implementation manner, after the electronic device acquires the images of the plurality of first objects with the camera, the method further includes: the electronic equipment identifies the characteristic points of the first object in the image of any first object, converts the characteristic points of the first object into characteristic vectors of the first object, and generates object characteristic templates under multiple viewpoints. Thus, the first electronic device can obtain the object feature templates of the first object under each viewpoint, so that the first object can be accurately checked and the target face image can be output later.

In one possible implementation, the first interface further includes a first area, where a plurality of third buttons are displayed; any third button corresponds to a schematic diagram, objects with one view point are displayed in the schematic diagram, and the view points of the objects are different in any schematic diagram; after the first electronic device displays the first interface, further comprising: the first electronic equipment receives triggering operation of a third button corresponding to the second viewpoint; responding to triggering operation of a third button corresponding to the second viewpoint, and setting a preset viewpoint by the first electronic equipment; the preset viewpoint is a second viewpoint; or the first electronic device does not receive a triggering operation for any third button; the first electronic equipment sets a preset viewpoint; the default view point is a default view point, and the default view point comprises a second view point. Therefore, the user can set the preset viewpoint according to the self demand, the follow-up first electronic equipment can output images meeting the preset viewpoint, and the use experience of the user is improved.

In one possible implementation, the method further includes: the first electronic equipment acquires a fifth frame image by adopting a camera; the fifth frame image comprises a first object, and in the fifth frame image, the viewpoint of the first object is a fourth viewpoint; the first electronic device recognizes that the head of the first object rotates to the side based on the fifth frame image, and the first electronic device sends a sixth frame image to the second electronic device; the sixth frame image is obtained by converting the viewpoint of the first object in the fifth frame image by the first electronic device; in the sixth frame image, the viewpoint of the first object is a fifth viewpoint; the fifth view is different from the fourth view; the first electronic equipment acquires a seventh frame image by adopting a camera; the seventh frame image includes the first object, and in the seventh frame image, the viewpoint of the first object is a sixth viewpoint, and the sixth viewpoint is different from the fourth viewpoint; the first electronic device recognizes that the head of the first object rotates to the side based on the seventh frame image, and the first electronic device sends an eighth frame image to the second electronic device; the eighth frame image is obtained by converting the viewpoint of the first object in the seventh frame image by the first electronic device; in the eighth frame image, the viewpoint of the first object is a seventh viewpoint; the seventh view is different from the sixth view; wherein, the vertical angle of the fourth view point and the vertical angle of the fifth view point are different, and the horizontal angle of the fourth view point and the horizontal angle of the fifth view point are the same; the vertical angle of the sixth view point is different from the vertical angle of the seventh view point, and the horizontal angle of the sixth view point is the same as the horizontal angle of the seventh view point; and the vertical angle of the fifth viewpoint, the vertical angle of the seventh viewpoint, and the vertical angle of the second viewpoint are the same. Therefore, the first electronic equipment can display the user image more flexibly and vividly, and the aesthetic property of the video is improved.

In a possible implementation manner, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image further includes a second object, and in the second frame image, a viewpoint of the second object is a second viewpoint. Therefore, when the hands of the user are out of the mirror, the first electronic equipment can simultaneously perform viewpoint conversion on the face and the hands, so that images meeting viewpoint requirements more naturally are output, and the use experience of the user is improved.

In a possible implementation manner, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image also comprises a second object, and the second frame image is obtained by the first electronic equipment after the image of the second object and the converted image of the first object are subjected to image fusion; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint. In this way, the first electronic device can fuse the face after viewpoint conversion with the unconverted hand to obtain a more natural second frame image; to a certain extent, the operation pressure of the first electronic device can also be reduced.

In a possible implementation manner, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image also comprises a second object, and the second frame image is obtained by splicing the image of the second object with the converted image of the first object by the first electronic equipment; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint. In this way, the first electronic device can splice the face after viewpoint conversion with the unconverted hand without fusion, and in the second frame image, the face is displayed in a partial area and the hand is displayed in a partial area in a similar form of a split mirror; the operation pressure of the first electronic device can be reduced.

In one possible implementation, the resolution of the second frame image is related to the network quality of the first electronic device, the network quality of the second electronic device, the operational capability of the first electronic device, and/or the operational capability of the second electronic device; the resolution of the second frame image is positively correlated with the network quality of the first electronic device; the resolution of the second frame image is positively correlated with the network quality of the second electronic device; the resolution of the second frame image is positively correlated with the operational capabilities of the first electronic device; the resolution of the second frame image is positively correlated with the operational capabilities of the second electronic device. In this way, the first electronic device and/or the second electronic device may adjust the image resolution according to the network quality and/or the operation capability, so that the first electronic device and/or the second electronic device may smoothly display the image.

In a second aspect, an embodiment of the present application provides an electronic device, which may also be referred to as a terminal device, a terminal (terminal), a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), or the like. The terminal device may be a mobile phone, a smart television, a wearable device, a tablet (Pad), a computer with wireless transceiving function, a Virtual Reality (VR) terminal device, an augmented reality (augmented reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self-driving), a wireless terminal in teleoperation (remote medical surgery), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), or the like.

The electronic device includes: comprising the following steps: a processor and a memory; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to cause the electronic device to perform a method performed by a first electronic device as described in any one of the possible implementations of the first aspect, or to perform a method performed by a second electronic device as described in any one of the possible implementations of the first aspect.

In a third aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program. The computer program, when executed by a processor, implements a method performed by a first electronic device as described in any one of the possible implementations of the first aspect; or to implement a method performed by a second electronic device as described in any one of the possible implementations of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run, causes a computer to perform a method performed by a first electronic device as described in any one of the possible implementations of the first aspect; or performing a method performed by a second electronic device as described in any one of the possible implementations of the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip comprising a processor for invoking a computer program in a memory to perform a method performed by a first electronic device as described in any one of the possible implementations of the first aspect; or performing a method performed by a second electronic device as described in any one of the possible implementations of the first aspect.

It should be understood that the second to fifth aspects of the present application correspond to the technical solutions of the first aspect of the present application, and the advantages obtained by each aspect and the corresponding possible embodiments are similar, and are not repeated.

Drawings

FIG. 1 is a schematic view of an effect presented by an image at various viewpoints;

FIG. 2 is a schematic diagram of one possible implementation of a camera arrangement;

FIG. 3 is a schematic diagram of three possible low angle camera arrangements;

FIG. 4 is a schematic diagram of an image captured by a low-angle camera in a possible implementation;

fig. 5 is a schematic hardware structure of an electronic device 100 according to an embodiment of the present application;

fig. 6 is a software block diagram of an electronic device 100 according to an embodiment of the present application;

fig. 7 is an interface schematic diagram of a setting scenario provided in an embodiment of the present application;

fig. 8 is a schematic flow chart of a setting scenario provided in an embodiment of the present application;

FIG. 9 is a schematic diagram of collecting facial information of a user according to an embodiment of the present application;

fig. 10 is a face image marked with face feature points according to an embodiment of the present application;

fig. 11 is a schematic diagram of a setting scenario provided in an embodiment of the present application;

FIG. 12A is an interface diagram of a usage scenario according to an embodiment of the present application;

FIG. 12B is an interface diagram of another usage scenario provided by an embodiment of the present application;

FIG. 13 is a schematic flow chart of a usage scenario provided in an embodiment of the present application;

fig. 14 is a schematic view of a scene of two kinds of viewpoint conversion according to an embodiment of the present application;

FIG. 15 is a schematic diagram of a usage scenario provided by an embodiment of the present application;

FIG. 16 is a view of face images at different resolutions according to an embodiment of the present application;

FIG. 17 is an interface diagram of a user panning scene provided by an embodiment of the present application;

fig. 18 is a schematic diagram of a user shaking scene provided by an embodiment of the present application;

FIG. 19 is a schematic diagram of an interface for synchronously displaying face and hand images of a user according to an embodiment of the present application;

fig. 20 is a flowchart of an image processing method according to an embodiment of the present application;

fig. 21 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.

Detailed Description

In order to facilitate the clear description of the technical solutions of the embodiments of the present application, the following simply describes some terms and techniques involved in the embodiments of the present application:

1. viewpoint: the viewpoint may be a viewpoint of a camera, and the viewpoint of the camera is a position where a shooting point is located when the camera shoots an object. For example, a camera is arranged on the electronic device, and when the electronic device starts a video conference function and the like, an image seen by a user is an image acquired by the camera from the viewpoint of the camera. In the embodiment of the application, the viewpoint may be the position of the shooting point when the virtual camera virtually shoots the three-dimensional model.

For example, fig. 1 shows an image presented at a plurality of viewpoints. In which a diagram in fig. 1 shows 9 viewpoints in different directions, each viewpoint may be regarded as a photographing point of a camera, for example, the camera may photograph a user from directions of a left upper viewpoint 1, a right upper viewpoint 2, a right upper viewpoint 3, a left side viewpoint 4, a front viewpoint 5, a right side viewpoint 6, a left lower viewpoint 7, a right lower viewpoint 8, a right lower viewpoint 9, and the like, and the electronic device may obtain an image as shown in b diagram in fig. 1. The b-diagram in fig. 1 shows images at respective viewpoints.

In a diagram a in fig. 1, the camera may be a high-angle camera when it is located at viewpoint 1, viewpoint 2 or viewpoint 3; the camera may be a low angle camera when located at viewpoint 7, viewpoint 8 or viewpoint 9. In the embodiment of the application, the image shot by the camera at the viewpoint 5 (the image of the viewpoint 5) is a frontal face image.

The embodiment of the application only shows nine viewpoints by way of example, and in practical application, the camera can be located at more viewpoints, and the embodiment of the application is not limited thereto.

In the embodiment of the application, the view point can be a real view point or a virtual view point; the position of the shooting point of the camera may be a real viewpoint, and the position of the shooting point after performing viewpoint conversion on the real viewpoint may be a virtual viewpoint.

In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

The "at … …" in the embodiment of the present application may be an instant when a certain situation occurs, or may be a period of time after a certain situation occurs, which is not particularly limited. In addition, the display interface provided by the embodiment of the application is only used as an example, and the display interface can also comprise more or less contents.

By way of example, fig. 2 illustrates one possible implementation of a camera setup, as shown in fig. 2:

the electronic device 100 may be a notebook computer.

In the scenario shown in fig. 2, a camera 193 is disposed on the display screen of the electronic device 100, and the camera 193 is located in the middle of the upper frame of the display screen.

In an actual use scenario, when a user uses the camera function of the electronic device, the camera is substantially level with the height of the user's face, as in the scenario shown in b of fig. 2. The image photographed by the camera can better reflect the facial expression of the user, such as the right front face image in the b-diagram of fig. 2.

In order to increase the screen ratio of the display screen in the notebook computer, in some scenarios, the notebook computer may use a low angle camera to reduce the width of the upper bezel, e.g., the low angle camera may be disposed below the screen of the notebook computer or hidden in the keyboard.

Fig. 3 shows three low angle camera settings in a possible implementation, as shown in fig. 3:

in the scenario a in fig. 3, a camera 193 is disposed on the display screen of the electronic device 100, and the camera 193 is located in the middle of the lower frame of the display screen. It can be seen that in order to increase the screen ratio of the display area, the upper frame, the left frame, and the right frame of the electronic device 100 are all subjected to an extremely narrowing process, which makes the width of the upper frame no longer capable of placing the camera 193. In order not to reduce the photographing function of the electronic apparatus 100, the camera 193 is placed at the lower frame.

In the scenario shown in b of fig. 3, the display screen of the electronic device 100 is provided with a camera 193, and the camera 193 may also be located on the left or right side of the lower border of the display screen.

In the scenario shown in fig. 3 c, the camera 193 in the electronic device 100 is combined with keys in the keyboard, the camera 193 occupying a single key location, e.g., the camera 193 is located in the keyboard region at the closest row of keys to the screen. The user can switch the camera function and the key function through pressing operation or sliding operation. Thus, the upper frame, the lower frame, the left frame and the right frame of the electronic device 100 can be extremely narrowed, so as to further improve the screen occupation ratio of the display area.

However, in practical applications, the low-angle camera may not meet the needs of the user, and the user often has the following problems when using the low-angle camera:

for example, fig. 4 shows a schematic view of a scene captured by a low angle camera.

Taking the example that the camera is arranged in the keyboard keys of the electronic equipment, when a user uses the camera function of the electronic equipment, the height of the camera is lower than the height of the face of the user, the camera is used for surreptitiously shooting the user, and the user image obtained by the electronic equipment can be deformed. For example, during a video conference, users often need to face a screen in order to look at the conference content, as shown in scene a in fig. 4; at this time, the user image collected by the camera can highlight the chin, nose and other parts of the user, the face cannot face the camera, the facial expression communication benefit is reduced, and the image is less attractive, as shown by b in fig. 4. For another example, the user may look at the camera with a low head to ensure that the camera can capture facial images of the front of the user, but the user may not be able to view the display screen, affecting the user's use experience.

In view of this, an embodiment of the present application provides an image processing method, when an electronic device obtains an image with a larger deviation between a viewpoint and a preset viewpoint, the electronic device first processes the viewpoint to a position almost similar to the preset viewpoint, and after obtaining an image with the viewpoint adjusted, displays the image with the viewpoint adjusted, and/or sends the image with the viewpoint adjusted to other electronic devices for display, so that the electronic device and/or other electronic devices can display images meeting the viewpoint requirements. For example, in the scene a shown in fig. 4, the electronic device may convert the face image of the chin from the viewpoint into the front face image, so as to improve the facial expression communication benefit of the user and also improve the aesthetic property of the video.

In the embodiment of the application, the electronic equipment: may also be referred to as a terminal device, a terminal (terminal), a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), etc. The terminal device may be a mobile phone, a smart television, a wearable device, a tablet (Pad), a computer with wireless transceiving function, a Virtual Reality (VR) terminal device, an augmented reality (augmented reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self-driving), a wireless terminal in teleoperation (remote medical surgery), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation security (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), or the like.

The wearable device can also be called as a wearable intelligent device, and is a generic name for intelligently designing daily wearing and developing wearable devices by applying a wearable technology, such as glasses, gloves, watches, clothes, shoes and the like. The wearable device is a portable device that is worn directly on the body or integrated into the clothing or accessories of the user. The wearable device is not only a hardware device, but also can realize a powerful function through software support, data interaction and cloud interaction. The generalized wearable intelligent device includes full functionality, large size, and may not rely on the smart phone to implement complete or partial functionality, such as: smart watches or smart glasses, etc., and focus on only certain types of application functions, and need to be used in combination with other devices, such as smart phones, for example, various smart bracelets, smart jewelry, etc. for physical sign monitoring.

In addition, in the embodiment of the application, the electronic equipment can also be electronic equipment in an internet of things (internet of things, ioT) system, and the IoT is an important component of the development of future information technology, and the main technical characteristics of the IoT are that the article is connected with a network through a communication technology, so that the man-machine interconnection and the intelligent network of the internet of things are realized. The embodiment of the application does not limit the specific technology and the specific equipment form adopted by the electronic equipment.

In an embodiment of the present application, an electronic device may include a hardware layer, an operating system layer running above the hardware layer, and an application layer running above the operating system layer. The hardware layer includes hardware such as a central processing unit (central processing unit, CPU), a memory management unit (memory management unit, MMU), and a memory (also referred to as a main memory). The operating system may be any one or more computer operating systems that implement business processes through processes (processes), such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a windows operating system. The application layer comprises applications such as a browser, an address book, word processing software, instant messaging software and the like.

In the embodiment of the application, the electronic device may be a notebook computer, or the electronic device may be other electronic device modes, which is not limited in the embodiment of the application.

By way of example, fig. 5 shows a schematic structural diagram of an electronic device.

The electronic device may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earpiece interface 170D, keys 190, an indicator 192, a camera 193, and a display 194, among others.

It should be understood that the structure illustrated in the embodiments of the present application does not constitute a specific limitation on the electronic device. In other embodiments of the application, the electronic device may include more or less components than illustrated, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

A memory may be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.

In the embodiment of the present application, after the processor 110 collects the face image of the user, the pixel points corresponding to the outline of the five sense organs in the face image can be converted into feature vectors, so as to generate a feature vector template; memory in the processor 110 may be used to store feature vector templates.

The internal memory 121 may be used to store computer-executable program code that includes instructions. The internal memory 121 may include a storage program area and a storage data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data created during use of the electronic device (e.g., audio data, phonebook, feature vector templates of embodiments of the application, etc.), and so on. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like. The processor 110 performs various functional applications of the electronic device and data processing by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor. For example, in the embodiment of the present application, the processor may cause the electronic device to execute the image processing method provided in the embodiment of the present application by executing the instructions stored in the internal memory.

The electronic device implements display functions via a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information. The electronic device may implement shooting functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.

The camera 193 is used to capture still images or video. In some embodiments, the electronic device may include 1 or N cameras 193, N being a positive integer greater than 1.

In embodiments of the present application, during the process of the electronic device collecting user information, the camera 193 may collect images; during the video conference, the camera 193 may acquire images in real time.

The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. In some embodiments, the electronic device may include 1 or N display screens 194, N being a positive integer greater than 1. The electronic device implements display functions via a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor.

In the embodiment of the present application, the display screen 194 may be used to display the face image after viewpoint conversion.

The software system of the electronic device 100 may employ a layered architecture, an event driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture, among others. The embodiment of the present application exemplifies a Windows system of a layered architecture, and illustrates a software structure of the electronic device 100.

Fig. 6 is a software configuration block diagram of a terminal device according to an embodiment of the present application.

The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, a Windows system includes an application layer and an Operating System (OS) layer.

As shown in fig. 6, the application layer includes a plurality of application programs, for example, a setup application and a video communication application. The setting application is provided with an inlet for starting the image processing method of the embodiment of the application, and the image processing method can be started based on the setting application; the setting application can also be used for setting a preset viewpoint of an object in the output image and acquiring information of the object; the setup application may be, for example, a computer housekeeping application. Video communication applications are used for video communication with other electronic devices, and video communication applications may be, for example, video telephony applications and video conferencing applications.

Taking the example that the object in the image is a human face. The OS layer includes a correlation module for acquiring and processing a face image. For example, the OS layer may include: the system comprises a human image acquisition module, a human face recognition module, a characteristic comparison module, a viewpoint conversion module and a database. The image acquisition module is used for receiving face images uploaded by a camera of a hardware layer; the face recognition module is used for converting the face image into a face feature template; feature comparison modules are used, for example, for user authentication; the viewpoint conversion module is used for converting viewpoints of faces in the face images; the database is for example used to store face feature templates.

It should be noted that, the embodiment of the present application is only illustrated by a Windows system, and in other operating systems (such as an android system, an IOS system, etc.), the scheme of the present application can be implemented as long as the functions implemented by each functional module are similar to those of the embodiment of the present application.

Two possible implementation scenarios are described below in connection with fig. 6:

in one possible implementation, the electronic device operates in a setup application that can inform the underlying layer to begin collecting facial information of the user. The camera collects face images of all viewpoints and reports the face images of all viewpoints to the face recognition module through the face image collection module; the face recognition module converts face images of all viewpoints into face feature templates of all viewpoints; the face recognition module stores face feature templates of all viewpoints into a memory through a database.

In another possible implementation, the electronic device operates in a video communication application that can inform the underlying layer to begin capturing the image to be processed. The camera collects an image to be processed and reports the image to be processed to the face recognition module through the face collection module; the face recognition module converts the image to be processed into a face feature template of the image to be processed; the face recognition module transmits a face feature template of the image to be processed to the feature comparison module; the feature comparison module reads face feature templates of all viewpoints in the memory through the database; the feature comparison module compares the face feature templates of the image to be processed with the face feature templates of all the viewpoints to verify the identity of the user; after the verification is successful, the feature comparison module can inform the viewpoint conversion module of performing viewpoint conversion; the viewpoint conversion module converts an original viewpoint of a human image in the image to be processed into a preset viewpoint to obtain a target face image; and the viewpoint conversion module reports the target face image to the video communication application. Subsequently, the electronic device may display the target face image based on the video communication application, or the electronic device may send the target face image to other electronic devices in video communication therewith.

The use scenario of the image processing method in the embodiment of the present application will be described with reference to fig. 7. Fig. 7 is an interface schematic diagram of an image processing method according to an embodiment of the present application, where the interface schematic diagram is shown in fig. 7:

taking the example that a camera is arranged on the lower frame of a display screen of electronic equipment, a switch using the image processing method provided by the embodiment of the application is arranged in the electronic equipment. After the user triggers the switch, the electronic device can display the face image acquired by the camera as a front face image.

In one possible implementation, the switch for initiating the image processing method of the embodiments of the present application may be placed in a computer housekeeping application. For example, the electronic device may launch a computer housekeeping application, the electronic device displaying an interface as shown in a of fig. 7. The interface shown in a of FIG. 7 may include a number of setup options, such as home page, intelligent interconnect, multi-screen collaboration, system optimization, application recommendation, intelligent audio visual 701, troubleshooting, and the like; among the setting options of the smart view 701, a button 702 for camera setting, a button for display setting, a button for audio setting, and the like may be included. A switch for starting the image processing method of the embodiment of the present application may be provided in the relevant function of the camera setting.

Illustratively, when the electronic device receives a trigger operation of the button 702 for camera settings, the electronic device enters a camera settings interface, such as the interface shown as b in fig. 7. In the interface shown in b in fig. 7, the electronic apparatus may display a switch 703 for portrait view conversion, a button for setting a portrait view, and a button 705 for setting a user portrait. The state of the switch 703 for converting the portrait viewpoint may include: an on state and an off state. When the switch 703 for turning the portrait view is in an on state, the electronic device may convert the portrait of the user at the first view into a portrait of the user at a preset view, and display the portrait of the user at the preset view on the display screen, where the first view is different from the preset view. When the switch 703 for turning the portrait view is in an off state, the electronic device will display the portrait of the user from the first view acquired by the camera on the display screen.

After the electronic device turns on the switch 703 for the portrait view transition, the user needs to enter face information of the user in the electronic device, for example, the user face image may be a video frame image including the user face. Illustratively, in the interface shown in b in fig. 7, when the electronic device receives a trigger operation for the button 705 for setting a user portrait, the electronic device displays a user portrait acquisition interface as shown in c in fig. 7. In the interface shown in c of fig. 7, the user may perform facial information collection according to voice prompts and/or text prompts. For example, the interface shown in c in fig. 7 displays a prompt 706, where the prompt 706 may be "please watch the lower camera", "please turn the head left/right", "please watch the screen", "please shake the head/click the head", "please open the mouth", etc. For another example, in the interface shown in c in fig. 7, the electronic device may display the viewpoint mark, and the position of the viewpoint mark in the display screen is in a changing state, and the electronic device may guide the user to perform the turning operation in each direction by changing the position of the viewpoint mark, so as to obtain the face image of the user at each viewpoint.

The electronic equipment can collect face images of the user under various angles and convert the face images into feature vector templates so that the follow-up electronic equipment can execute processes of checking the identity of the user and converting the view points of the images.

After the electronic device turns on the switch 703 for changing the portrait session, the user may also set a preset session in a user-defined manner. Illustratively, in the area for setting the portrait viewpoint, the user may select the viewpoint of the electrocardiograph as a preset viewpoint.

In one possible implementation, the electronic device may display a schematic 704 of the portrait presentation effect at multiple viewpoints; the user may select an appropriate viewpoint for outputting the portrait based on the schematic diagram 704. For example, the user performs a click operation on the schematic diagram (the viewpoint indicated by the dashed box in the schematic diagram 704) corresponding to the viewpoint 5 on the b interface in fig. 7, and the electronic device sets the viewpoint 5 as a preset viewpoint. Subsequently, the electronic device may display the face image at viewpoint 5 during the conference.

In another possible implementation manner, the electronic device may perform a frame extraction operation in the collected video of the user to extract face pictures from multiple viewpoints; the schematic diagram 704 may be displayed as a face image at various viewpoints of the user, for example, a picture corresponding to a dashed box in the schematic diagram 704 may be a front face image of the user. Thus, the user can more intuitively browse the effect presented by the image after the viewpoint conversion.

Optionally, after the user selects any one of the viewpoints of the schematic view 704 in the interface b shown in fig. 7, the electronic device may set the viewpoint as a preset viewpoint, and the electronic device may provide a plurality of face images at the preset viewpoint, and the user selects one of the plurality of face images as the face image at the preset viewpoint. It can be understood that when the face information of the user is collected, the electronic device obtains a video including the face of the user, wherein the video may include face images under multiple frames of preset viewpoints, and the electronic device may provide multiple face images, so that the user can select an image with a better shooting effect for subsequent viewpoint conversion.

In the embodiment of the present application, if the user does not set the portrait viewpoint, the electronic device may set the preset viewpoint as a default viewpoint, for example, the default viewpoint may correspond to the image (front face image) of the viewpoint 5 shown in the b-chart in fig. 1. It can be understood that in the scenario shown in fig. 4, the technical problem that is mainly solved by the embodiment of the present application is: how to make the face image shot by the low-angle camera show the effect of the front face image, the default viewpoint can be set as viewpoint 5 according to the embodiment of the application. In other application scenarios, the electronic device may also set other views as default views, which embodiments of the present application do not limit.

It should be noted that, the embodiment of the present application is only described with reference to fig. 7 by way of example. In practical applications, fig. 7 does not limit the interface of the electronic device, and the switch 703 for converting the portrait view may be set in other applications, which is not limited in the embodiment of the present application.

In addition, the embodiment of the application only takes the image acquired by the camera as the face image as an example, and the image processing method of the embodiment of the application is described. In the embodiment of the application, the object in the image can be other objects; such as animals, plants, etc., to which embodiments of the application are not limited.

Meanwhile, the user information (including but not limited to user equipment information, user personal information and the like) and the data (including but not limited to data for analysis, stored data, displayed data and the like) related to the application are information and data authorized by a user or fully authorized by all parties, and the collection, the use and the processing of related data need to comply with related laws and regulations and standards of related countries and regions, and are provided with corresponding operation entrances for users to select authorization or rejection.

The above embodiments explain the setting interface of the image processing method of the embodiment of the present application. The flow of the image processing method according to the embodiment of the present application will be described with reference to fig. 8 and 11, respectively.

In a setting scenario, fig. 8 shows a flowchart of an image processing method according to an embodiment of the present application, where the flowchart is shown in fig. 8:

s801, the electronic equipment displays an interface for starting portrait viewpoint conversion.

The electronic device runs a setup application, e.g., a computer housekeeper, and the electronic device displays a camera setup interface. The computer management home is provided with an entry of the image processing method according to the embodiment of the present application, and the camera setting interface may correspond to the interface shown in b in fig. 7.

In the embodiment of the application, the electronic equipment can also set the entrance of the portrait view conversion based on other applications.

S802, when the electronic device receives a trigger operation for turning on the switch for portrait session transfer, the electronic device sets the switch for portrait session transfer to an on state.

Upon receiving a trigger operation of the switch 703 for portrait view conversion, the electronic device may update the switch 703 for portrait view conversion from an off state to an on state, so that the subsequent electronic device may execute the flow of the image processing method provided by the embodiment of the present application.

S803, when the electronic equipment receives triggering operation for acquiring the user portrait, the electronic equipment enters a user portrait acquisition interface.

The triggering operation for capturing the user portrait may correspond to: in the interface shown in b in fig. 7, a trigger operation of the button 705 for setting a user figure is directed. The user portrait collection interface may correspond to the interface shown as c in fig. 7.

S804, the camera of the electronic equipment collects face images of the user at various viewpoints and reports the face images to the image collection module.

The process of the electronic device capturing an image may be as shown in fig. 9: after the electronic equipment starts the camera, the user can execute corresponding gestures according to the prompt information, so that the camera collects face images under all viewpoints.

It can be appreciated that when the user performs gazing, turning left, turning right, nodding, panning, etc., a plurality of face images are acquired by the electronic device. The face image may include a user face at various viewpoints. In the embodiment of the application, a plurality of face images are shown by way of example only through fig. 9, and the number of face images acquired by the electronic device is not limited in the actual application process.

In one possible implementation manner, during the process of collecting the face image, the electronic device may store the collected face image in the image collecting module; after the user image collection is completed, the image collection module may transmit the plurality of face images to the face recognition module, and execute step S805.

In another possible implementation manner, when the camera acquires any frame of face image, the face image acquisition module may transmit the frame of face image to the face recognition module in real time, and execute step S805.

S805, a face recognition module of the electronic equipment obtains a plurality of face images, and recognizes face feature points in any face image.

The face feature points may include: the pixels corresponding to the facial contours and the pixels corresponding to the facial contours, for example, the eyebrows, eyes, nose, mouth, and facial contours. Fig. 10 is an exemplary face image with face features marked thereon; wherein any one dot in the graph can correspond to a pixel point of the facial feature or the facial outline; in some embodiments, the number of feature points may be 68 or 78.

After the electronic device obtains the face image, the face feature points in the face image can be marked, any face feature point corresponds to a marking number and a coordinate value, and the representation form of the face feature point can be the marking number: (X-coordinate, Y-coordinate). For example, the electronic device may obtain a face image of the face feature points marked 68, and the number of the face feature points may be from 0 to 67; the number of the face feature point corresponding to the outermost side of the left face may be, for example, 0, the number of the face feature point corresponding to the left mouth corner may be, for example, 48, the number of the face feature point corresponding to the right mouth corner may be, for example, 54, and the representation form of the face feature point may be, for example, 1: (100,69). Embodiments of the present application are not specifically recited herein.

In some embodiments, the electronic device may perform the task of extracting the facial feature points in the facial image using a facial feature extraction algorithm, e.g., dlib library, deep learning approach, opencv, insightface, etc. The embodiments of the present application are not limited in this regard.

S806, the face recognition module of the electronic equipment converts any face feature point in any face image into a corresponding feature vector and generates a face feature template.

The face recognition module can convert any face feature point in the face image into a feature vector expressed by numbers, and the feature vector can be expressed by the face feature point in a high-dimensional space. Specifically, for example, in a face image, a face feature point with a number of 0 may be represented by a feature vector 1; the face feature point with the number of 1 can be represented by the feature vector 2 … …, and the face recognition module can traverse all face feature points in the face image to obtain 68 feature vectors corresponding to the face feature points.

For any face image, after the feature vectors corresponding to the feature points of each face are obtained, the face recognition module can generate a face feature template based on the feature vectors; the face feature template can be understood as a feature vector table of face feature points in the face image. Any face feature template can correspond to a feature template corresponding to the user at one viewpoint; at one point of view, there may be one or more face feature templates.

In the embodiment of the application, as shown in fig. 11, the electronic device may acquire the face information of the user through the scene shown as a in fig. 11 and the b graph in fig. 11, and the electronic device may obtain N face images. After the electronic device collects N face images, the electronic device may convert the N face images into N face feature templates, for example, face feature templates 1 to N face feature templates, where any face feature template corresponds to one face image, as shown in fig. 11 c. The electronic device may obtain a plurality of face feature templates.

Optionally, S807, the electronic device receives an operation for setting a preset viewpoint.

For example, in the scene shown in b in fig. 7, the user inputs a preset viewpoint, and the electronic device receives an operation for setting the preset viewpoint. After receiving the operation of setting the preset viewpoint by the user, the electronic device may perform step S808.

S808, a face recognition module of the electronic equipment adds labels to face feature templates of preset viewpoints in a plurality of face feature templates.

The electronic device may obtain a plurality of face feature templates, and the electronic device may add a label to the face feature template of the face image under the preset viewpoint, for example, as shown in c in fig. 11, the viewpoint corresponding to the face feature template 3 is the preset viewpoint, and the electronic device marks the face feature template 3.

Specifically, in some embodiments, in the scenario shown in b in fig. 7, the user may select an image at any viewpoint in the schematic diagram 704; the electronic equipment can display face images at a plurality of preset viewpoints; the electronic equipment receives the selected operation aiming at any face image again; the electronic equipment can label the face feature template corresponding to the face image.

It can be understood that when the electronic device collects the facial information of the user, the electronic device obtains a plurality of video frames; one or more face images may be present at one viewpoint; the electronic equipment can screen out a plurality of face images at a preset viewpoint for the user to select. The user can select the face image of the cardiometer as the face image at the preset viewpoint. The electronic equipment can label the face feature template corresponding to the face image selected by the user.

The embodiment of the application does not limit the expression form of the label of the preset viewpoint.

S809, the face recognition module of the electronic equipment stores the face feature templates.

For example, the electronic device may store the facial feature templates in memory via the database of the OS layer to facilitate subsequent invocation of the facial feature templates.

Optionally, after step S806, the method may further include:

s810, the electronic device does not receive an operation for setting a preset viewpoint, and the electronic device sets the preset viewpoint as a default viewpoint.

It may be appreciated that in the scenario shown in fig. 7, in the case where the user does not set the preset viewpoint, the electronic device may set the preset viewpoint as a default viewpoint, for example, the default viewpoint may be viewpoint 5.

After step S810, the electronic device may continue to perform step S808 and step S809, which may be referred to the above description and will not be repeated here.

In one possible implementation manner, in step S808, the electronic device may display a plurality of face images at the preset viewpoint after setting the default viewpoint as the preset viewpoint, so that the user may select the best face image. And the electronic equipment labels the face feature templates corresponding to the selected face images.

In another possible implementation manner, after the electronic device sets the default viewpoint as the preset viewpoint, the electronic device does not display a plurality of face images under the preset viewpoint; the electronic equipment can prestore face feature points and face feature vectors corresponding to the schematic diagrams; the face recognition module can compare the feature vector of the image with the feature vectors in the face feature templates respectively to obtain the similarity between each face feature template and the feature vector of the image, obtain the face feature template corresponding to the preset viewpoint according to the similarity, and label the face feature template. The face feature templates corresponding to the preset view points are obtained according to the similarity, and for example, the electronic equipment can label the face feature templates with the maximum similarity.

In the embodiment of the application, the user can update the preset viewpoint, and when the preset viewpoint is updated, the electronic device can remove the label of the face feature template with the label and label the face feature template corresponding to the new preset viewpoint.

Thus, the electronic equipment can obtain a plurality of face feature templates of the user at various viewpoints, and the follow-up processes such as user identity verification, face image viewpoint conversion and the like can be realized based on the face feature templates.

The following describes the procedure of using the image processing method in the embodiment of the present application. The image processing method provided by the embodiment of the application can be applied to a scene of recording video by using the electronic equipment, for example, the user can use the electronic equipment to carry out lesson recording activities. Fig. 12A is an interface schematic diagram of an image processing method according to an embodiment of the present application, where the interface schematic diagram is shown in fig. 12A:

taking the preset viewpoint as an example of the viewpoint capable of displaying the front face image, the user records a video by using the electronic device, the gesture of the user is shown as a diagram a in fig. 12A, the face of the user faces the display screen, and the user does not face the direction of the camera. The electronic device may display the user face image at the preset viewpoint. For example, the preset viewpoint is viewpoint 5, and the electronic device may display a frontal face image of the user, such as the interface shown as b in fig. 12A.

It can be understood that, in the scene, the image collected by the camera of the electronic device is a first viewpoint, and the first viewpoint may correspond to the image of viewpoint 8; the electronic device does not output an image of the viewpoint 8 but displays an image at a preset viewpoint on the display screen.

When the preset viewpoint is set as another viewpoint, the electronic device may display the face image under the other viewpoint, which is not limited in the embodiment of the present application.

The image processing method provided by the embodiment of the application can also be applied to interaction scenes such as video conferences, video calls and the like. For example, fig. 12B shows an interface schematic diagram of another image processing method according to an embodiment of the present application, as shown in fig. 12B:

in the scenario shown in a in fig. 12B, a user a is performing video conference communication with an electronic device 101 of a user B using an electronic device 100, where a camera of the electronic device 100 is a low-angle camera and a camera of the electronic device 101 is a normal-angle camera; the electronic device 100 may perform video using the image processing method provided by the embodiment of the present application.

For example, taking a preset viewpoint as viewpoint 5 as an example, in the interface shown in B in fig. 12B, the electronic device 100 may display an image of the user B, and the electronic device 100 may also display an image of the user a in a window; in the image of the user a, the viewpoint of the user a is a preset viewpoint in the electronic device 100. Meanwhile, the electronic device 101 may receive the image of the user a sent by the electronic device 100 in real time, and display the image of the user a on the display screen of the electronic device 101; in the image of the user a, the viewpoint of the user a is still a preset viewpoint in the electronic device 100.

It can be seen that the electronic device 100 processes the acquired image to obtain a face image of the user a at a preset viewpoint, and the user a can see the face image on the front of the user a through a small window in the conference process; the electronic device 100 may also send the face image of the user a at the preset viewpoint to the electronic device 101, so that the electronic device 101 displays the face image of the user a at the preset viewpoint, and thus, the user B sees the front face of the user a on the electronic device 101, which further improves the benefit of facial communication and the aesthetic property of the face image.

The above embodiment describes the use scenario of the image processing method, and the execution flow of the image processing method is described below with reference to the use scenario shown in fig. 12B and fig. 13. Fig. 13 is a schematic flow chart of an image processing method according to an embodiment of the present application, where the schematic flow chart is shown in fig. 13:

s1301, the electronic equipment runs a video communication application.

In an embodiment of the present application, the electronic device may be the electronic device 100 in fig. 12B. The video communication application may be an application capable of recording video using a camera, for example, the video communication application may be an application supporting video conferencing, an application supporting video telephony, or the like.

S1302, the camera of the electronic equipment shoots an image to be processed, and the image to be processed is uploaded to the portrait acquisition module.

In the image to be processed acquired by the camera, the viewpoint of the face may be a first viewpoint.

It is understood that the image to be processed may be a face image captured in a scene as shown by a in fig. 12B. When the camera of the electronic device is a low angle camera, the image collected by the camera tends to exhibit the effect shown from the image of viewpoint 7 to the image of viewpoint 9 in fig. 1. The first viewpoint may be, for example, viewpoint 7, viewpoint 8, viewpoint 9, etc.

When the electronic equipment is in a video conference scene, the image to be processed acquired by the camera is often real-time, and each time the camera acquires one frame of the image to be processed, the image to be processed can be temporarily stored in the human image acquisition module and transmitted to the human face recognition module. In order to reduce the storage pressure of the portrait acquisition module, the electronic equipment can clear the image to be processed in the portrait acquisition module after processing any frame of image to be processed.

It should be noted that, the embodiment of the present application only exemplarily illustrates one possible implementation manner of the image processing method, the viewpoint of the image to be processed in the embodiment of the present application is not limited, and the first viewpoint may also be any viewpoint in fig. 1.

S1303, a face recognition module of the electronic equipment obtains an image to be processed, and extracts face feature points and feature vectors in the image to be processed.

Step S1303 may refer to the related descriptions in steps S805 and S806, and will not be described here.

And 1304, obtaining the facial feature points and the feature vectors of the image to be processed by a feature comparison module of the electronic equipment.

And S1305, the feature comparison module of the electronic equipment respectively obtains the similarity between the image to be processed and any face feature template.

This step may be used to verify whether the image to be processed is a face image of a target user, which may be a user who enters face information of the user in the scene shown in fig. 7.

Specifically, the determination process may be as follows:

1) And the feature comparison module is used for respectively comparing the feature vector of the image to be processed with the feature vectors of the plurality of portrait feature templates to obtain a plurality of similarities.

The similarity can be a score of scoring the similarity between the image to be processed and the portrait characteristic template by the electronic equipment, and the similarity can be understood as the matching degree of the image to be processed and the portrait characteristic template. The similarity of face images of the same user is higher; the distance between face images of the same user is closer in the feature vector space, and the distance between face images of different users is farther.

In the embodiment of the application, the feature comparison module can calculate the similarity between the image to be processed and the face feature template by adopting calculation modes such as Euclidean distance, cosine similarity and the like. The range of similarity may be, for example, between 0 and 1, wherein the closer the similarity is to 1, the higher the degree of matching of the image to be processed with the face feature template.

2) When the value of the one or more similarities is greater than or equal to the similarity threshold, the electronic device executes step S1306.

A similarity threshold is preset in the electronic device, and the similarity threshold may be, for example, 0.6. When the similarity is higher than a similarity threshold, the face information in the image to be processed can be determined to be matched with the face information of the target user in the setting scene, and the face information are the same person; the electronic device may continue to perform subsequent steps.

Or, 3) when the values of the similarity are all smaller than the similarity threshold, the electronic device does not execute step S1306.

S1306, when one or more similarity values are greater than or equal to a similarity threshold, outputting a target face image by a viewpoint conversion module of the electronic device, wherein the viewpoint of the target face image is a preset viewpoint.

The preset viewpoint may be a user-defined portrait viewpoint (e.g., a user-entered portrait viewpoint in the interface shown in b in fig. 7), or a default viewpoint.

When the object in the image to be processed is a user set in the scene shown in fig. 7, the image to be processed may be converted into a target face image at a preset viewpoint.

By way of example, fig. 14 shows a scene of two viewpoint conversion.

In a scene, for example, in face images of respective viewpoints shown as a in fig. 14, a user wants to show a front face to other conference persons at the time of a video conference, so that when setting a person's viewpoint, a preset viewpoint is set as a viewpoint corresponding to an image (8). The electronic device captures an image to be processed, such as the image (3), the electronic device converts the image (3) into the image (8), and outputs the image (8), and does not output the image (3).

In another scenario, for example, in face images of respective viewpoints shown in b in fig. 14, the user wants to show a side face image at a certain viewpoint to other conference persons at the time of a video conference, so that when setting a person image viewpoint, a preset viewpoint is set as a viewpoint corresponding to the image (10), for example. The electronic device captures an image to be processed, such as the image (3), the electronic device converts the image (3) into the image (10), and outputs the image (10), and does not output the image (3).

Specifically, the following describes an operation flow for realizing the above viewpoint conversion effect in combination with the above scene, as follows:

1) When the similarity is greater than or equal to the similarity threshold, the feature comparison module instructs the viewpoint conversion module to perform viewpoint conversion.

2) The viewpoint conversion module obtains a face feature template with a label.

The face feature templates are stored in the database of the electronic equipment, and the electronic equipment can find the face feature templates added with the labels from the database. The face feature templates with the labels correspond to face feature templates of preset viewpoints.

3) The viewpoint conversion module is combined with a face feature template of the tag to convert the image to be processed into a target face image.

The viewpoint conversion module obtains feature vectors and face feature points in the face feature templates with the labels, and the viewpoint conversion module obtains the face feature points of the images to be processed; the viewpoint conversion module can fuse or replace part of face feature points in the face feature template with the labels and the face feature points of the image to be processed to obtain the processed face feature points; the viewpoint conversion module carries out mapping processing based on the processed face feature points to obtain a target face image; in the target face image, the viewpoint of the face is a preset viewpoint.

The mapping process may be, for example, normal mapping, diffuse reflection mapping, high light mapping, and the like.

It can be understood that the electronic device stores the face feature template of the label in advance; however, in the use process, the expression, the makeup, and the like of the user may come in and go out from the images in the setting scene, and the electronic equipment can process the images to be processed by using the face feature template with the tag to obtain the portrait feature points of the preset view points in the images to be processed; the electronic equipment uses the contour lines of the pixel points in the human face in the image to be processed in the processed human image characteristic points to carry out mapping, and an image obtained after the image to be processed and the human face under the preset view point are fused is obtained.

4) And the viewpoint conversion module reports the target face image under the preset viewpoint to the video communication application.

S1307, the video communication application displays the target face image at the preset viewpoint, and/or sends the target face image to other electronic devices.

Illustratively, in the scenario a in fig. 15, where a user is using an electronic device to perform a video conference, the image to be processed acquired by the camera may be as shown b in fig. 15. The preset viewpoint may be a viewpoint of the target face image shown in c in fig. 15. The video communication application of the electronic device may display the target face image as shown in c in fig. 15, and the electronic device may also transmit the target face image to other electronic devices in communication with the electronic device, so that the other electronic devices display the target face image in real time.

Therefore, the user can set the preset viewpoint according to the preference of the user, the electronic equipment displays the face image under the preset viewpoint, the aesthetic property of the image is improved, and the use experience of the user is improved.

Alternatively, the electronic device may select the resolution of the target face image based on network quality and/or operational capabilities.

It can be appreciated that when a user performs a video conference, the network quality and/or the operation capability of the electronic device may affect the smoothness of the video conference, and when the network quality and/or the operation capability are weaker, a target face image with higher resolution is output, so that a video clamping phenomenon may occur in the video conference. Therefore, in the embodiment of the application, the electronic device can select the proper resolution according to the network quality and/or the operation capability of the electronic device. Network quality may include bandwidth, network speed, etc.; the operational capabilities may relate to the processor and memory of the electronic device.

By way of example, fig. 16 shows a face image at three resolutions, where the resolution of the face image shown in a in fig. 16 is lower than the resolution of the face image shown in b in fig. 16, and the resolution of the face image shown in b in fig. 16 is lower than the resolution of the face image shown in c in fig. 16, where the grid in the face image corresponds to pixels, and the resolution of the face image may be the number of pixels included in each row and the number of pixels included in each column in the face image.

In one possible implementation, the electronic device may set the resolution of the target face image according to the operational capabilities. For example, in the above setting scenario, the electronic device may collect a face image, and the resolution of the face image may be less than or equal to the image resolution that can be supported by the operating capabilities of the electronic device. The electronic equipment can use the face image with low resolution to operate so as to obtain a face feature template; in the use scene, the electronic device can output the target face image after viewpoint conversion.

It can be appreciated that when the operation capability of the electronic device is low, the operation efficiency of the electronic device for processing the image to be processed is low, which may cause the electronic device to fail to display the target face image in real time. Therefore, according to the embodiment of the application, the electronic equipment can improve the operation efficiency of image processing through the face image with low resolution, so that the defect of the operation capability of the electronic equipment is overcome.

In another possible implementation manner, in a usage scenario, the electronic device may acquire the current network quality in real time, where the electronic device includes a plurality of intervals of network quality, and the electronic device may output the target face image using a resolution corresponding to the interval where the current network quality is located.

In the embodiment of the application, the electronic equipment can acquire the face image with low resolution for operation and output; the electronic device can also compress the collected high-resolution face image by using the GPU, the DSP and other images to obtain a low-resolution face image, and operate and output the low-resolution face image. The embodiments of the present application are not limited in this regard.

Optionally, fig. 17 shows an applicable scenario of an image processing method provided by an embodiment of the present application, as shown in fig. 17:

the user is using the electronic device to conduct video conferences, and the user can face the screen and do shaking actions. The electronic device may convert the face image of the upward point into a face image of a flat point.

Specifically, in the scenario shown in a of fig. 17, the user faces the screen, and the electronic device may display a frontal face image of the user, for example, the viewpoint of the face image displayed by the electronic device is viewpoint 5. The face image displayed by the electronic device is different from the viewpoint of the face image collected by the camera, for example, the viewpoint of the face image collected by the camera is viewpoint 8.

In the scenario shown in b of fig. 17, the user turns his head left, and the electronic device may display a face image of the user, for example, the viewpoint of the face image displayed by the electronic device is viewpoint 4. The face image displayed by the electronic device is different from the viewpoint of the face image collected by the camera, for example, the viewpoint of the face image collected by the camera is viewpoint 7.

In the scenario shown in c of fig. 17, the user turns his head to the right, and the electronic device may display a face image of the user, for example, the viewpoint of the face image displayed by the electronic device is viewpoint 6. The face image displayed by the electronic device is different from the viewpoint of the face image collected by the camera, for example, the viewpoint of the face image collected by the camera is viewpoint 9.

The implementation of the application scenario in fig. 17 is described below, as follows:

1) The electronic equipment can perform shaking detection in the process of collecting the image to be processed.

The shaking detection may be for example: during the shaking of the head, the width of the cheek becomes narrower and the nose length becomes almost unchanged. The electronic equipment can calculate the distance between specific face feature points in the face image of the preset viewpoint; and judging whether the viewpoints of other face images are calibration viewpoints or not through the distance change of the distances between specific feature points in other face images. Specifically, the cheek width variation can be reflected by the distance between the face feature point 3 and the face feature point 13 of the cheek, and the nose length variation can be reflected by the distance between the face feature point 27 and the face feature point 30 of the nose; when the cheek width changes greatly between the continuous multi-frame images to be processed acquired by the electronic equipment, the electronic equipment can determine that the user is executing the head shaking operation.

Alternatively, the electronic device may calculate a plurality of viewpoints of successive multi-frame images to be processed, and when the difference in horizontal angles of the plurality of viewpoints is large, the electronic device may determine that the user is performing the panning operation.

The embodiments of the present application are merely exemplary in describing two ways of detecting shaking head, and the embodiments of the present application are not limited in this respect.

When it is recognized that the user performs the panning operation, the viewpoint of the image to be processed of the current frame may be the first viewpoint.

2) When the electronic equipment identifies that the user executes the head shaking operation, the electronic equipment can output a target face image; the viewpoint of the user face in the target face image is different from the first viewpoint, and the viewpoint of the user face is different from the preset viewpoint. The horizontal angle of the viewpoint of the user face is the same as the horizontal angle of the first viewpoint, and the vertical angle of the viewpoint of the user face is the same as the vertical angle of the preset viewpoint.

For example, the first viewpoint is viewpoint 9, the viewpoint of the user face is viewpoint 6, and the preset viewpoint is viewpoint 5. The horizontal angle is understood to be the angle by which the head turns around the central axis of the body, and may also be referred to as the yaw angle (yaw); the vertical angle is understood to be the angle by which the head is turned about the axis of the binaural line, which may also be referred to as pitch angle (pitch).

When the electronic equipment detects that the user shakes the head, the vertical angle in the image to be processed can be adjusted to the vertical angle of the preset viewpoint, the horizontal angle is not processed, and the horizontal angle in the image to be processed is continuously used. And the electronic device may perform the steps shown in fig. 13 when no user shaking is detected.

Specifically, in the embodiment of the present application, in the use process as shown in fig. 18, when the camera acquires the image (1), the electronic device may output the image (4); when the camera acquires the image (2), the electronic equipment can output the image (5); when the camera acquires the image (3), the electronic equipment can output the image (6). Thereby realize showing the shaking head of looking up the effect as the shaking head of looking up the effect, promote the aesthetic property of image.

In the embodiment of the present application, the shaking head is taken as an example for explanation. The embodiment of the application can also be suitable for displaying head movements such as user nodding after viewpoint conversion in real time, wherein the principle of the electronic equipment for displaying nodding movements is similar to the above embodiment, and the application is not specifically described.

Optionally, fig. 19 shows an applicable scenario of an image processing method provided by an embodiment of the present application, as shown in fig. 19:

In the scenario shown in a of fig. 19, the user is using the electronic device for video conferencing, the user may face the screen, and perform gestures.

The electronic device may acquire an image as shown in b in fig. 19, in which both the face and the hand of the user assume a state of elevation.

Three possible implementations are presented below:

a first possible implementation: in the scene shown in c in fig. 19, the electronic device may display a face image and a hand image of a preset viewpoint. The viewpoints of the face image and the hand image in the user image acquired by the camera are different from the preset viewpoints. For example, the viewpoint of the face image in the user image collected by the camera is viewpoint 8, the viewpoint of the hand image in the user image is viewpoint 8, and the preset viewpoint is viewpoint 5.

In the embodiment of the application, in a setting scene, the electronic equipment can acquire the face information of the user, and the electronic equipment can also acquire the hand information of the user. For example, the electronic device captures a plurality of user images, one of which may include a user's face and a user's hand. The electronic equipment respectively identifies characteristic points in a face image and a hand image in any user image, converts the characteristic points into characteristic vectors, and generates face and hand characteristic templates to obtain a face and hand characteristic template set. The electronic equipment marks the face and hand feature templates under the preset view point. In the use process, after the electronic equipment determines that the similarity of the feature vector of the image to be processed and the feature vector of the face and hand feature templates is larger than a threshold value, the electronic equipment outputs target face and hand images based on the face and hand feature templates under the preset view point.

Therefore, the electronic equipment can perform viewpoint conversion through the stored images, the hands do not need to be cut and spliced, and the effect of the images is more natural.

A second possible implementation: in the scene shown as d in fig. 19, the electronic device may display a hand image and a face image of a preset viewpoint. The camera collects face images in the user images and is different from a preset viewpoint, and the hand images in the user images collected by the camera are identical to hand images displayed by the electronic equipment. For example, the viewpoint of the face image acquired by the camera is viewpoint 8, the viewpoint of the face image displayed by the electronic device is viewpoint 5, and the viewpoint of the hand image displayed by the electronic device is viewpoint.

In the embodiment of the application, in the using process, the electronic equipment can obtain the image to be processed; the electronic equipment can divide the image to be processed into a face image to be processed, a hand image to be processed and a background; the electronic equipment converts the image to be processed of the face into a feature vector and compares the feature vector with a face feature template in similarity; when the similarity is larger than a similarity threshold, the electronic equipment obtains a target face image, wherein the target face image is correspondingly added with a face feature template of a preset viewpoint. The above steps may be described with reference to the embodiment shown in fig. 13, and will not be described herein.

And the electronic equipment performs image fusion on the target face image, the hand image to be processed and the background to generate a target user image. The image fusion process may be, for example: the electronic device performs a blurring process on the background, and the blurring process may be, for example, gaussian blurring process, median blurring process, and the like; the electronic equipment fuses the target face image, the hand image to be processed and the background after the blurring processing, and the fusion method can be used for example for weight fusion; the electronic device outputs a target user image.

Alternatively, in the scene, the background may be a background extracted from the image to be processed, and the background may be replaced by another image, which is not limited in the embodiment of the present application.

In a third possible implementation manner, in the scene shown in e in fig. 19, the electronic device may display two areas, in which a hand image is displayed in one area and a face image of a preset viewpoint is displayed in the other area. The viewpoint of the hand image is the same as the viewpoint of the hand image in the user image acquired by the camera; the viewpoint of the face image is different from the viewpoint of the face image in the user image acquired by the camera.

In the embodiment of the application, in the using process, the electronic equipment can obtain the image to be processed; the electronic equipment can cut the image to be processed into a face image to be processed and a hand image to be processed; the electronic equipment converts the image to be processed of the face into a feature vector and compares the feature vector with a face feature template in similarity; when the similarity is larger than a similarity threshold, the electronic equipment obtains a target face image, wherein the target face image is correspondingly added with a face feature template of a preset viewpoint. The above steps may be described with reference to the embodiment shown in fig. 13, and will not be described herein.

The electronic device may stitch the hand image with the target face image, in one possible implementation, the hand image may be pasted onto the target face image, and the effect shown as e in fig. 19 is displayed, and the partial area displays the face after viewpoint conversion, and the partial area displays the hand.

Therefore, the workload of algorithm operation can be reduced, and the efficiency of the image processing method is improved.

On the basis of the above embodiments, an embodiment of the present application provides an image processing method. Fig. 20 is a schematic flow chart of an image processing method according to an embodiment of the present application.

As shown in fig. 20, the image processing method may be applied to a first electronic device including a camera, the method including the steps of:

s2001, the first electronic device establishes video communication with the second electronic device.

The first electronic device may be, for example, the electronic device 100 shown in fig. 12B, and the second electronic device may be, for example, the electronic device 101 shown in fig. 12B. The first electronic device establishes video communication with the second electronic device, which may correspond to the scenario a shown in fig. 12B; the video communication may be a video conference, a video call, etc.

It should be noted that, in the embodiment of the present application, the camera of the first electronic device is taken as a low-angle camera as an example. In the embodiment of the application, the camera in the first electronic device can be in a conventional position, and the first electronic device can also realize the image processing method provided by the embodiment of the application, which is not limited.

S2002, acquiring a first frame image by the first electronic equipment through a camera; the first frame image includes a first object, and a viewpoint of the first object in the first frame image is a first viewpoint.

The first frame image can be a video frame image acquired by a camera of the first electronic device in real time; the first object may be, for example, a user face; the first object may also be other objects, e.g. plants, animals, etc. The first viewpoint may be a real viewpoint at which the camera photographs the first object.

For example, the first frame image may correspond to an image to be processed shown in b-chart in fig. 15; the first object may be a face in the image to be processed and the first viewpoint may be, for example, viewpoint 8. Step S2002 refers to the description of step S1302, and is not described herein.

S2003, the first electronic device sends a second frame image to the second electronic device; the second frame image is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the first object is a second viewpoint; the first electronic device is preset with a second viewpoint, and the second viewpoint is different from the first viewpoint.

The second frame image may be the first frame image after viewpoint conversion, and the processing procedure may refer to the related descriptions in steps S1303-S1307, which are not described herein.

For example, the second viewpoint may be a preset viewpoint set by the user in the scene shown in fig. 7, the second viewpoint may be, for example, viewpoint 5, and the second frame image may be a target face image as shown in c-chart in fig. 15, and the viewpoint of the face in the second frame image is viewpoint 5.

In this way, the first electronic device can convert the image of the real viewpoint into the image of the preset viewpoint, so that the image meeting the viewpoint requirement is obtained; for example, the first electronic device may convert a face image of a chin from a viewpoint into a front face image, so as to improve the facial expression communication benefit of the user, and also improve the aesthetic property of the video.

Optionally, the method further comprises:

the first electronic equipment acquires a third frame image by adopting a camera; the third frame image comprises a first object, wherein the viewpoint of the first object is a third viewpoint which is different from the first viewpoint; the first electronic device sends a fourth frame image to the second electronic device; the fourth frame image is obtained by converting the viewpoint of the first object in the third frame image by the first electronic equipment; in the fourth frame image, the viewpoint of the first object is the second viewpoint.

The third frame image may be a video frame image acquired by a camera of the first electronic device in real time; the third viewpoint may be a real viewpoint of capturing the first object at the camera. The third view is different from the first view. The fourth frame image may be a third frame image after viewpoint conversion.

It can be appreciated that the first electronic device may convert the frame image at any viewpoint into a frame image at a preset viewpoint. For example, in figure a of fig. 14, the first electronic device may convert the image (3) of viewpoint 8 into the image (8) of viewpoint 5, the first electronic device may also convert the image (13) of viewpoint 2 into the image (8) of viewpoint 5, and the first electronic device may also convert the image (9) of viewpoint 6 into the image (8) of viewpoint 5. Embodiments of the present application are not specifically recited herein.

In this way, when the user uses the first electronic device to perform video communication, the first electronic device can convert any image with a non-preset viewpoint into an image with a preset viewpoint, so as to obtain an image meeting the viewpoint requirement; for example, the first electronic device may convert the non-frontal face image into a frontal face image, so as to improve the facial expression communication benefit of the user, and also improve the aesthetic property of the video.

Optionally, a plurality of object feature templates of the first object are stored in the first electronic device, and any one of the object feature templates is a feature template corresponding to the first object at a viewpoint; before the first electronic device sends the second frame image to the second electronic device, the method further comprises: the first electronic equipment identifies characteristic points of a first object in a first frame image, converts the characteristic points of the first object into characteristic vectors of the first object, and generates a first object characteristic template; when the similarity between the first object feature template and any one of the object feature templates is larger than or equal to a similarity threshold, the first electronic device combines the object feature templates of the second viewpoints in the object feature templates to convert the first frame image into the second frame image.

When the first object is a user, the object feature templates can correspond to face feature templates, and any face feature template is a feature template corresponding to the first object at one viewpoint; the first frame image is an image to be processed, and the first object feature template may be a face feature template of the image to be processed. The object feature templates of the second viewpoint may have labeled face feature templates. This step can be described with reference to the correlation in steps S1304-S1306.

Therefore, after the first electronic equipment verifies the identity of the first object in the first frame image, the first electronic equipment can perform viewpoint conversion on the first frame image to obtain an image meeting the viewpoint requirements, and therefore accuracy and safety of the image processing method in the embodiment of the application are improved.

Optionally, the first electronic device converts the first frame image into the second frame image in combination with the object feature templates of the second view point in the plurality of object feature templates, including: and the first electronic equipment fuses part of characteristic points in the object characteristic template of the second viewpoint with characteristic points of the first object in the first frame image to obtain a second frame image.

Thus, the electronic device can obtain the second frame image under the preset viewpoint, thereby obtaining the image meeting the viewpoint requirement.

Optionally, before the first electronic device establishes video communication with the second electronic device, the method further includes: the first electronic equipment displays a first interface, wherein the first interface comprises a first button which is in a closed state; wherein the state of the first button comprises: an on state and an off state; when the state of the first button is an on state, the first electronic equipment outputs a frame image of a preset viewpoint when the first object performs video communication; when the state of the first button is in a closed state, the first electronic equipment outputs a frame image acquired by the camera when the first object performs video communication; the first electronic device receives a triggering operation for a first button; in response to a trigger operation for the first button, the first electronic device sets the first button in the off state to the first button in the on state.

The first interface may be an interface shown in b in fig. 7, and the first button may be a switch 703 for converting a human visual point.

Therefore, the user can select whether to use the viewpoint conversion function in the first electronic equipment according to the self requirement, and the user experience is improved.

Optionally, the first interface further includes a second button, where the second button is used to instruct the first electronic device to use the camera to collect the image of the first object; after the first electronic device displays the first interface, further comprising: the first electronic device receives a triggering operation for the second button; responding to triggering operation for a second button, and displaying a second interface by the first electronic equipment, wherein the second interface comprises a first object and prompt information; the prompt information comprises prompt words and/or prompt voice; the prompt information is used for indicating the first object to rotate the head; the electronic equipment adopts a camera to collect a plurality of images of the first object.

Wherein, the first interface may correspond to the interface b shown in fig. 7, and the second button may be a button 705 for setting a portrait of the user; for the triggering operation of the second button, an operation of clicking the button 705 for setting the user portrait may be performed; the second interface may correspond to the interface shown in c in fig. 7, the first object may be a face of a user acquired by the camera in real time, and the prompt information may be prompt text 706 in the interface shown in c in fig. 7.

In this way, the first electronic device may collect information of the first object in advance. The information of the first object in the setting scene can be used as the basis of viewpoint conversion and identity verification of the first object in the using scene, so that the first electronic equipment can accurately obtain the target face image.

Optionally, after the electronic device collects the images of the plurality of first objects by using the camera, the method further includes: the electronic equipment identifies the characteristic points of the first object in the image of any first object, converts the characteristic points of the first object into characteristic vectors of the first object, and generates object characteristic templates under multiple viewpoints.

In this step, reference may be made to step S806, which is not described in detail in the embodiment of the present application.

Thus, the first electronic device can obtain the object feature templates of the first object under each viewpoint, so that the first object can be accurately checked and the target face image can be output later.

Optionally, the first interface further includes a first area, and a plurality of third buttons are displayed in the first area; any third button corresponds to a schematic diagram, objects with one view point are displayed in the schematic diagram, and the view points of the objects are different in any schematic diagram; after the first electronic device displays the first interface, further comprising: the first electronic equipment receives triggering operation of a third button corresponding to the second viewpoint; responding to triggering operation of a third button corresponding to the second viewpoint, and setting a preset viewpoint by the first electronic equipment; the preset viewpoint is a second viewpoint; or the first electronic device does not receive a triggering operation for any third button; the first electronic equipment sets a preset viewpoint; the default view point is a default view point, and the default view point comprises a second view point.

The first interface may correspond to the interface shown in b in fig. 7, and the first area may be an area including a plurality of schematic diagrams 704, where the viewpoints of the objects are different in any of the schematic diagrams 704; any diagram corresponds to a third button at one viewpoint; the triggering operation for the third button corresponding to the second viewpoint may be an operation of clicking the third button corresponding to the viewpoint 5 in the schematic diagram 704 (the viewpoint indicated by the dashed box in the schematic diagram 704); the first electronic device sets the viewpoint 5 as a preset viewpoint.

Therefore, the user can set the preset viewpoint according to the self demand, the follow-up first electronic equipment can output images meeting the preset viewpoint, and the use experience of the user is improved.

Optionally, the method further comprises:

(1) The first electronic equipment acquires a fifth frame image by adopting a camera; the fifth frame image includes the first object, and in the fifth frame image, the viewpoint of the first object is the fourth viewpoint.

The fifth frame image may be a video frame image acquired by a camera of the first electronic device in real time; the fourth viewpoint may be a real viewpoint at which the camera photographs the first object. For example, the fifth frame image may correspond to image (3) in fig. 18, and the fourth viewpoint may be viewpoint 7.

(2) The first electronic device recognizes that the head of the first object rotates to the side based on the fifth frame image, and the first electronic device sends a sixth frame image to the second electronic device; the sixth frame image is obtained by converting the viewpoint of the first object in the fifth frame image by the first electronic device; in the sixth frame image, the viewpoint of the first object is a fifth viewpoint; the fifth view is different from the fourth view.

The first electronic device recognizes that the user performs the panning motion when the fifth frame of image. The sixth frame image may correspond to the image (6) in fig. 18, and the fifth viewpoint may be viewpoint 4.

(3) The first electronic equipment acquires a seventh frame image by adopting a camera; the seventh frame image includes the first object, and in the seventh frame image, a viewpoint of the first object is a sixth viewpoint, the sixth viewpoint being different from the fourth viewpoint.

The seventh frame image may be a video frame image acquired by a camera of the first electronic device in real time; the sixth viewpoint may be a real viewpoint at which the camera photographs the first object. For example, the seventh frame image may correspond to image (1) in fig. 18, and the fourth viewpoint may be viewpoint 9.

(4) The first electronic device recognizes that the head of the first object rotates to the side based on the seventh frame image, and the first electronic device sends an eighth frame image to the second electronic device; the eighth frame image is obtained by converting the viewpoint of the first object in the seventh frame image by the first electronic device; in the eighth frame image, the viewpoint of the first object is a seventh viewpoint; the seventh view is different from the sixth view.

The first electronic device can still recognize that the user performs the panning motion when the seventh frame of image is displayed. The eighth frame image may correspond to the image (4) in fig. 18, and the fifth view may be view 6.

Wherein, the vertical angle of the fourth view point and the vertical angle of the fifth view point are different, and the horizontal angle of the fourth view point and the horizontal angle of the fifth view point are the same; the vertical angle of the sixth view point is different from the vertical angle of the seventh view point, and the horizontal angle of the sixth view point is the same as the horizontal angle of the seventh view point; and the vertical angle of the fifth viewpoint, the vertical angle of the seventh viewpoint, and the vertical angle of the second viewpoint are the same.

It can be seen that in step (1) and step (2), the fourth viewpoint is different from the fifth viewpoint, and the fifth viewpoint is not a preset viewpoint. In fig. 18, the first electronic device converts the image (3) of the upward viewpoint into the image (6) of the downward viewpoint, converting in the vertical angle, but not converting in the horizontal angle; in the step (3) and the step (4), the sixth viewpoint is different from the seventh viewpoint, and the seventh viewpoint is not a preset viewpoint. In fig. 18, the first electronic device converts the image (1) of the upward viewpoint into the image (4) of the downward viewpoint, converting in the vertical angle, but not converting in the horizontal angle. The viewpoint conversion of the first electronic device at the vertical angle may be based on the vertical angle of the preset viewpoint, for example, the vertical angles in the image (4), the image (5) and the image (6) are all the same, and the image (5) is the image of the preset viewpoint.

Therefore, the first electronic equipment can display the user image more flexibly and vividly, and the aesthetic property of the video is improved.

Optionally, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image further includes a second object, and in the second frame image, a viewpoint of the second object is a second viewpoint.

The second object may be a user's hand, the first frame image may also be an image shown in b in fig. 19, the viewpoint of the user's hand and the viewpoint of the face are the same, and the first viewpoint may be, for example, viewpoint 8; the second frame image may also be an image shown as c in fig. 19, and the viewpoint of the user's hand and the viewpoint of the face are the same, and the second viewpoint may be, for example, viewpoint 5.

Therefore, when the hands of the user are out of the mirror, the first electronic equipment can simultaneously perform viewpoint conversion on the face and the hands, so that images meeting viewpoint requirements more naturally are output, and the use experience of the user is improved.

Optionally, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image also comprises a second object, and the second frame image is obtained by the first electronic equipment after the image of the second object and the converted image of the first object are subjected to image fusion; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint.

The second object may be a user's hand, the first frame image may also be an image shown in b in fig. 19, the viewpoint of the user's hand and the viewpoint of the face are the same, and the first viewpoint may be, for example, viewpoint 8; the second frame image may also be an image shown as d in fig. 19, where the viewpoint of the user's hand and the viewpoint of the face are different, and the viewpoint of the hand in the second frame image is the viewpoint of the hand in the first frame image, for example, viewpoint 8; the facial viewpoint in the second frame image is a preset viewpoint, which may be, for example, viewpoint 5.

In this way, the first electronic device can fuse the face after viewpoint conversion with the unconverted hand to obtain a more natural second frame image; to a certain extent, the operation pressure of the first electronic device can also be reduced.

Optionally, the first frame image further includes a second object, and in the first frame image, a viewpoint of the second object is a first viewpoint; the first object is different from the second object; the second frame image also comprises a second object, and the second frame image is obtained by splicing the image of the second object with the converted image of the first object by the first electronic equipment; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic equipment; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint.

The second object may be a user's hand, the first frame image may also be an image shown in b in fig. 19, the viewpoint of the user's hand and the viewpoint of the face are the same, and the first viewpoint may be, for example, viewpoint 8; the second frame image may also be an image shown as e in fig. 19, where the viewpoint of the hand of the user is different from the viewpoint of the face, and the viewpoint of the hand in the second frame image is the viewpoint of the hand in the first frame image, for example, viewpoint 8; the facial viewpoint in the second frame image is a preset viewpoint, which may be, for example, viewpoint 5.

In this way, the first electronic device can splice the face after viewpoint conversion with the unconverted hand without fusion, and in the second frame image, the face is displayed in a partial area and the hand is displayed in a partial area in a similar form of a split mirror; the operation pressure of the first electronic device can be reduced.

Optionally, the resolution of the second frame image is related to the network quality of the first electronic device, the network quality of the second electronic device, the operational capability of the first electronic device and/or the operational capability of the second electronic device; the resolution of the second frame image is positively correlated with the network quality of the first electronic device; the resolution of the second frame image is positively correlated with the network quality of the second electronic device; the resolution of the second frame image is positively correlated with the operational capabilities of the first electronic device; the resolution of the second frame image is positively correlated with the operational capabilities of the second electronic device.

The first electronic device and/or the second electronic device may adjust the resolution according to the network quality and/or the operation capability when displaying the second frame image, e.g., the stronger the operation capability of the electronic device, the higher the resolution of the second frame image; the better the network quality of the electronic device, the higher the resolution of the second frame image; conversely, the lower the resolution of the second frame image. In this way, the first electronic device and/or the second electronic device may adjust the image resolution according to the network quality and/or the operation capability, so that the first electronic device and/or the second electronic device may smoothly display the image.

The image processing method according to the embodiment of the present application has been described above, and an apparatus for performing the image processing method according to the embodiment of the present application is described below. It will be appreciated by those skilled in the art that the methods and apparatus may be combined and referred to, and that the related apparatus provided in the embodiments of the present application may perform the steps in the image processing method described above.

As shown in fig. 21, the image processing apparatus 2100 may be used in a communication device, a circuit, a hardware component, or a chip, and includes: a display unit 2101, and a processing unit 2102. Wherein the display unit 2101 is used for supporting the step of displaying performed by the image processing apparatus 2100; the processing unit 2102 is for supporting the image processing apparatus 2100 to execute steps of information processing.

In a possible implementation manner, the image processing apparatus 2100 may also include a communication unit 2103. Specifically, the communication unit is for supporting the image processing apparatus 2100 to perform the steps of transmission of data and reception of data. The communication unit 2103 may be an input or output interface, a pin or circuit, or the like.

In a possible embodiment, the image processing apparatus may further include: a storage unit 2104. The processing unit 2102 and the storage unit 2104 are connected by a line. The memory unit 2104 may include one or more memories, which may be one or more devices, circuits, or means for storing programs or data. The storage unit 2104 may exist independently and be connected to the processing unit 2102 provided in the image processing apparatus via a communication line. The memory unit 2104 may also be integrated with the processing unit 2102.

The storage unit 2104 may store computer-executed instructions of the method in the terminal apparatus to cause the processing unit 2102 to execute the method in the above-described embodiment. The storage unit 2104 may be a register, a cache, a RAM, or the like, and the storage unit 2104 may be integrated with the processing unit 2102. The storage unit 2104 may be a read-only memory (ROM) or other type of static storage device that may store static information and instructions, and the storage unit 2104 may be independent of the processing unit 2102.

The image processing method provided by the embodiment of the application can be applied to the electronic equipment with the communication function. The electronic device includes a terminal device, and specific device forms and the like of the terminal device may refer to the above related descriptions, which are not repeated herein.

The embodiment of the application provides a terminal device, which comprises: comprising the following steps: a processor and a memory; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to cause the terminal device to perform the method described above.

The embodiment of the application provides a chip. The chip comprises a processor for invoking a computer program in a memory to perform the technical solutions in the above embodiments. The principle and technical effects of the present application are similar to those of the above-described related embodiments, and will not be described in detail herein.

The embodiment of the application also provides a computer readable storage medium. The computer-readable storage medium stores a computer program. The computer program realizes the above method when being executed by a processor. The methods described in the above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer readable media can include computer storage media and communication media and can include any medium that can transfer a computer program from one place to another. The storage media may be any target media that is accessible by a computer.

In one possible implementation, the computer readable medium may include RAM, ROM, compact disk-read only memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium targeted for carrying or storing the desired program code in the form of instructions or data structures and accessible by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (Digital Subscriber Line, DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes optical disc, laser disc, optical disc, digital versatile disc (Digital Versatile Disc, DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

Embodiments of the present application provide a computer program product comprising a computer program which, when executed, causes a computer to perform the above-described method.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing detailed description of the application has been presented for purposes of illustration and description, and it should be understood that the foregoing is by way of illustration and description only, and is not intended to limit the scope of the application.

Claims

1. An image processing method, applied to a first electronic device including a camera, comprising:

the first electronic device establishes video communication with the second electronic device;

the first electronic equipment acquires a first frame image by adopting the camera; the first frame image comprises a first object, and a viewpoint of the first object in the first frame image is a first viewpoint;

the first electronic device sends a second frame image to the second electronic device; the second frame image is obtained by converting the viewpoint of the first object in the first frame image by the first electronic device; in the second frame image, the viewpoint of the first object is a second viewpoint; the second viewpoint is preset in the first electronic device, and the second viewpoint is different from the first viewpoint.

2. The method as recited in claim 1, further comprising:

the first electronic equipment acquires a third frame image by adopting the camera; the third frame image comprises the first object, and in the third frame image, the viewpoint of the first object is a third viewpoint, and the third viewpoint is different from the first viewpoint;

The first electronic device sends a fourth frame image to the second electronic device; the fourth frame image is obtained by converting the viewpoint of the first object in the third frame image by the first electronic device; in the fourth frame image, the viewpoint of the first object is the second viewpoint.

3. The method according to claim 1 or 2, wherein a plurality of object feature templates of the first object are stored in the first electronic device, and any one of the object feature templates is a feature template corresponding to the first object at a viewpoint; before the first electronic device sends the second frame image to the second electronic device, the method further comprises:

the first electronic equipment identifies characteristic points of the first object in the first frame image, converts the characteristic points of the first object into characteristic vectors of the first object, and generates a first object characteristic template;

when the similarity between the first object feature template and any one of the object feature templates is greater than or equal to a similarity threshold, the first electronic device converts the first frame image into the second frame image by combining the object feature templates of the second view points in the plurality of object feature templates.

4. The method of claim 3, wherein the first electronic device converting the first frame image into the second frame image in combination with an object feature template of the second viewpoint of the plurality of object feature templates comprises:

and the first electronic equipment fuses part of characteristic points in the object characteristic template of the second viewpoint with the characteristic points of the first object in the first frame image to obtain the second frame image.

5. The method of claim 1, further comprising, prior to the first electronic device establishing video communication with a second electronic device:

the first electronic equipment displays a first interface, wherein the first interface comprises a first button, and the first button is in a closed state; wherein the state of the first button comprises: an on state and an off state; when the state of the first button is the opening state, the first electronic device outputs a frame image of a preset viewpoint when the first object performs the video communication; when the state of the first button is the closed state, the first electronic device outputs a frame image acquired by the camera when the first object performs the video communication;

The first electronic device receives a triggering operation for the first button;

in response to a trigger operation for the first button, the first electronic device sets the first button in the off state to the first button in the on state.

6. The method of claim 5, further comprising a second button in the first interface for instructing the first electronic device to capture an image of the first object using the camera; after the first electronic device displays the first interface, the method further includes:

the first electronic device receives a triggering operation for the second button;

responding to the triggering operation of the second button, and displaying a second interface by the first electronic equipment, wherein the second interface comprises the first object and prompt information; the prompt information comprises prompt words and/or prompt voice; the prompt information is used for indicating the first object to rotate the head;

the electronic equipment acquires a plurality of images of the first object by adopting the camera.

7. The method of claim 6, further comprising, after the electronic device acquires a plurality of images of the first object with the camera:

The electronic equipment identifies the characteristic points of the first object in any one of the images of the first object, converts the characteristic points of the first object into characteristic vectors of the first object, and generates object characteristic templates under multiple viewpoints.

8. The method of any of claims 5-7, further comprising a first area in the first interface, the first area having a plurality of third buttons displayed therein; any one of the third buttons corresponds to a schematic diagram, wherein objects with one view point are displayed in the schematic diagram, and the view points of the objects are different in any one of the schematic diagrams; after the first electronic device displays the first interface, the method further includes:

the first electronic device receives triggering operation of the third button corresponding to the second viewpoint;

responding to the triggering operation of the third button corresponding to the second viewpoint, and setting the preset viewpoint by the first electronic equipment; wherein the preset viewpoint is the second viewpoint;

or the first electronic device does not receive a triggering operation for any third button;

the first electronic device sets the preset viewpoint; the preset view point is a default view point, and the default view point comprises the second view point.

9. The method according to claim 1, wherein the method further comprises:

the first electronic equipment acquires a fifth frame image by adopting the camera; the fifth frame image comprises the first object, and in the fifth frame image, the viewpoint of the first object is a fourth viewpoint;

the first electronic device recognizes that the head of the first object rotates to the side based on the fifth frame image, and the first electronic device sends a sixth frame image to the second electronic device; the sixth frame image is obtained by converting the viewpoint of the first object in the fifth frame image by the first electronic device; in the sixth frame image, the viewpoint of the first object is a fifth viewpoint; the fifth view is different from the fourth view;

the first electronic equipment acquires a seventh frame image by adopting the camera; the seventh frame image includes the first object, and in the seventh frame image, a viewpoint of the first object is a sixth viewpoint, and the sixth viewpoint is different from the fourth viewpoint;

the first electronic device recognizes that the head of the first object rotates to the side based on the seventh frame image, and the first electronic device sends an eighth frame image to the second electronic device; the eighth frame image is obtained by converting the viewpoint of the first object in the seventh frame image by the first electronic device; in the eighth frame image, the viewpoint of the first object is a seventh viewpoint; the seventh view is different from the sixth view;

Wherein, the vertical angle of the fourth view point and the vertical angle of the fifth view point are different, and the horizontal angle of the fourth view point and the horizontal angle of the fifth view point are the same; the vertical angle of the sixth view point is different from the vertical angle of the seventh view point, and the horizontal angle of the sixth view point is the same as the horizontal angle of the seventh view point; and a vertical angle of the fifth viewpoint, a vertical angle of the seventh viewpoint, and a vertical angle of the second viewpoint are the same.

10. The method according to claim 1, wherein the first frame image further includes a second object, and wherein a viewpoint of the second object in the first frame image is the first viewpoint; the first object is different from the second object; the second frame image further includes the second object, and in the second frame image, a viewpoint of the second object is the second viewpoint.

11. The method according to claim 1, wherein the first frame image further includes a second object, and wherein a viewpoint of the second object in the first frame image is the first viewpoint; the first object is different from the second object;

The second frame image also comprises the second object, and the second frame image is obtained by the first electronic equipment after performing image fusion on the image of the second object and the converted image of the first object; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic device; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint.

12. The method according to claim 1, wherein the first frame image further includes a second object, and wherein a viewpoint of the second object in the first frame image is the first viewpoint; the first object is different from the second object;

the second frame image also comprises the second object, and the second frame image is obtained by splicing the image of the second object with the converted image of the first object by the first electronic equipment; the converted image of the first object is obtained by converting the viewpoint of the first object in the first frame image by the first electronic device; the image of the second object is obtained by cutting the second object in the first frame image by the first electronic equipment; in the second frame image, the viewpoint of the second object is the first viewpoint.

13. The method according to claim 1 or 2, characterized in that the resolution of the second frame image is related to the network quality of the first electronic device, the network quality of the second electronic device, the operational capability of the first electronic device and/or the operational capability of the second electronic device; the resolution of the second frame image is positively correlated with the network quality of the first electronic device; the resolution of the second frame image is positively correlated with the network quality of the second electronic device; the resolution of the second frame image is positively correlated with the operational capability of the first electronic device; the resolution of the second frame image is positively correlated with the operational capabilities of the second electronic device.

14. An electronic device, comprising: a processor and a memory;

the memory stores computer-executable instructions;

the processor executing computer-executable instructions stored in the memory to cause the electronic device to perform the method performed by the first electronic device of any one of claims 1-13 or to perform the method performed by the second electronic device of any one of claims 1-13.

15. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements a method performed by a first electronic device according to any of claims 1-13 or a method performed by a second electronic device according to any of claims 1-13.