CN107528938B

CN107528938B - Video call method, terminal and computer readable storage medium

Info

Publication number: CN107528938B
Application number: CN201710618360.1A
Authority: CN
Inventors: 吴再稳
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2017-07-26
Filing date: 2017-07-26
Publication date: 2020-09-29
Anticipated expiration: 2037-07-26
Also published as: CN107528938A

Abstract

The embodiment of the invention provides a video call method, a terminal and a computer readable storage medium. The method is applied to a first terminal and comprises a first display screen and a second display screen which are arranged oppositely, the first terminal also comprises a front camera which is arranged at the same side of the first display screen and a rear camera which is arranged at the same side of the second display screen, and the method comprises the following steps: starting a front camera and a rear camera to acquire video images in a state of establishing video call connection with a second terminal; sending a first video image acquired by the front camera and a second video image acquired by the rear camera to a second terminal; receiving a third video image sent by a second terminal; displaying a third video image on the first display screen and the second display screen respectively; and the third video image is a video image acquired by the first camera of the second terminal. The invention solves the problems that the front camera head and the rear camera head need to be switched back and forth to be in a black screen state and the operation is complicated in the video call process.

Description

Video call method, terminal and computer readable storage medium

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a video call method, a terminal, and a computer-readable storage medium.

Background

At present, mobile terminals such as mobile phones and the like are rapidly developed, and users can use the terminals to carry out network video calls or make video calls. In the process of using a mobile terminal to carry out video call, if a plurality of scenes needing to be shot exist on one side of one terminal, a user may need to shoot video images by switching a front camera and a rear camera, so that the user of the other terminal can receive two different video images within the shooting range of the front camera and the rear camera on one side of the terminal. Like this, the video conversation in-process needs to switch leading camera repeatedly, complex operation, and at leading camera, the switching in-process of taking video image of rear camera, the video can present short-lived black screen.

Disclosure of Invention

The embodiment of the invention provides a video call method, a terminal and a computer readable storage medium, which are used for solving the problems that a front camera head and a rear camera head need to be switched back and forth to be in a black screen state and the operation is complicated in the video call process.

In a first aspect, an embodiment of the present invention provides a video call method, which is applied to a first terminal, where the first terminal includes a first display screen and a second display screen that are arranged oppositely, the first terminal further includes a front camera on the same side as the first display screen and a rear camera on the same side as the second display screen, and the method includes:

starting a front camera and a rear camera to acquire video images in a state of establishing video call connection with a second terminal;

sending the first video image collected by the front camera and the second video image collected by the rear camera to the second terminal;

receiving a third video image sent by the second terminal;

displaying the third video image on the first display screen and the second display screen respectively;

and the third video image is a video image acquired by a first camera of the second terminal.

In a second aspect, an embodiment of the present invention provides a video call method, which is applied to a second terminal, where the second terminal includes a third display screen, the second terminal further includes a first camera on the same side as the third display screen, and the method includes:

starting a first camera to collect video images in a state of establishing video call connection with a first terminal;

sending a third video image acquired by the first camera to the first terminal;

receiving a first video image and a second video image sent by the first terminal;

displaying the first video image and the second video image;

the first video image is a video image collected by a front camera of the first terminal, and the second video image is a video image collected by a rear camera of the first terminal.

In a third aspect, an embodiment of the present invention provides a first terminal, including a first display screen and a second display screen that are arranged oppositely, a front camera that is on the same side as the first display screen, and a rear camera that is on the same side as the second display screen, and:

the first acquisition module: the system comprises a front camera, a rear camera and a second terminal, wherein the front camera and the rear camera are used for acquiring video images under the condition of establishing video call connection with the second terminal;

a first sending module: the first video image acquired by the front camera and the second video image acquired by the rear camera are sent to the second terminal;

a first receiving module: the second terminal is used for receiving a third video image sent by the second terminal;

a first display module: the third video image is respectively displayed on the first display screen and the second display screen;

In a fourth aspect, an embodiment of the present invention provides a second terminal, where the second terminal includes a third display screen, the second terminal further includes a first camera on the same side as the third display screen, and the second terminal further includes:

the second acquisition module: the first camera is started to collect video images in a state of establishing video call connection with the first terminal;

a second sending module: the third video image acquired by the first camera is sent to the first terminal;

a second receiving module: the first video image and the second video image are used for receiving the first video image and the second video image sent by the first terminal;

a third display module: for displaying the first video image and the second video image;

In a fifth aspect, an embodiment of the present invention provides a first terminal, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the video call method applied to the first terminal as provided in any one of the embodiments of the present invention.

In a sixth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the video call method applied to the first terminal, as provided in any one embodiment of the present invention.

In a seventh aspect, an embodiment of the present invention provides a second terminal, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the video call method applied to the second terminal, as provided in any one embodiment of the present invention.

In an eighth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the video call method applied to the second terminal, as provided in any one embodiment of the present invention.

Thus, in the embodiment of the invention, when the first terminal and the second terminal carry out a video call, the front camera and the rear camera can be used for respectively acquiring the first video image and the second video image, and simultaneously transmitting the first video image and the second video image to the second terminal for displaying, so that a user participating in the video call at one side of the second terminal can see scenes in the shooting ranges of the front camera and the rear camera of the first terminal, and when users participating in the video call exist in the shooting ranges of the front camera and the rear camera of the first terminal or scenes which the user participating in the video call at one side of the second terminal wants to see exist in the shooting ranges of the front camera and the rear camera, the video images shot by the two cameras can be simultaneously transmitted to the second terminal, thereby avoiding the back-and-forth switching of the front camera and the rear camera, the video image shooting range of the video call is expanded, and the operation of a user is simplified.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a flowchart of a video call method applied to a first terminal according to an embodiment of the present invention;

fig. 2A is a second flowchart of a video call method applied to a first terminal according to an embodiment of the present invention;

FIG. 2B is a schematic view of a display interface of a first display screen during execution of the method of FIG. 2A;

fig. 2C is a schematic view illustrating an interface change of the first floating window in fig. 2B after a touch operation is detected;

fig. 2D is one of schematic display interface diagrams of display interfaces of a first display screen and a second display screen of a first terminal according to an embodiment of the present invention;

fig. 2E is a second schematic display interface diagram of display interfaces of the first display screen and the second display screen of the first terminal according to the embodiment of the present invention;

fig. 3 is a third flowchart of a video call method applied to a first terminal according to an embodiment of the present invention;

fig. 4 is a fourth flowchart of a video call method applied to a first terminal according to an embodiment of the present invention;

fig. 5 is a fifth flowchart of a video call method applied to a first terminal according to an embodiment of the present invention;

fig. 6 is a flowchart of a video call method applied to a second terminal according to an embodiment of the present invention;

fig. 7A is a second flowchart of a video call method applied to a second terminal according to an embodiment of the present invention;

FIG. 7B is a diagram illustrating an exemplary display interface of a second terminal;

fig. 7C is a second display interface of the second terminal according to the embodiment;

fig. 8 is a third flowchart of a video call method applied to a second terminal according to an embodiment of the present invention;

fig. 9 is a fourth flowchart of a video call method applied to a second terminal according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a first terminal according to an embodiment of the present invention

Fig. 11 is a schematic structural diagram of a second terminal according to an embodiment of the present invention;

fig. 12 is a second schematic structural diagram of the first terminal according to the embodiment of the present invention;

fig. 13 is a second schematic structural diagram of a second terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The video call method provided by the embodiment of the invention is applied to a first terminal, the first terminal comprises a first display screen and a second display screen which are arranged oppositely, the first terminal also comprises a front camera which is arranged at the same side of the first display screen and a rear camera which is arranged at the same side of the second display screen, and the method is shown in figure 1 and comprises the following steps:

step 101: and under the state of establishing video call connection with the second terminal, starting the front camera and the rear camera to acquire video images.

In a specific embodiment, the state in which the second terminal establishes the video call connection may be a state in which the video call connection with the second terminal is about to be established, or a state in which the video call connection with the second terminal has already been established.

For example, user a of a first terminal is at a first location, which also includes user C; the user B of the second terminal is at a second place, the first place and the second place are two places with different positions, the user A is dad of the user C, and the user B is mom of the user C; the method comprises the steps that a user A sends a video call request to a user B, or after the user B sends the video call request to the user A, a first terminal and a second terminal establish video call connection.

The method comprises the steps that under the condition that a first terminal and a second terminal are in video call connection, a front camera and a rear camera of the first terminal are started to collect video images, the rear camera of the first terminal faces a user C, the front camera faces the user A, and the front camera and the rear camera of the first terminal are started to collect the video images of the user A and the user C respectively.

Step 102: and sending the first video image collected by the front camera and the second video image collected by the rear camera to the second terminal.

For example, during a video call, a user a is located on one side of a front camera of a first terminal; and the user C is positioned on one side of the rear camera of the first terminal. The method comprises the steps that a first video image containing a user A is collected by a front camera of a first terminal, a second video image containing a user C is collected by a rear camera of the first terminal, the first video image containing the user A and the second video image containing the user C are sent to a second terminal by the first terminal, and a user B of the second terminal can see the first video image and the second video image at a second place simultaneously.

Step 103: and receiving a third video image sent by the second terminal.

In the embodiment of the invention, the third video image sent by the second terminal is received, and the first video image collected by the front camera and the second video image collected by the rear camera are sent to the second terminal to be executed simultaneously or sequentially according to any sequence.

For example, the user B of the second terminal is at the second location, the first camera of the second terminal captures a third video image containing the user B, the third video image is sent to the first terminal, and the first terminal receives the third video image, so that the user a and the user C of the first terminal can see the third video image containing the user B at the first location.

Step 104: and displaying the third video image on the first display screen and the second display screen respectively.

For example, the user B at the second location is in the second location, the third video image is a video image of the user B captured by the camera of the second terminal, and after receiving the third video image, the first terminal displays the third video image on the first display screen and the second display screen of the first terminal, so that the user a and the user C at the first location can see the video image of the user B on the first display screen and the second display screen, respectively.

In a specific embodiment of the present invention, at least one of the first video image and the second video image is also displayed on the first display screen and the second display screen. For example, a first video image is displayed on a first display screen and a second video image is simultaneously displayed on a second display screen, or the second video image is displayed on the first display screen and the first video image is displayed on the second display screen, or only the first video image or only the second video image is displayed on both the first display screen and the second display screen.

And the third video image is a video image collected by the first camera of the second terminal and is a video image which is shot by the camera of the second terminal and then transmitted to the first terminal.

It will be understood by those skilled in the art that step 103 may be performed before step 101 or step 102, simultaneously with step 101 or step 102, or at step 101 or step 102, in the case where the display and the photographing do not conflict. Step 104 can be executed after receiving the third video image sent by the second terminal in step 103.

As can be seen from the above, when the video call method provided by the present invention is applied to a first terminal, a front camera and a rear camera are adopted to shoot video images, so that a second terminal performing video call with the first terminal can present video images shot by the front camera and the rear camera of the first terminal, the video image shooting range of the first terminal is expanded, a first terminal user can simultaneously take a scene in the shooting range of the front camera and a scene in the shooting range of the rear camera during video call, and thus the video image shooting scene does not need to be changed by switching cameras in the video call process, and meanwhile, the shooting range of the video images received by the second terminal can be expanded.

In a specific embodiment of the present invention, the step of displaying the third video image on the first display screen and the second display screen respectively includes:

and displaying the third video image on the first display screen in a full screen mode, and displaying the third video image on the second display screen in a full screen mode.

Through the embodiment, if the users carry out video call with the users at the second terminal in the shooting ranges of the front camera and the rear camera at the first terminal, the users at the first display screen and the users at the second display screen can both see the users at the second terminal, and the visual angle during the video call is enlarged, so that in the video call process, even if two or more users participating in the video call exist at different positions at one side of the first terminal, the cameras for shooting video pictures do not need to be switched continuously, and the operation of the users on the first terminal is reduced.

In some embodiments of the present invention, referring to fig. 2A, after the step of displaying the third video image on the first display screen in full screen and displaying the third video image on the first display screen in full screen on the basis of fig. 1, the method further includes:

step 105: displaying the first video image in a first floating window of a first display area in the first display screen.

For example, the whole display area of the first display screen is used as the first display area, and the first floating window is located in the range of the first display area and covers a part of the first display area. And displaying the first video image containing the user A in the first floating window, so that the user A can also see the video image of the user A in the process of carrying out video call with the user B.

Step 106: displaying the second video image in a second floating window of a second display area in the first display screen.

For example, referring to fig. 2B, the third video image is displayed on the full screen of the first display screen 201, and the first floating window 202 and the second floating window 203 are located in the display area of the first display screen 201 and cover a part of the first display screen 201. It can be seen that the video images of all users participating in the video call can be displayed on the first display screen 201 at the same time, and frequent camera switching operation is not required.

In the above embodiment, the first video image and the second video image are displayed through the floating window in the first display screen, so that a user on one side of the first display video can see all the video images shot by the first terminal, the shooting angle and the like can be adjusted according to the conditions of the first video image and the second video image, and the user of the first terminal can conveniently control the shooting angle and position of the front camera and the rear camera.

According to the execution sequence shown in fig. 2A, steps 105, 106 can be executed theoretically at any time after the first video image and the second video image are formed.

Specifically, the first floating window and the second floating window are non-full-screen windows, the first video image and the second video image are respectively displayed on the first display screen through the first floating window and the second floating window, so that a user of the first terminal can view the video images shot by the front camera and the rear camera, the shooting angles of the front camera and the rear camera are adjusted as required, the best video image shooting effect is obtained, the range of the visual angle shot during video call is expanded, a user B of the second terminal participating in the video call can obtain images of multiple different visual angles of the place where the user of the first terminal is located, and even if the user participating in the video call on one side of the first terminal is not located on the first terminal, the same side of the first terminal can be shot in the video images. It should be understood by those skilled in the art that the

steps

105 and 106 may be executed sequentially in any order, or simultaneously.

In consideration of the possibility of a scene in which two or more users on the first terminal side are talking to a user on the second terminal side, it is also possible to display the first video image and the second video image through the small window on the first display screen and display the second video image through the small window on the second display screen. And the user in the shooting range of the rear camera of the first terminal can also see the appearance of the user in the video window on the second display screen.

In some embodiments of the present invention, referring to fig. 3, on the basis of fig. 2A, after the step of displaying the second video image in the second floating window of the second display area in the first display screen, the method further includes:

step 107: and detecting touch operation in any one of the first floating window and the second floating window.

The user can click any one of the first floating window and the second floating window, then the video image in the floating window corresponding to the click operation is amplified to be displayed in a full screen mode, and the video image displayed in the full screen mode before the click operation is executed is adjusted to be in the floating window display mode.

Step 108: and if the touch operation is detected, acquiring a first target video image currently displayed in a full screen mode on the first display screen.

For example, as shown in fig. 2B, suppose that the user a clicks the first floating window 202, obtains the third video image currently displayed in full screen, and takes the third video image as the first target video image.

Step 109: and displaying the video image in the floating window corresponding to the touch operation on the first display screen in a full screen mode.

As shown in fig. 2B, the first video image in the first floating window 202 corresponding to the touch operation is displayed on the first display screen in a full screen.

Step 110: and displaying the first target video image in a floating window corresponding to the touch operation.

As shown in fig. 2B, the third video image displayed in full screen before the click operation is displayed in the first floating window 202 corresponding to the click operation.

In theory, step 107 can be performed after either of

steps

105 and 106 is completed. However, it should be understood by those skilled in the art that steps 108 and 109 may be performed sequentially in any order, or simultaneously.

Specifically, the detecting of the touch operation in any one of the first floating window and the second floating window is to detect the touch operation in a display range of any one of the first floating window and the second floating window. Through the embodiment, the user of the first terminal can perform switching control on the display of the video picture on the first display screen, and the video picture which the user wants to see is enlarged by full-screen display when needed.

In some embodiments of the present invention, after the step of displaying the second video image in the second floating window of the second display area in the first display screen, the method further includes:

detecting gesture characteristics of the video image displayed in the first display screen;

if the gesture features are detected and matched with gesture features which are stored in advance and used for switching display modes, adjusting the display states of the first floating window and/or the second floating window according to the detected gesture features;

wherein the display state comprises a hidden state and a display state.

In the above embodiment, adjusting the display state of the first floating window and/or the second floating window includes: and only adjusting the display state of the first floating window, or only adjusting the display state of the second floating window, or adjusting the display states of the first floating window and the second floating window. Specifically, when only the display state of the first floating window is adjusted, if the first floating window is in the hidden state, the first floating window is adjusted to the display state, and the original display state of the second floating window is not changed. When only the display state of the second floating window is adjusted, if the second floating window is in the display state, the second floating window is adjusted to be in the hidden state, and the original display state of the first floating window is not changed. When the display states of the first floating window and the second floating window are adjusted, the floating window in the hidden state is adjusted to be in the display state, and the floating window in the display state is adjusted to be in the hidden state.

For example, before adjusting the states of the first floating window and the second floating window, as shown in fig. 2E, both the first display screen 201 and the second display screen 204 display the third video image in a full screen, and the first floating window and the second floating window are in a hidden state; if a gesture matched with a gesture feature which is stored in advance and used for switching the display mode is received, displaying the first floating window and the second floating window which are hidden on the first display screen as shown in fig. 2D.

In the embodiment of the invention, the purpose of video call is to perform visible dialogue with the user of the second terminal, so for some users, only the video image of the remote party participating in the video call may need to be seen in the video call process, and therefore, the first floating window and the second floating window can be hidden, so that the user can switch the display mode according to the needs of the user, and when the user wants to use the first display screen only for displaying the remote user, the first floating window and the second floating window are hidden.

In some embodiments of the present invention, after the step of displaying the third video image on the first display screen and the second display screen respectively, the method further includes:

receiving a fourth video image sent by the second terminal;

dividing the first display screen into a first display area, a second display area, a third display area and a fourth display area;

displaying the first video image, the second video image, the third video image and the fourth video image in the first display area, the second display area, the third display area and the fourth display area, respectively.

The fourth video image is a video image captured by the other camera of the second terminal when the second terminal has two cameras. Under the condition that the second terminal is provided with the two cameras, more video images can be generated in the process of video call, the video images shot by the first terminal and the second terminal are all displayed on the first display screen, so that a user positioned on one side of the first display screen can see all the video images, and the shooting angle of the camera of the first terminal can be adjusted according to the video images, and a more ideal shooting effect is achieved.

In some embodiments of the present invention, after the step of displaying the first video image, the second video image, the third video image and the fourth video image in the first display area, the second display area, the third display area and the fourth display area respectively, further comprising:

detecting touch operation in any one of the first display area, the second display area, the third display area and the fourth display area;

if the touch operation is detected, displaying a video image in a display area corresponding to the touch operation in a full screen mode;

detecting a sliding operation on the first display screen;

if the sliding operation is detected, acquiring the direction of the sliding operation;

and switching the video image content displayed in the full screen currently on the first display screen according to the direction of the sliding operation.

Displaying the first video image, the second video image, the third video image and the fourth video image in the first display area, the second display area, the third display area and the fourth display area. If touch operation in the first display area is detected, displaying a first video image in the first display area in a full screen mode, hiding video images in the second display area, the third display area and the fourth display area, and switching the current first video image displayed in the full screen mode into a second video image when sliding operation to the left is detected; or when the sliding operation towards the right is detected, the first video image displayed in full screen at present is switched into the fourth video image.

Through the embodiment, a user can switch the video images displayed on the full screen and the floating window on the display screen through one sliding operation, so that the switching steps are reduced, the switching identifier does not need to be arranged in the display area, the area of the display area is saved, and the overlapping of the display area and the identifier is reduced.

detecting a shooting instruction;

if a shooting instruction is detected, controlling the front camera and/or the rear camera to execute shooting operation;

the shooting instruction comprises at least one of a voice instruction, a key instruction, a screen gesture instruction and an air gesture instruction.

The step of detecting the shooting instruction specifically comprises the following steps:

step 401: and detecting voice data collected by a microphone of the first terminal.

For example, it is detected on the first terminal side that the microphone has collected voice data of "photograph", or "shooting", or the like.

Step 402: and judging whether a shooting instruction is received or not according to the voice data.

In some embodiments of the present invention, the step of determining whether a shooting instruction is received according to the voice data includes: detecting a source of the voice data; if the source of the voice data is detected to be the second terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting; and if the voice data is detected to comprise the pre-stored key words for triggering shooting, determining that a shooting instruction is received.

And receiving voice data collected by a microphone, comparing the collected voice data with pre-stored voice data of the user B, and determining that the source of the voice data is the second terminal if the collected voice data is matched with the pre-stored voice data of the user B. And then detecting whether the voice data comprises pre-stored keywords for triggering shooting, and determining that a shooting instruction is received when determining that the shooting, the photographing or the shooting in the voice data is consistent with the voice data corresponding to the pre-stored shooting instruction.

In the embodiment of the invention, the corresponding relation between the voice data and the shooting instruction is stored in advance, and the corresponding relation between a plurality of different voice data and the shooting instruction can be established, for example, according to the shooting habit of the user, when the user is detected to say the voice data of 'shooting', 'eggplant' and the like, the shooting instruction is determined to be received, and the shooting operation can be executed.

If a shooting instruction is detected, controlling the front camera and/or the rear camera to execute shooting operation, specifically:

step 501: and if a shooting instruction is received, determining a target shooting object corresponding to the shooting instruction.

The user B can control to shoot the image of the user A and/or the user C through a voice instruction, and the step of determining the target shooting object corresponding to the shooting instruction comprises the following steps: extracting key words in the voice data collected by the microphone; determining the target shooting object according to the extracted keywords; wherein the target photographic object comprises a first target object in the first video image and/or a second target object in the second video image.

Before the step of determining the target shooting object corresponding to the shooting instruction, the method further comprises the following steps: respectively acquiring a frame of video image acquired by a front camera and a frame of video image acquired by a rear camera, carrying out face recognition on the video images acquired by the front camera and the rear camera, determining names of people corresponding to the faces, for example, identifying a first person name corresponding to the face in the video image acquired by the front camera as dad, identifying a second person name corresponding to the face in the video image acquired by the rear camera as baby, and then establishing a first corresponding relationship between the front camera and a first target object and a second corresponding relationship between the rear camera and a second target object.

Extracting key words in the voice data collected by the microphone; and determining whether the name of the person is the name of the first person or the name of the second person according to the extracted keywords, for example, detecting that the keywords comprise 'baby' and 'one shot', and determining that the target shooting object is the second target object.

Step 502: and controlling the front camera and/or the rear camera to execute shooting operation according to the target shooting object.

The target shooting object comprises a first target object in the first video image and/or a second target object in the second video image.

In some embodiments of the present invention, the step of controlling the front camera and/or the rear camera to perform a shooting operation according to the target shooting object includes: if the target shooting object is the first target object, controlling the front-facing camera to execute shooting operation; if the target shooting object is the second target object, controlling the rear camera to execute shooting operation; and if the target shooting objects are the first target object and the second target object, controlling the front camera and the rear camera to execute shooting operation.

In the above-described embodiments, the camera used for the shooting operation is determined according to the target photographic subject, and controllability of the shooting operation is improved. In the above embodiment, the corresponding camera is controlled to perform the shooting operation according to the target shooting object, so that not only can the remote control of shooting be realized, but also the user can determine the target shooting object from the participants of the current video call to perform selective shooting. Through the embodiment, the user of the second terminal can also remotely control the first terminal to carry out the photographing operation, so that the controllability of photographing is improved, the accuracy of photographing instruction identification is improved, the photographing operation cannot be triggered mistakenly due to language communication in the video call process, and the interactivity of the user participating in the video call at the first terminal side and the user participating in the video call at the second terminal side can be enhanced.

In the prior art, in order to capture a picture of a video call, a picture displayed on a current screen can be captured only in a screen capture mode. Although the picture of the video call can also be obtained, the screen is intercepted with the required steps, so that the interception of the picture is delayed; in addition, when the screen is captured, the content displayed on the screen becomes a picture, and the part around the video window can be captured, so that the effect of the picture is influenced.

Through the embodiment, the user can capture the picture of the video call opposite side which the user wants to obtain at any time in the video call process and form the photo, and the problem that the camera cannot be opened for taking a picture due to the fact that the camera is occupied in the video call process in the prior art is solved. Meanwhile, after the shooting instruction is detected, the front camera and/or the rear camera can be directly controlled to execute the shooting operation, the formed picture does not contain the picture which can be shot by the camera, other contents of the current screen display area are not contained, and the image quality is better compared with the screenshot.

In the embodiment, the user can issue the photographing instruction by sending out voice, so that the photographing triggering does not need manual operation of the user, and even if the hands of the user are occupied in the video process, the camera can be controlled to perform photographing operation. In addition, in the embodiment, the target shooting object can be determined according to the voice source, and the target shooting object can issue the shooting instruction according to the shooting requirement of the target shooting object, so that the user can select the target shooting object to shoot in all users participating in the video call, the shooting operation is not performed on the object which is not expected to be shot, and the shooting controllability is improved. In other embodiments of the present invention, the shooting instruction may also be sent in the form of a key, a button, or the like.

In some embodiments of the present invention, after the step of controlling the front camera and/or the rear camera to perform a shooting operation, the method further includes: acquiring an image generated by the shooting operation; and sending the image to the second terminal.

For example, on the first terminal side, the acquired image is an image captured by a front camera or a rear camera. The image shot by the front camera or the rear camera is sent to the second terminal, so that the user B at one side of the second terminal can automatically receive the image, the image shot in the video call process can be shared at any time, and the user at one side of the first terminal does not need to manually send the image.

Through the embodiment, the shot images can be shared between the first terminal and the second terminal, the first user terminal does not need to manually control the sending of the shot images, the operation of the user of the first terminal is reduced, and meanwhile the user of the second terminal can conveniently keep the shot images in the video call process. In a specific embodiment, after the image generated by the shooting operation is acquired, a prompt message pops up, the prompt message includes a selection button for whether to send the image, and after the button icon corresponding to the picture to be sent is detected to be clicked by the user from the selection button, the image is sent.

In an embodiment of the present invention, the step of detecting the shooting instruction includes:

performing gesture feature detection on the video image displayed in the first display screen and/or the second display screen;

if the shooting instruction is detected, controlling the front camera and/or the rear camera to execute shooting operation, wherein the step comprises the following steps:

if the gesture features are detected and matched with gesture features which are stored in advance and used for triggering shooting, determining the source of the detected gesture features;

determining a target shooting object corresponding to the shooting instruction according to the source of the detected gesture feature;

and controlling the front camera and/or the rear camera to execute shooting operation according to the target shooting object.

In an embodiment of the present invention, the gesture may be used to instruct a shooting object, a shooting mode, or a user to send a shooting instruction to trigger a shooting operation.

For example, gesture feature detection is performed on a third video image displayed in the second display screen, if a gesture feature is detected and the gesture feature is matched with a gesture feature which is stored in advance and used for triggering shooting, whether a person corresponding to the gesture feature is a user B is detected, and if the person corresponding to the gesture feature is the user B, the source of the detected gesture feature is determined to be the second terminal.

Wherein, according to the detected source of the gesture feature, determining the target shooting object corresponding to the shooting instruction specifically includes: and if the source of the detected gesture feature is the second terminal, determining a target shooting object corresponding to the detected gesture feature according to the detected gesture feature and the pre-stored gesture feature.

For example, a target shooting object corresponding to the gesture feature of the scissor hand may be preset as a second target object, and a target shooting object corresponding to the OK gesture feature may be a first target object, and if the source of the detected gesture feature is the second terminal, the detected gesture feature is matched with the gesture feature stored in advance, and if the detected gesture feature is an OK gesture, the target shooting object is determined as the first target object, and then the front-facing camera is controlled to perform a shooting operation.

In the above embodiment, the user B may trigger the shooting operation by using a gesture, so that when the user B is inconvenient to perform touch operation on the terminal (for example, water is adhered to a hand, and other substances affecting the touch operation), the shooting instruction may be issued, the issuing manner of the shooting instruction is increased, and the user B may select a convenient manner to issue the shooting instruction according to the needs of the user B.

In another embodiment of the present invention, a video call method is provided, which is applied to a second terminal, and is characterized in that the second terminal includes a third display screen, the second terminal further includes a first camera on the same side as the third display screen, and the method includes the steps shown in fig. 6:

step 601: and starting a first camera to collect video images in a state of establishing video call connection with the first terminal.

For example, user a of a first terminal is at a first location, which also includes user C; the user B of the second terminal is at a second place, the user A is dad of the user C, and the user B is mom of the user C; the method comprises the steps that a user A sends a video call request to a user B, or after the user B sends the video call request to the user A, a first terminal and a second terminal establish video call connection.

And starting a first camera of the second terminal to collect video images facing a user B in a state that the first terminal and the second terminal establish video call connection.

Step 602: and sending the third video image acquired by the first camera to the first terminal.

For example, the user B is photographed by the first camera of the second terminal, a third video image is formed, and is transmitted to the first terminal.

Step 603: and receiving a first video image and a second video image sent by the first terminal.

For example, the first video image is a video image including the user a captured by a front camera of the first terminal, and the second video image is a video image including the user C captured by a rear camera of the first terminal. After the first video image and the second video image are shot by the first terminal, the first video image and the second video image are sent to the second terminal, and the second terminal receives the first video image and the second video image.

Step 604: displaying the first video image and the second video image.

When the second terminal is provided with a display screen, the first video image and the second video image can be simultaneously displayed on the display screen of the second terminal. When the second terminal is provided with two display screens, the first video image and the second video image are simultaneously displayed on one display screen of the second terminal, or are respectively displayed on the two display screens of the second terminal, or the first video image and the second video image are displayed on the two display screens of the second terminal.

As can be seen from the above, the video call method provided in the embodiment of the present invention enables the second terminal to receive the video images sent by the first terminal and captured by the front camera and the rear camera of the first terminal, and display the video images on the display screen of the second terminal, so that the video images captured by the two cameras of the first terminal can be presented to the user of the second terminal in the video call process, the visual angle range of the first terminal side that the user of the second terminal can see in the video call process is expanded, and under the condition that the user of the first terminal needs to see the scenes in the capturing ranges of the front camera and the rear camera of the first terminal, the user of the first terminal does not need to switch the front camera and the rear camera, thereby facilitating the user to perform multi-scene video call.

In some embodiments of the present invention, the second terminal includes a fourth display screen disposed opposite the third display screen;

the step of displaying the first video image and the second video image comprises:

displaying the first video image on the third display screen in a full screen manner, and displaying the second video image on the fourth display screen in a full screen manner;

or displaying the second video image on the third display screen in a full screen mode, and displaying the first video image on the fourth display screen in a full screen mode.

When at least two video call users exist at the second terminal side, and the two video call users are respectively located at the first camera side and the second camera side of the second terminal, the display mode provided by the embodiment can enable the users located at different positions to see video images, so that multi-person or multi-view video call activities can be carried out at the second terminal side.

In some embodiments of the present invention, on the basis of fig. 6, as shown in fig. 7A, the third display screen includes a swap identifier for swapping display contents of the third display screen and the fourth display screen;

after the step of displaying the first video image and the second video image, the method further comprises:

step 605: and detecting touch operation on the swap identifier.

For example, the swap identifier is an arrow icon displayed in the upper left or right corner of the third display screen and an arrow icon displayed in the upper left or right corner of the fourth display screen. And when the user B clicks the swap identifier, detecting the touch operation of the swap identifier.

Step 606: and if the touch operation is detected, the video images displayed by the third display screen and the fourth display screen are exchanged.

For example, in fig. 7B, a user A, C is present on the first terminal side to participate in a video call, a user B is present on the second terminal side to participate in a video call, a video image of the user a is displayed on the third display 701, and a video image of the user B is displayed on the fourth display 702.

The swap identifier may specifically be an arrow-shaped icon, such as swap identifier 703 shown in fig. 7B. After the user clicks the swap identifier 703, the video image of the user a displayed on the third display screen 701 is switched to the fourth display screen 702 for display, as shown in the left half of fig. 7C; and switches the video image of the user C displayed on the fourth display screen 702 to the third display screen 701 for display, as shown in the right half of fig. 7C.

In the above embodiment, when the first video image is displayed on the third display screen and the second video image is displayed on the fourth display screen, and when the touch operation on the swap identifier is detected, the second video image is displayed on the third display screen, and the first video image is displayed on the fourth display screen, so that the display position of the video image can be switched, and thus, when a user participating in the video call on one side of the second terminal needs to watch the video images of other users, the mobile phone does not need to be turned over, the display of the first video image and the display of the second video image can be switched by only clicking the swap identifier, and the operation is convenient.

In a specific embodiment of the present invention, the swap identifier may also be disposed on the fourth display screen, or disposed on the third display screen and the fourth display screen simultaneously. By switching the video images of the third display screen and the fourth display screen of the second terminal, the watching requirements of a user on one side of the third display screen or one side of the fourth display screen of the second terminal can be flexibly met. By setting the swapping identifier, the content displayed in the display area can be switched, so that the user can determine the content to be displayed in each display area according to the needs of the user.

In some embodiments of the present invention, the step of displaying the first video image and the second video image comprises:

determining a second target video image to be displayed in a full screen mode in the first video image and the second video image;

determining a third target video image to be displayed in a window in the first video image and the second video image;

displaying the second target video image on the third display screen in a full screen manner;

displaying the third target video image in a third floating window of a third display area in the third display screen;

displaying the third video image in a fourth floating window of a fourth display area in the third display screen.

For example, the first video image may be set as a second target video image to be displayed in a full screen by default, the second video image is determined as a third target video image to be displayed in a window, and then the second target video image is displayed in the full screen on the third display screen; displaying the third target video image in a third floating window of a third display area in the third display screen; displaying the third video image in a fourth floating window of a fourth display area in the third display screen.

In the above-described embodiment of the present invention, the second target video image is one of the first video image and the second video image; the third target video image is the other of the first video image and the second video image. Under the condition that the second terminal is provided with a display screen, the first video image and the second video image are respectively displayed in a full screen mode and a floating window mode, so that a user of the second terminal can see video images shot by a front camera and a rear camera of the first terminal at the same time. In the case where the second terminal has two display screens but only one user on the second terminal side participates in the video call, the first video image and the second video image are all displayed on the third display screen in the above-described embodiment, so that the user on the second terminal side participating in the video call can see all the video images transmitted by the first terminal side.

In addition, in the case where the second terminal has two display screens but only one user on the side of the second terminal participates in the video call, it is possible to select to display the video image transmitted by the first terminal through one display screen of the second terminal.

In some embodiments of the present invention, after the step of displaying the third video image in the fourth floating window of the fourth display area in the third display screen, the method further includes:

detecting a touch operation in any one of the third floating window and the fourth floating window;

if the touch operation is detected, acquiring a fourth target video image currently displayed in a full screen mode on the third display screen;

displaying the video image in the floating window corresponding to the touch operation on the third display screen in a full screen manner;

and displaying the fourth target video image in a floating window corresponding to the touch operation.

For example, if the third display screen currently displays the first video image in a full screen mode, the third floating window displays the second video image, the fourth floating window displays the third video image, and a touch operation in the third floating window is detected, the first video image currently displayed in the full screen mode by the third display screen is acquired, the second video image in the third floating window is displayed in the full screen mode on the third display screen, and the first video image displayed in the full screen mode before is displayed in the third floating window.

Through the embodiment, the video image displayed in the full screen and the video image displayed in the small floating window can be exchanged, so that the user can display the video image displayed on the third display screen in the full screen according to the requirement of the user. It should be understood by those skilled in the art that, in the above embodiment, the step of obtaining the fourth target video image currently displayed in full screen on the third display screen and the step of displaying the video image in the floating window corresponding to the touch operation in full screen on the third display screen may be performed in any order, or may be performed simultaneously.

In some embodiments of the present invention, after the step of displaying the fourth target video image in the floating window corresponding to the touch operation, the method further includes:

detecting gesture features of the video image displayed in the third display screen;

if the gesture features are detected and matched with gesture features which are stored in advance and used for switching display modes, adjusting the display state of the third floating window and/or the fourth floating window according to the detected gesture features;

wherein the display state comprises a hidden state and a display state.

The process of how to adjust the hidden and displayed states of the third floating window and the fourth floating window is the same as the method embodiment of the first terminal, and is not described in detail herein. In the embodiment of the invention, the purpose of video call is to perform visible dialogue with the user of the first terminal, so for some users, only the video image of the remote party participating in the video call may need to be seen in the video call process, and therefore, the third floating window and the fourth floating window can be hidden, so that the user can switch the display mode according to the needs of the user, and when the user wants to use the third display screen only for displaying the remote user, the third floating window and the fourth floating window are hidden.

In some embodiments of the present invention, the second terminal further includes a second camera on the same side as the fourth display screen;

if receiving a starting instruction of the second camera, starting the second camera to collect a video image;

sending a fourth video image acquired by the second camera to the first terminal;

dividing the third display screen into a fifth display area, a sixth display area, a seventh display area and an eighth display area;

displaying the first video image, the second video image, the third video image and the fourth video image in the fifth display area, the sixth display area, the seventh display area and the eighth display area, respectively.

If the first terminal and the second terminal participating in the video call are provided with two cameras, and one side of the second terminal is also provided with two cameras, all video pictures are displayed on the second terminal, when a plurality of users participate in the video call, each user can appear in a video image through the corresponding camera, and can see other users participating in the video call on a display screen, and four display areas need to be divided on each display screen of the terminal, wherein the four display areas can be divided in a form of a suspended window; one may also be determined to be a full screen display area and the other three may be displayed through a floating window.

In the above embodiment, the second camera can be started according to the instruction of the user, so that when a plurality of users participate in the video call on one side of the second terminal, the camera of the second terminal does not need to be frequently switched, the operations of the users on the terminal in the video call process are reduced, the first terminal in video connection with the second terminal can receive the video images of the users on one side of the second terminal at the same time, and the reduction of user experience caused by short-time black screen in the video switching process can be avoided.

In some embodiments of the present invention, after the step of displaying the first video image, the second video image, the third video image and the fourth video image in the fifth display area, the sixth display area, the seventh display area and the eighth display area respectively, further comprising:

detecting touch operation in any one of the fifth display area, the sixth display area, the seventh display area and the eighth display area;

detecting a sliding operation on the third display screen;

and switching the video image content displayed in the full screen currently on the third display screen according to the direction of the sliding operation.

When a plurality of display areas exist on the second terminal, the display areas for displaying the images can be switched through a sliding operation, for example, through a left-right sliding operation, switching back and forth among the first video image, the second video image, the third video image and the fourth video image can be performed, and the specific process is the same as that of the method embodiment of the first terminal, and is not described in detail herein.

Through the embodiment, a user can switch the video images displayed on the full screen and the floating window on the display screen through one sliding operation, so that the steps of switching the video images are reduced, a switching identifier does not need to be arranged in the display area, the area of the display area is saved, and the overlapping of the display area and the identifier is reduced.

In some embodiments of the present invention, after the step of displaying the first video image and the second video image, the method further includes:

detecting a shooting instruction;

if a shooting instruction is detected, controlling the first camera to execute shooting operation;

Through the embodiment, the shooting can be realized in the video call process, so that when a user wants to keep the picture of the other party of the video call, the screen does not need to be shot, through the shooting instruction, the obtained picture does not have the content except the picture shot by the camera, and the problem that in the prior art, the camera is occupied by the video call, so that the shooting application cannot be started to shoot is solved.

In some embodiments of the present invention, referring to fig. 8, after the step of displaying the first video image and the second video image on the basis of fig. 6, the method further includes:

step 607: and detecting voice data collected by the microphone.

For example, it is detected on the second terminal side that the microphone has collected voice data of "photograph", or "shooting", or the like.

Step 608: and judging whether a shooting instruction is received or not according to the voice data.

The step of judging whether a shooting instruction is received or not according to the voice data comprises the following steps: detecting a source of the voice data; if the source of the voice data is detected to be the first terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting; and if the voice data is detected to comprise the pre-stored key words for triggering shooting, determining that a shooting instruction is received.

The method comprises the steps of receiving voice data collected by a microphone, comparing the collected voice data with pre-stored voice data of a user A or a user C, and determining that the source of the voice data is the first terminal if the collected voice data is matched with the pre-stored voice data of the user A or the user C. And then detecting whether the voice data comprises pre-stored keywords for triggering shooting, and determining that a shooting instruction is received when determining that the shooting, the photographing or the shooting in the voice data is consistent with the voice data corresponding to the pre-stored shooting instruction.

Step 609: and if a shooting instruction is received, controlling the first camera to execute shooting operation.

For example, after receiving the shooting instruction, the first camera is controlled to shoot the image of the user B who is using the second terminal to carry out video call.

Through the embodiment, the user A or the user C of the first terminal can issue the shooting instruction to the second terminal through voice, so that the remote control of shooting is realized. In other embodiments of the present invention, the shooting instruction may also be sent in the form of a key, a button, or the like.

In some embodiments of the present invention, referring to fig. 9, on the basis of fig. 8, after the step of controlling the first camera to perform the shooting operation, the method further includes:

step 610: and acquiring an image generated by the shooting operation.

For example, on the side of the second terminal, the acquired image is an image captured by a first camera, and the first camera is a front camera or a rear camera of the second terminal.

Step 611: and sending the image to the first terminal.

For example, an image captured by the first camera is transmitted to the first terminal, so that the user a and the user C on the first terminal side can automatically receive the image. By the embodiment, the images shot in the video call process can be shared at any time, a user B at one side of the second terminal does not need to manually send the images, the time occupied by batch sharing of the images is saved, and omission caused by manual sending can be avoided.

In other specific embodiments, after the step of acquiring the image generated by the shooting operation and before the step of sending the image to the first terminal, the method further includes: and popping up prompt information, wherein the prompt information comprises a selection button for judging whether to send the picture, and after detecting that the user B clicks a button icon corresponding to the picture to be sent from the selection button, the step of sending the picture to the first terminal is executed.

In some embodiments of the present invention, the step of detecting a shooting instruction includes:

if the shooting instruction is detected, controlling the first camera to execute shooting operation, including:

and if the source of the gesture feature is the first terminal, controlling the first camera to execute shooting operation.

For example, gesture feature detection is performed on a first video image displayed in a third display screen, if a gesture feature is detected and the gesture feature is matched with a gesture feature which is stored in advance and used for triggering shooting, whether a person corresponding to the gesture feature is a user a is detected, if so, the source of the detected gesture feature is determined to be a first terminal, and then the first camera is controlled to execute shooting operation.

Through the embodiment, the user can capture the picture which the user wants to obtain at any time and form the photo in the video call process, and the problem that the camera cannot be opened to take the photo due to the fact that the camera is occupied in the video call process in the prior art is solved. Meanwhile, after the shooting instruction is detected, the front camera and/or the rear camera can be directly controlled to execute the shooting operation, the formed picture does not contain the picture which can be shot by the camera, other contents of the current screen display area are not contained, and the image quality is better compared with the screenshot.

The embodiment of the present invention further provides a first terminal 1000, which has a structure as shown in fig. 10, and includes a first display screen 1011 and a second display screen 1012 that are arranged oppositely, a front camera 1013 on the same side as the first display screen 1011, a rear camera 1014 on the same side as the second display screen 1012, and:

the first acquisition module 1015: the system comprises a front camera, a rear camera and a second terminal, wherein the front camera and the rear camera are used for acquiring video images under the condition of establishing video call connection with the second terminal;

first transmit module 1016: the first video image acquired by the front camera and the second video image acquired by the rear camera are sent to the second terminal;

the first receiving module 1017: the second terminal is used for receiving a third video image sent by the second terminal;

first display module 1018: the third video image is respectively displayed on the first display screen and the second display screen;

Optionally, the first display module 1018 includes:

a first full screen display unit: the third video image is displayed on the first display screen in a full screen mode;

a second full screen display unit: the third video image is displayed on the second display screen in a full screen mode.

Optionally, the first display module 1018 further comprises:

a first floating display unit: the first video image is displayed in a first floating window of a first display area in the first display screen, and the second video image is displayed in a second floating window of a second display area in the first display screen.

Optionally, the first display module 1018 further comprises:

a first touch operation detection unit: the touch control device is used for detecting touch control operation in any one of the first floating window and the second floating window;

a first target video image acquisition unit: the method comprises the steps of obtaining a first target video image displayed in a full screen mode currently on a first display screen if touch operation is detected;

a first full screen display switching unit: the touch control device is used for displaying the video image in the floating window corresponding to the touch control operation on the first display screen in a full screen mode;

a first floating display switching unit: and displaying the first target video image in a floating window corresponding to the touch operation.

In some embodiments of the present invention, the first display module 1018 further comprises:

first gesture feature detection unit: the gesture feature detection is carried out on the video image displayed in the first display screen;

a first mode switching unit: the display state of the first floating window and/or the second floating window is adjusted according to the detected gesture feature if the gesture feature is detected and the gesture feature is matched with a gesture feature which is stored in advance and used for switching a display mode;

wherein the display state comprises a hidden state and a display state.

In some embodiments of the present invention, said first terminal 1000 further comprises:

the fourth video image receiving module: the fourth video image is used for receiving the fourth video image sent by the second terminal;

a first display area division module: the first display screen is divided into four display areas, namely a first display area, a second display area, a third display area and a fourth display area;

a second display module: the display device is used for displaying the first video image, the second video image, the third video image and the fourth video image in the first display area, the second display area, the third display area and the fourth display area respectively.

the first touch operation detection module: the touch control device is used for detecting touch control operation in any one of the first display area, the second display area, the third display area and the fourth display area;

a first full screen switching module: the display device is used for displaying a video image in a display area corresponding to the touch operation in a full screen mode if the touch operation is detected;

a first sliding operation detection module: the first display screen is used for displaying a first image;

a first sliding direction acquisition module: the method comprises the steps of obtaining the direction of the sliding operation if the sliding operation is detected;

the first display switching processing module: and the video image content displayed in full screen at present on the first display screen is switched according to the direction of the sliding operation.

the first shooting instruction detection module: the shooting instruction is used for detecting the shooting instruction;

the first shooting module: the front camera and/or the rear camera are/is controlled to execute shooting operation if a shooting instruction is detected;

Wherein, first shooting instruction detection module specifically includes:

the first voice data detection unit: for detecting voice data collected by a microphone of said first terminal 1000;

a first shooting instruction determination unit: the voice data processing device is used for judging whether a shooting instruction is received or not according to the voice data;

the first shooting module is used for: if a shooting instruction is received, determining a target shooting object corresponding to the shooting instruction; controlling the front camera and/or the rear camera to execute shooting operation according to the target shooting object;

wherein the target photographic object comprises a first target object in the first video image and/or a second target object in the second video image.

Optionally, the first shooting instruction determining unit includes:

the first voice data source detection subunit: for detecting a source of the voice data;

the first trigger keyword detection subunit: the voice data processing device is used for detecting whether the voice data comprises a pre-stored keyword used for triggering shooting if the source of the voice data is detected to be the second terminal;

the first shooting triggering subunit: and the voice processing unit is used for determining that a shooting instruction is received if the voice data is detected to comprise a pre-stored keyword for triggering shooting.

Wherein the first terminal 1000 further comprises:

a first image generation module: the image acquisition module is used for acquiring an image generated by the shooting operation;

a first image sending module: for sending the image to the second terminal.

Optionally, the first photographing module includes:

an object keyword extraction unit: the voice recognition device is used for extracting key words in voice data collected by the microphone;

a target photographic subject determination unit: determining the target shooting object according to the extracted keywords;

Optionally, the first shooting instruction detecting module includes:

a first gesture detection unit: the gesture feature detection is carried out on the video images displayed in the first display screen and/or the second display screen;

the first shooting module is further configured to: if the gesture features are detected and matched with gesture features which are stored in advance and used for triggering shooting, determining the source of the detected gesture features; determining a target shooting object corresponding to the shooting instruction according to the source of the detected gesture feature; and controlling the front camera and/or the rear camera to execute shooting operation according to the target shooting object.

Optionally, the first shooting module is further configured to:

if the target shooting object is the first target object, controlling the front-facing camera to execute shooting operation;

if the target shooting object is the second target object, controlling the rear camera to execute shooting operation;

and if the target shooting objects are the first target object and the second target object, controlling the front camera and the rear camera to execute shooting operation.

The first terminal 1000 can implement each process implemented in the above method embodiment applied to the first terminal, and achieve the same technical effect, and for avoiding repetition, details are not described here again.

In the embodiment of the invention, when a first terminal and a second terminal carry out video call, a front camera and a rear camera can be used for respectively shooting a first video image and a second video image, and the first video image and the second video image are simultaneously sent to the second terminal for display, so that a user participating in the video call at one side of the second terminal can see scenes in the shooting ranges of the front camera and the rear camera of the first terminal, and when users participating in the video call exist in the shooting ranges of the front camera and the rear camera of the first terminal or scenes which the user participating in the video call at one side of the second terminal wants to see exist in the shooting ranges of the front camera and the rear camera, the video images shot by the two cameras can be simultaneously transmitted to the second terminal, thereby the user participating in the video call at one side of the first terminal is not required to continuously switch the cameras, the video image shooting range of the video call is expanded, and the experience of the video call is improved.

The embodiment of the present invention further provides a first terminal, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the video call method embodiment, and can achieve the same technical effect, and is not described herein again to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the video call method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

An embodiment of the present invention further provides a second terminal 1300, as shown in fig. 11, where the second terminal 1300 further includes a first camera 1302 on the same side as the third display 1301, and the second terminal 1300 further includes:

the second acquisition module 1303: the first camera 1302 is started to collect video images in a state of establishing video call connection with the first terminal;

the second sending module 1304: the third video image acquired by the first camera is sent to the first terminal;

the second receiving module 1305: the first video image and the second video image are used for receiving the first video image and the second video image sent by the first terminal;

the third display module 1306: for displaying the first video image and the second video image;

Optionally, the second terminal includes a fourth display screen arranged opposite to the third display screen;

the third display module 1306 includes:

a third full screen display unit: the second video image is displayed on the fourth display screen in a full screen mode;

or, the fourth full screen display unit: the second video image is displayed on the third display screen in a full screen mode, and the first video image is displayed on the fourth display screen in a full screen mode.

Optionally, the third display screen includes a swap identifier for swapping display contents of the third display screen and the fourth display screen;

the second terminal 1300 further includes:

the second touch operation detection module: the touch operation for the swap identifier is detected;

a first switching module: and the video images displayed by the third display screen and the fourth display screen are exchanged if the touch operation is detected.

Optionally, the third display module 1306 includes:

a second target video image acquisition unit: the second target video image to be displayed in a full screen mode in the first video image and the second video image is determined;

a third target video image acquisition unit: the third target video image to be displayed in a window in the first video image and the second video image is determined;

a second full screen display switching unit: the second target video image is displayed on the third display screen in a full screen mode;

a second floating display switching unit: means for displaying the third target video image in a third floating window of a third display area in the third display screen;

a third floating display switching unit: for displaying the third video image in a fourth floating window of a fourth display area in the third display screen.

Optionally, the third display module 1306 further includes:

a third touch operation detection unit: the touch operation detection module is used for detecting the touch operation in any one of the third floating window and the fourth floating window;

a fourth target video image acquisition unit: the third display screen is used for acquiring a fourth target video image currently displayed in a full screen mode by the third display screen if the touch operation is detected;

a third full screen display switching unit: the touch control device is used for displaying the video image in the floating window corresponding to the touch control operation on the third display screen in a full screen mode;

a fourth floating display switching unit: and the display unit is used for displaying the fourth target video image in the floating window corresponding to the touch operation.

The second terminal 1300 further comprises a second camera on the same side as the fourth display screen; and the number of the first and second groups,

a third acquisition module: the second camera is started to acquire a video image if a starting instruction of the second camera is received;

a third sending module: the fourth video image acquired by the second camera is sent to the first terminal;

a second display area division module: the display screen is used for dividing the third display screen into a fifth display area, a sixth display area, a seventh display area and an eighth display area;

a fourth display module: the display device is used for displaying the first video image, the second video image, the third video image and the fourth video image in the fifth display area, the sixth display area, the seventh display area and the eighth display area respectively.

Optionally, the second terminal 1300 further includes:

a second gesture feature detection unit: the gesture feature detection is carried out on the video image displayed in the third display screen;

a second mode switching unit: if the gesture features are detected and matched with gesture features which are stored in advance and used for switching display modes, adjusting the display state of the third floating window and/or the fourth floating window according to the detected gesture features;

wherein the display state comprises a hidden state and a display state.

In some embodiments of the present invention, the second terminal 1300 further includes:

the second touch operation detection module: the touch control operation detection module is used for detecting touch control operation in any one of the fifth display area, the sixth display area, the seventh display area and the eighth display area;

the second full screen switching module: the display device is used for displaying a video image in a display area corresponding to the touch operation in a full screen mode if the touch operation is detected;

a second sliding operation detection module: the sliding operation on the third display screen is detected;

a second sliding direction obtaining module: the method comprises the steps of obtaining the direction of the sliding operation if the sliding operation is detected;

the second display switching module: and the video image content displayed in full screen currently on the third display screen is switched according to the direction of the sliding operation.

the second shooting instruction detection module: the shooting instruction is used for detecting the shooting instruction;

a second shooting module: the first camera is controlled to execute shooting operation if a shooting instruction is detected;

The second shooting instruction detection module further includes:

a second voice data detection unit: the voice detection module is used for detecting voice data collected by a microphone of the second terminal 1300;

a second shooting instruction determination unit: the voice data processing device is used for judging whether a shooting instruction is received or not according to the voice data;

a second shooting control unit: and is configured to control the first camera 1303 to execute a shooting operation if a shooting instruction is received.

The second terminal 1300 further includes:

a second image generation module: the image acquisition module is used for acquiring an image generated by the shooting operation;

a second image sending module: for sending the image to the first terminal.

In other specific embodiments, after the step of acquiring the image generated by the shooting operation and before the step of sending the image to the first terminal, the method further includes: and popping up prompt information, wherein the prompt information comprises a selection button for judging whether to send the picture, and after detecting that a user clicks a button icon corresponding to the picture to be sent from the selection button, the step of sending the picture to the first terminal is executed.

Optionally, the second shooting instruction determining unit includes:

a second voice data source detection subunit: for detecting a source of the voice data;

a second trigger keyword detection subunit: the voice data processing device is used for detecting whether the voice data comprises a pre-stored keyword for triggering shooting if the source of the voice data is detected to be the first terminal;

a second shooting trigger subunit: and the voice processing unit is used for determining that a shooting instruction is received if the voice data is detected to comprise a pre-stored keyword for triggering shooting.

Optionally, the second shooting instruction detecting module includes:

a third gesture detection unit: the gesture feature detection is carried out on the video image displayed in the third display screen;

the second photographing module includes:

second gesture source detection unit: the source of the detected gesture features is determined if the gesture features are detected and matched with gesture features which are stored in advance and used for triggering shooting;

a fourth target photographic subject determination unit: and the camera is used for controlling the first camera to execute shooting operation if the source of the gesture feature is the first terminal.

The second terminal 1300 can implement each process implemented in the above method embodiment applied to the second terminal, and achieve the same technical effect, and for avoiding repetition, details are not described here again.

The embodiment of the present invention further provides a second terminal, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the video call method embodiment, and can achieve the same technical effect, and is not described herein again to avoid repetition.

Referring to fig. 12, fig. 12 is a structural diagram of a first terminal according to an embodiment of the present invention, which is capable of implementing details of a video call method applied to the first terminal in the foregoing embodiment and achieving the same effect. As shown in fig. 12, the first terminal 1600 includes: at least one processor 1601, memory 1602, at least one network interface 1604, a user interface 1603, a front-facing camera 1606, a rear-facing camera 1607; the first terminal further comprises a front camera and a rear camera, wherein the front camera is located on the same side of the first display screen, and the rear camera is located on the same side of the second display screen. The various components in first terminal 1600 are coupled together by a bus system 1605. It is understood that the bus system 1605 is used to enable connected communication between these components. The bus system 1605 includes a power bus, a control bus, and a status signal bus in addition to the data bus. For clarity of illustration, however, the various buses are labeled in fig. 12 as bus system 1605.

The user interface 1603 may include, among other things, a display, a keyboard or a pointing device (e.g., a mouse, track ball, touch pad or touch screen, etc.).

It is to be understood that the memory 1602 in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double data rate Synchronous Dynamic random access memory (ddr DRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous link SDRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 802 of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 1602 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof: an operating system 16021 and application programs 16022.

The operating system 16021 includes various system programs, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks. The application 16022 includes various applications, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. Programs that implement methods in accordance with embodiments of the present invention may be included within application 16022.

In this embodiment of the present invention, the first terminal 1600 further includes: a computer program stored on the memory 1602 and executable on the processor 1601, the computer program when executed by the processor 1601 performing the steps of: in the state of establishing video call connection with the second terminal, the front camera 1606 and the rear camera 1607 are started to acquire video images; sending the first video image collected by the front camera 1606 and the second video image collected by the rear camera 1607 to the second terminal; receiving a third video image sent by the second terminal; displaying the third video image on the first display screen and the second display screen respectively; and the third video image is a video image acquired by a first camera of the second terminal.

The method disclosed by the above-mentioned embodiments of the present invention may be applied to the processor 1601 or implemented by the processor 1601. The processor 1601 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the method may be performed by hardware integrated logic circuits or instructions in software form in the processor 1601. The Processor 1601 may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may reside in ram, flash memory, rom, prom, or eprom, registers, among other computer-readable storage media known in the art. The computer readable storage medium is located in the memory 1602, and the processor 1601 reads the information in the memory 1602, and performs the steps of the method in combination with the hardware. Specifically, the computer readable storage medium has a computer program stored thereon, and the computer program realizes the steps of the above-described video call method embodiments when executed by the processor 1601.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.

For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: and displaying the third video image on the first display screen in a full screen mode, and displaying the third video image on the second display screen in a full screen mode.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: displaying the first video image in a first floating window of a first display area in the first display screen; displaying the second video image in a second floating window of a second display area in the first display screen.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting a touch operation in any one of the first floating window and the second floating window; if the touch operation is detected, acquiring a first target video image displayed on the first display screen in a full screen mode at present; displaying the video image in the floating window corresponding to the touch operation on the first display screen in a full screen manner; and displaying the first target video image in a floating window corresponding to the touch operation.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting gesture characteristics of the video image displayed in the first display screen; if the gesture features are detected and matched with gesture features which are stored in advance and used for switching display modes, adjusting the display states of the first floating window and/or the second floating window according to the detected gesture features; wherein the display state comprises a hidden state and a display state.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: receiving a fourth video image sent by the second terminal; dividing the first display screen into a first display area, a second display area, a third display area and a fourth display area; displaying the first video image, the second video image, the third video image and the fourth video image in the first display area, the second display area, the third display area and the fourth display area, respectively.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting touch operation in any one of the first display area, the second display area, the third display area and the fourth display area; if the touch operation is detected, displaying a video image in a display area corresponding to the touch operation in a full screen mode; detecting a sliding operation on the first display screen; if the sliding operation is detected, acquiring the direction of the sliding operation; and switching the video image content displayed in the full screen currently on the first display screen according to the direction of the sliding operation.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting a shooting instruction; if a shooting instruction is detected, controlling the front camera 1606 and/or the rear camera 1607 to execute shooting operation; the shooting instruction comprises at least one of a voice instruction, a key instruction, a screen gesture instruction and an air gesture instruction.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting voice data collected by a microphone of the first terminal; judging whether a shooting instruction is received or not according to the voice data; if a shooting instruction is received, determining a target shooting object corresponding to the shooting instruction; and controlling the front camera 1606 and/or the rear camera 1607 to execute shooting operation according to the target shooting object. Wherein the target photographic object comprises a first target object in the first video image and/or a second target object in the second video image.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: detecting a source of the voice data; if the source of the voice data is detected to be the second terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting; and if the voice data is detected to comprise the pre-stored key words for triggering shooting, determining that a shooting instruction is received.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: acquiring an image generated by the shooting operation; and sending the image to the second terminal.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: extracting key words in the voice data collected by the microphone; determining the target shooting object according to the extracted keywords; wherein the target photographic object comprises a first target object in the first video image and/or a second target object in the second video image.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: performing gesture feature detection on the video image displayed in the first display screen and/or the second display screen; if the gesture features are detected and matched with gesture features which are stored in advance and used for triggering shooting, determining the source of the detected gesture features; determining a target shooting object corresponding to the shooting instruction according to the source of the detected gesture feature; and controlling the front camera 1606 and/or the rear camera 1607 to execute shooting operation according to the target shooting object.

Optionally, the computer program may further implement the following steps when executed by the processor 1601: if the target shooting object is the first target object, controlling the front-facing camera 1606 to execute a shooting operation; if the target shooting object is the second target object, controlling the rear camera 1607 to execute a shooting operation; if the target photographic objects are the first target object and the second target object, the front camera 1606 and the rear camera 1607 are controlled to execute photographic operations.

Referring to fig. 13, fig. 13 is a structural diagram of a second terminal according to another embodiment of the present invention, which can implement details of the video call method applied to the second terminal in the foregoing embodiment and achieve the same effect. As shown in fig. 13, the second terminal 1700 includes a Radio Frequency (RF) circuit 1710, a memory 1720, an input unit 1730, a display unit 1740, a processor 1750, an audio circuit 1760, a communication module 1770, and a power supply 1780.

The input unit 1730 may be used to receive numeric or character information input by a user and generate signal inputs related to user settings and function control of the second terminal 1700, among others. Specifically, in the embodiment of the present invention, the input unit 1730 may include a touch panel 1731. The touch panel 1731, also referred to as a touch screen, may collect touch operations of a user (e.g., operations of the user on the touch panel 1731 by using a finger, a stylus pen, or any other suitable object or accessory) on or near the touch panel 1731, and drive a corresponding connection device according to a preset program. Alternatively, the touch panel 1731 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts it to touch point coordinates, and provides the touch point coordinates to the processor 1750, where it can receive and execute commands from the processor 1750. In addition, the touch panel 1731 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1731, the input unit 1730 may also include other input devices 1732, and the other input devices 1732 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

Among them, the display unit 1740 may be used to display information input by a user or information provided to the user and various menu interfaces of the second terminal 1700. The display unit 1740 may include a display panel 1741, and optionally, the display panel 1741 may be configured in the form of an LCD or an Organic Light-Emitting Diode (OLED), or the like.

It should be noted that touch panel 1731 may overlay display panel 1741 to form a touch screen display, which when detecting a touch operation thereon or nearby, transmits the touch screen display to processor 1750 to determine the type of touch event, and then processor 1750 provides a corresponding visual output on the touch screen display according to the type of touch event. Specifically, the touch panel 1731 includes a third display screen, and the second terminal 1700 further includes a first camera 1790 on the same side as the third display screen.

The processor 1750 is a control center of the second terminal 1700, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions and processes data of the second terminal 1700 by running or executing software programs and/or modules stored in the first memory 1721 and calling data stored in the second memory 1722, thereby integrally monitoring the second terminal 1700. Optionally, processor 1750 may include one or more processing units.

In this embodiment of the present invention, by calling the software program and/or module stored in the first memory 1721 and/or the data stored in the second memory 1722, in this embodiment of the present invention, the second terminal 1700 further includes: a computer program stored on the memory 1722 and executable on the processor 1750, the computer program when executed by the processor 1750 performing the steps of: starting a first camera 1790 to collect a video image in a state of establishing video call connection with a first terminal; sending a third video image acquired by the first camera 1790 to the first terminal; receiving a first video image and a second video image sent by the first terminal; displaying the first video image and the second video image; the first video image is a video image collected by a front camera of the first terminal, and the second video image is a video image collected by a rear camera of the first terminal.

Optionally, the second terminal includes a fourth display screen arranged opposite to the third display screen; the computer program when executed by the processor 1750 may also implement the steps of: displaying the first video image on the third display screen in a full screen manner, and displaying the second video image on the fourth display screen in a full screen manner; or displaying the second video image on the third display screen in a full screen mode, and displaying the first video image on the fourth display screen in a full screen mode.

Optionally, the third display screen includes a swap identifier for swapping display contents of the third display screen and the fourth display screen; the computer program when executed by the processor 1750 may also implement the steps of: detecting touch operation on the swap identifier; and if the touch operation is detected, the video images displayed by the third display screen and the fourth display screen are exchanged.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: determining a second target video image to be displayed in a full screen mode in the first video image and the second video image; determining a third target video image to be displayed in a window in the first video image and the second video image; displaying the second target video image on the third display screen in a full screen manner; displaying the third target video image in a third floating window of a third display area in the third display screen; displaying the third video image in a fourth floating window of a fourth display area in the third display screen.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting a touch operation in any one of the third floating window and the fourth floating window; if the touch operation is detected, acquiring a fourth target video image currently displayed in a full screen mode on the third display screen; displaying the video image in the floating window corresponding to the touch operation on the third display screen in a full screen manner; and displaying the fourth target video image in a floating window corresponding to the touch operation.

Optionally, the second terminal further includes a second camera 1791 on the same side as the fourth display screen, and when being executed by the processor 1750, the computer program further realizes the following steps: after the step of displaying the first video image and the second video image, the method further comprises: if an opening instruction of the second camera 1791 is received, the second camera 1791 is opened to collect a video image; sending a fourth video image acquired by the second camera 1791 to the first terminal; dividing the third display screen into a fifth display area, a sixth display area, a seventh display area and an eighth display area; displaying the first video image, the second video image, the third video image and the fourth video image in the fifth display area, the sixth display area, the seventh display area and the eighth display area, respectively.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting gesture features of the video image displayed in the third display screen; if the gesture features are detected and matched with gesture features which are stored in advance and used for switching display modes, adjusting the display state of the third floating window and/or the fourth floating window according to the detected gesture features; wherein the display state comprises a hidden state and a display state.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting touch operation in any one of the fifth display area, the sixth display area, the seventh display area and the eighth display area; if the touch operation is detected, displaying a video image in a display area corresponding to the touch operation in a full screen mode; detecting a sliding operation on the third display screen; if the sliding operation is detected, acquiring the direction of the sliding operation; and switching the video image content displayed in the full screen currently on the third display screen according to the direction of the sliding operation.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting a shooting instruction; if a shooting instruction is detected, controlling the first camera 1790 to execute shooting operation; the shooting instruction comprises at least one of a voice instruction, a key instruction, a screen gesture instruction and an air gesture instruction.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting voice data collected by a microphone; judging whether a shooting instruction is received or not according to the voice data; and if a shooting instruction is received, controlling the first camera 1790 to execute shooting operation.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: acquiring an image generated by the shooting operation; and sending the image to the first terminal.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting a source of the voice data; if the source of the voice data is detected to be the first terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting; and if the voice data is detected to comprise the pre-stored key words for triggering shooting, determining that a shooting instruction is received.

Optionally, the computer program when executed by the processor 1750 may further implement the steps of: detecting gesture features of the video image displayed in the third display screen; if the gesture features are detected and matched with gesture features which are stored in advance and used for triggering shooting, determining the source of the detected gesture features; and if the source of the gesture feature is the first terminal, controlling the first camera 1790 to execute shooting operation.

In the embodiment of the invention, when a first terminal and a second terminal carry out video call, a first front camera and a rear camera can be utilized to respectively shoot a first video image and a second video image, and the first video image and the second video image are simultaneously sent to the second terminal for display, so that a user participating in the video call at one side of the second terminal can see scenes in the shooting ranges of the first front camera and the rear camera of the first terminal, and when a user participating in the video call exists in the shooting ranges of the first front camera and the rear camera of the first terminal or scenes which the user participating in the video call at one side of the second terminal wants to see exist in the shooting ranges of the first front camera and the rear camera, video images shot by the two cameras can be simultaneously transmitted to the second terminal, so that the user participating in the video call at one side of the first terminal is not required to continuously switch the cameras, the video image shooting range of the video call is expanded, and the experience of the video call is improved.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A video call method is applied to a first terminal, and is characterized in that the first terminal comprises a first display screen and a second display screen which are arranged oppositely, the first terminal also comprises a front camera which is arranged at the same side of the first display screen and a rear camera which is arranged at the same side of the second display screen, and the method comprises the following steps:

receiving a third video image sent by the second terminal;

the third video image is a video image collected by a first camera of the second terminal;

after the step of displaying the third video image on the first display screen and the second display screen respectively, the method further includes:

detecting a shooting instruction;

the shooting instruction comprises at least one of a voice instruction, a key instruction, a screen gesture instruction and an air gesture instruction;

the step of detecting a shooting instruction includes:

detecting voice data collected by a microphone of the first terminal;

judging whether a shooting instruction is received or not according to the voice data;

the step of judging whether a shooting instruction is received or not according to the voice data comprises the following steps:

detecting a source of the voice data;

if the source of the voice data is detected to be the second terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting;

if the voice data is detected to comprise a pre-stored keyword for triggering shooting, determining that a shooting instruction is received;

the detecting the source of the voice data comprises:

and comparing the voice data collected by the microphone with the pre-stored voice data of the user of the second terminal, and if the voice data is matched with the pre-stored voice data of the user of the second terminal, determining that the source of the voice data is the second terminal.

2. The method of claim 1, wherein the step of displaying the third video image on the first display screen and the second display screen respectively comprises:

3. The method of claim 2, wherein after the step of displaying the third video image full screen on the first display screen and the third video image full screen on the second display screen, further comprising:

displaying the first video image in a first floating window of a first display area in the first display screen;

displaying the second video image in a second floating window of a second display area in the first display screen.

4. The method of claim 3, wherein after the step of displaying the second video image in the second floating window of the second display area in the first display screen, further comprising:

detecting a touch operation in any one of the first floating window and the second floating window;

if the touch operation is detected, acquiring a first target video image displayed on the first display screen in a full screen mode at present;

displaying the video image in the floating window corresponding to the touch operation on the first display screen in a full screen manner;

and displaying the first target video image in a floating window corresponding to the touch operation.

5. The method of claim 3, wherein after the step of displaying the second video image in the second floating window of the second display area in the first display screen, further comprising:

wherein the display state comprises a hidden state and a display state.

6. The method of claim 1, wherein after the step of displaying the third video image on the first display screen and the second display screen, respectively, further comprising:

receiving a fourth video image sent by the second terminal;

7. The method of claim 6, wherein after the step of displaying the first video image, the second video image, the third video image, and the fourth video image in the first display area, the second display area, the third display area, and the fourth display area, respectively, further comprises:

detecting a sliding operation on the first display screen;

8. The method of claim 1, wherein the step of detecting a capture instruction further comprises:

if the shooting instruction is detected, controlling the front camera and/or the rear camera to execute shooting operation, including:

if a shooting instruction is received, determining a target shooting object corresponding to the shooting instruction;

controlling the front camera and/or the rear camera to execute shooting operation according to the target shooting object;

9. The method according to claim 1, wherein after the step of controlling the front camera and/or the rear camera to perform the photographing operation, the method further comprises:

acquiring an image generated by the shooting operation;

and sending the image to the second terminal.

10. The method according to claim 8, wherein the step of determining the target photographic object corresponding to the photographic instruction comprises:

extracting key words in the voice data collected by the microphone;

determining the target shooting object according to the extracted keywords;

11. The method of claim 8, wherein the step of detecting a capture instruction comprises:

12. The method according to claim 8 or 11, wherein the step of controlling the front camera and/or the rear camera to perform a photographing operation according to the target photographic object comprises:

13. A video call method is applied to a second terminal, and is characterized in that the second terminal comprises a third display screen and a first camera which is on the same side as the third display screen, and the method comprises the following steps:

sending a third video image acquired by the first camera to the first terminal;

displaying the first video image and the second video image;

the first video image is a video image acquired by a front camera of the first terminal, and the second video image is a video image acquired by a rear camera of the first terminal;

detecting a shooting instruction;

the step of detecting a shooting instruction includes:

detecting voice data collected by a microphone;

detecting a source of the voice data;

if the source of the voice data is detected to be the first terminal, detecting whether the voice data comprises a pre-stored keyword for triggering shooting;

the detecting the source of the voice data comprises:

14. The method of claim 13, wherein the second terminal comprises a fourth display screen disposed opposite the third display screen;

15. The method of claim 14, wherein the third display screen comprises a swap identifier for swapping display content of the third display screen and the fourth display screen;

detecting touch operation on the swap identifier;

and if the touch operation is detected, the video images displayed by the third display screen and the fourth display screen are exchanged.

16. The method of claim 13, wherein the step of displaying the first video image and the second video image comprises:

17. The method of claim 16, wherein after the step of displaying the third video image in a fourth floating window of a fourth display area in the third display screen, further comprising:

18. The method of claim 14, wherein the second terminal further comprises a second camera on the same side as the fourth display screen;

19. The method of claim 17, wherein after the step of displaying the fourth target video image in the floating window corresponding to the touch operation, the method further comprises:

wherein the display state comprises a hidden state and a display state.

20. The method of claim 18, wherein after the step of displaying the first video image, the second video image, the third video image, and the fourth video image in the fifth display area, the sixth display area, the seventh display area, and the eighth display area, respectively, further comprises:

detecting a sliding operation on the third display screen;

21. The method of claim 13, wherein the step of detecting a capture instruction further comprises:

and if a shooting instruction is received, controlling the first camera to execute shooting operation.

22. The method of claim 13, wherein after the step of controlling the first camera to perform the photographing operation, further comprising:

acquiring an image generated by the shooting operation;

and sending the image to the first terminal.

23. The method of claim 13, wherein the step of detecting a capture instruction comprises:

24. The utility model provides a first terminal, its characterized in that, including relative first display screen and the second display screen that sets up, still include with the leading camera of first display screen homonymy and with the rearmounted camera of second display screen homonymy, and:

the first terminal further comprises:

the first photographing instruction detecting module includes:

the first voice data detection unit: the voice detection device is used for detecting voice data collected by a microphone of the first terminal;

the first shooting instruction determination unit includes:

the first shooting triggering subunit: the voice data processing device is used for determining that a shooting instruction is received if the voice data is detected to comprise a pre-stored keyword for triggering shooting;

the first voice data source detection subunit: the voice recognition method is specifically used for comparing the voice data collected by the microphone with the pre-stored voice data of the user of the second terminal, and if the voice data are matched with the pre-stored voice data of the user of the second terminal, determining that the source of the voice data is the second terminal.

25. The first terminal of claim 24, wherein the first display module comprises:

26. The first terminal of claim 25, wherein the first display module further comprises:

a first floating display unit: the first video image is displayed in a first floating window of a first display area in the first display screen; displaying the second video image in a second floating window of a second display area in the first display screen.

27. The first terminal of claim 26, wherein the first display module further comprises:

28. The first terminal of claim 26, wherein the first display module further comprises:

wherein the display state comprises a hidden state and a display state.

29. The first terminal of claim 24, wherein the first terminal further comprises:

30. The first terminal of claim 29, wherein the second terminal further comprises:

31. The first terminal of claim 24, wherein the first photographing instruction detecting module further comprises:

the first shooting module is further configured to: if a shooting instruction is received, determining a target shooting object corresponding to the shooting instruction;

32. The first terminal of claim 24, wherein the first terminal further comprises:

a first image sending module: for sending the image to the second terminal.

33. The first terminal of claim 31, wherein the first photographing module comprises:

a first target photographic subject determination unit: the target shooting object is determined according to the extracted keywords;

34. The first terminal of claim 31, wherein the first photographing instruction detecting module comprises:

the first shooting module is further configured to:

35. The first terminal of claim 31 or 34, wherein the first photographing module is further configured to:

36. The utility model provides a second terminal, its characterized in that, the second terminal includes the third display screen, the second terminal still include with the first camera of third display screen homonymy, the second terminal still includes:

the second terminal further includes:

the second shooting instruction detection module further includes:

a second voice data detection unit: the voice detection device is used for detecting voice data collected by a microphone;

the second shooting instruction determination unit includes:

a second shooting trigger subunit: the voice data processing device is used for determining that a shooting instruction is received if the voice data is detected to comprise a pre-stored keyword for triggering shooting;

a second voice data source detection subunit: the voice recognition method is specifically used for comparing the voice data collected by the microphone with the pre-stored voice data of the user of the second terminal, and if the voice data are matched with the pre-stored voice data of the user of the second terminal, determining that the source of the voice data is the second terminal.

37. The second terminal according to claim 36, wherein the second terminal comprises a fourth display screen disposed opposite the third display screen;

the third display module includes:

38. The second terminal according to claim 37, wherein the third display screen includes a swap identifier for swapping display contents of the third display screen and the fourth display screen;

the second terminal further includes:

39. The second terminal according to claim 36, wherein the third display module comprises:

40. The second terminal of claim 39, wherein the third display module further comprises:

41. The second terminal according to claim 37, further comprising a second camera on the same side as the fourth display screen;

the second terminal further includes:

42. The second terminal as claimed in claim 40, wherein the third display module further comprises:

wherein the display state comprises a hidden state and a display state.

43. The second terminal of claim 41, wherein the second terminal further comprises:

44. The second terminal of claim 36, wherein the second capture instruction detection module further comprises:

a second shooting control unit: and the camera is used for controlling the first camera to execute shooting operation if a shooting instruction is received.

45. The second terminal of claim 36, wherein the second terminal further comprises:

a second image sending module: for sending the image to the first terminal.

46. The second terminal of claim 36, wherein the second shooting instruction detecting module comprises:

the second photographing module includes:

47. A first terminal, characterized in that it comprises a processor, a memory and a computer program stored on said memory and executable on said processor, said computer program, when executed by said processor, implementing the steps of the video telephony method according to any one of claims 1 to 12.

48. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the video call method according to any one of claims 1 to 12.

49. A second terminal, characterized in that it comprises a processor, a memory and a computer program stored on said memory and executable on said processor, said computer program, when executed by said processor, implementing the steps of the video telephony method according to any one of claims 13 to 23.

50. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the video call method according to any one of claims 13 to 23.