CN109639999B

CN109639999B - Video call data optimization method, mobile terminal and readable storage medium

Info

Publication number: CN109639999B
Application number: CN201811629297.2A
Authority: CN
Inventors: 张忠男
Original assignee: Zhongke Lianxin Guangzhou Technology Co ltd
Current assignee: Zhongke Lianxin Guangzhou Technology Co ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2022-12-13
Anticipated expiration: 2038-12-28
Also published as: CN109639999A

Abstract

The invention discloses an optimization method of video call data, a mobile terminal and a readable storage medium, wherein the method comprises the following steps: when a first terminal detects a call instruction for video call with a second terminal, acquiring original video call data corresponding to a first user in the video call through a camera of the first terminal; shielding other pictures except the portrait outline of the first user in each call picture corresponding to the original video call data to obtain first video call data; and sending the first video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the first video call data. According to the invention, in the video call process, other pictures except the face outline of the first user in the video call data are shielded, so that the background picture of the current position of the first user is not carried in the video call data sent to the second user, the privacy of the video call user is ensured, and the utilization rate of the video call is improved.

Description

Video call data optimization method, mobile terminal and readable storage medium

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method for optimizing video call data, a mobile terminal, and a readable storage medium.

Background

With the development of communication technology, video calls are widely used by users. At present, in a video call, an opposite-end user can see a background where a home-end user is located in the video call process. However, sometimes the home terminal user does not want to expose the environment where the home terminal user is in when talking, or it is inconvenient for the opposite terminal user to see the environment where the home terminal user is in, so that in the video call process, video call data transmitted between terminals carries data that the home terminal user does not want to transmit, privacy of the video call user cannot be guaranteed, and the use rate of the video call is reduced.

Disclosure of Invention

The invention mainly aims to provide a video call data optimization method, a mobile terminal and a readable storage medium, and aims to solve the technical problems that the privacy of a video call user cannot be ensured and the utilization rate of the video call cannot be reduced in the video call process.

In order to achieve the above object, the present invention provides a method for optimizing video call data, where the method for optimizing video call data includes:

when a first terminal detects a call instruction for video call with a second terminal, acquiring original video call data corresponding to a first user in the video call through a camera of the first terminal;

shielding other pictures except the portrait outline of the first user in each call picture corresponding to the original video call data to obtain first video call data;

and sending the first video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the first video call data.

Optionally, the step of masking other pictures except the silhouette of the first user portrait in each call picture corresponding to the original video call data to obtain first video call data includes:

shielding other pictures except the figure outline of the first user in each call picture corresponding to the original video call data, and acquiring a pre-stored virtual background;

and setting the virtual background as a background picture of the call picture to obtain first video call data.

Optionally, after the step of shielding other pictures except for the silhouette of the first user portrait in the call pictures corresponding to the original video call data to obtain first video call data, the method further includes:

identifying sound data belonging to the first user in the first video call data;

deleting other sound data except the first user sound data in the first video call data to obtain second video call data;

the step of sending the first video call data to the second terminal so that a second user of the second terminal can perform video call with the first user through the first video call data includes:

and sending the second video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the second video call data.

Optionally, the step of deleting other voice data in the original video call data except the voice data of the first user to obtain second video call data includes:

deleting other sound data except the first user sound data in the first video call data, and acquiring pre-stored virtual sound;

and setting the virtual sound as the background sound of the first video call data to obtain second video call data.

Optionally, the step of identifying the sound data belonging to the first user in the first video call data includes:

determining sound data with the highest decibel in the first video call data;

and determining the sound data with the highest decibel as the sound data belonging to the first user in the first video call data.

identifying a portrait outline of a first user corresponding to the first terminal in each call picture corresponding to the original video call data through a portrait identification technology;

and deleting other pictures except the portrait outline in each call picture to screen the other pictures except the portrait outline of the first user in the call pictures to obtain first video call data.

Optionally, before the step of sending the first video call data to the second terminal, so that a second user of the second terminal performs a video call with the first user through the first video call data, the method further includes:

encoding the first video call data to obtain encoded first video call data;

and sending the encoded first video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the first video call data.

Optionally, after the step of acquiring, by the camera of the first terminal, original video call data corresponding to the first user in the video call after the first terminal detects a call instruction for performing a video call with the second terminal, the method includes:

detecting whether a video call data optimization function is started;

and if the video call data optimization function is started, executing the step of shielding other pictures except the first user portrait outline in each call picture corresponding to the original video call data to obtain first video call data.

In addition, in order to achieve the above object, the present invention also provides a mobile terminal, which includes a memory, a processor and an optimization program of video call data stored in the memory and capable of running on the processor, wherein the optimization program of video call data when executed by the processor implements the steps of the optimization method of video call data as described above.

Furthermore, to achieve the above object, the present invention also provides a computer readable storage medium, on which an optimization program of video call data is stored, the optimization program of video call data implementing the steps of the optimization method of video call data as described above when executed by a processor.

According to the invention, after the first terminal detects the call instruction of video call with the second terminal, the camera of the first terminal is used for collecting the video call data corresponding to the first user, and shielding other pictures except the face outline of the first user in the video call data, so that only the figure image of the first user exists in each frame of picture corresponding to the video call data, the processed video call data is obtained, and the processed video call data is sent to the second terminal, so that the second user of the second terminal can carry out video call with the first user through the processed video call data, and the video call data is optimized by shielding other pictures except the face outline of the first user in the video call data, so that the background picture of the current position of the first user is not carried in the video call data sent to the second user, the privacy of the video call user is ensured, and the utilization rate of the video call is improved.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of a terminal for implementing various embodiments of the present invention;

fig. 2 is a diagram of a communication network system architecture according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a method for optimizing video call data according to a first embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for optimizing video call data according to a third embodiment of the present invention;

FIG. 5 is a flowchart illustrating a method for optimizing video call data according to a fourth embodiment of the present invention;

FIG. 6 is a diagram illustrating one frame of raw video call data without frame processing according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating a frame obtained after masking other frames than the first user portrait outline in the frame of FIG. 6.

The implementation, functional features and advantages of the present invention will be described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.

The terminal may be implemented in various forms. For example, the terminal described in the present invention may include a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and the like, and a fixed terminal such as a Digital TV, a desktop computer, and the like.

The following description will be given by way of example of a mobile terminal, and it will be understood by those skilled in the art that the construction according to the embodiment of the present invention can be applied to a fixed type terminal, in addition to elements particularly used for mobile purposes.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present invention, the mobile terminal 100 may include: RF (Radio Frequency) unit 101, wiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals, and that a mobile terminal may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and specifically, receive downlink information of a base station and then process the downlink information to the processor 110; in addition, the uplink data is transmitted to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000 (Code Division Multiple Access 2000 ), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).

WiFi belongs to short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The audio output unit 103 may convert voice data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the mobile terminal 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is for receiving an audio or video signal. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, and the Graphics processor 1041 processes image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (voice data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and can process such sound into voice data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 1061 and/or the backlight when the mobile terminal 100 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, the description is omitted here.

The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 1071 (e.g., an operation performed by the user on or near the touch panel 1071 using a finger, a stylus, or any other suitable object or accessory), and drive a corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 107 may include other input devices 1072 in addition to the touch panel 1071. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, without limitation.

Further, the touch panel 1071 may cover the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although in fig. 1, the touch panel 1071 and the display panel 1061 are two independent components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the mobile terminal, which is not limited herein.

The interface unit 108 serves as an interface through which at least one external device is connected to the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and external devices.

The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as voice data, a phonebook, etc.) created according to the use of the cellular phone, etc. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 110 is a control center of the mobile terminal 100, connects various parts of the entire mobile terminal 100 using various interfaces and lines, performs various functions of the mobile terminal 100 and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby integrally monitoring the mobile terminal 100. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

Further, in the mobile terminal 100 shown in fig. 1, the mobile terminal 100 is a first terminal, and the processor 110 is configured to call a secure mode boot program stored in the memory 109 and perform the following operations:

when a call instruction of video call with a second terminal is detected, acquiring original video call data corresponding to a first user in the video call through a camera of the first terminal;

shielding other pictures except the portrait outline of the first user in all the call pictures corresponding to the original video call data to obtain first video call data;

Further, the step of shielding other pictures except the silhouette of the first user portrait in each call picture corresponding to the original video call data to obtain first video call data includes:

Further, after the step of masking other frames except the first user portrait outline in the frames corresponding to the original video call data to obtain the first video call data, the processor 1001 is further configured to invoke an optimization program of the video call data stored in the memory 1005, and execute the following operations:

Further, the step of deleting other voice data in the original video call data except the voice data of the first user to obtain second video call data includes:

Further, the step of identifying the voice data belonging to the first user in the first video call data comprises:

determining sound data with the highest decibel in the first video call data;

and deleting other pictures except the portrait outline in each call picture to screen the other pictures except the portrait outline of the first user in the call picture to obtain first video call data.

Further, before the step of sending the first video call data to the second terminal, so that a second user of the second terminal makes a video call with the first user through the first video call data, the processor 1001 is further configured to invoke an optimization program of the video call data stored in the memory 1005, and perform the following operations:

encoding the first video call data to obtain encoded first video call data;

Further, after the step of acquiring, by the first terminal, original video call data corresponding to the first user in the video call through the camera of the first terminal after the first terminal detects the call instruction for performing the video call with the second terminal, the processor 1001 is further configured to invoke an optimization program of the video call data stored in the memory 1005, and execute the following operations:

detecting whether a video call data optimization function is started;

Based on the above structure, embodiments of the video call data optimization method of the present invention are provided.

The invention provides a method for optimizing video call data.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for optimizing video call data according to a first embodiment of the present invention.

In the present embodiment, an embodiment of a method for optimizing video call data is provided, and it should be noted that although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that here.

In the embodiments of the method for optimizing video call data, for convenience of description, the first terminal is taken as an execution subject to explain the embodiments.

In this embodiment, the method for optimizing video call data includes:

step S10, after the first terminal detects a call instruction for video call with the second terminal, original video call data corresponding to a first user in the video call are collected through a camera of the first terminal.

When the first terminal detects a call instruction for video call with the second terminal, the first terminal collects original video call data corresponding to a first user in the video call through a camera of the first terminal. It should be noted that, in order to distinguish a user corresponding to the first terminal from a user corresponding to the second terminal, in the embodiment of the present invention, the user corresponding to the first terminal is denoted as the first user, and the user corresponding to the second terminal is denoted as the second user. The call instruction may be triggered by the first user in the first terminal, or triggered by the first user in the first terminal to allow the first user to perform a video call with the second user after the first terminal receives the video call instruction sent by the second terminal. Specifically, the first terminal collects original video call data corresponding to a first user in a video call process through a front-facing camera of the first terminal. It can be understood that the original video call data is the video call data collected by the first terminal camera and not subjected to the picture processing and the sound processing.

And step S20, shielding other pictures except the portrait outline of the first user in all the call pictures corresponding to the original video call data to obtain first video call data.

After the first terminal collects original video call data corresponding to a first user, the first terminal shields other pictures except the portrait outline of the first user in all call pictures corresponding to the original video call data to obtain the first video call data. It should be noted that the video call data is composed of a frame of picture and corresponding sound, that is, the original video call data is composed of sound and a frame of picture, and one frame of picture is a picture, so that other pictures before the silhouette of the first user portrait in each call picture corresponding to the original video call data are shielded, and the video call data without the first user background picture can be obtained, that is, the first video call data is obtained. Specifically, referring to fig. 6 and 7, fig. 6 is a call screen before the other screen outside the outline of the first user portrait is masked, and fig. 7 is a call screen obtained after the other screen outside the outline of the first user portrait in the call screen of fig. 6 is masked.

Further, step S20 includes:

step a, identifying the portrait outline of a first user corresponding to the first terminal in each call picture corresponding to the original video call data through a portrait identification technology.

Specifically, after the first terminal collects original video call data, the first terminal identifies a portrait outline of a first user in each call picture corresponding to the original video call data through a portrait identification technology. The portrait recognition technology belongs to the biological characteristic recognition technology, which is used for distinguishing organism individuals from the biological characteristics of organisms (generally, specially, people).

And b, deleting other pictures except the portrait outline in each call picture, and screening the other pictures except the portrait outline of the first user in the call pictures to obtain first video call data.

When the first terminal identifies the first user portrait outline in each conversation picture, the first terminal deletes other pictures except the first user portrait outline in each conversation picture to shield the other pictures except the first user portrait outline in each conversation picture, so that video conversation data after picture processing is obtained, and the video conversation data after picture processing is recorded as first video conversation data.

Further, the first terminal can also identify the portrait outline of the first user in each call picture corresponding to the original video call data through a deep learning algorithm. The deep learning algorithm includes, but is not limited to, LSTM (Long Short-Term Memory) and RNN (Recurrent Neural Network). Specifically, the first terminal can obtain a certain number of images containing the portrait, train the images through a deep learning algorithm to obtain a deep learning model capable of identifying the portrait outline in the images, and then input the collected original video call data into the trained deep learning model to identify the portrait outline in each picture in the original video call data.

Step S30, the first video call data is sent to the second terminal, so that a second user of the second terminal can carry out video call with the first user through the first video call data.

And after the first terminal obtains the first video call data, the first terminal sends the first video call data to the second terminal. And after the second terminal receives the first video call data, the second terminal outputs the first video call data in a screen of the second terminal so that the second user can conveniently carry out video call with the first user through the first video call data. It should be noted that, in the screen of the first terminal, the first video call data may be set to be displayed, or the original video call data may be displayed. When the first terminal screen is set to display the first video call data, the first user sees that the own video picture in the first terminal screen does not have the background picture of the position where the first user is located, and at the moment, the background seen by the first user in the video picture can be a white canvas preset by the first terminal; when the first terminal screen is set to display original video call data, a video picture seen by the first user in the first terminal screen is a background picture with the current position of the first user.

In the embodiment, after a call instruction of video call with a second terminal is detected by a first terminal, video call data corresponding to a first user is acquired through a camera of the first terminal, and other pictures outside the face outline of the first user in the video call data are shielded, so that only a character image of the first user exists in each frame of picture corresponding to the video call data, processed video call data are obtained, and the processed video call data are sent to the second terminal, so that a second user of the second terminal can carry out video call with the first user through the processed video call data.

Further, a second embodiment of the method for optimizing video call data of the present invention is provided. The second embodiment of the method for optimizing video call data is different from the first embodiment of the method for optimizing video call data in that the step S20 further includes:

and c, shielding other pictures except the figure outline of the first user in each call picture corresponding to the original video call data, and acquiring a pre-stored virtual background.

And d, setting the virtual background as a background picture of the call picture to obtain first video call data.

When the first terminal shields other pictures except the portrait outline of the first user in each call picture corresponding to the original video call data, the first terminal acquires a virtual background stored in advance. The virtual background can be set by the first user according to specific needs, and specifically, the virtual background can be videos, pictures, slides and the like. After the first terminal obtains the virtual background, the first terminal sets the virtual background as a background picture of each call picture to obtain first video data. If the virtual background is a landscape image, the second user can see through the first video call data, and the landscape corresponding to the landscape image is behind the first user.

Further, if the virtual background is not preset by the first user, and a default background picture is set in the first terminal, the first terminal acquires the default background picture after shielding other pictures except the portrait outline of the first user in each call picture corresponding to the original video call data, and sets the default background picture as the background picture of each call picture.

In the embodiment, after other pictures except the first portrait outline in each call picture corresponding to the original video call data are shielded, the first terminal sets the pre-stored virtual background as the background picture of the call picture to obtain the first video call data, so that the first user can set the background picture in the video call process according to the preference in the video call data sent to the second user, and the interestingness of the video call is improved.

Further, a third embodiment of the method for optimizing video call data of the present invention is provided. The third embodiment of the method for optimizing video call data is different from the first or second embodiment of the method for optimizing video call data in that, referring to fig. 4, the method for optimizing video call data further includes:

and S40, identifying the sound data belonging to the first user in the first video call data.

After the first terminal obtains the first video call data, the first terminal identifies sound data belonging to the first user in the first video call data. Specifically, the first terminal may store voiceprint information of the first user in advance, and then the first terminal identifies, according to the stored voiceprint information of the first user, voice data belonging to the first user in the first video call data by using a voiceprint identification technology. Further, the first terminal may also store voiceprint information of multiple users, for example, voiceprint information of the first user and a parent of the first user, at which time, when the first terminal identifies the voice data in the first video call data through a voiceprint identification technology, the first terminal may identify the voice data of the multiple users.

Further, step S40 includes:

and e, determining sound data with the highest decibel in the first video call data.

And f, determining the sound data with the highest decibels as the sound data belonging to the first user in the first video call data.

Further, in the process that the first terminal collects the original video call data corresponding to the first user through the camera, the first terminal can collect the sound data through the earphone of the first terminal, and it can be understood that the collected sound data is the sound data in the original video call data. After the first terminal obtains the first video call data, the first terminal determines sound data with the highest decibel from the first video call data, and determines the sound data with the highest decibel as sound data belonging to a first user in the first video call data. It will be appreciated that during a video call between a first user and a second user, there is other noise in the video call data in addition to the voice data of the first user.

And S50, deleting other voice data except the voice data of the first user in the first video call data to obtain second video call data.

And when other sound data except the first user sound data in the first video call data are determined, the first terminal deletes the other sound data except the first user sound data in the first video call data to obtain the first video call data after sound processing. In the embodiment of the present invention, for convenience of distinguishing, the first video call data after the sound processing is recorded as the second video call data.

The step S30 includes:

step S31, sending the second video call data to the second terminal, so that a second user of the second terminal can carry out video call with the first user through the second video call data.

And after the second video call data is obtained, the first terminal sends the second video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the second video call data.

It should be noted that, in the embodiment of the present invention, the original video call data is subjected to the picture processing first and then the sound processing is performed, and in other embodiments, the original video call data may be subjected to the sound processing first and then the picture processing is performed.

In the embodiment, the optimized video call data is obtained by identifying the sound data belonging to the first user in the first video call data and then deleting the sound data except the sound data of the first user in the first video call data, so that the video call data sent to the second user is prevented from carrying other noises except the sound data of the first user, and the video call quality of the first user and the second user in the video call process is improved.

Further, in order to increase the interest of the video call, step S50 includes:

and g, deleting other sound data except the first user sound data in the first video call data, and acquiring pre-stored virtual sound.

And h, setting the virtual sound as the background sound of the first video call data to obtain second video call data.

Further, when the first terminal deletes the sound data of the first video call data except the sound data of the first user, the first terminal obtains the virtual sound pre-stored in the database thereof, wherein the virtual sound can be set by the first user according to specific needs, and the virtual sound includes but is not limited to music and sound segments recorded by the first user dashed lines. After the first terminal acquires the pre-stored virtual sound, the first terminal sets the virtual sound as the background sound of the first video call data to obtain second video call data. It is understood that the background sound may be heard by the second user during the video call with the first user through the second video call data.

Further, in order to ensure the call quality of the video call, in the process of outputting the second video call data, the second terminal may set the output volume of the background sound, so that the output volume of the background sound is smaller than the output volume of the sound data of the first user to a certain extent, and a difference between the output volume of the background sound and the output volume of the sound data of the first user may be set according to specific needs, and in this embodiment, the difference is not specifically limited.

Further, a fourth embodiment of the method for optimizing video call data of the present invention is provided. The fourth embodiment of the method for optimizing video call data is different from the first, second, or third embodiment of the method for optimizing video call data in that, referring to fig. 5, the method for optimizing video call data further includes:

step S60, encoding the first video call data to obtain the encoded first video call data.

After the first terminal obtains the first video call data, the first terminal encodes the first video call data to obtain encoded first video call data. The encoding algorithm for encoding the first video call data includes, but is not limited to, MPEG (Moving Picture Experts Group), h.261, h.263, h.264, and the like.

The step S30 further includes:

step S32, sending the encoded first video call data to the second terminal, so that a second user of the second terminal can perform video call with the first user through the first video call data.

And after the first terminal obtains the coded first video call data, the first terminal sends the coded first video call data to the second terminal. When the second terminal receives the encoded first video call data sent by the first terminal, the second terminal decodes the encoded first video call data to obtain the uncoded original first video call data, and outputs the decoded first video call data in a screen of the second terminal, so that the second user can carry out video call with the first user through the first video call data. It should be noted that, an encoding algorithm for the first terminal to encode the first video call data is negotiated with the second terminal in advance, and a decoding algorithm adopted by the second terminal is corresponding to the encoding algorithm adopted by the first terminal, for example, the first terminal encodes the first video call data by using the h.264 algorithm, and at this time, the second terminal also correspondingly decodes the first video call data by using the h.264 algorithm.

It should be noted that, before the first terminal sends the second video call data to the second terminal, the first terminal may also encode the second video call data.

In the embodiment, before the first terminal sends the video call data to the second terminal, the first terminal encodes the video call data to be sent to the second terminal, and sends the encoded video call data to the second terminal, so that the transmission rates of the video code streams of the first terminal and the second terminal are improved through an encoding technology.

Further, in order to improve the intelligence of optimizing the video call data, the method for optimizing the video call data further includes:

and i, detecting whether the first terminal starts a video call data optimization function.

And j, if the first terminal starts the video call data optimization function, executing the step of shielding other pictures except the portrait outline of the first user in all call pictures corresponding to the original video call data to obtain first video call data.

Further, after the first terminal acquires original video call data corresponding to the first user in the video call process through the camera of the first terminal, the first terminal detects whether the video call data optimization function is started. And if the first terminal starts the video call data optimization function, the first terminal shields other pictures except the portrait outline of the first user in each call picture corresponding to the original video call data to obtain the first video call data. Further, if the first terminal does not start the video call data optimization function, the first terminal directly sends the acquired original video call data to the second terminal. Specifically, if the first terminal detects that the state identifier of the video call data optimization function is the start identifier, the first terminal determines that the video call data optimization function is started; if the first terminal detects that the state identifier of the video call data optimization function is the closing identifier, the first terminal determines that the video call data optimization function is not opened and is in a closing state. In the embodiment of the present invention, the expressions of the on flag and the off flag are not limited, for example, the on flag may be set to "11", and the off flag may be set to "00".

The method comprises the steps that a function button corresponding to a video call data optimization function is arranged in a first terminal, when a first user wants to start the video call data optimization function of the first terminal, the first user can click the function button when the video call data optimization function is in a closed state so as to start the video call data optimization function of the first terminal, and an opening mark is added to the video call data optimization function; when the first user wants to close the video call data optimization function of the first terminal, the first user can click the function button when the video call data optimization function is in an open state, so as to close the video call data optimization function of the first terminal and add a closing identifier for the video call data optimization function.

In addition, the embodiment of the invention also provides a computer readable storage medium.

The computer readable storage medium has stored thereon an optimization program of video call data, which when executed by the processor implements the steps of the optimization method of video call data as described above.

The specific implementation manner of the computer-readable storage medium of the present invention is substantially the same as that of each embodiment of the above-mentioned video call data optimization method, and is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for optimizing video call data is characterized in that the method for optimizing the original video call data comprises the following steps:

sending the first video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the first video call data;

at the same time, the user can select the required time,

determining sound data with the highest decibel in the first video call data;

determining the sound data with the highest decibel as the sound data belonging to the first user in the first video call data;

setting the virtual sound as the background sound of the first video call data to obtain second video call data;

sending the second video call data to the second terminal so that a second user of the second terminal can carry out video call with the first user through the second video call data;

wherein a volume of the virtual sound is less than a sound of the first user.

2. The method for optimizing video call data according to claim 1, wherein the step of masking, from among the call pictures corresponding to the original video call data, other pictures than the silhouette of the first user portrait to obtain first video call data comprises:

3. The method for optimizing video call data according to claim 1, wherein the step of masking, from among the call pictures corresponding to the original video call data, other pictures than the silhouette of the first user portrait to obtain first video call data comprises:

4. The method for optimizing video call data according to claim 1, wherein the step of sending the first video call data to the second terminal for a second user of the second terminal to make a video call with the first user through the first video call data further comprises:

encoding the first video call data to obtain encoded first video call data;

5. The method according to any one of claims 1 to 4, wherein after the step of acquiring, by the camera of the first terminal, original video call data corresponding to the first user in the video call after the first terminal detects a call instruction for performing a video call with the second terminal, the method comprises:

detecting whether a video call data optimization function is started;

6. A mobile terminal, characterized in that it comprises a memory, a processor and an optimization program of video call data stored on said memory and executable on said processor, said optimization program of video call data implementing the steps of the method of optimization of video call data according to any one of claims 1 to 5 when executed by said processor.

7. A computer-readable storage medium, on which an optimization program of video call data is stored, which when executed by a processor implements the steps of the optimization method of video call data according to any one of claims 1 to 5.