WO2023025020A1 - 视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质 - Google Patents

视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2023025020A1
WO2023025020A1 PCT/CN2022/113236 CN2022113236W WO2023025020A1 WO 2023025020 A1 WO2023025020 A1 WO 2023025020A1 CN 2022113236 W CN2022113236 W CN 2022113236W WO 2023025020 A1 WO2023025020 A1 WO 2023025020A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
user terminal
video
privacy protection
video call
Prior art date
Application number
PCT/CN2022/113236
Other languages
English (en)
French (fr)
Inventor
杨海城
周金星
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023025020A1 publication Critical patent/WO2023025020A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present disclosure relates to the technical field of communication, and specifically relates to a video call method, a user terminal, a data server, a computer device, and a computer-readable storage medium.
  • 5G new calls use H.265 and EVS (Enhance Voice Services, Enhanced Voice Services) coding technology instead of H.264 and AMR (Adaptive MultiRate Codec, adaptive multi-rate coding) technology to achieve high-definition audio and video calls.
  • H.265 and EVS Enhance Voice Services, Enhanced Voice Services
  • AMR Adaptive MultiRate Codec, adaptive multi-rate coding
  • an embodiment of the present disclosure provides a video call method applied to a first user terminal, including:
  • the first user terminal is the video call connection of the called party
  • obtain the video data of the video call the calling party of the video call is the second user terminal
  • the first data channel between servers sends a data processing request carrying the video data to the data server, the data processing request is configured to enable the data server to perform privacy protection processing on images in the video data, and Sending the privacy-protected video data to the second user terminal through a second data channel between the data server and the second user terminal.
  • an embodiment of the present disclosure provides a video call method applied to a data server, including:
  • the data processing request sent by the first user terminal through the first data channel, and obtain the video data carried in the data processing request, the data processing request is that the first user terminal establishes a video call connection for the called party and It is sent after the first user terminal determines that the image privacy protection function is enabled, and the calling party of the video call is the second user terminal;
  • an embodiment of the present disclosure also provides a user terminal, including a communication module, an acquisition module, an image processing module, and a data channel management module, the communication module is configured to establish a video call connection in which the first user terminal is the called party , and sending the processed video data to the second user terminal;
  • the obtaining module is configured to obtain video data of the video call, where the calling party of the video call is the second user terminal;
  • the image processing module is configured to perform privacy protection processing on images in the video data when the image privacy protection function is turned on;
  • the data channel management module is configured to send a data processing request carrying the video data to the data server through the first data channel between the first user terminal and the data server when the image privacy protection function is enabled , the data processing request is configured to make the data server perform privacy protection processing on images in the video data, and perform privacy protection processing on the images in the video data through the second data channel between the data server and the second user terminal The subsequent video data is sent to the second user terminal.
  • an embodiment of the present disclosure further provides a data server, including a data channel management module and an image processing module, the data channel management module is configured to receive a data processing request sent by a first user terminal through a first data channel, and obtain The video data carried in the data processing request, and sending the video data after privacy protection processing to the second user terminal through the second data channel between the data server and the second user terminal, the The data processing request is sent after the first user terminal establishes a video call connection for the called party and the first user terminal determines that the image privacy protection function is enabled, and the calling party of the video call is the second user terminal;
  • the image processing module is configured to perform privacy protection processing on images in the video data.
  • an embodiment of the present disclosure further provides a computer device, including: at least one processor; and a storage device, on which at least one computer program is stored; when the at least one computer program is executed by the at least one processor When, the at least one processor is made to implement the aforementioned video call method.
  • an embodiment of the present disclosure further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the aforementioned video calling method is implemented.
  • FIG. 1 is a system architecture diagram provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a video call method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic flow diagram of establishing a video call connection provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic flow diagram of selecting a resolution in the process of establishing a video call connection provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of a video call method provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a user terminal provided by an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of a user terminal provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a data server provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
  • Embodiments described herein may be described with reference to plan views and/or cross-sectional views by way of idealized schematic illustrations of the present disclosure. Accordingly, the example illustrations may be modified according to manufacturing techniques and/or tolerances. Therefore, the embodiments are not limited to the ones shown in the drawings but include modifications of configurations formed based on manufacturing processes. Therefore, the regions illustrated in the drawings have schematic properties, and the shapes of the regions shown in the figures illustrate the specific shapes of the regions of the elements, but are not restrictive.
  • An embodiment of the present disclosure provides a video call method, and the video call is applied to the system shown in FIG. 1 .
  • the system includes a data server (Data channel server) and each user terminal for video calls.
  • Data channel server Data channel server
  • two user terminals are used as an example to make a video call.
  • the first The user terminal is the called party of the video call
  • the second user terminal is the calling party of the video call.
  • the first user terminal and the second user terminal can respectively establish a Bootstrap data channel with the data server, and transmit video data and audio data of the video call through the Bootstrap data channel.
  • the first user terminal and the second user terminal may not use the Bootstrap data channel and transmit the audio and video data via the data server, but use the IMS data channel to directly transmit the audio and video data.
  • the video call method provided by the embodiment of the present disclosure is applied to a first user terminal, and includes the following steps S21 and S22.
  • step S21 after the first user terminal establishes a video call connection for the called party, video data of the video call is obtained, and the calling party of the video call is the second user terminal.
  • the second user terminal initiates an IMS-type video call request to the first user terminal.
  • the IMS-type video call may include but is not limited to VoNR (Voice over New Radio, voice call based on New Radio), VoLTE (Voice over LTE, based on LTE Voice calls over WiFi), VoWiFi (Voice over WiFi, voice calls over WiFi).
  • audio data and video data are collected in real time.
  • audio data and video data may be collected by using a sound collection device (such as a microphone), an image collection device (such as a front camera) and other equipment built into the first user terminal.
  • Step S22 when the image privacy protection function is turned on, perform privacy protection processing on the images in the video data, and send the processed video data to the second user terminal, or, through the connection between the first user terminal and the data server
  • the first data channel sends a data processing request carrying video data to the data server, and the data processing request is configured to enable the data server to perform privacy protection processing on images in the video data, and through the second communication between the data server and the second user terminal
  • the data channel sends the privacy-protected video data to the second user terminal.
  • the first user terminal is provided with an image privacy protection function switch, and the user can manually set to enable or disable the image privacy protection function based on needs.
  • the image privacy protection function is turned on, when the user uses the first user terminal to conduct a video call, the privacy protection processing can be performed on the image in the video data, so as to prevent the other party of the video call from viewing the image of the sensitive area.
  • the operation of performing privacy protection processing on the images in the video data may be performed by the first user terminal.
  • the The video data after the privacy protection processing is sent to the second user terminal through the IMS data channel. Since the privacy protection processing operation for images has high requirements on the image processing performance of the user terminal equipment, when the image processing performance of the first user terminal is limited, the operation of performing privacy protection processing on the images in the video data can also be performed by the data The server executes.
  • the video data of the video call is transmitted between the first user terminal and the data server, and between the data server and the second user terminal through the Bootstrap data channel.
  • the data sent by the first user terminal to the second user terminal or the data server may include not only video data, but also audio data, except that the audio data is the original audio without privacy protection processing. data.
  • the video call method provided by the embodiment of the present disclosure is applied to the first user terminal, and after the first user terminal is established as the video call connection of the called party, the video data of the video call is obtained; when the image privacy protection function is enabled, After the first user terminal performs privacy protection processing on the images in the video data, the processed video data is sent to the calling party of the video call, that is, the second user terminal, or the video data is sent to the second user terminal through the first data channel.
  • the data server after the data server performs privacy protection processing on the images in the video data, sends the video data after the privacy protection processing to the second user terminal through the second data channel.
  • the video call method provided by the embodiment of the present disclosure performs privacy protection processing on the image in the video call, so that the opposite end user of the video call sees the image after privacy protection processing, strengthens the privacy protection of the user, and can effectively improve the privacy of the video call. Call security.
  • the image includes a person image
  • privacy protection processing is performed on at least one of the following regions of the person image in the video data: a biological information feature recognition region, and a body private part region. That is to say, the privacy protection processing is performed on the biological information feature recognition area and/or the private body part area of the person image in the video data.
  • the biological information feature recognition area may be a human face area, a human eye area, a hand area including fingerprints, and the like.
  • the performing privacy protection processing may include at least one of the following: blocking according to a preset pattern, replacing with an avatar, superimposing augmented reality expressions, and reducing resolution. That is to say, one or more operations are performed on one or more regions of the image in the video data by using preset patterns to block, replacing with avatars, superimposing augmented reality (Augmented Reality, AR) expressions, and reducing resolution.
  • augmented reality Augmented Reality, AR
  • AR Augmented Reality
  • avatars and AR emoticons can be provided on the “image privacy protection function” interface of the first user terminal for users to choose. By replacing them with avatars and superimposing AR emoticons, the fun of video calls can be improved sex.
  • the preset pattern may be patterns of masks, glasses, masks, wigs and the like.
  • the upper body of the user is wearing a formal suit, and the lower body is wearing home clothes.
  • the lower body of the user can be blocked, and only the upper body of the formal suit is displayed to avoid privacy exposure. And more humane.
  • the images that undergo privacy protection processing may not be limited to images of people, but may include images of non-persons.
  • background images including environmental background images, images of surrounding items, and the like.
  • performing privacy protection processing may include: using preset patterns to block and/or reduce resolution, for example, reducing the resolution of the background image area, using cartoon patterns to block surrounding objects, and the like.
  • the image privacy protection function can be enabled or disabled through user configuration. Therefore, in some implementation manners, the video call method may further include the following step: enabling an image privacy protection function in response to receiving a privacy protection instruction.
  • the privacy protection instruction is initiated by the user of the first user terminal, and the user can enable the image privacy protection function in the setting menu interface of the first user terminal.
  • the image privacy protection function can be enabled for all video calls; the user can also Enable the image privacy protection function on the video call interface. In this case, you can enable the image privacy protection function for the current video call.
  • the first user terminal and the second user terminal perform video call resolution negotiation, so as to establish the video call connection according to the negotiated resolution.
  • the video call method may further include the following steps S31 and S32.
  • Step S31 in response to receiving the video call request sent by the second user terminal, acquire multiple sets of resolutions carried in the video call request.
  • the first user terminal receives the call invitation (INVITE) message sent by the second user terminal, and obtains the video resolution list supported by the second user terminal according to the image attribute (imageattr) field in the call invitation message , the list of video resolutions includes sets of resolutions.
  • call invitation (INVITE) message log is as follows.
  • Step S32 selecting a set of resolutions from multiple sets of resolutions, and establishing a video call connection with the second user terminal according to the selected resolutions.
  • the first user terminal selects a set of resolutions from multiple sets of resolutions according to a preset policy, and replies to the second user terminal through a call invitation (INVITE) response message.
  • INVITE call invitation
  • call invitation (INVITE) response message log is as follows.
  • the selection of a set of resolutions from multiple sets of resolutions includes the following steps S321 to S324.
  • Step S321 querying the communication number of the second user terminal in the first user terminal.
  • the communication number of the second user terminal may be a mobile phone number.
  • the first user terminal may query the mobile phone number of the second user terminal locally to determine whether the mobile phone number has been stored locally. .
  • Step S322 judging whether the communication number of the second user terminal is found, if found, execute step S323; if not found, execute step S324.
  • the first user terminal inquires the communication number of the second user terminal, it means that the communication number of the second user terminal has been stored in the first user terminal (such as in the address book), then it is likely that the second user terminal
  • the user of the user terminal is an acquaintance (for example, relatives and friends, etc.) of the user of the first user terminal, then can adopt higher resolution to carry out video call with the second user terminal;
  • Communication number indicating that the communication number of the second user terminal has not been stored in the first user terminal (such as in the address book), so it is likely that the user of the second user terminal is a stranger to the user of the first user terminal, Then the video call with the second user terminal can be performed with a lower resolution.
  • Step S323 selecting a first set of resolutions from multiple sets of resolutions.
  • Step S324 selecting a second set of resolutions from multiple sets of resolutions, the second set of resolutions being smaller than the first set of resolutions.
  • the second set of resolutions may be the lowest set of resolutions among multiple sets of video resolutions
  • the first set of resolutions may be the highest set of resolutions among multiple sets of video resolutions, or, from multiple sets of video resolutions A randomly selected set of resolutions that are not the lowest in the set of video resolutions.
  • the video call method may also include the following steps: when the video call connection is established according to the second group of resolutions and the image privacy protection function is turned off In the case of , turn on the image privacy protection function. That is to say, if the first user terminal does not find the communication number of the second user terminal locally, it means that the second user terminal user is a stranger to the first user terminal user. At this time, the first user terminal user not only uses It can make a video call with it at a lower resolution.
  • the image privacy protection function is turned off, the image privacy protection function can be automatically turned on, so as to achieve multi-faceted privacy protection.
  • the video data of the video call is transmitted between the first user terminal and the data server, and between the data server and the second user terminal through the Bootstrap data channel.
  • the Bootstrap data channel can be established before or after the video call connection is established.
  • the video call method may further include the following steps: Before the video call connection is established, a first data channel is established with the data server; or, after the video call connection is established and before a data processing request is sent to the data server, the first data channel is established with the data server.
  • the second data channel between the data server and the second user terminal may also be established before the video call connection is established, or after the video call connection is established and before the data processing request is sent.
  • the process of establishing the Bootstrap data channel will be described in detail below by taking the establishment of the first data channel between the first user terminal and the data server as an example.
  • the first user terminal sends a SIP (Session Initiation Protocol, Session Initiation Protocol) update (UPDATE) message carrying the dcmap field to the data server.
  • SIP Session Initiation Protocol
  • UPDATE Session Initiation Protocol
  • the data server After receiving the SIP update (UPDATE) message, the data server establishes a hypertext transmission with the first user terminal Protocol (http) connection, so as to establish the first data channel.
  • the value of the dcmap field is the SDP media stream identifier (Stream ID), and the meaning of the value of the dcmap field is shown in Table 1:
  • “content source” represents the originator who establishes the data channel
  • a data channel may be directly established between the first user terminal and the second user terminal without going through the data server, and the data channel between the two user terminals is a non-Bootstrap data channel.
  • the first user terminal may also use the data channel (different from the IMS data channel) directly established between the first user terminal and the second user terminal to protect the privacy
  • the processed video data is sent to the second user terminal.
  • An embodiment of the present disclosure also provides a video call method applied to a data server. As shown in FIG. 5 , the video call method includes the following steps S51 to S53.
  • Step S51 receiving the data processing request sent by the first user terminal through the first data channel, and obtaining the video data carried in the data processing request.
  • the data processing request is that the first user terminal establishes a video call connection for the called party and the first user terminal It is sent after the terminal determines that the image privacy protection function is enabled, and the calling party of the video call is the second user terminal.
  • the data processing request carrying the video data is sent to the data server through the first data channel, so as to request the data server to perform privacy protection processing on the images in the video data.
  • the data server receives the data processing request carrying video data through the first data channel with the first user terminal, and obtains the video data in the data processing request, and the video data is the original video data, namely Video data that has not been processed for privacy protection.
  • Step S52 performing privacy protection processing on images in the video data.
  • the data server performs privacy protection processing on the images in the video data, so as to prevent the initiator of the video call from seeing images of sensitive areas.
  • Step S53 sending the privacy-protected video data to the second user terminal through the second data channel between the data server and the second user terminal.
  • the data sent by the data server to the second user terminal includes not only video data, but also audio data, except that the audio data is original audio data without privacy protection processing.
  • first data channel and the second data channel are Bootstrap data channels.
  • the video call method provided by the embodiments of the present disclosure is applied to the data server, receives the data processing request carrying the video data sent by the first user terminal through the first data channel, performs privacy protection processing on the images in the video data, and passes the data server
  • the second data channel with the second user terminal sends the privacy-protected video data to the second user terminal, and the data processing request is that the first user terminal establishes a video call connection for the called party and the first user It is sent after the terminal determines that the image privacy protection function is enabled, and the calling party of the video call is the second user terminal.
  • the video call method provided by the embodiment of the present disclosure performs privacy protection processing on the image in the video call, so that the opposite end user of the video call sees the image after privacy protection processing, strengthens the privacy protection of the user, and can effectively improve the privacy of the video call. Call security.
  • the image includes a person image
  • privacy protection processing is performed on at least one of the following regions of the person image in the video data: a biological information feature recognition region, and a body private part region. That is to say, the privacy protection processing is performed on the biological information feature recognition area and/or the private body part area of the person image in the video data.
  • the biological information feature recognition area may be a human face area, a human eye area, a hand area including fingerprints, and the like.
  • the performing privacy protection processing may include at least one of the following: blocking according to a preset pattern, replacing with an avatar, superimposing augmented reality expressions, and reducing resolution. That is to say, one or more operations are performed on one or more regions of the image in the video data by using preset patterns to block, replacing with avatars, superimposing augmented reality (Augmented Reality, AR) expressions, and reducing resolution.
  • augmented reality Augmented Reality, AR
  • AR Augmented Reality
  • avatars and AR emoticons can be provided on the “image privacy protection function” interface of the first user terminal for users to choose. By replacing them with avatars and superimposing AR emoticons, the fun of video calls can be improved sex.
  • the preset pattern may be patterns of masks, glasses, masks, wigs and the like.
  • the upper body of the user is wearing a formal suit, and the lower body is wearing home clothes.
  • the lower body of the user can be blocked, and only the upper body of the formal suit is displayed to avoid privacy exposure. And more humane.
  • the images that undergo privacy protection processing may not be limited to images of people, but may include images of non-persons.
  • background images including environmental background images, images of surrounding items, and the like.
  • performing privacy protection processing may include: using preset patterns to block and/or reduce resolution, for example, reducing the resolution of the background image area, using cartoon patterns to block surrounding objects, and the like.
  • the video calling method also includes the following steps:
  • the Bootstrap data channel (including the first data channel and the second data channel) may be established before the video call connection between the first user terminal and the second user terminal is established, or after the video call connection is established, It is established before receiving the data processing request sent by the first user terminal.
  • the first user terminal as the called party collects audio data and video data in real time, and analyzes the collected Video data for privacy region detection. If a privacy area is identified, perform privacy protection processing for the privacy area in the image, for example, add masks, glasses, masks, wigs and other occluders to the face area, reduce the resolution of the face area, replace the face area with an avatar, Superimpose the AR expression, etc., and then send the audio data and the video data after privacy protection processing to the second user terminal as the calling party of the video call.
  • privacy protection processing for the privacy area in the image for example, add masks, glasses, masks, wigs and other occluders to the face area, reduce the resolution of the face area, replace the face area with an avatar, Superimpose the AR expression, etc.
  • the video call method provided by the embodiments of the present disclosure can reduce the risks of user privacy and security, and is especially suitable for video call scenarios with strangers, and can prevent high-definition face pictures from being abused for visual access control. In this way, personal safety hazards are eliminated, and strangers can be prevented from obtaining information such as the user's identity, personal preferences, environment, location, etc. through the image of the video call, and the privacy protection of the user is strengthened.
  • the embodiment of the present disclosure also provides a user terminal.
  • the user terminal includes a communication module 101 , an acquisition module 102 , an image processing module 103 and a data channel management module 104 .
  • the communication module 101 is configured to establish a video call connection in which the first user terminal is the called party, and send processed video data to the second user terminal.
  • the obtaining module 102 is configured to obtain video data of the video call, where the calling party of the video call is the second user terminal.
  • the image processing module 103 is configured to perform privacy protection processing on the images in the video data when the image privacy protection function is turned on.
  • the data channel management module 104 is configured to send a data processing request carrying the video data to the data server through the first data channel between the first user terminal and the data server when the image privacy protection function is enabled;
  • the data processing request is configured to make the data server perform privacy protection processing on images in the video data, and process the privacy protection processing through the second data channel between the data server and the second user terminal. sending the video data to the second user terminal.
  • the image includes a person image
  • privacy protection processing is performed on at least one of the following regions of the person image in the video data: a biological information feature recognition region, and a body private part region.
  • the image processing module 103 is configured to perform at least one of the following privacy protection processing on the images in the video data: use preset patterns to block, replace with avatars, superimpose augmented reality expressions, and reduce resolution.
  • the user terminal further includes a setting module 105 configured to enable the image privacy protection function in response to receiving a privacy protection instruction.
  • the communication module 101 is further configured to obtain multiple sets of resolutions carried in the video call request in response to receiving the video call request sent by the second user terminal before the video call connection is established. , selecting a set of resolutions from the plurality of sets of resolutions, and establishing the video call connection with the second user terminal according to the selected resolutions.
  • the communication module 101 is further configured to query the communication number of the second user terminal in the first user terminal, and in response to querying the communication number of the second user terminal, select from the multiple groups Selecting a first group of resolutions from the resolutions, and in response to not finding the communication number of the second user terminal, selecting a second group of resolutions from the multiple groups of resolutions, the second group of resolutions is smaller than the Describe the first set of resolutions.
  • the setting module 105 is further configured to enable the image privacy protection function when the video call connection is established according to the second set of resolutions and the image privacy protection function is disabled.
  • the data channel management module 104 is further configured to send a data processing request carrying the video data to the data server through the first data channel between the first user terminal and the data server , before establishing the video call connection, establishing the first data channel with the data server; or, after establishing the video call connection and before sending the data processing request to the data server, communicating with the data server The data server establishes the first data channel.
  • the embodiment of the present disclosure further provides a data server, as shown in FIG. 8 , the data server includes a data channel management module 201 and an image processing module 202 .
  • the data channel management module 201 is configured to receive the data processing request sent by the first user terminal through the first data channel, obtain the video data carried in the data processing request, and pass the data processing request between the data server and the second user terminal.
  • the second data channel sends the privacy-protected video data to the second user terminal, and the data processing request is that the first user terminal establishes a video call connection for the called party and the first user terminal It is sent after it is determined that the image privacy protection function is enabled, and the calling party of the video call is the second user terminal.
  • the image processing module 202 is configured to perform privacy protection processing on images in the video data.
  • the image includes a person image
  • privacy protection processing is performed on at least one of the following regions of the person image in the video data: a biological information feature recognition region, and a body private part region.
  • the image processing module 202 is configured to perform at least one of the following privacy protection processing on the images in the video data: using preset patterns for occlusion, replacing them with avatars, superimposing augmented reality expressions, and reducing resolution.
  • the data channel management module 201 is further configured to establish the first data channel with the first user terminal and establish the second data channel with the second user terminal before the video call connection is established. Two data channels; or, after the video call connection is established and before receiving the data processing request sent by the first user terminal, establishing the first data channel with the first user terminal, and communicating with the second user terminal The second user terminal establishes the second data channel.
  • An embodiment of the present disclosure also provides a computer device.
  • the computer device includes: at least one processor 301 (only one is shown in FIG. 9 ) and a storage device 302; at least one A computer program, when the at least one computer program is executed by the at least one processor 301, the at least one processor 301 is enabled to implement the aforementioned video call method.
  • the processor 301 and the storage device 302 may be connected via a bus, for example.
  • An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the aforementioned video calling method is implemented.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components, for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit circuit.
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本公开提供一种视频通话方法,应用于第一用户终端,在第一用户终端为被叫方的视频通话连接建立后,获取视频通话的视频数据;在图像隐私保护功能开启的情况下,由第一用户终端对视频数据中的图像进行隐私保护处理后,将处理后的视频数据发送给视频通话的主叫方,即第二用户终端,或者,通过第一数据通道将视频数据发送给数据服务器,由数据服务器对视频数据中的图像进行隐私保护处理后,通过第二数据通道将隐私保护处理后的视频数据发送给第二用户终端。本公开还提供一种用户终端、一种数据服务器、一种计算机设备和一种计算机可读存储介质。

Description

视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质
相关申请的交叉引用
本申请要求于2021年8月24日提交的中国专利申请NO.202110974205.X的优先权,该中国专利申请的内容通过引用的方式整体合并于此。
技术领域
本公开涉及通信技术领域,具体涉及视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质。
背景技术
2021MWC(Mobile World Congress,世界移动通信大会)上海展会上发布了5G(5th Generation Mobile Communication Technology,第五代移动通信技术)新通话解决方案。该方案作为一种基于IMS(IP Multimedia Subsystem,IP多媒体子系统)数据通道架构实现的通话业务增强形态,具有高清、可视化、可交互的特征,将有效提升5G时代的用户通话体验。
5G新通话使用H.265和EVS(Enhance Voice Services,增强语音服务)编码技术代替H.264和AMR(Adaptive MultiRate Codec,自适应多码率编码)技术,实现了高清音视频通话。然而,高清晰度的视频通话可能给用户带来隐私方面的风险及安全性风险。
公开内容
第一方面,本公开实施例提供一种视频通话方法,应用于第一用户终端,包括:
在第一用户终端为被叫方的视频通话连接建立后,获取所述视 频通话的视频数据,所述视频通话的主叫方为第二用户终端;以及
在图像隐私保护功能开启的情况下,对所述视频数据中的图像进行隐私保护处理,并将处理后的视频数据发送给所述第二用户终端,或者,通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求,所述数据处理请求配置为使所述数据服务器对所述视频数据中的图像进行隐私保护处理,并通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
又一方面,本公开实施例提供一种视频通话方法,应用于数据服务器,包括:
通过第一数据通道接收第一用户终端发送的数据处理请求,获取所述数据处理请求中携带的视频数据,所述数据处理请求是所述第一用户终端为被叫方的视频通话连接建立且所述第一用户终端确定图像隐私保护功能开启后发送的,所述视频通话的主叫方为第二用户终端;
对所述视频数据中的图像进行隐私保护处理;以及
通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
又一方面,本公开实施例还提供一种用户终端,包括通信模块、获取模块、图像处理模块和数据通道管理模块,所述通信模块配置为建立第一用户终端为被叫方的视频通话连接,以及将处理后的视频数据发送给第二用户终端;
所述获取模块配置为获取所述视频通话的视频数据,所述视频通话的主叫方为所述第二用户终端;
所述图像处理模块配置为在图像隐私保护功能开启的情况下,对所述视频数据中的图像进行隐私保护处理;
所述数据通道管理模块配置为在图像隐私保护功能开启的情况下,通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求,所述数据处理请求配置为使所述数据服务器对所述视频数据中的图像进行隐私保护 处理,并通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
又一方面,本公开实施例还提供一种数据服务器,包括数据通道管理模块和图像处理模块,所述数据通道管理模块配置为通过第一数据通道接收第一用户终端发送的数据处理请求,获取所述数据处理请求中携带的视频数据,以及通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端,所述数据处理请求是所述第一用户终端为被叫方的视频通话连接建立且所述第一用户终端确定图像隐私保护功能开启后发送的,所述视频通话的主叫方为第二用户终端;
所述图像处理模块配置为对所述视频数据中的图像进行隐私保护处理。
又一方面,本公开实施例还提供一种计算机设备,包括:至少一个处理器;以及存储装置,其上存储有至少一个计算机程序;当所述至少一个计算机程序被所述至少一个处理器执行时,使得所述至少一个处理器实现如前所述的视频通话方法。
又一方面,本公开实施例还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如前所述的视频通话方法。
附图说明
图1为本公开实施例提供的系统架构图;
图2为本公开实施例提供的视频通话方法的流程示意图;
图3为本公开实施例提供的建立视频通话连接的流程示意图;
图4为本公开实施例提供的在建立视频通话连接过程中选择分辨率的流程示意图;
图5为本公开实施例提供的视频通话方法的流程示意图;
图6为本公开实施例提供的用户终端的结构示意图;
图7为本公开实施例提供的用户终端的结构示意图;
图8为本公开实施例提供的数据服务器的结构示意图;以及
图9为本公开实施例提供的计算机设备的结构示意图。
具体实施方式
在下文中将参考附图更充分地描述示例实施例,但是所述示例实施例可以以不同形式来体现,且本公开不应当被解释为限于本文阐述的实施例。提供这些实施例的目的在于使本公开更加透彻和完整,并使本领域技术人员充分理解本公开的范围。
如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。
本文所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在特定特征、整体、步骤、操作、元件和/或组件,但不排除存在或可添加一个或多个其他特征、整体、步骤、操作、元件、组件和/或其群组。
本文所述实施例可借助本公开的理想示意图而参考平面图和/或截面图进行描述。因此,可根据制造技术和/或容限来修改示例图示。因此,实施例不限于附图中所示的实施例,而是包括基于制造工艺而形成的配置的修改。因此,附图中例示的区具有示意性属性,并且图中所示区的形状例示了元件的区的具体形状,但并不是限制性的。
除非另外限定,否则本文所用的所有术语(包括技术术语和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。
本公开实施例提供一种视频通话方法,所述视频通话应用于如图1所示的系统中。如图1所示,所述系统包括数据服务器(Data channel server)以及视频通话的各个用户终端,在本公开提供的视频通话方法中,以两个用户终端进行视频通话为例进行说明,第一用户终端为视频通话的被叫方,第二用户终端为视频通话的主叫方。第 一用户终端和第二用户终端可以分别与数据服务器建立Bootstrap数据通道,并通过Bootstrap数据通道传输视频通话的视频数据和音频数据。需要说明的是,第一用户终端和第二用户终端也可以不利用Bootstrap数据通道及经由数据服务器传输音视频数据,而是利用IMS数据通道直接传输音视频数据。
如图2所示,本公开实施例提供的视频通话方法应用于第一用户终端,包括以下步骤S21和S22。
步骤S21,在第一用户终端为被叫方的视频通话连接建立后,获取视频通话的视频数据,视频通话的主叫方为第二用户终端。
第二用户终端向第一用户终端发起IMS类型的视频通话请求,IMS类型的视频通话可以包括但不限于VoNR(Voice over New Radio,基于新无线的语音通话)、VoLTE(Voice over LTE,基于LTE的语音通话)、VoWiFi(Voice over WiFi,基于WiFi的语音通话)。
在本步骤S21中,第一用户终端的用户接听该视频通话并建立该视频通话连接之后,实时采集该用户的视频数据和音频数据。在一些实施方式中,可以利用第一用户终端自带的声音采集装置(例如麦克风)、图像采集装置(例如前置摄像头)等设备采集音频数据和视频数据。
步骤S22,在图像隐私保护功能开启的情况下,对视频数据中的图像进行隐私保护处理,并将处理后的视频数据发送给第二用户终端,或者,通过第一用户终端与数据服务器之间的第一数据通道向数据服务器发送携带视频数据的数据处理请求,数据处理请求配置为使数据服务器对视频数据中的图像进行隐私保护处理,并通过数据服务器与第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给第二用户终端。
在本公开实施例提供的视频通话方法中,第一用户终端设置有图像隐私保护功能开关,用户可以基于需要,手动设置以开启或关闭该图像隐私保护功能。在图像隐私保护功能开启的情况下,在用户使用第一用户终端进行视频通话时,能够对视频数据中的图像进行隐私保护处理,以避免视频通话对方看到敏感区域的图像。
在本公开实施例提供的视频通话方法中,对视频数据中的图像进行隐私保护处理的操作,可以由第一用户终端执行,第一用户终端对视频数据中的图像进行隐私保护处理之后,将隐私保护处理后的视频数据通过IMS数据通道发送给第二用户终端。由于针对图像的隐私保护处理操作对用户终端设备的图像处理性能要求较高,在第一用户终端的图像处理性能有限的情况下,对视频数据中的图像进行隐私保护处理的操作也可以由数据服务器执行。在由数据服务器对视频数据中的图像进行隐私保护处理的情况下,第一用户终端与数据服务器之间、以及数据服务器与第二用户终端之间通过Bootstrap数据通道传输视频通话的视频数据。
需要说明的是,在本步骤S22中,第一用户终端向第二用户终端或数据服务器发送的数据不但包括视频数据,还可以包括音频数据,只不过音频数据是未进行隐私保护处理的原始音频数据。
本公开实施例提供的视频通话方法,应用于第一用户终端,在第一用户终端为被叫方的视频通话连接建立后,获取视频通话的视频数据;在图像隐私保护功能开启的情况下,由第一用户终端对视频数据中的图像进行隐私保护处理后,将处理后的视频数据发送给视频通话的主叫方,即第二用户终端,或者,通过第一数据通道将视频数据发送给数据服务器,由数据服务器对视频数据中的图像进行隐私保护处理后,通过第二数据通道将隐私保护处理后的视频数据发送给第二用户终端。本公开实施例提供的视频通话方法通过对视频通话中的图像进行隐私保护处理,使得视频通话的对端用户看到的是经过隐私保护处理后的图像,加强用户的隐私保护,可有效提高视频通话的安全性。
在一些实施方式中,所述图像包括人物图像,对视频数据中人物图像的以下至少一个区域进行隐私保护处理:生物信息特征识别区域、身体隐私部位区域。也就是说,对视频数据中人物图像的生物信息特征识别区域和/或身体隐私部位区域进行隐私保护处理。示例性的,生物信息特征识别区域可以是人脸区域、人眼区域、包括指纹的手部区域等。
在一些实施方式中,所述进行隐私保护处理可以包括以下至少之一:根据预设图案遮挡、替换为虚拟形象、叠加增强现实表情、降低分辨率。也就是说,对视频数据中图像的一个或多个区域进行使用预设图案遮挡、替换为虚拟形象、叠加增强现实(Augmented Reality,AR)表情、降低分辨率中的一种或多种操作。示例性的,可以将人脸区域替换为虚拟形象、在人脸区域叠加AR表情、降低人脸区域的分辨率、使用预设图案遮挡人脸区域、遮挡身体隐私部位区域等。
需要说明的是,虚拟形象和AR表情可以在第一用户终端的设置“图像隐私保护功能”界面上提供,以供用户进行选择,通过替换为虚拟形象、叠加AR表情,可以提高视频通话的趣味性。
在一些实施方式中,示例性的,预设图案可以是口罩、眼镜、面具、假发等图案。
在一些实施方式中,在某些视频会议场景下,用户上半身穿着正装,下半身穿着家居服,通过遮挡身体隐私部位区域,可以将用户的下半身进行遮挡,只显示身着正装的上半身,避免隐私暴露且更加人性化。
由于用户所处的环境、周边的物品也可能包含位置、个人喜好、职业等隐私信息,因此,在一些实施方式中,进行隐私保护处理的图像也可以不限于人物图像,可以包括非人物图像,例如背景图像,包括环境背景图像、周边物品图像等。针对非人物图像,进行隐私保护处理可以包括:使用预设图案遮挡和/或降低分辨率,例如,降低背景图像区域的分辨率、使用卡通图案对周边物品进行遮挡等。
需要说明的是,在本公开实施例提供的视频通话方法中,图像隐私保护功能可以通过用户配置开启或关闭。因此,在一些实施方式中,所述视频通话方法还可以包括以下步骤:响应于接收到隐私保护指示,开启图像隐私保护功能。隐私保护指示由第一用户终端的用户发起,用户可以在第一用户终端的设置菜单界面开启图像隐私保护功能,在这种情况下,可以针对所有的视频通话开启图像隐私保护功能;用户也可以在视频通话界面开启图像隐私保护功能,在这种情况下,可以针对当前的视频通话开启图像隐私保护功能。
在一些实施方式中,在视频通话连接建立之前,第一用户终端与第二用户终端进行视频通话分辨率协商,以便根据协商好的分辨率建立视频通话连接。相应的,如图3所示,在视频通话连接建立之前,所述视频通话方法还可以包括以下步骤S31和S32。
步骤S31,响应于接收到第二用户终端发送的视频通话请求,获取视频通话请求中携带的多组分辨率。
在本步骤S31中,第一用户终端接收到第二用户终端发送的通话邀请(INVITE)消息,根据该通话邀请消息中的图像属性(imageattr)字段,得到第二用户终端支持的视频分辨率列表,该视频分辨率列表包括多组分辨率。
示例性的,通话邀请(INVITE)消息日志如下。
a=imageattr:114 send[x=480,y=640][x=240,y=320][x=144,y=176]recv[x=480,y=640][x=240,y=320][x=144,y=176]
a=rtpmap:113 H265/90000
a=imageattr:113 send[x=480,y=640][x=240,y=320][x=144,y=176]recv[x=480,y=640][x=240,y=320][x=144,y=176]
通过上述日志可以看出,第二用户终端支持H265的113和114两种编码格式,每种编码格式支持的视频分辨率列表包括三组分辨率:[x=480,y=640]、[x=240,y=320]、[x=144,y=176],x为视频图像中x坐标的像素值,y为视频图像中y坐标的像素值。
步骤S32,从多组分辨率中选择一组分辨率,并根据选择的分辨率与第二用户终端建立视频通话连接。
在本步骤S32中,第一用户终端根据预设策略,从多组分辨率中选择一组分辨率,并通过通话邀请(INVITE)响应消息答复第二用户终端。
示例性的,通话邀请(INVITE)响应消息日志如下。
a=imageattr:114 send[x=480,y=640]recv[x=480,y=640]
通过上述日志可以看出,第一用户终端与第二用户终端协商采用114编码格式、[x=480,y=640]的分辨率进行视频通话。
在一些实施方式中,如图4所示,所述从多组分辨率中选择一 组分辨率(即步骤S32)包括以下步骤S321至S324。
步骤S321,在第一用户终端内查询第二用户终端的通信号码。
在一些实施方式中,第二用户终端的通信号码可以是手机号码,在本步骤S321中,第一用户终端可以在本地查询第二用户终端的手机号码,以确定是否在本地存储过该手机号码。
步骤S322,判断是否查询到第二用户终端的通信号码,若查询到,则执行步骤S323;若未查询到,则执行步骤S324。
在本步骤S322中,若第一用户终端查询到第二用户终端的通信号码,说明第二用户终端的通信号码曾经在第一用户终端中(例如通讯录中)存储过,那么很可能第二用户终端的用户是第一用户终端的用户的熟人(例如,亲朋好友等),则可以采用较高分辨率与第二用户终端进行视频通话;若第一用户终端未查询到第二用户终端的通信号码,说明第二用户终端的通信号码并没有在第一用户终端中(例如通讯录中)存储过,那么很可能第二用户终端的用户对于第一用户终端的用户来说是陌生人,则可以采用较低分辨率与第二用户终端进行视频通话。
步骤S323,从多组分辨率中选择第一组分辨率。
步骤S324,从多组分辨率中选择第二组分辨率,第二组分辨率小于第一组分辨率。
在一些实施方式中,第二组分辨率可以为多组视频分辨率中最低的一组分辨率,第一组分辨率可以为多组视频分辨率中最高的一组分辨率、或者、从多组视频分辨率中非最低分辨率中随机选择的一组分辨率。
为了进一步加强用户的隐私保护,进一步提高视频通话的安全性,在一些实施方式中,所述视频通话方法还可以包括以下步骤:在根据第二组分辨率建立视频通话连接且图像隐私保护功能关闭的情况下,开启图像隐私保护功能。也就是说,如果第一用户终端在本地没有查询到第二用户终端的通信号码,说明第二用户终端用户对于第一用户终端用户而言是陌生人,此时,第一用户终端用户不但采用较低的分辨率与其进行视频通话,在图像隐私保护功能关闭的情况下, 可以自动开启图像隐私保护功能,从而起到多方位的隐私保护。
针对由数据服务器对视频数据中的图像进行隐私保护处理的方案,第一用户终端与数据服务器之间,以及数据服务器与第二用户终端之间通过Bootstrap数据通道传输视频通话的视频数据。Bootstrap数据通道可以在视频通话连接建立之前建立,也可以在视频通话连接建立之后建立。
因此,在一些实施方式中,在通过第一用户终端与数据服务器之间的第一数据通道向数据服务器发送携带视频数据的数据处理请求的情况下,所述视频通话方法还可以包括以下步骤:在视频通话连接建立之前,与数据服务器建立第一数据通道;或者,在视频通话连接建立之后、向数据服务器发送数据处理请求之前,与数据服务器建立第一数据通道。
需要说明的是,数据服务器与第二用户终端之间的第二数据通道也可以在视频通话连接建立之前建立,或者,在视频通话连接建立之后、发送数据处理请求之前建立。
以下以建立第一用户终端与数据服务器之间的第一数据通道为例,对建立Bootstrap数据通道的过程进行详细说明。
第一用户终端向数据服务器发送携带dcmap字段的SIP(Session Initiation Protocol,会话初始协议)更新(UPDATE)消息,数据服务器接收到SIP更新(UPDATE)消息后,建立与第一用户终端的超文本传输协议(http)连接,从而建立第一数据通道。dcmap字段的值为SDP媒体流标识(Stream ID),dcmap字段的值的含义如表1所示:
表1
Figure PCTCN2022113236-appb-000001
表1中,“内容源”表示建立数据通道的发起者,dcmp=0表示由本地网络的数据服务器发起建立数据通道;dcmp=10表示由本地网络的用户终端(例如图1中的第一用户终端)发起建立数据通道;dcmp=100表示由远程网络的数据服务器发起建立数据通道;dcmp=110表示由远程网络的用户终端(例如图1中的第二用户终端)发起建立数据通道。
需要说明的是,第一用户终端与第二用户终端之间也可以不经过数据服务器而是直接建立数据通道,两个用户终端之间的数据通道是非Bootstrap数据通道。针对由第一用户终端对视频数据中的图像进行隐私保护处理的方案,第一用户终端也可以通过与第二用户终端之间的直接建立的数据通道(不同于IMS数据通道),将隐私保护处理后的视频数据发送给第二用户终端。
本公开实施例还提供一种视频通话方法,应用于数据服务器,如图5所示,所述视频通话方法包括以下步骤S51至S53。
步骤S51,通过第一数据通道接收第一用户终端发送的数据处理请求,获取数据处理请求中携带的视频数据,数据处理请求是第一用户终端为被叫方的视频通话连接建立且第一用户终端确定图像隐私保护功能开启后发送的,视频通话的主叫方为第二用户终端。
第一用户终端的用户接听第二用户终端发起的IMS类型的视频通话请求,并建立该视频通话连接之后,实时采集该用户的视频数据和音频数据。在确定图像隐私保护功能开启后,将携带视频数据的数据处理请求通过第一数据通道发送给数据服务器,以请求数据服务器对视频数据中的图像进行隐私保护处理。
在本步骤S51中,数据服务器通过与第一用户终端之间的第一数据通道接收携带有视频数据的数据处理请求,并获取数据处理请求中的视频数据,该视频数据是原始视频数据,即未进行隐私保护处理的视频数据。
步骤S52,对视频数据中的图像进行隐私保护处理。
在本步骤S52中,数据服务器对视频数据中的图像进行隐私保护处理,以避免视频通话的发起方看到敏感区域的图像。
步骤S53,通过数据服务器与第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给第二用户终端。
在本步骤S53中,数据服务器向第二用户终端发送的数据不但包括视频数据,还可以包括音频数据,只不过音频数据是未进行隐私保护处理的原始音频数据。
需要说明的是,第一数据通道和第二数据通道为Bootstrap数据通道。
本公开实施例提供的视频通话方法应用于数据服务器,通过第一数据通道接收第一用户终端发送的携带有视频数据的数据处理请求,对视频数据中的图像进行隐私保护处理,并通过数据服务器与第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给第二用户终端,所述数据处理请求是第一用户终端为被叫方的视频通话连接建立且第一用户终端确定图像隐私保护功能开启后发送的,视频通话的主叫方为第二用户终端。本公开实施例提供的视频通话方法通过对视频通话中的图像进行隐私保护处理,使得视频通话的对端用户看到的是经过隐私保护处理后的图像,加强用户的隐私保护,可有效提高视频通话的安全性。
在一些实施方式中,所述图像包括人物图像,对视频数据中人物图像的以下至少一个区域进行隐私保护处理:生物信息特征识别区域、身体隐私部位区域。也就是说,对视频数据中人物图像的生物信息特征识别区域和/或身体隐私部位区域进行隐私保护处理。示例性的,生物信息特征识别区域可以是人脸区域、人眼区域、包括指纹的手部区域等。
在一些实施方式中,所述进行隐私保护处理可以包括以下至少之一:根据预设图案遮挡、替换为虚拟形象、叠加增强现实表情、降低分辨率。也就是说,对视频数据中图像的一个或多个区域进行使用预设图案遮挡、替换为虚拟形象、叠加增强现实(Augmented Reality,AR)表情、降低分辨率中的一种或多种操作。示例性的,可以将人脸区域替换为虚拟形象、在人脸区域叠加AR表情、降低人脸区域的分辨率、使用预设图案遮挡人脸区域、遮挡身体隐私部位区域等。
需要说明的是,虚拟形象和AR表情可以在第一用户终端的设置“图像隐私保护功能”界面上提供,以供用户进行选择,通过替换为虚拟形象、叠加AR表情,可以提高视频通话的趣味性。
在一些实施方式中,示例性的,预设图案可以是口罩、眼镜、面具、假发等图案。
在一些实施方式中,在某些视频会议场景下,用户上半身穿着正装,下半身穿着家居服,通过遮挡身体隐私部位区域,可以将用户的下半身进行遮挡,只显示身着正装的上半身,避免隐私暴露且更加人性化。
由于用户所处的环境、周边的物品也可能包含位置、个人喜好、职业等隐私信息,因此,在一些实施方式中,进行隐私保护处理的图像也可以不限于人物图像,可以包括非人物图像,例如背景图像,包括环境背景图像、周边物品图像等。针对非人物图像,进行隐私保护处理可以包括:使用预设图案遮挡和/或降低分辨率,例如,降低背景图像区域的分辨率、使用卡通图案对周边物品进行遮挡等。
在一些实施方式中,所述视频通话方法还包括以下步骤:
在视频通话连接建立之前,与第一用户终端建立第一数据通道,并与第二用户终端建立第二数据通道;或者,在视频通话连接建立之后、在接收第一用户终端发送的数据处理请求之前,与第一用户终端建立第一数据通道,并与第二用户终端建立第二数据通道。也就是说,Bootstrap数据通道(包括第一数据通道和第二数据通道)可以在第一用户终端与第二用户终端之间的视频通话连接建立之前建立,也可以在该视频通话连接建立之后、在接收第一用户终端发送的数据处理请求之前建立。
本公开实施例提供的视频通话方法,在建立第一用户终端与第二用户终端之间的视频通话后,作为被叫方的第一用户终端实时采集音频数据和视频数据,并对采集到的视频数据进行隐私区域检测。如果识别到隐私区域,则针对图像中的隐私区域进行隐私保护处理,例如,在面部区域添加口罩、眼镜、面具、假发等遮挡物、降低面部区域的分辨率、将面部区域替换为虚拟形象、叠加AR表情等,然后将 音频数据和经过隐私保护处理后的视频数据发给作为视频通话主叫方的第二用户终端。本公开实施例提供的视频通话方法可以降低用户隐私方面的风险以及安全性风险,尤其适用于与陌生人之间的视频通话场景,可以避免高清晰度的人脸图片被滥用于可视门禁,从而排除人身安全隐患,还可以避免陌生人通过视频通话的图像去获取用户的身份、个人喜好、环境、位置等信息,加强用户的隐私保护。
基于相同的技术构思,本公开实施例还提供一种用户终端,如图6所示,所述用户终端包括通信模块101、获取模块102、图像处理模块103和数据通道管理模块104。
通信模块101配置为建立第一用户终端为被叫方的视频通话连接,以及将处理后的视频数据发送给第二用户终端。
获取模块102配置为获取所述视频通话的视频数据,所述视频通话的主叫方为所述第二用户终端。
图像处理模块103配置为在图像隐私保护功能开启的情况下,对所述视频数据中的图像进行隐私保护处理。
数据通道管理模块104配置为在图像隐私保护功能开启的情况下,通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求;所述数据处理请求配置为使所述数据服务器对所述视频数据中的图像进行隐私保护处理,并通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
在一些实施方式中,所述图像包括人物图像,对所述视频数据中人物图像的以下至少一个区域进行隐私保护处理:生物信息特征识别区域、身体隐私部位区域。
在一些实施方式中,图像处理模块103配置为对所述视频数据中的图像进行以下至少之一的隐私保护处理:使用预设图案遮挡、替换为虚拟形象、叠加增强现实表情、降低分辨率。
在一些实施方式中,如图7所示,所述用户终端还包括设置模块105,设置模块105配置为响应于接收到隐私保护指示,开启所述图像隐私保护功能。
在一些实施方式中,通信模块101还配置为在所述视频通话连接建立之前,响应于接收到所述第二用户终端发送的视频通话请求,获取所述视频通话请求中携带的多组分辨率,从所述多组分辨率中选择一组分辨率,并根据选择的分辨率与所述第二用户终端建立所述视频通话连接。
在一些实施方式中,通信模块101还配置为在所述第一用户终端内查询所述第二用户终端的通信号码,响应于查询到所述第二用户终端的通信号码,从所述多组分辨率中选择第一组分辨率,以及响应于未查询到所述第二用户终端的通信号码,从所述多组分辨率中选择第二组分辨率,所述第二组分辨率小于所述第一组分辨率。
在一些实施方式中,设置模块105还配置为在根据所述第二组分辨率建立所述视频通话连接且所述图像隐私保护功能关闭的情况下,开启所述图像隐私保护功能。
在一些实施方式中,数据通道管理模块104还配置为在通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求的情况下,在所述视频通话连接建立之前,与所述数据服务器建立所述第一数据通道;或者,在所述视频通话连接建立之后、向所述数据服务器发送所述数据处理请求之前,与所述数据服务器建立所述第一数据通道。
基于相同的构思,本公开实施例还提供一种数据服务器,如图8所示,所述数据服务器包括数据通道管理模块201和图像处理模块202。
数据通道管理模块201配置为通过第一数据通道接收第一用户终端发送的数据处理请求,获取所述数据处理请求中携带的视频数据,以及通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端,所述数据处理请求是所述第一用户终端为被叫方的视频通话连接建立且所述第一用户终端确定图像隐私保护功能开启后发送的,所述视频通话的主叫方为第二用户终端。
图像处理模块202配置为对所述视频数据中的图像进行隐私保 护处理。
在一些实施方式中,所述图像包括人物图像,对所述视频数据中人物图像的以下至少一个区域进行隐私保护处理:生物信息特征识别区域、身体隐私部位区域。
在一些实施方式中,图像处理模块202配置为对所述视频数据中的图像进行以下至少之一的隐私保护处理:使用预设图案遮挡、替换为虚拟形象、叠加增强现实表情、降低分辨率。
在一些实施方式中,数据通道管理模块201还配置为在所述视频通话连接建立之前,与所述第一用户终端建立所述第一数据通道,并与所述第二用户终端建立所述第二数据通道;或者,在所述视频通话连接建立之后、在接收所述第一用户终端发送的数据处理请求之前,与所述第一用户终端建立所述第一数据通道,并与所述第二用户终端建立所述第二数据通道。
本公开实施例还提供了一种计算机设备,如图9所示,该计算机设备包括:至少一个处理器301(图9中仅示出一个)以及存储装置302;存储装置302上存储有至少一个计算机程序,当上述至少一个计算机程序被上述至少一个处理器301执行时,使得上述至少一个处理器301实现如前所述的视频通话方法。
所述处理器301和所述存储装置302例如可通过总线连接。
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如前所述的视频通话方法。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分,例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器(如中央处理器、数字信号处理器或微处理器)执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布 在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
本文已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则与特定实施例相结合描述的特征、特性和/或元素可单独使用,或可与结合其他实施例描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。

Claims (13)

  1. 一种视频通话方法,应用于第一用户终端,包括:
    在第一用户终端为被叫方的视频通话连接建立后,获取所述视频通话的视频数据,其中,所述视频通话的主叫方为第二用户终端;以及
    在图像隐私保护功能开启的情况下,对所述视频数据中的图像进行隐私保护处理,并将处理后的视频数据发送给所述第二用户终端,或者,通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求,所述数据处理请求配置为使所述数据服务器对所述视频数据中的图像进行隐私保护处理,并通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
  2. 如权利要求1所述的方法,其中,所述图像包括人物图像,对所述视频数据中人物图像的以下至少一个区域进行隐私保护处理:生物信息特征识别区域、身体隐私部位区域。
  3. 如权利要求2所述的方法,其中,所述进行隐私保护处理包括以下至少之一:
    使用预设图案遮挡、替换为虚拟形象、叠加增强现实表情、降低分辨率。
  4. 如权利要求1所述的方法,还包括:
    响应于接收到隐私保护指示,开启所述图像隐私保护功能。
  5. 如权利要求1所述的方法,还包括:
    在所述视频通话连接建立之前,响应于接收到所述第二用户终端发送的视频通话请求,获取所述视频通话请求中携带的多组分辨率;以及
    从所述多组分辨率中选择一组分辨率,并根据选择的分辨率与所述第二用户终端建立所述视频通话连接。
  6. 如权利要求5所述的方法,其中,所述从所述多组分辨率中选择一组分辨率包括:
    在所述第一用户终端内查询所述第二用户终端的通信号码;
    响应于查询到所述第二用户终端的通信号码,从所述多组分辨率中选择第一组分辨率;以及
    响应于未查询到所述第二用户终端的通信号码,从所述多组分辨率中选择第二组分辨率,其中,所述第二组分辨率小于所述第一组分辨率。
  7. 如权利要求6所述的方法,还包括:
    在根据所述第二组分辨率建立所述视频通话连接且所述图像隐私保护功能关闭的情况下,开启所述图像隐私保护功能。
  8. 如权利要求1所述的方法,还包括:
    在通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求的情况下,在所述视频通话连接建立之前,与所述数据服务器建立所述第一数据通道;或者,
    在所述视频通话连接建立之后、向所述数据服务器发送所述数据处理请求之前,与所述数据服务器建立所述第一数据通道。
  9. 一种视频通话方法,应用于数据服务器,包括:
    通过第一数据通道接收第一用户终端发送的数据处理请求,获取所述数据处理请求中携带的视频数据,其中,所述数据处理请求是所述第一用户终端为被叫方的视频通话连接建立且所述第一用户终端确定图像隐私保护功能开启后发送的,所述视频通话的主叫方为第二用户终端;
    对所述视频数据中的图像进行隐私保护处理;以及
    通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
  10. 一种用户终端,包括通信模块、获取模块、图像处理模块和数据通道管理模块,
    所述通信模块配置为建立第一用户终端为被叫方的视频通话连接,以及将处理后的视频数据发送给第二用户终端;
    所述获取模块配置为获取所述视频通话的视频数据,其中,所述视频通话的主叫方为所述第二用户终端;
    所述图像处理模块配置为在图像隐私保护功能开启的情况下,对所述视频数据中的图像进行隐私保护处理;
    所述数据通道管理模块配置为在图像隐私保护功能开启的情况下,通过所述第一用户终端与数据服务器之间的第一数据通道向所述数据服务器发送携带所述视频数据的数据处理请求,所述数据处理请求配置为使所述数据服务器对所述视频数据中的图像进行隐私保护处理,并通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端。
  11. 一种数据服务器,包括数据通道管理模块和图像处理模块,所述数据通道管理模块配置为通过第一数据通道接收第一用户终端发送的数据处理请求,获取所述数据处理请求中携带的视频数据,以及通过所述数据服务器与所述第二用户终端之间的第二数据通道将隐私保护处理后的视频数据发送给所述第二用户终端,其中,所述数据处理请求是所述第一用户终端为被叫方的视频通话连接建立且所述第一用户终端确定图像隐私保护功能开启后发送的,所述视频通话的主叫方为第二用户终端;
    所述图像处理模块配置为对所述视频数据中的图像进行隐私保护处理。
  12. 一种计算机设备,包括:
    至少一个处理器;以及
    存储装置,其上存储有至少一个计算机程序;
    当所述至少一个计算机程序被所述至少一个处理器执行时,使得所述至少一个处理器实现如权利要求1至9中任一项所述的视频通话方法。
  13. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至9中任一项所述的视频通话方法。
PCT/CN2022/113236 2021-08-24 2022-08-18 视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质 WO2023025020A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110974205.XA CN115914527A (zh) 2021-08-24 2021-08-24 视频通话方法、装置、计算机设备和可读介质
CN202110974205.X 2021-08-24

Publications (1)

Publication Number Publication Date
WO2023025020A1 true WO2023025020A1 (zh) 2023-03-02

Family

ID=85322502

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/113236 WO2023025020A1 (zh) 2021-08-24 2022-08-18 视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN115914527A (zh)
WO (1) WO2023025020A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101296429A (zh) * 2007-04-25 2008-10-29 华为技术有限公司 一种可视终端和来电处理方法
US20110149014A1 (en) * 2009-12-18 2011-06-23 Foxconn Communication Technology Corp. Communication device and privacy protection method
CN104836977A (zh) * 2014-02-10 2015-08-12 阿里巴巴集团控股有限公司 即时通讯过程中的视频通讯方法及系统
CN104935860A (zh) * 2014-03-18 2015-09-23 北京三星通信技术研究有限公司 视频通话实现方法及装置
CN105046133A (zh) * 2015-07-21 2015-11-11 深圳市元征科技股份有限公司 一种图像显示方法及车载终端
CN108600679A (zh) * 2018-01-25 2018-09-28 维沃移动通信有限公司 一种视频通话方法及终端
CN109274919A (zh) * 2017-07-18 2019-01-25 福州瑞芯微电子股份有限公司 视频通话中隐私保护方法、系统、视频通话终端及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101296429A (zh) * 2007-04-25 2008-10-29 华为技术有限公司 一种可视终端和来电处理方法
US20110149014A1 (en) * 2009-12-18 2011-06-23 Foxconn Communication Technology Corp. Communication device and privacy protection method
CN104836977A (zh) * 2014-02-10 2015-08-12 阿里巴巴集团控股有限公司 即时通讯过程中的视频通讯方法及系统
CN104935860A (zh) * 2014-03-18 2015-09-23 北京三星通信技术研究有限公司 视频通话实现方法及装置
CN105046133A (zh) * 2015-07-21 2015-11-11 深圳市元征科技股份有限公司 一种图像显示方法及车载终端
CN109274919A (zh) * 2017-07-18 2019-01-25 福州瑞芯微电子股份有限公司 视频通话中隐私保护方法、系统、视频通话终端及系统
CN108600679A (zh) * 2018-01-25 2018-09-28 维沃移动通信有限公司 一种视频通话方法及终端

Also Published As

Publication number Publication date
CN115914527A (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
EP2761809B1 (en) Method, endpoint, and system for establishing a video conference
US8917306B2 (en) Previewing video data in a video communication environment
US9603173B2 (en) Synchronizing mobile devices and displays
US11089266B2 (en) Communication processing method, terminal, and storage medium
TWI650976B (zh) 即時通訊過程中的視頻通訊方法及系統
RU2637469C2 (ru) Способ, устройство и система осуществления вызовов в видеоконференциях, основанных на унифицированном общении
EP2850816B1 (en) Communication system
US20100153858A1 (en) Uniform virtual environments
CN113055628A (zh) 显示视频通话数据
US11290659B2 (en) Physical object-based visual workspace configuration system
CN108574689B (zh) 一种可视通话的方法和装置
CN114449112B (zh) 电话会议的提醒方法、电子设备及存储介质
US11509695B1 (en) Management of controlled-environment facility resident image and/or background during video visitation
CN107172466A (zh) 共享图像数据显示方法和系统
KR101172268B1 (ko) 영상 통화중 객체 숨김 서비스 제공 방법 및 시스템
WO2023025020A1 (zh) 视频通话方法、用户终端、数据服务器、计算机设备和计算机可读存储介质
CN114915852A (zh) 视频通话交互方法、装置、计算机设备和存储介质
KR20210013923A (ko) 영상 통화 중개 장치, 방법 및 컴퓨터 판독 가능한 기록매체
CN106657533B (zh) 通话处理方法及装置
CN108366223A (zh) 基于移动设备的多方视频会议系统和方法
KR101189052B1 (ko) 영상통화 중 감정 전달 시스템 및 방법
WO2023051739A1 (zh) 图像显示方法及装置、存储介质及电子装置
KR20110026137A (ko) 영상통화 중 감정 전달 시스템 및 방법
WO2018010700A1 (zh) 图像处理方法及装置
CN117097860A (zh) 一种视频通话中背景虚化的实现方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22860366

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE