CN113938634A

CN113938634A - Multi-channel video call processing method and display device

Info

Publication number: CN113938634A
Application number: CN202010674682.XA
Authority: CN
Inventors: 路锋; 高琨; 张磊
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2020-07-14
Filing date: 2020-07-14
Publication date: 2022-01-14

Abstract

The invention discloses a multi-channel video call processing method and display equipment, wherein the method comprises the following steps: acquiring the current call path number of the video call from a server; in response to the fact that the current call path number is larger than the preset path number, calling an interface layout template corresponding to the preset path number, and displaying a video window with the preset path number in a first area on a video call interface; displaying voice windows with the number being the difference between the current call path number and the preset path number in a second area; and in response to the fact that the current communication path number is smaller than or equal to the preset path number, calling an interface layout template corresponding to the current communication path number, and displaying video windows of the current communication path number on a video communication interface without displaying a voice window. According to the method and the device, the multi-channel video call is not limited by the configuration and the model of the display equipment any more, and the application experience of the video call of a user is improved.

Description

Multi-channel video call processing method and display device

Technical Field

The invention relates to the technical field of display equipment, in particular to a multi-channel video call processing method and display equipment.

Background

The display device can be installed with a video call application, when a video call is established, a plurality of members can join in the session in the corresponding virtual room, and a plurality of video windows need to be displayed on a call interface, so that a multi-path video call is formed. However, the configuration of the display device is limited, and especially for some old models, the number of paths capable of supporting video calls is generally low, and when the number of members joining a virtual room is large, the requirement of multiple paths of video calls cannot be met.

Disclosure of Invention

In order to solve the technical problem, the invention provides a multi-channel video call processing method and a display device.

In a first aspect, the present invention provides a display device comprising:

a display;

a sound player;

a communicator for communicatively connecting the display device with the server;

a user interface for receiving an operation input by a user;

a controller respectively connected to the display, the sound player, the communicator, and the user interface, for performing:

acquiring the current call path number of the video call from a server;

in response to the fact that the current call path number is larger than the preset path number, calling an interface layout template corresponding to the preset path number to control a display to display a video window with the preset path number in a first area on a video call interface; displaying voice windows with the number being the difference between the current call path number and the preset path number in a second area;

and in response to the fact that the current communication path number is smaller than or equal to the preset path number, calling an interface layout template corresponding to the current communication path number to control a display to display a video window with the current communication path number on a video communication interface, and not displaying a voice window.

In a second aspect, the present invention provides a method for processing a multi-channel video call, including:

acquiring the current call path number of the video call from a server;

in response to the fact that the current call path number is larger than the preset path number, calling an interface layout template corresponding to the preset path number, and displaying a video window with the preset path number in a first area on a video call interface; displaying voice windows with the number being the difference between the current call path number and the preset path number in a second area;

and in response to the fact that the current communication path number is smaller than or equal to the preset path number, calling an interface layout template corresponding to the current communication path number, and displaying video windows of the current communication path number on a video communication interface without displaying a voice window.

According to the technical scheme, when a video call is initiated, the server can establish a virtual room, the display device obtains the current call path number from the server, namely obtains the number of call members currently accessed in the virtual room, and the display mode of the video call interface can be determined by comparing the current call path number with the preset path number of the display device, wherein the preset path number is the video call path number which can be supported by the display device, and depends on the configuration, model and other factors of the following device. When the current call path number is less than or equal to the preset path number, the video windows of the current call path number can be correspondingly displayed according to the interface layout template, at this time, the call members joining the virtual room are all in a video access mode, and the video window of each member can be displayed on the video call interface.

When the number of the current call paths is larger than the preset number, it is obvious that the display device cannot support the video call of the current call paths, and for this reason, the application sets a first area and a second area on a video call interface, the upper limit of video windows displayed in the first area is the preset number of paths, and other paths of calls exceeding the preset number of paths can be accessed by voice, namely, the calls are automatically switched to voice windows in the second area, so that the number of the voice windows in the second area is the difference between the current call paths and the preset number of paths, and the total number of the windows in the first area and the second area is the current number of call paths, so that a home terminal user can chat with a call object in the second area in a voice mode, so that the multi-path video call is no longer limited by the configuration and model of the display device, and the application experience of the video call of the user is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic diagram illustrating an operation scenario between a display device 200 and a control apparatus 100;

fig. 2 is a block diagram illustrating a hardware configuration of the display device 200 in fig. 1;

fig. 3 is a block diagram schematically showing a hardware configuration of the control apparatus 100 in fig. 1;

fig. 4 is a schematic diagram illustrating a software configuration in the display device 200 in fig. 1;

FIG. 5 is a schematic diagram illustrating an icon control interface display of an application on display device 200;

FIG. 6a is a schematic diagram illustrating an interface layout template;

fig. 6b exemplarily shows a video call interface diagram when the current call path number and the preset path number are 3;

fig. 6c exemplarily shows a schematic view of a video call interface when the current call path number is 4 and the preset path number is 3;

fig. 7 exemplarily shows a schematic view of a video call interface when the actual number of paths is 6 and the preset number of paths is 3;

FIG. 8 is a diagram illustrating the display of a prompt popup upon initiation of a video call;

FIG. 9 is a schematic diagram illustrating a video-call interface when a list of controls is opened;

FIG. 10 is a schematic diagram illustrating a video call interface when a first object is selected;

FIG. 11 is a schematic diagram illustrating a video call interface when a second object selection popup is displayed;

fig. 12 is a schematic diagram illustrating an interface when switching of the audio/video window is completed;

fig. 13 is a flowchart illustrating a processing method for switching an audio/video window in a multi-way video call;

FIG. 14 is a logical diagram illustrating the process of inviting a third object to join a video call;

fig. 15 is a logic diagram illustrating a process when a call partner hangs up a video call.

Detailed Description

To make the objects, embodiments and advantages of the present application clearer, the following description of exemplary embodiments of the present application will clearly and completely describe the exemplary embodiments of the present application with reference to the accompanying drawings in the exemplary embodiments of the present application, and it is to be understood that the described exemplary embodiments are only a part of the embodiments of the present application, and not all of the embodiments.

All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments described herein without inventive step, are intended to be within the scope of the claims appended hereto. In addition, while the disclosure herein has been presented in terms of one or more exemplary examples, it should be appreciated that aspects of the disclosure may be implemented solely as a complete embodiment.

It should be noted that the brief descriptions of the terms in the present application are only for the convenience of understanding the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.

The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar or analogous objects or entities and are not necessarily intended to limit the order or sequence of any particular one, Unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.

Furthermore, the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.

The term "module," as used herein, refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the functionality associated with that element.

The term "remote control" as used in this application refers to a component of an electronic device (such as the display device disclosed in this application) that is typically wirelessly controllable over a relatively short range of distances. Typically using infrared and/or Radio Frequency (RF) signals and/or bluetooth to connect with the electronic device, and may also include WiFi, wireless USB, bluetooth, motion sensor, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in the common remote control device with the user interface in the touch screen.

The term "gesture" as used in this application refers to a user's behavior through a change in hand shape or an action such as hand motion to convey a desired idea, action, purpose, or result.

Fig. 1 is a schematic diagram illustrating an operation scenario between a display device and a control apparatus according to an embodiment. As shown in fig. 1, a user may operate the display device 200 through the mobile terminal 300 and the control apparatus 100.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication methods, etc., and the display device 200 is controlled by wireless or other wired methods. The user may input a user command through a key on a remote controller, voice input, control panel input, etc. to control the display apparatus 200. Such as: the user can input a corresponding control command through a volume up/down key, a channel control key, up/down/left/right moving keys, a voice input key, a menu key, a power on/off key, etc. on the remote controller, to implement the function of controlling the display device 200.

In some embodiments, mobile terminals, tablets, computers, laptops, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device. The application, through configuration, may provide the user with various controls in an intuitive User Interface (UI) on a screen associated with the smart device.

In some embodiments, the mobile terminal 300 may install a software application with the display device 200 to implement connection communication through a network communication protocol for the purpose of one-to-one control operation and data communication. Such as: the mobile terminal 300 and the display device 200 can establish a control instruction protocol, synchronize a remote control keyboard to the mobile terminal 300, and control the display device 200 by controlling a user interface on the mobile terminal 300. The audio and video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.

As also shown in fig. 1, the display apparatus 200 also performs data communication with the server 400 through various communication means. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. Illustratively, the display device 200 receives software program updates, or accesses a remotely stored digital media library, by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The server 400 may be a cluster or a plurality of clusters, and may include one or more types of servers. Other web service contents such as video on demand and advertisement services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limiting, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display apparatus 200 may additionally provide an intelligent network tv function of a computer support function including, but not limited to, a network tv, an intelligent tv, an Internet Protocol Tv (IPTV), and the like, in addition to the broadcast receiving tv function.

A hardware configuration block diagram of a display device 200 according to an exemplary embodiment is exemplarily shown in fig. 2.

In some embodiments, at least one of the controller 250, the tuner demodulator 210, the communicator 220, the detector 230, the input/output interface 255, the display 275, the audio output interface 285, the memory 260, the power supply 290, the user interface 265, and the external device interface 240 is included in the display apparatus 200.

In some embodiments, a display 275 receives image signals originating from the first processor output and displays video content and images and components of the menu manipulation interface.

In some embodiments, the display 275, includes a display component for presenting a picture, and a drive component that drives the display of an image.

In some embodiments, the video content is displayed from broadcast television content, or alternatively, from various broadcast signals that may be received via wired or wireless communication protocols. Alternatively, various image contents received from the network communication protocol and sent from the network server side can be displayed.

In some embodiments, the display 275 is used to present a user-manipulated UI interface generated in the display apparatus 200 and used to control the display apparatus 200.

In some embodiments, a driver assembly for driving the display is also included, depending on the type of display 275.

In some embodiments, display 275 is a projection display and may also include a projection device and a projection screen.

In some embodiments, communicator 220 is a component for communicating with external devices or external servers according to various communication protocol types. For example: the communicator may include at least one of a Wifi chip, a bluetooth communication protocol chip, a wired ethernet communication protocol chip, and other network communication protocol chips or near field communication protocol chips, and an infrared receiver.

In some embodiments, the display apparatus 200 may establish control signal and data signal transmission and reception with the external control device 100 or the content providing apparatus through the communicator 220.

In some embodiments, the user interface 265 may be configured to receive infrared control signals from a control device 100 (e.g., an infrared remote control, etc.).

In some embodiments, the detector 230 is a signal used by the display device 200 to collect an external environment or interact with the outside.

In some embodiments, the detector 230 includes a light receiver, a sensor for collecting the intensity of ambient light, and parameters changes can be adaptively displayed by collecting the ambient light, and the like.

In some embodiments, the detector 230 may further include an image collector, such as a camera, etc., which may be configured to collect external environment scenes, collect attributes of the user or gestures interacted with the user, adaptively change display parameters, and recognize user gestures, so as to implement a function of interaction with the user.

In some embodiments, the detector 230 may also include a temperature sensor or the like, such as by sensing ambient temperature.

In some embodiments, the display apparatus 200 may adaptively adjust a display color temperature of an image. For example, the display apparatus 200 may be adjusted to display a cool tone when the temperature is in a high environment, or the display apparatus 200 may be adjusted to display a warm tone when the temperature is in a low environment.

In some embodiments, the detector 230 may also be a sound collector or the like, such as a microphone, which may be used to receive the user's voice. Illustratively, a voice signal including a control instruction of the user to control the display device 200, or to collect an ambient sound for recognizing an ambient scene type, so that the display device 200 can adaptively adapt to an ambient noise.

In some embodiments, as shown in fig. 2, the input/output interface 255 is configured to allow data transfer between the controller 250 and external other devices or other controllers 250. Such as receiving video signal data and audio signal data of an external device, or command instruction data, etc.

In some embodiments, the external device interface 240 may include, but is not limited to, the following: the interface can be any one or more of a high-definition multimedia interface (HDMI), an analog or data high-definition component input interface, a composite video input interface, a USB input interface, an RGB port and the like. The plurality of interfaces may form a composite input/output interface.

In some embodiments, as shown in fig. 2, the tuning demodulator 210 is configured to receive a broadcast television signal through a wired or wireless receiving manner, perform modulation and demodulation processing such as amplification, mixing, resonance, and the like, and demodulate an audio and video signal from a plurality of wireless or wired broadcast television signals, where the audio and video signal may include a television audio and video signal carried in a television channel frequency selected by a user and an EPG data signal.

In some embodiments, the frequency points demodulated by the tuner demodulator 210 are controlled by the controller 250, and the controller 250 can send out control signals according to user selection, so that the modem responds to the television signal frequency selected by the user and modulates and demodulates the television signal carried by the frequency.

In some embodiments, the broadcast television signal may be classified into a terrestrial broadcast signal, a cable broadcast signal, a satellite broadcast signal, an internet broadcast signal, or the like according to the broadcasting system of the television signal. Or may be classified into a digital modulation signal, an analog modulation signal, and the like according to a modulation type. Or the signals are classified into digital signals, analog signals and the like according to the types of the signals.

In some embodiments, the controller 250 and the modem 210 may be located in different separate devices, that is, the modem 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box. Therefore, the set top box outputs the television audio and video signals modulated and demodulated by the received broadcast television signals to the main body equipment, and the main body equipment receives the audio and video signals through the first input/output interface.

In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in memory. The controller 250 may control the overall operation of the display apparatus 200. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 275, the controller 250 may perform an operation related to the object selected by the user command.

In some embodiments, the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon. The user command for selecting the UI object may be a command input through various input means (e.g., a mouse, a keyboard, a touch pad, etc.) connected to the display apparatus 200 or a voice command corresponding to a voice spoken by the user.

As shown in fig. 2, the controller 250 includes at least one of a Random Access Memory 251 (RAM), a Read-Only Memory 252 (ROM), a video processor 270, an audio processor 280, other processors 253 (e.g., a Graphics Processing Unit (GPU), a Central Processing Unit 254 (CPU), a Communication Interface (Communication Interface), and a Communication Bus 256(Bus), which connects the respective components.

In some embodiments, RAM 251 is used to store temporary data for the operating system or other programs that are running, and in some embodiments, ROM252 is used to store instructions for various system boots.

In some embodiments, the ROM252 is used to store a Basic Input Output System (BIOS). The system is used for completing power-on self-test of the system, initialization of each functional module in the system, a driver of basic input/output of the system and booting an operating system.

In some embodiments, when the power-on signal is received, the display device 200 starts to power up, the CPU executes the system boot instruction in the ROM252, and copies the temporary data of the operating system stored in the memory to the RAM 251 so as to start or run the operating system. After the start of the operating system is completed, the CPU copies the temporary data of the various application programs in the memory to the RAM 251, and then, the various application programs are started or run.

In some embodiments, CPU processor 254 is used to execute operating system and application program instructions stored in memory. And executing various application programs, data and contents according to various interactive instructions received from the outside so as to finally display and play various audio and video contents.

In some example embodiments, the CPU processor 254 may comprise a plurality of processors. The plurality of processors may include a main processor and one or more sub-processors. A main processor for performing some operations of the display apparatus 200 in a pre-power-up mode and/or operations of displaying a screen in a normal mode. One or more sub-processors for one operation in a standby mode or the like.

In some embodiments, the graphics processor 253 is used to generate various graphics objects, such as: icons, operation menus, user input instruction display graphics, and the like. The display device comprises an arithmetic unit which carries out operation by receiving various interactive instructions input by a user and displays various objects according to display attributes. And the system comprises a renderer for rendering various objects obtained based on the arithmetic unit, wherein the rendered objects are used for being displayed on a display.

In some embodiments, the video processor 270 is configured to receive an external video signal, and perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image synthesis, and the like according to a standard codec protocol of the input signal, so as to obtain a signal that can be displayed or played on the direct display device 200.

In some embodiments, video processor 270 includes a demultiplexing module, a video decoding module, an image synthesis module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio and video data stream, and if the input MPEG-2 is input, the demultiplexing module demultiplexes the input audio and video data stream into a video signal and an audio signal.

And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like.

And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display.

The frame rate conversion module is configured to convert an input video frame rate, such as a 60Hz frame rate into a 120Hz frame rate or a 240Hz frame rate, and the normal format is implemented in, for example, an interpolation frame mode.

The display format module is used for converting the received video output signal after the frame rate conversion, and changing the signal to conform to the signal of the display format, such as outputting an RGB data signal.

In some embodiments, the graphics processor 253 and the video processor may be integrated or separately configured, and when the graphics processor and the video processor are integrated, the graphics processor and the video processor may perform processing of graphics signals output to the display, and when the graphics processor and the video processor are separately configured, the graphics processor and the video processor may perform different functions, for example, a GPU + FRC (Frame Rate Conversion) architecture.

In some embodiments, the audio processor 280 is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform noise reduction, digital-to-analog conversion, and amplification processes to obtain an audio signal that can be played in a speaker.

In some embodiments, video processor 270 may comprise one or more chips. The audio processor may also comprise one or more chips.

In some embodiments, the video processor 270 and the audio processor 280 may be separate chips or may be integrated together with the controller in one or more chips.

In some embodiments, the audio output, under the control of controller 250, receives sound signals output by audio processor 280, such as: the speaker 286, and an external sound output terminal of a generating device that can output to an external device, in addition to the speaker carried by the display device 200 itself, such as: external sound interface or earphone interface, etc., and may also include a near field communication module in the communication interface, for example: and the Bluetooth module is used for outputting sound of the Bluetooth loudspeaker.

The power supply 290 supplies power to the display device 200 from the power input from the external power source under the control of the controller 250. The power supply 290 may include a built-in power supply circuit installed inside the display apparatus 200, or may be a power supply interface installed outside the display apparatus 200 to provide an external power supply in the display apparatus 200.

A user interface 265 for receiving an input signal of a user and then transmitting the received user input signal to the controller 250. The user input signal may be a remote controller signal received through an infrared receiver, and various user control signals may be received through the network communication module.

In some embodiments, the user inputs a user command through the control apparatus 100 or the mobile terminal 300, the user input interface responds to the user input through the controller 250 according to the user input, and the display device 200 responds to the user input through the controller 250.

In some embodiments, a user may enter user commands on a Graphical User Interface (GUI) displayed on the display 275, and the user input interface receives the user input commands through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.

In some embodiments, a "user interface" is a media interface for interaction and information exchange between an application or operating system and a user that enables conversion between an internal form of information and a form that is acceptable to the user. A commonly used presentation form of the User Interface is a Graphical User Interface (GUI), which refers to a User Interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, window, control, etc. displayed in the display of the electronic device, where the control may include a visual interface element such as an icon, button, menu, tab, text box, dialog box, status bar, navigation bar, Widget, etc.

The memory 260 includes a memory storing various software modules for driving the display device 200. Such as: various software modules stored in the first memory, including: at least one of a basic module, a detection module, a communication module, a display control module, a browser module, and various service modules.

The base module is a bottom layer software module for signal communication between various hardware in the display device 200 and for sending processing and control signals to the upper layer module. The detection module is used for collecting various information from various sensors or user input interfaces, and the management module is used for performing digital-to-analog conversion and analysis management.

For example, the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is used for controlling the display to display the image content, and can be used for playing the multimedia image content, UI interface and other information. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing a module for data communication between browsing servers. And the service module is used for providing various services and modules including various application programs. Meanwhile, the memory 260 may store a visual effect map for receiving external data and user data, images of various items in various user interfaces, and a focus object, etc.

Fig. 3 exemplarily shows a block diagram of a configuration of the control apparatus 100 according to an exemplary embodiment. As shown in fig. 3, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface, a memory, and a power supply.

The control apparatus 100 is configured to control the display device 200 and may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200. Such as: the user operates the channel up/down key on the control device 100, and the display device 200 responds to the channel up/down operation.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications that control the display device 200 according to user demands.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similar to the control apparatus 100 after an application for manipulating the display device 200 is installed. Such as: the user may implement the function of controlling the physical keys of the apparatus 100 by installing an application, various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM 113 and ROM 114, a communication interface 130, and a communication bus. The controller is used for controlling the operation of the control device 100, as well as the communication cooperation among the internal components and the external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display apparatus 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display apparatus 200. The communication interface 130 may include at least one of a WiFi chip 131, a bluetooth module 132, an NFC module 133, and other near field communication modules.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touch pad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can realize a user instruction input function through actions such as voice, touch, gesture, pressing, and the like, and the input interface converts the received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display apparatus 200. In some embodiments, the interface may be an infrared interface or a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. The following steps are repeated: when the rf signal interface is used, a user input command needs to be converted into a digital signal, and then the digital signal is modulated according to the rf control signal modulation protocol and then transmitted to the display device 200 through the rf transmitting terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an input-output interface 140. The control device 100 is configured with a communication interface 130, such as: the WiFi, bluetooth, NFC, etc. modules may transmit the user input command to the display device 200 through the WiFi protocol, or the bluetooth protocol, or the NFC protocol code.

And a memory 190 for storing various operation programs, data and applications for driving and controlling the control apparatus 100 under the control of the controller. The memory 190 may store various control signal commands input by a user.

And a power supply 180 for providing operation power support for each element of the control device 100 under the control of the controller. A battery and associated control circuitry.

In some embodiments, the system may include a Kernel (Kernel), a command parser (shell), a file system, and an application program. The kernel, shell, and file system together make up the basic operating system structure that allows users to manage files, run programs, and use the system. After power-on, the kernel is started, kernel space is activated, hardware is abstracted, hardware parameters are initialized, and virtual memory, a scheduler, signals and interprocess communication (IPC) are operated and maintained. And after the kernel is started, loading the Shell and the user application program. The application program is compiled into machine code after being started, and a process is formed.

Referring to fig. 4, in some embodiments, the system is divided into four layers, which are an Application (Applications) layer (abbreviated as "Application layer"), an Application Framework (Application Framework) layer (abbreviated as "Framework layer"), an Android runtime (Android runtime) and system library layer (abbreviated as "system runtime library layer"), and a kernel layer from top to bottom.

In some embodiments, at least one application program runs in the application program layer, and the application programs can be Window (Window) programs carried by an operating system, system setting programs, clock programs, camera applications and the like; or may be an application developed by a third party developer such as a hi program, a karaoke program, a magic mirror program, or the like. In specific implementation, the application packages in the application layer are not limited to the above examples, and may actually include other application packages, which is not limited in this embodiment of the present application.

The framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. The application framework layer acts as a processing center that decides to let the applications in the application layer act. The application program can access the resources in the system and obtain the services of the system in execution through the API interface.

As shown in fig. 4, in the embodiment of the present application, the application framework layer includes a manager (Managers), a Content Provider (Content Provider), a View System (View System), and the like, where the manager includes at least one of the following modules: an Activity Manager (Activity Manager) is used for interacting with all activities running in the system; the Location Manager (Location Manager) is used for providing the system service or application with the access of the system Location service; a Package Manager (Package Manager) for retrieving various information related to an application Package currently installed on the device; a Notification Manager (Notification Manager) for controlling display and clearing of Notification messages; a Window Manager (Window Manager) is used to manage the icons, windows, toolbars, wallpapers, and desktop components on a user interface.

In some embodiments, the activity manager is to: managing the life cycle of each application program and the general navigation backspacing function, such as controlling the exit of the application program (including switching the user interface currently displayed in the display window to the system desktop), opening, backing (including switching the user interface currently displayed in the display window to the previous user interface of the user interface currently displayed), and the like.

In some embodiments, the window manager is used to manage all window processes, such as obtaining display size, determining if there is a status bar, locking the screen, intercepting the screen, controlling display window changes (e.g., zooming out, dithering, distorting, etc.) and the like.

In some embodiments, the system runtime layer provides support for the upper layer, i.e., the framework layer, and when the framework layer is used, the android operating system runs the C/C + + library included in the system runtime layer to implement the functions to be implemented by the framework layer.

In some embodiments, the kernel layer is a layer between hardware and software. As shown in fig. 4, the core layer includes at least one of the following drivers: audio drive, display drive, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (such as fingerprint sensor, temperature sensor, touch sensor, pressure sensor, etc.), and so on.

In some embodiments, the kernel layer further comprises a power driver module for power management.

In some embodiments, software programs and/or modules corresponding to the software architecture of fig. 4 are stored in the first memory or the second memory shown in fig. 2 or 3.

In some embodiments, taking the magic mirror application (photographing application) as an example, when the remote control receiving device receives a remote control input operation, a corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes the input operation into an original input event (including information such as a value of the input operation, a timestamp of the input operation, etc.). The raw input events are stored at the kernel layer. The application program framework layer obtains an original input event from the kernel layer, identifies a control corresponding to the input event according to the current position of the focus and uses the input operation as a confirmation operation, the control corresponding to the confirmation operation is a control of a magic mirror application icon, the magic mirror application calls an interface of the application framework layer to start the magic mirror application, and then the kernel layer is called to start a camera driver, so that a static image or a video is captured through the camera.

In some embodiments, for a display device with a touch function, taking a split screen operation as an example, the display device receives an input operation (such as a split screen operation) applied to a display by a user, and the kernel layer may generate a corresponding input event according to the input operation and report the event to the application framework layer. The window mode (such as multi-window mode) corresponding to the input operation, the position and size of the window and the like are set by an activity manager of the application framework layer. And the window management of the application program framework layer draws a window according to the setting of the activity manager, then sends the drawn window data to the display driver of the kernel layer, and the display driver displays the corresponding application interface in different display areas of the display.

In some embodiments, as shown in fig. 5, the application layer containing at least one application may display a corresponding icon control in the display, such as: a live television application icon control, a Video On Demand (VOD) application icon control, a media center application icon control, an application center icon control, a game application icon control, and the like.

In some embodiments, the live television application may provide live television via different signal sources. For example, a live television application may provide television signals using input from cable television, radio broadcasts, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

In some embodiments, a video-on-demand application may provide video from different storage sources. Unlike live television applications, video on demand provides a video display from some storage source. For example, the video on demand may come from a server side of the cloud storage, from a local hard disk storage containing stored video programs.

In some embodiments, the media center application may provide various applications for multimedia content playback. For example, a media center, which may be other than live television or video on demand, may provide services that a user may access to various images or audio through a media center application.

In some embodiments, an application center may provide storage for various applications. The application may be a game, an application, or some other application associated with a computer system or other device that may be run on the smart television. The application center may obtain these applications from different sources, store them in local storage, and then be operable on the display device 200.

The hardware configuration, the software configuration, the function realization and the like of the display device are introduced, when an initiator initiates a video call, the server establishes a corresponding virtual room, the display device is provided with a sound collector and an image collector, the image collector is used for collecting the video of a local user in real time in the video call process, the sound collector is used for collecting the audio of the local user in real time in the video call process, then the audio and video data of the local user are sent to the server, and the audio and video data of each member in the virtual room can be uploaded to the server in such a form, so that the display device acquires the audio/video data of other members from the server to be displayed and played at the local end in the video call process.

After the video call is started, the virtual room records the current call path number, that is, the number of call persons currently accessing the virtual room, and as the number of call members increases (for example, new call is answered, a new member is invited to join the video call, etc.) or decreases (for example, a member hangs up the video call), the current call path number recorded in the virtual room also needs to be updated accordingly.

In some embodiments, the display device needs to obtain the current call path number from the server to compare the relationship between the current call path number and the preset path number, so as to execute the corresponding multi-path video call logic. The display device may send an inquiry request to the server, and the server, in response to the inquiry request, may inquire the current number of call paths recorded in the virtual room and send the current number of call paths to the display device, or the server, when detecting that the current number of call paths recorded in the virtual room changes, needs to send the latest current number of call paths to the display device, thereby synchronizing the current number of call paths recorded in the virtual room to the display devices of each call member.

Based on the configuration and model of the display device, different display devices have different preset paths, the preset paths are the number of video call paths that the display device can support, namely the upper limit number of video windows that the video call interface can display, the preset paths are determined according to the processing capacity and experience value of the display device, and the number of paths supported by each display device is fixedly set along with the hardware model. In some embodiments, a higher configured display device may support up to 9 video calls. When the display device supports 2-way video call, namely only one-to-one video call, and when the display device supports 2-9-way video call, many-to-many video call can be realized.

The display device generally has an interface layout template, i.e., the size and position of each video call window are set when a call is made. When the number of the current call paths is less than or equal to the preset number of the call paths, the interface layout template may be called to display the video windows of the current call paths, and the interface layout template has a corresponding relationship with the number of the video calls, and in some embodiments, as shown in fig. 6a, different interface layout templates corresponding to the call paths of 2 to 9 are shown. When the number of the channels is 2, the video pictures of the friends can be displayed on a full-screen window, the video pictures of the user at the home terminal can be displayed on a small window, and the sizes of the two video call windows can be the same; the video windows may be ordered in an array format when the number of passes is 3-9. For example, if the preset number of paths of a certain display device is 6, 2-6 video calls may be supported, for example, if the current number of call paths is 4, video windows of 4 call members may be displayed according to an interface layout template with the number of paths of 4, that is, a window layout of 2 × 2 in fig. 6 a. The interface layout template corresponding to each call path number is only one display layout setting for implementing the call method, and technicians in the field can also provide other designs by adjusting the size and the position of the video window, and the size and the position setting of the display window do not influence the implementation of the scheme.

However, in some embodiments, different display devices may support different numbers of call paths, when two display devices supporting different numbers of call paths start a call, or when the number of call paths supported by a display device is exceeded after a newly invited call party is added in the call process, the display device supporting a low-number call still needs to be retained in the call process, but the user cannot be forced to quit.

In this application, the basic processing logic of the multi-channel video call is: acquiring the current call path number of the video call from a server; in response to the fact that the current call path number is larger than the preset path number, calling an interface layout template corresponding to the preset path number to control a display to display a video window with the preset path number in a first area on a video call interface; displaying voice windows with the number being the difference between the current call path number and the preset path number in a second area; and in response to the fact that the current communication path number is smaller than or equal to the preset path number, calling an interface layout template corresponding to the current communication path number to control a display to display a video window with the current communication path number on a video communication interface, and not displaying a voice window. The basic processing logic for a multi-way video call is first described in detail below.

In some embodiments, when there are many members joining a virtual room of a video call, a situation may occur that the current call path number is greater than a preset path number, for example, there are 10 members in a communication group a, where a user B initiates a video chat in the communication group, and when all of the other 9 members answer the video call, the current call path number is 10, but the preset path number of the display device is 9, that is, the video call interface displays 9 video windows at most, then it is necessary that 1 member cannot perform video access, and the member may perform voice access, that is, a mode combining the video window and the voice window is adopted, and 9 video windows and 1 voice window are displayed on the video call interface. For another example, when the display device a supporting two-way video performs two-way video call with the display device B supporting 3-way video, if any one of the two parties of the call invites a third party, the display device a supporting two-way video can set the audio and video data of the third party display device C that is accessed last into the voice call by judging the call access time, so that the current call can be continuously maintained, and the display device B supporting 3-way video can call the interface layout template of 3-way video call to perform video call. For another example, when a display device a supporting two-way video, a display device B supporting 3-way video, and a display device C supporting 4-way video are performing three-way video call, if any one of the three parties invites a display device D, the display device a supporting two-way video can set the audio and video data of the last accessed display device D as an audio call by judging the call access time, so that the current call can be continuously maintained; the display equipment B supporting the 3-channel video can set the audio and video data of the display equipment D which is accessed last into a voice call through the judgment of the access call time, so that the current call can be continuously maintained; the display device C supporting the 4-way video may perform display of a call window using an interface layout template for the 4-way video call.

In some embodiments, the display device determines audio and video data of an opposite-end display device performing video display and audio and video data of an opposite-end display device performing audio display according to access time of the opposite-end display device participating in the call, accesses the opposite-end display device, and only performs audio data display when the number of paths exceeds a preset number.

In some embodiments, the display device sets audio and video data of the display device performing video presentation and audio and video data of the display device performing audio presentation only according to a switching operation of a user.

In some embodiments, as shown in fig. 6b and fig. 6c, the interface implementation during the call is explained by taking the display device supporting the 3-way video call as an example, and the switching process of other ways is similar to this example. In a window displayed on an interface, user identifiers are added on a voice window and a video window to represent different display devices corresponding to audio and video streams, in some embodiments, the identifiers can be generated according to an address book or a remark of the device, and in some embodiments, the identifiers can be generated according to a device identifier or an account identifier of an opposite terminal.

In some embodiments, the display device of user 1 is display device a, the display device of user 2 is display device B, the display device of user himself is display device C, and the display device of user 4 is display device D.

In some embodiments, in the scenario shown in fig. 6B, the user 1, the user 2, and the server are performing a video call, the server establishes a virtual room for the video call, the display device a, the display device B, and the display device C respectively access the virtual room to upload locally acquired audio/video data to the server, and pull the audio/video data uploaded by the display device on the opposite end according to the device ID and/or the account ID. In some embodiments, the audio/video data of the opposite terminal can inform the display device of pulling after the server distinguishes according to the device ID and/or the account ID.

In some embodiments, in the scenario shown in fig. 6c, any one of the three parties participating in the call before invites the user 4 to join the call through an invitation control in the interface, or when there are 4 call members joining the call initially, that is, the number of current call paths is 4, but the joining time of the user 4 is later than that of the user 1, the user 2 and the user themselves, or when the call can be joined through other means of automatic call joining, the display device adds a floating layer (that is, a second area) on the original interface layout, a voice window is set in the second area, and the voice window can show the identifier of the user 4 on the control according to preset characters and a user identifier corresponding to the post-joining pronunciation video data obtained from the server, and other prompt information.

In some embodiments, if the user 1 exits the call, the video window in the display interface that originally displayed the audio/video data of the user 1 may be used to show the audio/video data of the user 4 and cancel the second area that contains the voice window of the user 4.

In some embodiments, when the current number of call paths is greater than the preset number of paths, that is, when a video call interface in a video window + voice window mode is adopted, the interface mainly includes a first area and a second area, the first area displays video windows with the preset number of paths, and the second area displays voice windows with the number equal to the difference between the actual number of paths and the preset number of paths.

Taking the actual number of paths of the video call as 6, and the preset number of paths of the display device as 3 for example, as shown in fig. 7, since the preset number of paths is 3, the first region includes 3 video windows, each video window displays the video picture of the corresponding member, and meanwhile, the video window is also provided with a voice control for playing the audio of the corresponding member. The video windows may also identify user IDs to identify call members corresponding to each video window, where the user IDs may be user names, account or remark names, etc., for example, the user IDs corresponding to 3 video windows in fig. 7 are user 1, user 2 and user 3 (which are merely exemplary and do not represent representations of actual user IDs).

The second area in fig. 7 includes 3 voice windows for playing the audio of the corresponding member, and the voice windows may display a preset image because the real video frames of the friends cannot be seen, where the preset image may be a unified virtual avatar or an avatar set by the user in the video call application; prompt information such as 'voice access in progress' and the like can be displayed in the voice window and used for prompting that the friend has accessed voice and is in voice communication with a home terminal user; the voice windows may also identify user IDs to identify call members corresponding to each voice window, where the user IDs may be user names, accounts, or remark names, and the like, for example, the user IDs corresponding to 3 voice windows in fig. 7 are user 4, user 5, and self, respectively. The first region and the second region are positionally free of intersection, overlap, and occlusion.

In the video call interface, when the current call path number is greater than the preset path number, the number of video windows in the first area is equal to the preset path number, and the number of voice windows in the second area is not limited, and mainly depends on the call path number of the virtual room accessed in the call process, the number of new members invited and the number of hung-up voice access. Due to the limitation of the size of the video call interface and the voice window in the second area, the voice window may not be completely displayed, for example, there are 6 members with voice access, but the user can only see 4 members in the second area at most, so the user can slide left and right in the second area to view 2 voice windows hidden in the second area. When the current call path number is less than or equal to the preset path number, the video window is displayed only according to the interface layout template corresponding to the current call path number, and the second area and the voice window are not displayed under the condition.

In practical applications, the video window and the voice window are relatively fixed, and referring to fig. 7, if the user 4 is in voice access, the home terminal user and the user 4 can only perform voice chat but cannot perform video chat, and similarly, if the user 1 is in video access, the home terminal user and the user 1 can only maintain video chat but cannot switch to voice chat. The user can only see the call object corresponding to the video window in the first area, but cannot see the call object corresponding to the voice window in the second area, and as the user cannot see the video pictures of all the members in the virtual room, there is a need to expect to autonomously select which member's video picture to watch.

In contrast, in the mode that the video call interface is embodied as video call of the first area and voice call of the second area, the scheme for switching the audio and video windows in the video call process is provided. First, a processing logic for switching an audio/video window and a corresponding UI change display when a multi-channel video call is started are described below, specifically, an example is performed with a preset channel number of 3.

In some embodiments, when a video call is started, firstly determining whether the current call path number is greater than a preset path number, if so, indicating that the current call path number exceeds the upper limit of the number of video windows in a first area, calling an interface layout template corresponding to the maximum number (namely the preset path number) in a call interface, playing audio and video data of accessed display equipment on the video window of the call interface according to call access time, and after the video windows of the call interface all display the audio and video data corresponding to the audio and video streams, enabling the added call members to only switch to voice access and display through the voice window on a second area; if the current call path number does not exceed the preset path number, the number of the video paths participating in the call does not exceed the upper limit of the number of the video windows in the first area, an interface layout template corresponding to the current call path number can be called on the call interface, the audio and video data of the display equipment accessed to the video call are displayed in the video window of the first area, and then the added call members can also access the video until the number of the video windows in the first area reaches the upper limit; if the actual number of the paths of the video call is equal to the preset number of the paths, the first area just reaches the upper limit after the home terminal user accesses, and then the added call members can only switch to voice access.

In some embodiments, when a new call party is added, first determining whether the current call path number is greater than a preset path number, if so, indicating that the number of video windows in the first area reaches an upper limit, and then adding call members which can only switch to voice access and display through a voice window on the second area; if the number of the video windows in the first area is not larger than the upper limit, the first area is indicated to be not full, then the added call members can access the video, and the audio and video data of the accessed display equipment are displayed by calling different interface layout templates until the number of the video windows in the first area reaches the upper limit; if the current number of the call paths is equal to the preset number of the call paths, the first area just reaches the upper limit of the local terminal display equipment after the call is accessed, and then the newly-added call members can only switch to voice access.

In some embodiments, for example, the preset number of the local display device is 3, but there are 5 currently accessed members of the video call, when a new user accesses, the new user generally defaults to a voice access mode in the local display device, that is, a voice window is newly added in the second area; the new access situation is shown in fig. 6b and 6 c. The method comprises the steps that a corresponding display device C adds a second area after receiving a notice that a new video is accessed and sent by a server, a voice window is arranged in the second area, a controller of the display device C pulls audio and video data uploaded by the newly accessed display device D in a virtual room of the server, and the received audio and video data are separated. In some embodiments, the controller of the display device C parses the separated audio data and the separated video data, but sends the parsed audio data to the sound output device of the display device C for audio output, and discards the parsed video data, thereby implementing the voice access of the display device D.

In some embodiments, the user identifier of the voice window may be displayed according to a user identifier carried in a notification that a new video access is sent by the server, or may be displayed according to a user identifier in the pulled audio/video data.

In some embodiments, in order to provide a certain prompt and guidance for switching audio and video windows for a user, when the current number of call paths is greater than the preset number of call paths, a prompt popup window may be displayed on an upper layer of a video call interface, and the pop-up time of the prompt popup window may be when a video call is started or during a call. For example, the preset number of paths of the display device is 3, the current number of paths of the call is 3, in the call process, any party invites 1 new member to join the call, after the new member accesses the virtual room, the current number of paths of the call exceeds the preset number of paths, and after the display device receives the invitation success information, a prompt popup window can pop up. For another example, the local terminal device supports 4-way communication, 5 virtual rooms have been accessed before the local terminal user answers 8 video calls, and the local terminal user is the 6 th access, so that the number of ways that the local terminal user has exceeded the preset number of ways supported by the local terminal user when accessing, a prompt popup window can be popped up after the local terminal user successfully answers the incoming call. And when the current video call path number is less than or equal to the preset path number, the prompt popup window is not displayed.

In some embodiments, a display interface of the prompt pop-up window is shown in fig. 8, and prompt information such as "parent, limited to your television configuration, and automatically switching a video call joined by exceeding X to a voice access" may be displayed in the prompt pop-up window, where X is a numerical value of a preset number of paths, so that a home terminal user can know that the number of paths accessed by the home terminal user or a new call party after access exceeds the preset number of paths, and a member accessed to a virtual room after the number of paths exceeds the preset number of paths automatically switches to a voice access to the video call.

Secondly, guidance information such as "pressing a key to open an operation button — selecting 'switching audio and video' to switch voice into video" can be displayed in the prompt pop-up window, a user can click a first control on an interface such as fig. 7 to open the operation button, the first control can be the "down key" in fig. 7 or can be in other forms, so that the operation button is expanded, the operation button includes but is not limited to a microphone state control, a camera state control, a hang-up control, an invitation control, an audio and video window switching control, a small window call control and the like, and the user can switch the voice chat mode of a certain object into a video chat mode by clicking the audio and video window switching control.

The bottom of the prompt popup window can be provided with operation controls in forms of 'no prompt any more' and 'i know' and the like, a user clicks 'i know', the current prompt popup window can be destroyed, and when the user starts a video call next time, if the number of paths exceeds the preset number, the prompt popup window needs to be started again; if the user clicks 'no more prompt', the current prompt popup can be destroyed, but the prompt popup is only displayed this time, and the prompt popup cannot be displayed any more even if the number of the channels exceeds the preset number when the call is accessed.

After clicking the "i know" or "no more prompt" on the prompt pop-up window by the user, the prompt pop-up window is destroyed, and then the video call interface shown in fig. 7 can be entered, and after clicking the first control by the user, the user can open the operation button, as shown in fig. 9, which includes but is not limited to a microphone state control, a camera state control, a hang-up control, an invitation control, an audio/video window switching control, a small window call control, and the like. The microphone control is used for turning on or off the microphone when being triggered; the camera control is used for opening or closing the camera when being triggered; the hang-up control is used for hanging up the current video call when being triggered; the invitation control is used for inviting a new member to join the current video call when being triggered; the widget call control is used for switching the current full-screen video call interface into a widget when being triggered so as to support that a user can simultaneously carry out video call when watching other signal sources or application programs, namely, the widget call control realizes the chat while watching. On the basis of the conventional operation controls, the scheme is additionally provided with an audio/video window switching control so as to start audio/video switching logic when being triggered.

Because the number of operation buttons set for the video call is large, the control list is displayed in a floating layer manner in the second area in fig. 9, and the operation control can be switched by sliding left and right, and certainly, the display manner of the control list is not limited in the specific implementation, such as display in a column manner or display in an array manner, and the display position of the floating layer of the control list is not limited. The user can align the cursor with the corresponding control and click the determination key to input some operation instructions by operating a remote controller or a mouse and other devices, and in specific implementation, when the cursor moves on the control list, different controls can acquire a focus, for example, when the cursor moves to switch the audio/video window control, the audio/video window control is switched to acquire the focus and relatively present amplification to a certain extent, and tips information is displayed at the same time, for example, "click here to switch voice to video". When the user does not click any operation control in the control list, the control list can also be hidden through certain operations, for example, the control list is opened when the first control is clicked once, and the control list can be hidden when the first control is clicked twice; or, the user performs clicking/double-clicking and other operations on the video call interface except the control list floating layer to hide the control list; or, a time threshold may be set, when the display time of the control list exceeds the time threshold and the user does not click any operation control, the control list may be automatically hidden, and the hiding and displaying scheme of the control list may be flexibly set according to the actual application.

And responding to the operation of switching the audio and video windows, acquiring a target voice window corresponding to the first object from the second area, and acquiring a target video window corresponding to the second object from the first area, namely firstly determining two sides needing to be replaced after starting the audio and video switching logic.

Specifically, in response to a click operation on a first control, a display is controlled to display a control list on a floating layer of a video call interface, when the current call path number is greater than a preset path number, in response to an operation of clicking a control for switching an audio/video window in the control list by a user, the control list is controlled to be automatically hidden, an audio/video switching logic is started in a display device, the user can select a target voice window from a second area, a call member corresponding to the target voice window is a replacement object, namely a first object, and is switched to the video window by replacing a certain member in the first area, and when the target voice window is embodied on a UI layer, as shown in fig. 10, the user can move a focus left and right in the second area by clicking a direction key on a remote controller or moving the remote controller/mouse and the like. When a certain voice window acquires a focus, the voice window relatively presents a certain degree of amplification, and simultaneously displays tips information, such as "press a determination key O to select a friend to be switched to a video". In fig. 10, the voice window corresponding to the user 4 obtains the focus, and if the user clicks the confirmation key, that is, it is confirmed that the voice window corresponding to the user 4 is the target voice window, the display device records the first user ID corresponding to the first object as the target voice window, that is, records the first object as the user 4, in response to the confirmation operation on the target voice window.

After the first object is confirmed, the user can select a target video window from the first area, and the call member corresponding to the target video window is a replaced object, namely a second object, and is replaced into the second area by the first object so as to switch to the voice window. In an implementation manner of selecting the second object, the display device may further control the display to display a second object selection popup on an upper layer of the call interface in response to a confirmation operation on the target voice window, and when the display is embodied on a UI layer, as shown in fig. 11, the second object selection popup is displayed on the upper layer of the video call interface, and prompt information may be displayed in the second object selection popup, for example, "please select a friend to be replaced with voice", and user IDs corresponding to all video windows included in the first area are displayed in the second object selection popup, in examples in fig. 7 to 10, the first area includes 3 video windows corresponding to the user 1, the user 2, and the user 3, respectively, and then the user 1, the user 2, and the user 3 are displayed on the second object selection popup, and the user ID and the application avatar may be displayed for the user to select.

And the display equipment acquires a second user ID clicked in the second object selection popup by the user, records the second object as the second user ID, and the video window corresponding to the second user ID is the target video window. In fig. 11, if the user 1 is clicked, that is, the second user ID is the user 1, the second object is recorded as the user 1, which is equivalent to replacing the first object and the second object call window, and after the switching is completed, as shown in fig. 12, the user 4 switches to the video window (i.e., the target video window) of the previous user 1, the user 1 switches to the voice window (i.e., the target voice window) of the previous user 4, so that the user 4 switches from the voice chat to the video chat, the user 1 switches from the video chat to the voice chat, and meanwhile, information for prompting that the local user successfully switches, such as "the video window successfully switches", can be displayed on the interface, and then the audio/video window switching process is completed.

The display device simultaneously pulls the audio and video data of the user 4 from the virtual room of the server, and associates the audio and video data of the user 4 with the target video window for playing. Meanwhile, the display equipment stops pulling the video data of the user 1 from the virtual room, only pulls the audio data of the user 1, and associates the audio data of the user 1 with the target voice window for playing; or, the display device still pulls the audio and video data of the user 1 from the virtual room at the same time, but only analyzes and outputs the audio data of the user 1, but not analyzes and outputs the video data of the user 1, so that the voice access effect of the user 1 can be realized.

Of course, the manner of selecting the second object is not limited to that described in the embodiment of the present application, and other implementation manners may also be adopted, for example, a target video window may be directly selected in the first area in a manner similar to that of selecting the first object. Before the video call is not hung up, the home terminal user can select to switch any friend or self accessing the voice into the video window mode at any time according to personal wishes, and the user operation, the UI display and the audio and video switching logic can refer to the description during switching.

The display device provided in some embodiments may include the display device in some embodiments described above, or may be another display device that can implement the present application, and the display device includes: a display 275 for displaying a video call interface and the aforementioned UI, a sound player for playing the audio of each call member, a communicator 220 for communicatively connecting the display device 200 with the server 400, a user interface 265 for receiving a user input operation, and a controller 250 for processing a multiplex video call and switching an audio/video window. The sound player may be the speaker 286, or an external sound device. As shown in fig. 13, wherein the controller 250 is configured to perform the following multi-way video call processing method:

step S0, the current call path number of the video call is acquired from the server.

The display device may send an inquiry request to the server, and the server may inquire the current number of call paths recorded in the virtual room in response to the inquiry request and send the current number of call paths to the display device, or the server sends the latest current number of call paths to the display device when detecting that the current number of call paths recorded in the virtual room is changed, so as to synchronize the current number of call paths recorded in the virtual room to the display devices of each call member.

In some embodiments, after the initiator initiates a video call, because the invited members respond with different times, the current number of call paths recorded in the virtual room is continuously increased before all the members answer, and when one member newly accesses the virtual room, the current number of call paths is cumulatively increased by 1 until the number of call paths finally reaches the number of call paths specified by the initiator, and at this time, the current number of call paths is not changed temporarily.

In the process of video call, because the invitation control is arranged in the control list, any party in the call can click the invitation control to invite a new member to join the call, the new member can be a user who is not in the invitation list when the initiator initiates the call, for example, the user A invites the user B and the user C to carry out video call, namely, 3-person call is initiated, the user B invites the user D to join the video call in the process of call, the user D is a new member, after the user D answers the invitation call, the user D is accessed into a virtual room, the recorded current call path number is increased by 1, namely, the current call path number is changed to 4. Every time 1 new member is successfully invited, the number of current call paths needs to be cumulatively increased by 1.

For another example, the home terminal user a invites the user B and the user C to perform a video call, that is, a 3-person call is initiated, before the video call is not ended, the user C clicks the hang-up control to hang up the video call, the user C exits from the virtual room, and the number of current call paths recorded in the virtual room needs to be reduced by 1, that is, the number of current call paths is changed to 2. Every time one member hangs up the call, the current number of call paths needs to be accumulated by 1. After the user C hangs up the video call, the user a or the user B may also invite the user C again to join the video call through the invitation control, in which case the new member is an invitation list when the initiator initiates the call. Step S10, determine whether the current number of call paths is greater than the preset number of call paths. If yes, executing steps S20-S40; otherwise, if the determination result is negative, step S50 is performed.

In response to determining that the current call path number is greater than the preset path number, in step S20, invoking an interface layout template corresponding to the preset path number to control the display to display a video window with the preset path number in a first area on the video call interface; and displaying voice windows with the number of the difference between the current call path number and the preset path number in a second area.

When the actual number of the paths of the video calls is larger than the preset number of the paths, the video call interface is embodied as a mode of combining the video calls in the first area and the voice calls in the second area, wherein the preset number of the paths is the number of the video calls which can be supported by the display equipment, the video windows with the preset number of the paths are displayed in the first area at most, and the remaining number of the paths can be switched to the voice windows in the second area, namely the sum of the number of the windows in the first area and the second area is equal to the actual number of the paths.

Each video window in the first area and each voice window in the second area may have a preset window sorting rule, and referring to the layout of fig. 6, the sequence number of the video window is smaller, the higher the sequence number is, and the sequence number of the video window is larger, the lower the sequence number is. The ordering rule may include various kinds, such as random ordering, user-specified priority, or ordering according to chronological order of joining the virtual rooms.

In some embodiments, the controller is further configured to perform: after the video call is started, firstly, sequencing and displaying video windows in a first area according to the time sequence of the call members joining the virtual room; and responding to the fact that the total number of the video windows in the first area reaches the preset number, switching a mode that the following call members access the virtual room into voice access, and sequencing and displaying the voice windows in the second area according to the time sequence of the call members accessing the virtual room.

After the initiator initiates the video call, because the answering and responding time of each call member is different, the time for each call member to join the virtual room is also different. For example, the preset number of paths of the home terminal device is 3, a user 1 initiates a video call for 6 people, and then the

users

2, 3, 4, 5 and 6 (i.e. themselves) join the virtual rooms in sequence from first to last, because the user 1 is the initiator (i.e. the person joining the virtual room at the earliest time), and then the

users

2 and 3 join the virtual rooms in sequence, on the video call interface displayed by the home terminal device, the video call interface is sorted according to the

users

1, 2 and 3 in the first area and displays the respective corresponding video windows; when the user 3 reaches the upper limit of the preset number of paths during access, the user 4 starting after the user 3 automatically switches to the voice access mode, then the user 5 and the user 6 (own) perform voice access in sequence, and the corresponding voice windows are displayed in the second area according to the user 4, the user 5 and the own sequence, so that the sequence state as shown in fig. 7 is presented.

In some embodiments, if the local user wants to be able to see its own video picture in the first area, for example, the ordering rule may be set to set the video window of the local user at the last position of the first area, set the video window of the initiator at the first position of the first area, and order the other video windows in the first area and the audio windows in the second area according to the time sequence of the incoming call.

And step S30, responding to the operation of switching the audio and video windows, acquiring a target voice window corresponding to the first object from the second area, and acquiring a target video window corresponding to the second object from the first area.

In some embodiments, referring to fig. 8, the controller is further configured to perform: when the video call is started, if the current call path number is greater than the preset path number, controlling a display to display a prompt popup; the prompt popup window is used for prompting a user to display a control list by clicking a first control, switching a voice call into a video call by clicking a switching audio/video window control, and simultaneously prompting the user to execute corresponding processing logic by clicking other operation controls; and if the current call path number is less than or equal to the preset path number, the prompt popup window is not displayed.

When the current call path number is greater than the preset path number, a prompt popup window as shown in fig. 8 can be displayed, and after the user clicks 'i know' or 'no more prompt', the prompt popup window is destroyed and the user returns to the video call interface; when the current call path number is less than or equal to the preset path number, the user is still supported to access the video call in a video access mode at this time, and a prompt popup window is not required to be displayed.

In some embodiments, referring to fig. 9, the controller is further configured to perform: responding to the clicking operation of the first control, and controlling a display to display a control list on a floating layer of the video call interface; the control list comprises operation controls for a microphone state control, a camera state control, a hang-up control, an invitation control, an audio and video window switching control, a small window call control and the like.

In some embodiments, referring to fig. 10 and 11, for step S30, the controller is specifically configured to perform: when the number of the current call paths is larger than the preset number, responding to the click operation of the audio/video window switching control, and acquiring the target voice window selected by the user in a second area; responding to the confirmation operation of the target voice window, recording the first object as a first user ID corresponding to the target voice window, and controlling a display to display a second object selection popup on the upper layer of a video call interface; the second object selection popup displays user IDs corresponding to all video windows included in the first area; and acquiring a second user ID clicked in the second object selection popup by the user, and recording the second object as the second user ID.

After the control list is displayed, a user clicks a control for switching an audio/video window, that is, an operation for switching the audio/video window is input, and according to a selection and confirmation operation of the user in the second region, the controller may acquire and record the first object and a target voice window corresponding to the first object, for example, in fig. 10, the target voice window is a voice window with a sequence number of 1 in the second region, and a first user ID identified on the target voice window is user 4, that is, the first object is recorded as user 4; according to the selection operation of the user on the second object selection popup, the controller may acquire and record the second object and a target video window corresponding to the second object, for example, in fig. 10, the target video window is a video window with a sequence number of 1 in the first area, and the second user ID identified on the target video window is user 1, that is, the second object is recorded as user 1.

And step S40, associating the audio and video data of the first object to the target video window for playing, and associating the audio data of the second object to the target voice window for playing.

Referring to fig. 12, the audio/video data of the user 4 is associated with the target video window for playing, so that the home terminal user can not only hear the voice of the user 4, but also see the video picture of the user 4 in the target video window; and the user 1 as the replaced object can not see the video picture of the user 1 any more, but can listen to the voice of the user 1 in the target voice window, so that the user 4 is switched from the original voice access to the video access, and the user 1 is switched from the original video access to the voice access, thereby realizing the switching of the audio and video windows.

For the video window of each call object in the first area, because video and audio need to be played simultaneously, audio and video data of each call object in the first area need to be acquired from the server, that is, video streams and audio streams are pulled simultaneously from the server. For each voice window in the second area, only audio needs to be played, so that two modes can be involved, wherein the first mode is to pull only an audio stream but not a video stream; the second is to pull the video stream and the audio stream at the same time, but not decode and render the video data, so that the video data which is not analyzed cannot be played on a display, and the effect of independent voice access can also be realized. The call object is a call member except the home terminal user in the virtual room, and the home terminal user can acquire audio/video data through the home terminal sound/image acquisition device, so that the audio/video stream of the home terminal user does not need to be pulled from the server, and the pulled audio/video stream is the audio/video stream of the call object of the opposite terminal except the home terminal user.

In some embodiments, for the first case described above, the controller is further configured to: acquiring audio and video data of all call objects in the virtual room from a server; respectively associating the audio and video data of the call object in the first area with corresponding video windows for playing; the video data of the call object in the second area is not analyzed, and only the audio data of the call object in the second area are respectively associated to the corresponding voice windows for playing; and after responding to the operation of switching the audio and video window, acquiring the audio and video data of the first object from the server, and stopping acquiring the video data of the second object from the server.

And for the voice window in the second area, only the audio data of the call object is acquired, but the video data is not acquired, when the user selects to switch the voice window of the first object into the video window mode, because the first object is always in a voice access state before, namely the audio stream of the first object is always kept being pulled, only the transmission channel of the video data of the first object needs to be opened to start pulling the video stream of the first object from the virtual room of the server, and thus the controller acquires the audio and video data of the first object and associates the audio and video data with the target video window to play. And the second object is replaced to the voice access, so that the video data of the second object can be stopped being obtained from the server, the audio data of the second object can be kept being obtained from the server, namely, the video stream of the second object can be stopped being pulled only by closing the video data transmission channel of the second object, and then the audio data of the second object is associated to the target voice window to be played.

In some embodiments, for the second case described above, the controller is further configured to: acquiring audio and video data of call objects included in the first area from a server, and respectively associating the audio and video data with corresponding video windows for playing; only acquiring audio data of the call object in the second area from the server without acquiring video data, and respectively associating the audio data of the call object in the second area with corresponding voice windows for playing; and after responding to the operation of switching the audio and video window, starting to analyze the video data of the first object, and stopping analyzing the video data of the second object.

In this embodiment, for the speech window in the second region, the video stream and the audio stream are still pulled, but the video data is not analyzed. When a user selects to switch a voice window of a first object into a video window mode, audio and video streams of the first object and a second object are kept pulled, and analysis of video data of the first object is started, so that the video of the first object can be played in a target video window after being analyzed; the video data of the second object needs to be stopped from being analyzed, so that the video data of the second object which is not analyzed cannot be displayed on the display, the audio data of the second object is associated to the target voice window to be played, and the second object is switched to the voice access mode.

In some embodiments, the association of audio data to speech window playback may refer to: and determining that the video of the user is not displayed on the display according to the user corresponding to the voice window, and only playing the audio data of the user through the sound player. In some embodiments, the audio and video data of the user may be pulled from the server/collected locally, but the audio data is not parsed but the video data of the user is parsed, or only the audio data is pulled from the server without pulling the video data.

In some embodiments, the association of audio data to speech window play may also refer to: and determining that the video of the user is not displayed on the display according to the user corresponding to the voice window, playing the audio data of the user only through the sound player, and loading a dynamic control on the voice window, wherein the dynamic control is configured to dynamically display the dynamic control when the audio data of the user corresponding to the voice window meets a preset condition, and display the static image of the dynamic control when the audio data of the user does not meet the preset condition.

In some embodiments, the associating the audio and video data of the first object with the target video window for playing may refer to playing the video data in the audio and video data of the first object in the target video window, and playing the audio data in the audio and video data of the first object through a sound player. In some embodiments, the user's audiovisual data may be pulled from a server/collected locally.

In some embodiments, the associating of the audio and video data of the first object with the target video window for playing may also refer to playing the video data in the audio and video data of the first object in the target video window, and playing the audio data in the audio and video data of the first object through a sound player. And loading a dynamic control on the target video window, wherein the dynamic control is configured to dynamically display the dynamic control when the audio data of the user corresponding to the target video window meets a preset condition, and display a static image of the dynamic control when the audio data of the user does not meet the preset condition.

In some embodiments, the purpose of the association is to enable the user to see which video window corresponds to which accessing user and which speech window corresponds to which accessing user.

In some embodiments, each window (video window or voice window) is provided with a user identification control, the name displayed by the user identification control may be an ID of a user accessing a video call, in some embodiments, the user identification control may further be displayed in combination with an address book pre-stored in the display device, if the user accessing the video call has a name of a remark in the address book, the name of the remark is displayed on the user identification control, if the name of the remark is not stored in the address book, a remark defined by an opposite-end user is displayed, and the remark and the user ID are uniquely corresponding or identical.

In some embodiments, the text presented by the user identification control in the window characterizing the user himself is any one of "himself", "herself", and "me", and is not presented according to the user ID, or the name of the remark, or the custom remark, but is mapped to the user ID. And displaying the user identification control by the name of the access equipment of the opposite terminal.

In some embodiments, a mapping relationship exists between the serial number of the window and the user ID, and after the user pulls the audio/video stream from the server according to the user ID, the analyzed video data is displayed in the corresponding video window according to the mapping relationship.

In some embodiments, the audio/video data that needs to be analyzed and does not need to be analyzed is determined according to the mapping relationship between the serial number of the video window and the user ID, or the audio/video data that needs to be analyzed and does not need to be analyzed is determined according to the mapping relationship between the serial number of the voice window and the user ID, so that the audio/video data that is mapped with the video window is analyzed, and the audio/video data that is mapped with the voice window is not analyzed.

In some embodiments, according to a mapping relationship existing between the serial number of the video window and the user ID, audio data corresponding to the window is detected, and the dynamic control is controlled according to the audio data.

In response to determining that the current call path number is less than or equal to the preset path number, in step S50, an interface layout template corresponding to the current call path number is called to control the display to display a video window of the current call path number on the video call interface, and not to display a voice window.

When the current call path number is less than or equal to the preset path number, the interface layout template shown in fig. 6a may be referred to, and the corresponding interface layout template is selected according to the current call path number to display each video window, and the audio and video data of each passing object pulled from the server is associated with the corresponding video window to be played. Since the voice window is not displayed, there is no division of the first area and the second area on the video call interface, or it can be understood that only the first area and no second area are provided.

According to the method and the device, the capacity of the display device for processing the multi-channel video call can be improved, when the number of the current call paths is larger than the preset number, free and flexible switching of the audio and video windows is achieved through the switching audio and video window control and the corresponding audio and video switching logic which are additionally arranged in the control list, so that a user can see video pictures of friends according to personal wishes, the multi-channel video call is not limited by the configuration and the model of the display device, and the application experience of the user video call is improved.

In the embodiments of the present application, in consideration of the fact that the video screen of the member with voice access cannot be seen, the first object is selected first from the viewpoint of switching the first object in the second area from audio to video, and then the second object with video access replaced by the first object is selected. In practical applications, for example, the current video picture quality of the user 1 in the first area is not good or a black screen is displayed, so that the local user may not want to see the video picture of the user 1, so the user 1 may be switched to audio access from the viewpoint of switching a certain object in the first area from video to audio, and after the user clicks the switching audio/video window control, the user 1 may be selected as a replacement object in the first area, and then the replaced user 4 may be selected in the second area, so that the user 1 is switched from video to voice, and the user 4 is switched from voice to video.

It should be noted that, for the display devices used by each call member in the virtual room, although the multi-channel video call processing logic described in the foregoing embodiments may be configured, because the configuration, performance, and model of the display devices themselves are different, the UI and window layouts of the video call interfaces on these display devices may be different, for example, the actual number of paths of the current video call is 6, the preset number of paths of the display device of user 1 is 9, and the preset number of paths of the display device of user 2 is 3, then 6 video windows are displayed on the video call interface watched by user 1 without a voice window, and 3 video windows and 3 voice windows are displayed on the video call interface watched by user 2, so it can be seen that each display device processes the multi-channel video call logic based on its own configuration; in addition, each call member switches the audio and video window according to the watching desire of the call member, and the UI change generated by switching the audio and video window by the local user is only presented on the local terminal device and cannot be synchronized to the display devices of other call members, so that the other call members cannot be influenced, and the other call members cannot sense the operation of switching the audio and video window by the local user.

In summary, although the display device of each member in the virtual room is configured with the same multi-way video call processing logic, the process of processing the multi-way video call by each device is independent, and has no influence and interference with each other, and each display device can adaptively process the multi-way video call of the local end according to the processing logic based on the factors such as the self configuration and the watching desire of the local end user.

In some embodiments, the user may choose to invite a new member to join the current call at will during the video call, and for this, as shown in fig. 14, the controller is further configured to: responding to the click operation of the invitation control, and acquiring a third user ID of a third object invited by the user; and sending the third user ID to the server so that the server sends invitation information to the display device of the third object after receiving the third user ID.

After the user clicks the invitation control, the invited third object can be selected in the contact list, or information such as an account of the third object, a mobile phone number or a mailbox bound by the video call application is input to search the third object, so that the third object is invited to join the current video call, at this time, the controller can automatically acquire the third user ID of the third object, or the home terminal user can manually input the third user ID of the third object, so that the controller acquires the third user ID. The controller sends the third user ID to the server, and the server receives the third user ID and sends invitation information to the display device used by the third object according to the third user ID, where the invitation information may carry information of an inviter and related information of a video call (including, for example, an XX chat group and a member of a current call in the chat group); the display device of the third object, in response to the invitation information, may control the video call application to generate a call interface and a prompt tone, where the call interface may display, for example, "user a invites you to join XX chat group", and the call interface may further be provided with an answering control and a cancelling control. If the third object clicks the cancel control, the invitation fails, and the video call maintains the current state; if the third object clicks the answering control, the third object answers the video call and joins the virtual room, so that the invitation is successful; and the server responds to the information that the third object is answered, adds 1 to the current call path number to obtain the updated current call path number if the third object is newly added to the virtual room, and then sends invitation success information to the controller.

In some embodiments, as shown in fig. 14, the controller is further configured to perform: and responding to the received invitation success information sent by the server, controlling a display to display a newly added voice window in the second area when the updated current call path number is greater than the preset path number, acquiring audio data of a third object from the server, and associating the audio data of the third object with the newly added voice window for playing. And the indication information is sent by the server after adding 1 to the current call path number recorded in the virtual room when the display equipment of the third object responds to the invitation information and listens to the incoming call of the video call.

The controller receives and responds to the invitation success information, and needs to compare the size relationship between the updated current call path number and the preset path number, where the updated current call path number is greater than the preset path number, which indicates that when the third object is not accessed, the number of video windows in the first area has reached the upper limit, for example, the current call path number is 3, the preset path number of the local device is 3, that is, the number of video windows in the first area is just the upper limit path number, at this time, the second area is not displayed, when the third object is successfully invited, the updated current call path number is changed to 4, the local device needs to switch the third object into voice access, add 1 new voice window of the third object to the second area for display, receive only the audio data of the third object from the server, or receive the audio and video data of the third object, but not analyze and output the video data of the third object, and the audio data of the third object is associated to the newly added voice window to be played, so that the home terminal user can hear the voice message sent by the third object. For another example, the number of current call paths is 5, the number of preset paths of the home device is 3, that is, the number of video windows in the first area reaches the upper limit, the second area includes 2 voice windows, since the third object is accessed to the virtual room last, the newly added voice windows of the third object can be sorted at the last position of the second area for display, and the updated second area includes 3 voice windows. In this embodiment, the newly invited third object is a voice access, and if the home-end user needs to switch the voice access to the video access, the aforementioned audio/video switching logic may be adopted, which is not described herein again.

In some embodiments, as shown in fig. 14, the controller is further configured to perform: and in response to the received invitation success information sent by the server, controlling the display to display a newly-added video window in the first area when the updated current communication path number is less than or equal to the preset path number, calling an interface layout template corresponding to the updated current communication path number and refreshing the video communication interface by using a preset window sorting rule. Acquiring audio and video data of a third object from a server, and associating the audio and video data of the third object to the newly added video window for playing; and the indication information is sent by the server after adding 1 to the current call path number recorded in the virtual room when the display equipment of the third object responds to the invitation information and listens to the incoming call of the video call.

The controller receives and responds to the invitation success information, and the relationship between the updated current call path number and the preset path number needs to be compared, in this embodiment, the updated current call path number is less than or equal to the preset path number, which indicates that when the third object is not accessed, the number of video windows in the first area does not reach the upper limit, for example, the current call path number is 5, the preset path number of the local terminal device is 6, that is, the number of video windows in the first area is 5, at this time, the second area is not displayed, when the invitation of the third object is successful, the updated current call path number is changed to 6, that is, the number of video windows in the first area just reaches the upper limit after the third object is accessed, and the local terminal device can support the video access of the third object; for another example, the current number of call paths is 6, the preset number of call paths of the home device is 9, and when the invitation of the third object is successful, the updated current number of call paths is changed to 7, that is, the number of video windows in the first area does not reach the upper limit after the invitation of the new member, so that the home device supports video access of the third object.

When a video of a third object is accessed, 1 video window of the third object needs to be added newly, at this time, since the number of video windows in the first area is increased, and the corresponding window layout may need to be adjusted appropriately, for example, the preset number of paths of the local device is 9, 8 video windows are displayed in the original first area, and the number of video windows is increased to 9 after the third object is accessed, referring to fig. 6a, an interface layout template used in the first area needs to be changed from 2 rows and 4 columns to an array of 3 rows and 3 columns, then the video windows of each member are sorted according to a preset window sorting rule, for example, according to the time sequence of accessing calls of the 9 call members in the virtual room, and since the third object is accessed into the virtual room last, the video window of the newly added third object can be sorted at the last position of the first area for displaying, and the video call interface is refreshed, and then, a 3 x 3 interface which is sequenced after updating can be obtained, and simultaneously, the audio and video data of the third object are received from the server and are associated to the newly added video window to be played, so that the home terminal user can see the video picture of the third object and hear the voice message of the third object.

The above is an example of inviting a new member to access a virtual room, and the processing logic and UI adjustment when inviting a new member are described. When a plurality of new members are invited to add at one time, because the time for each new member to respond and answer the invitation call is possibly inconsistent, which is equivalent to accessing one new member at intervals until all the invited new members access the virtual room, the change of the current call path number, the UI interface and other contents can be adaptively adjusted according to the answering sequence of each new member, the processing logic is basically similar to the invitation logic, and the description is omitted here.

In some embodiments, before the video call is not ended, a call object may click a hang-up control to quit the video call in advance, which may include, but is not limited to, the following situations: in case (A), the hung-up call object is accessed by video, and the updated current call path number after hanging up is less than the preset path number; in case (B), the hung-up call object is accessed by video, and the updated current call path number after hanging up is equal to the preset path number; in case (C), the hung-up call object is video-accessed, and the updated current number of call paths after hanging up is greater than the preset number of call paths; situation (D), the hung up conversation object is accessed by voice, and the updated current conversation path number after hanging up is equal to the preset path number; and (E) the hung-up call object is accessed by voice, and the updated current call path number after hanging up is more than the preset path number.

In some embodiments, as shown in fig. 15, for case (a), the controller is further configured to perform: receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control; responding to the hang-up indication information, if the fourth object is in video access, controlling a display to display hang-up prompt information of the fourth object, and destroying a video window of the fourth object; and when the updated current communication path number is determined to be smaller than the preset path number, calling an interface layout template corresponding to the updated current communication path number and refreshing the video communication interface by using a preset window sorting rule.

Any call object (namely, a fourth object) except the home terminal user in the virtual room clicks the hang-up control, when the video call is quitted in advance, the server responds to a message that the fourth object is hung up, knows that a user quits in the virtual room, if the user is the fourth object, the recorded current call path number needs to be reduced by 1 to obtain an updated current call path number, and then sends hang-up indication information to a controller of the display device, wherein the hang-up indication information can carry a fourth user ID of the fourth object and is used for indicating that the call object corresponding to the ID of the controller has hung up the video call. Because the fourth object is hung up in the video access mode relative to the home terminal, the home terminal device cannot continuously pull the audio/video stream of the fourth object, so that an effective video picture cannot be displayed on a video window of the fourth object, and at this time, the video window may be in a form of displaying a black screen or a gray screen. The controller receives and responds to the hang-up indication information, and controls the display to display hang-up prompt information, wherein the hang-up prompt information is that the fourth object has hung up the call, the display position of the hang-up prompt information is not limited, for example, the hang-up prompt information can be displayed on a video window of a black screen of the fourth object, and then the video window of the fourth object needs to be destroyed, so that the video window of the hang-up party is not displayed on the video call interface of the local terminal.

When the fourth object hangs up the call in the video access mode, the video window of the fourth object needs to be destroyed, at this time, since the number of the video windows is reduced, the corresponding window layout may need to be properly adjusted, if the updated current call path number is less than the preset path number, it is indicated that the number of the video windows when the fourth object is not hung up is less than or equal to the preset path number, for example, the preset path number of the local device is 4, the original video call interface displays 4 video windows, and the fourth object is reduced to 3 video windows after being hung up, referring to fig. 6a, the interface layout template used by the video call interface needs to be changed from 2 rows and 2 columns into an array of 1 row and 3 columns, and then 3 video windows in the new interface layout template are sorted according to the preset window sorting rule, for example, according to the time sequence of the call members remaining 3 call members in the virtual room accessing the call, and obtaining the updated ordered 1 x 3 interface.

In some embodiments, if the video windows are sorted according to the sequence of the access time when the fourth object is not hung up, after the fourth object is hung up, the other video windows sorted after the fourth object may be sequentially shifted up by 1 rank, and the rank of the video window sorted before the fourth object remains unchanged, for example, when the fourth object is not hung up, the video windows are sequentially sorted into user 1, user 2, user 3, and user 4 (self) according to the sequence of the access time, when the user 2 hangs up the video call, that is, the user 2 is the fourth object, the user 1 remains at the first rank in the new template, the user 3 is shifted up to the 2 nd rank of the new template, the user 4 is shifted up to the third rank of the new template, and the video windows are sorted into user 1, user 3, and user 4 in sequence after refreshing.

In some embodiments, as shown in fig. 15, for case (B), the controller is further configured to perform: receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control; responding to the hang-up indication information, if the fourth object is in video access, controlling a display to display hang-up prompt information of the fourth object, and destroying a video window of the fourth object; when the updated current call path number is determined to be equal to the preset path number, audio and video data of a fifth object in the second area are obtained from the server, the display of the second area is cancelled on a video call interface, and a new video window is generated according to the association of the audio and video data of the fifth object; and according to a preset window sorting rule, sorting and displaying all video windows in the current video call interface.

If the updated current call path number is equal to the preset path number, which indicates that the call path number is +1 when the fourth object is not hung up, that is, the first area is provided with a plurality of video windows (including the fourth object) +1 voice window in the second area, and the 1 video window in the second area corresponds to the fifth object, then after the fourth object is hung up, the fifth object can be automatically switched from voice access to video access, that is, the voice window of the fifth object is destroyed, the display of the second area is cancelled on the video call interface, and the video stream of the fifth object is started and pulled from the server, and the audio stream which is always in a pull state, that is, the audio and video data of the fifth object are simultaneously obtained, a new video window is generated according to the audio and video data association of the fifth object, so that the local user can see the video picture of the fifth object according to the new video window, and hearing the voice message of the fifth object, wherein the current video call interface displays a preset number of video windows without the fourth object but including the fifth object.

In some embodiments, since the local device cannot pull the audio/video stream of the fourth object after the fourth object is hung up, the video window of the fourth object may not be destroyed, but the audio/video data of the fifth object received from the server is associated with the video window of the fourth object to be played, so that a new video window does not need to be created for the fifth object, that is, the fourth object is replaced by the fifth object.

When the fourth object is not hung up, the number of the video windows in the first area is the preset number of paths, and after the fourth object is hung up, the updated current call path number is equal to the preset number of paths, so that the interface layout template adopted by the video call interface does not need to be changed and is the interface layout template corresponding to the preset number of paths, and all the video windows of the current video call interface are displayed in a sorted manner only according to the preset window sorting rule, for example, the window sorting rule is that the video windows sorted before the fourth object are sequentially adjusted by 1 sequence according to the time sequence of accessing a virtual room, the sequence of the video windows sorted before the fourth object is kept unchanged, the video windows of the fifth object are arranged at the last position of the template, and after the sorting is finished, the updating of the video call interface is finished.

In some embodiments, as shown in fig. 15, for case (C), the controller is further configured to perform: receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control; responding to the hang-up indication information, if the fourth object is in video access, controlling a display to display hang-up prompt information of the fourth object, and destroying a video window of the fourth object; when the updated current call path number is determined to be larger than the preset path number, selecting a sixth object from the second area according to a preset screening rule, acquiring audio and video data of the sixth object from a server, generating a new video window in the first area according to the audio and video data of the sixth object in a correlation mode, and destroying a voice window of the sixth object in the second area; and according to a preset window sorting rule, sorting and displaying each video window in the current first area, and sorting and displaying each voice window in the current second area.

If the updated current call path number is larger than the preset path number, the call path number is the preset path number + N when the fourth object is not hung up, wherein N is larger than or equal to 2, namely the plurality of video windows (including the fourth object) in the preset path of the first area and the N voice windows in the second area. Since the fourth object is suspended and the second area has multiple voice windows, it relates to selecting one of the objects (named as the sixth object in this embodiment) from the second area to switch to the video access mode, and the sixth object may be selected from the second area according to a preset filtering rule, where the preset filtering rule preferentially switches the call member corresponding to the first voice window in the second area to the first area, for example, according to the sorting of the windows in the second area, or, if the local user (itself) is voice access, the local user may be automatically switched to video access, and the preset filtering rule may be set according to actual applications, which is not limited in this embodiment.

After the sixth object is selected, pulling the video stream of the sixth object is started, the audio stream of the sixth object is in a stream pulling state all the time, the audio and video data of the sixth object can be obtained, a new video window is generated in the first area according to the audio and video data of the sixth object in a relevant mode, meanwhile, the voice window of the sixth object in the second area is destroyed, and therefore a user can see the video picture of the sixth object in the new video window and hear the voice message of the sixth object. The current video call interface now appears as: the current first area displays a preset number of video windows without a fourth object but with a sixth object, the current second area displays N-1 voice windows, the current call path number is + N-1, and N is larger than or equal to 2.

In some embodiments, since the local device cannot pull the audio/video stream of the fourth object after the fourth object is hung up, the video window of the fourth object may not be destroyed, but the audio/video data of the sixth object received from the server is associated with the video window of the fourth object to be played, so that a new video window does not need to be created for the sixth object, that is, the sixth object replaces the fourth object.

When the fourth object is not hung up, the number of the video windows in the first area is the preset number of paths, and after the fourth object is hung up, the updated current call path number is still greater than the preset number of paths, so that the interface layout template adopted by the video call interface does not need to be changed and is the interface layout template corresponding to the preset number of paths, all the video windows of the current video call interface are displayed in a sorted manner only according to the preset window sorting rule, for example, the window sorting rule is that the video windows sorted before the fourth object are sequentially adjusted up by 1 sequence according to the time sequence of accessing the virtual room, the sequence of the video windows sorted before the fourth object is kept unchanged, the video windows of the sixth object are arranged at the last position of the template, and after the sorting is finished, the updating of the first area is completed.

Referring to fig. 7, for example, the preset screening rule is to switch the user 4 with the second area sorted at the head to video access, that is, the user 4 is the sixth object, after the voice window of the user 4 is destroyed, sequentially up-shifts 1 sequence position of the user 5 and the voice window of the user 5, that is, the voice window of the user 5 is arranged at the head, and the voice window of the user is arranged at the 2 nd position, which is equivalent to sequentially forward-shift 1 sequence position of the voice window after the sixth object. For another example, the preset screening rule is to switch the home terminal user to video access, that is, if the user is the sixth object, the sequence of the

users

4 and 5 sequenced before the sixth object is not changed, if there is no voice window behind the user, the voice window of the user is destroyed, and if there are other voice windows behind the user, the voice window behind the user is sequentially adjusted up by 1 sequence.

In some embodiments, referring to fig. 15, for case (D), the controller is further to perform: receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control; and responding to the hang-up indication information, if the fourth object is accessed by voice, and when the updated current call path number is determined to be equal to the preset path number, controlling a display to display the hang-up prompt information of the fourth object, canceling the display of the second area, and maintaining the current display state of the first area unchanged.

The fourth object is hung up in the voice access mode, which indicates that the number of call paths when the fourth object is not hung up is greater than the preset number of paths, and if M is the number of call paths when the fourth object is not hung up-the preset number of paths, then M is greater than or equal to 1, that is, the second area includes at least 1 voice window when the fourth object is not hung up, that is, the number of call paths when the fourth object is not hung up is equal to the preset number of paths + M, the updated current number of call paths is equal to the preset number of paths, and indicates that M is equal to 1.

When M is 1, please refer to fig. 6c, when the user 4 hangs up the call as the fourth object, the hang-up prompt message of the fourth object may be displayed, and the voice window of the fourth object is destroyed, at this time, the display of the second area is cancelled if no voice window is displayed in the second area, and no person hangs up in the first area, so that the first area remains the current display state, that is, the interface layout template, the audio/video playing of each call member (user 1, user 2 and himself) in the first area, and the window sorting, etc., remain unchanged, and the update of the video call interface is completed.

In some embodiments, referring to fig. 15, for case (E), the controller is further to perform: receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control; responding to the hang-up indication information, if the fourth object is in voice access, and when the updated current call path number is determined to be larger than the preset path number, controlling a display to display the hang-up prompt information of the fourth object, and destroying a voice window of the fourth object; and sequencing and displaying each voice window in the updated second area according to a preset window sequencing rule, and keeping the current display state of the first area unchanged.

For the case (E), the updated current call path number is greater than the preset path number, which indicates that M is greater than 1, when M is greater than 1, see fig. 7, when the user 5 hangs up the call, that is, the user 5 is the fourth object, the home-end user may be prompted to "the user 5 has hung up the call", and at the same time, destroy the voice window of the fourth object, because M is greater than 1, the second area still has at least 1 voice window after the voice window of the fourth object, in the example of fig. 7, after the voice window of the user 5 is destroyed, there are 2 voice windows in total for the user 4 and the user himself, then re-sorting is required according to the window sorting rule, if the user 4, the user 5 and the user are sorted according to the time sequence of accessing the virtual room when the user 5 is not hung up, then after the voice window of the user 5, the order of the voice window of the user 4 destroyed before the user 5 is unchanged, and (4) the own voice window sequenced behind the user 5 is adjusted up by 1 ordinal number, namely, the voice window is moved forward to the 2 nd ordinal number for displaying, thereby completing the updating of the second area. Like the case (D), the fourth object in the second area is suspended in the voice access mode, and there is no influence on the display state in the first area.

It should be noted that, if the fourth object suspends the call in the voice access mode, the updated current number of call paths is greater than or equal to the preset number of call paths, and there is no situation that the updated current number of call paths is less than the preset number of call paths.

In some embodiments, after the video call is started, the interface layout template of the video call and the window display in the first area/the second area may be adaptively changed and adjusted according to the number of people currently accessing the virtual room, for example, the preset number of paths of the device (as the initiator and the home terminal user) of the user 1 is 3, after the user 1 initiates the video call of 5 people, the video window of the initiator is first displayed on the current interface, and then after the user 2 accesses, the video call interface is displayed according to the arrangement order of the user 1 and the user 2 and the interface layout templates in row 1 and column 2; if the user 3 is accessed after the user 2, adjusting the video call interface to an interface layout template of 1 row and 3 columns corresponding to the preset path number according to the arrangement sequence of the user 1, the user 2 and the user 3; the user 4 is accessed after the user 3, the display state of the user 3 in the access process is kept unchanged in the first area, and the voice window of the user 4 is displayed in the second area; and if the user 5 is accessed after the user 4, the display state of the first area is kept unchanged, the voice window of the user 5 is displayed at the next ordinal position of the voice window of the user 4, then all 5 people are accessed, and the video call interface is correspondingly adjusted according to the access of each person.

In some embodiments, the initiator initiates a video call of a target number of ways, the server establishes a virtual room supporting the target number of ways, and then when the server sends a call request to other members, the call request carries the target number of ways, so that when the display device of each call member starts a video call, the window layout of a video call interface corresponding to the target number of ways can be directly established: if the target road number is less than the preset road number, calling an interface layout template corresponding to the target road number, and displaying a plurality of video windows of the target road; if the target road number is equal to the preset road number, calling an interface layout template corresponding to the preset road number, and displaying a plurality of video windows of the preset road number; if the target road number is larger than the preset road number, calling an interface layout template corresponding to the preset road number, displaying video windows of the preset road number in the first area, and generating and displaying voice windows of the difference value number between the target road number and the preset road number in the second area.

In some embodiments, the video picture of the home terminal user can be seen from the first position on the initial video call interface of the initiator, the video windows of other call objects are displayed as grey screens, prompt information such as 'waiting for access' is displayed on the windows, then the audio and video data of the call objects are sequentially pulled according to the time sequence of accessing the call objects to the virtual room and are associated with the corresponding video windows to be played, and/or the audio data of the call objects after exceeding the preset path number are pulled and are associated with the corresponding voice windows to be played until all the windows on the initial video call interface finish data loading. For example, a user 1 initiates a video call of 4 people, the preset number of paths of the display device of the user 1 is 3, when the user 1 initiates the video call, the established initial video call interface is presented as a video window with a layout of 1 × 3 in a first area and 1 voice window in a second area, and the video window at the 1 st position displays the video picture of the user 1; then, the user 2 accesses the system, the audio and video data of the user 2 are obtained from the server, and the audio and video data of the user 2 are related to the 2 nd video window to be played; if the user 3 is accessed after the user 2, the audio and video data of the user 3 is pulled, and the audio and video data of the user 3 is related to the 3 rd video window for playing; and the user 4 is accessed after the user 3, the user 4 is automatically switched to be in voice access at the moment, the audio data of the user 4 is pulled, the audio data of the user 4 is related to the 1 st voice window of the second area to be played, and then all the video calls of the target number are accessed.

In some embodiments, such as user 1 initiating a 5-person video call while inviting user 2, user 3, user 4, and user 5 to have a video call. The preset path number of the display equipment of the user 3 is 4, when the user 3 accesses the video call, the constructed initial video call interface is presented as a video window in a 2 x 2 layout of a first area and 1 voice window in a second area, before user 3, user 1 initiates a call, user 4 first accesses, then user 2, before it is time for user 3 to access, user 3 may receive user 1, user 4 and user 2 audiovisual data from the server, the 1 st video window plays the audio and video data of the user 1, the 2 nd video window plays the audio and video data of the user 4, the 3 rd video window plays the audio and video data of the user 2, the 4 th video window plays the video picture collected by the local end of the user 3, and the voice window in the second area displays a grey screen and displays prompt information such as 'waiting for access'. And if the user 5 is accessed behind the user 3, receiving the audio data of the user 5, and enabling the 1 st voice window to play the audio data of the user 5, so that all the video calls of the target paths are accessed.

It should be noted that, in the process of accessing the video call of the target number of channels, contents such as a specific UI presentation form and processing logic are not limited to those described in the embodiments of the present application.

In some embodiments, the controller is further configured to perform: and responding to the click operation of the hang-up control, and controlling the display to destroy the current video call interface. When the home terminal user clicks the hang-up operation, namely the home terminal user actively hangs up the video call and exits from the virtual room, the home terminal equipment cannot acquire and upload the audio and video data of the home terminal user any more and cannot receive the audio and video data of other call objects from the server, the video call interface is destroyed, and the video call is terminated.

In some embodiments, the controller is further configured to perform: and controlling the display to destroy the current video call interface in response to receiving the call object hang-up information sent by the server. The local end user does not actively hang up the call, but all call objects for the call with the local end user are hung up, namely, only 1 person of the local end user exists in the virtual room, the video call does not need to be continuously maintained, which is equivalent to that the local end user passively hangs up the video call. When the server receives hang-up messages of all other call objects except the home terminal user in the virtual room, the hang-up messages of the call objects are sent to the display equipment of the home terminal user, and the controller of the home terminal device receives and responds to the hang-up messages of the call objects, so that hang-up logic can be started, and a call interface is destroyed.

The controller can be preset with video call Application (APP), the creation and transformation of UI interfaces of video call initiation, answering, invitation, opposite end hang-up and local end hang-up can be realized in the APP, the operation of a user on the UI interface and the processing logics of controlling the pull flow state and the like of each call are supported, other functions of the video call can also be realized through the APP, and the description is omitted here. It should be noted that the UI display of the interface layout template and the related processing logic for the multi-way video call is not limited to that shown in the drawings, and the processing logic when the user operates based on the UI and the related control may be adaptively changed.

In some embodiments, the present application further provides a multi-channel video call processing method, where the method includes program steps executed by a controller configured in each embodiment of the display device, and details are not repeated here. The same and similar parts in the various embodiments are referred to each other in this specification.

Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. In particular implementations, the present invention also provides a computer storage medium, where the computer storage medium may store a program, and when the computer storage medium is located in a display device, the program may include program steps included in a multi-way video call processing method that the controller 250 is configured to execute. The computer storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The specification and examples are to be regarded in an illustrative manner only and are not intended to limit the scope of the present invention. With a true scope and spirit of the invention being indicated by the following claims.

Claims

1. A display device, comprising:

a display;

a sound player;

a user interface for receiving an input operation;

acquiring the current call path number of the video call from a server;

2. The display device according to claim 1, wherein the controller is further configured to perform:

when the number of the current call paths is larger than the preset number, responding to the operation of switching the audio and video windows, and acquiring a target voice window corresponding to a first object from the second area and acquiring a target video window corresponding to a second object from the first area;

and associating the audio and video data of the first object to the target video window for playing, and associating the audio data of the second object to the target voice window for playing.

3. The display device according to claim 2, wherein the controller is further configured to perform:

responding to the clicking operation of the first control, and controlling a display to display a control list on a floating layer of the video call interface;

the control list comprises a microphone state control, a camera state control, a hang-up control, an invitation control, an audio and video window switching control and a small window call control.

4. The display device of claim 3, wherein the controller is specifically configured to perform:

when the number of the current call paths is larger than the preset number, responding to the click operation of the audio/video window switching control, and acquiring the target voice window selected by the user in a second area;

responding to the confirmation operation of the target voice window, recording the first object as a first user ID corresponding to the target voice window, and controlling a display to display a second object selection popup on the upper layer of a video call interface; the second object selection popup displays user IDs corresponding to all video windows included in the first area;

and acquiring a second user ID clicked in the second object selection popup by the user, and recording the second object as the second user ID.

5. The display device of claim 3, wherein the controller is further configured to obtain the current call path number of the video call from the server according to the following steps:

sending a query request to the server, wherein the query request is used for enabling the server to query the current call path number recorded in the virtual room of the video call; receiving the current call path number sent by the server;

or receiving the current call path number sent by the server, wherein the current call path number is sent when the server detects that the current call path number recorded in the virtual room is changed;

wherein, the current number of call paths is changed by: when any call object in the virtual room hangs up the video call, subtracting 1 from the current call path number recorded in the virtual room, wherein the call object is a call member except a home terminal user in the virtual room; and when any call member in the virtual room invites a new member to answer the video call, adding 1 to the current call path number recorded in the virtual room.

6. The display device according to claim 5, wherein the controller is further configured to perform:

responding to the click operation of the invitation control, and acquiring a third user ID of a third object invited by the user;

transmitting the third user ID to a server so that the server transmits invitation information to the display device of the third object after receiving the third user ID;

responding to the received invitation success information sent by the server, controlling a display to display a newly added voice window in a second area when the updated current call path number is greater than the preset path number, acquiring audio data of a third object from the server, and associating the audio data of the third object with the newly added voice window for playing;

or, in response to receiving invitation success information sent by the server, when the updated current call path number is less than or equal to the preset path number, controlling the display to display a newly-added video window in the first area, calling an interface layout template corresponding to the updated current call path number and refreshing the video call interface by using a preset window sorting rule; acquiring audio and video data of a third object from a server, and associating the audio and video data of the third object to the newly added video window for playing;

and the indication information is sent by the server after adding 1 to the current call path number recorded in the virtual room when the display equipment of the third object responds to the invitation information and listens to the incoming call of the video call.

7. The display device according to claim 5, wherein the controller is further configured to perform:

receiving hang-up indication information sent by a server, wherein the hang-up indication information is sent after the server subtracts 1 from the current call path number recorded in the virtual room when a fourth object in the virtual room clicks a hang-up control;

responding to the hang-up indication information, if the fourth object is in video access, controlling a display to display hang-up prompt information of the fourth object, and destroying a video window of the fourth object;

and when the updated current communication path number is determined to be smaller than the preset path number, calling an interface layout template corresponding to the updated current communication path number and refreshing the video communication interface by using a preset window sorting rule.

8. The display device according to claim 5, wherein the controller is further configured to perform:

when the updated current call path number is determined to be equal to the preset path number, audio and video data of a fifth object in the second area are obtained from the server, the display of the second area is cancelled on a video call interface, and a new video window is generated according to the association of the audio and video data of the fifth object;

and according to a preset window sorting rule, sorting and displaying all video windows in the current video call interface.

9. The display device according to claim 5, wherein the controller is further configured to perform:

when the updated current call path number is determined to be larger than the preset path number, selecting a sixth object from the second area according to a preset screening rule, acquiring audio and video data of the sixth object from a server, generating a new video window in the first area according to the audio and video data of the sixth object in a correlation mode, and destroying a voice window of the sixth object in the second area;

and according to a preset window sorting rule, sorting and displaying each video window in the current first area, and sorting and displaying each voice window in the current second area.

10. The display device according to claim 5, wherein the controller is further configured to perform:

responding to the hang-up indication information, if the fourth object is accessed by voice, and when the updated current call path number is determined to be equal to the preset path number, controlling a display to display the hang-up prompt information of the fourth object, canceling the display of the second area, and maintaining the current display state of the first area unchanged;

or responding to the hang-up indication information, if the fourth object is in voice access, and when the updated current call path number is determined to be greater than the preset path number, controlling a display to display the hang-up prompt information of the fourth object and destroying a voice window of the fourth object; and sequencing and displaying each voice window in the updated second area according to a preset window sequencing rule, and keeping the current display state of the first area unchanged.

11. The display device according to any one of claims 2 to 10, wherein the controller is further configured to perform:

acquiring audio and video data of all call objects in the virtual room from a server; respectively associating the audio and video data of the call object in the first area with corresponding video windows for playing; the video data of the call object in the second area is not analyzed, and only the audio data of the call object in the second area are respectively associated to the corresponding voice windows for playing;

or acquiring audio and video data of a call object included in the first area from the server, and respectively associating the audio and video data with corresponding video windows for playing; only acquiring audio data of the call object in the second area from the server without acquiring video data, and respectively associating the audio data of the call object in the second area with corresponding voice windows for playing;

and the call object is a call member except the home terminal user in the virtual room.

12. The display device according to any one of claims 1 to 10, wherein the controller further presets a window sorting rule as follows:

after the video call is started, firstly, sequencing and displaying video windows in a first area according to the time sequence of the call members joining the virtual room;

and responding to the fact that the total number of the video windows in the first area reaches the preset number, switching a mode that the following call members access the virtual room into voice access, and sequencing and displaying the voice windows in the second area according to the time sequence of the call members accessing the virtual room.

13. The display device according to claim 3, wherein the controller is further configured to perform:

if the current call path number is larger than the preset path number, controlling a display to display a prompt popup window; the prompt popup window is used for prompting a user to display a control list by clicking a first control and switching a voice window of a first object into a video window by clicking a switching audio and video window control;

and if the current video call path number is less than or equal to the preset path number, the prompt popup window is not displayed.

14. A multi-channel video call processing method is characterized by comprising the following steps:

acquiring the current call path number of the video call from a server;